I have a question regarding best practices considering Java regular expressions/Strings manipulation.
I have a changing String template, let's say this time it looks like this:
/get/{id}/person
I have another String that matches this pattern eg.
/get/1234ewq/person
Keep in mind that the pattern could change anytime, slashes could disappear etc.
I would like to extract the difference between the two of them i.e. the result of the processing would be 1234ewq.
I know I could iterate over them char by char and compare, but, if it is possible, I wanted to find some smart approach to it with regular expressions.
What would be the best Java approach?
Thank you.
For you to answer your question with a regex approach I built a small example class which should hint you into a direction you could go with this (see below).
The problem with this approach is that you dynamically create a regular expression that depends on your template strings. This means that you have to somehow verify that your templates do not interfere with the regex compilation and matching process itself.
Also atm if you would use the same placeholder multiple times within a template the resulting HashMap only contains the value for the last placeholder mapping of that kind.
Normally this is the expected behaviour but this depends on your strategy of filling your templates.
For template processing in general you could have a look at the mustache library.
Also as Uli Sotschok mentioned, you probably would be better of with using something like google-diff-match-patch.
public class StringExtractionFromTemplate {
public static void main(String[] args) {
String template = "/get/{id}/person";
String filledTemplate = "/get/1234ewq/person";
System.out.println(diffTemplateInsertion(template, filledTemplate).get("id"));
}
private static HashMap<String, String> diffTemplateInsertion(String template, String filledTemplate){
//language=RegExp
String placeHolderPattern = "\\{(.+)}";
HashMap<String, String> templateTranslation = new HashMap<>();
String regexedTemplate = template.replaceAll(placeHolderPattern, "(.+)");
Pattern pattern = Pattern.compile(regexedTemplate);
Matcher templateMatcher = pattern.matcher(template);
Matcher filledTemplateMatcher = pattern.matcher(filledTemplate);
while (templateMatcher.find() && filledTemplateMatcher.find()) {
if(templateMatcher.groupCount() == filledTemplateMatcher.groupCount()){
for (int i = 1; i <= templateMatcher.groupCount(); i++) {
templateTranslation.put(
templateMatcher.group(i).replaceAll(placeHolderPattern,"$1"),
filledTemplateMatcher.group(i)
);
}
}
}
return templateTranslation;
}
}
Related
I have a property file (a.txt) which has the values (Example values given below) like below
test1=10
test2=20
test33=34
test34=35
By reading this file, I need to produce an output like below
value = 35_20_34_10
which means => I have a pattern like test34_test2_test33_test1
Note, If the 'test33' has any value other than 34 then I need to produce the value like below
value = 35_20_10
which means => I have a pattern like test34_test2_test1
Now my problem is, every time when the customer is making the change in the logic, I am making the change in the code. So what I expect is, I want to keep the logic (pattern) in another property file so I will be sending the two inputs to the util (one input is the property file (A.txt) another input will be the 'pattern.txt'),
My util has to be compare the A.txt and the business logic 'pattern.txt' and produce the output like
value = 35_20_34_10 (or)
value = 35_20_10
If there an example for such pattern based logic as I expect?
Any predefined util / java class does this?
Any help would be Great.
thanks,
Harry
First of all, svasa's answer makes a lot of sense, but covers different level of
abstraction. I recommend you read his answer too, that pattern should
be useful.
You may wanna look at Apache Velocity and FreeMarker libraries to see how they structure their API.
Those are template engines - they usually have some abstraction of pattern or format, and abstraction of variable/value binding (or namespace, or source). You can render a template by binding it with a binding/namespace, which yields the result.
For example, you may wanna have a pattern "<a> + <b>", and binding that looks like a map: {a: "1", b: "2"}. By binding that binding to that pattern you'll get "1 + 2", when interpreting <...> as variables.
You basically load the pattern from your pattern.txt, then load your data file A.txt (for example, by treating it as properties and using Properties class) and construct binding based on these properties. You'll get your output and possibility to customize the pattern all the time.
You may call the sequences like test34_test2_test33_test1 as a pattern, let me call them as constraints when building something.
To me this problem best fits into a
builder pattern.
When building the value you want, you tell the builder that these are my constraints(pattern) and these are my original properties like below:
new MyPropertiesBuilder().setConstraints(constraints).setProperties(original).buildValue();
Details:
Set some constraints in a separate file where you specify the order of the properties and their values like :
test34=desiredvalue-could-be-empty
test2=desiredvalue-could-be-empty
test33=34
test1=desiredvalue-could-be-empty
The builder goes over the constraints in the order specified, but get the values from the original properties and build the desired string.
One way to achieve your requirement through builder pattern is to define classes like below :
Interface:
public interface IMyPropertiesBuilder
{
public void setConstraints( Properties properties );
public void setProperties( Properties properties );
public String buildValue();
}
Builder
public class MyPropertiesBuilder implements IMyPropertiesBuilder
{
private Properties constraints;
private Properties original;
#Override
public void setConstraints( Properties constraints )
{
this.constraints = constraints;
}
#Override
public String buildValue()
{
StringBuilder value = new StringBuilder();
Iterator it = constraints.keySet().iterator();
while ( it.hasNext() )
{
String key = (String) it.next();
if ( original.containsKey( key ) && constraints.getProperty( key ) != null && original.getProperty( key ).equals( constraints.getProperty( key ) ) )
{
value.append( original.getProperty( key ) );
value.append( "_" );
}
}
return value.toString();
}
#Override
public void setProperties( Properties properties )
{
this.original = properties;
}
}
User
public class MyPropertiesBuilderUser
{
private Properties original = new Properties().load(new FileInputStream("original.properties"));;
private Properties constraints = new Properties().load(new FileInputStream("constraints.properties"));
public String getValue()
{
String value = new MyPropertiesBuilder().setConstraints(constraints).setProperties(original).buildValue();
}
}
Situation: I'm working on legacy code and trying to improve readability. The following example should visualize the intent:
private static final String CONSTANT_1 = "anyValue";
private static final String CONSTANT_2 = "anyValue";
private static final String CONSTANT_3 = "anyValue";
private static final String CONSTANT_4 = "anyValue";
private static final String CONSTANT_5 = "anyValue";
private final SomeType someField = new SomeType();
private void contentOfSomeMethods(){
someMethod(someField, CONSTANT_1, true);
someMethod(someField, CONSTANT_2, true);
someMethod(someField, CONSTANT_3, true);
someMethod(someField, CONSTANT_4, false);
someMethod(someField, CONSTANT_5, false);
}
private void someMethod(SomeType type, String value, boolean someFlag) { }
Imagine, there are about 50 calls of someMethod using about 50 constants. I want to do safe automatical refactorings on that code so that the contentOfSomeMethods method changes to
private void contentOfSomeMethods(){
doItWith(CONSTANT_1);
doItWith(CONSTANT_2);
doItWith(CONSTANT_3);
doItNotWith(CONSTANT_4);
doItNotWith(CONSTANT_5);
}
and two additional methods are generated:
private void doItWith(String value) {
someMethod(someField, value, true);
}
private void doItNotWith(String value) {
someMethod(someField, value, false);
}
The naive way is to extract all constants in contentOfSomeMethods inside local variables and use then the extract method refactoring to create the desired methods. And afterwards to inline back the local variables. But this solution doesn't scale up.
Another way is to use search and replace with regular expressions, but this is not a safe refactoring, so I could break the code without noticing it.
Do you have any better suggestions? Do you know some plugins for Eclipse that allow that?
I don't know of any utility that would do this directly.
I think using a regular expression is the only to go. First, you will need to create the two target methods doItWith and doItNotWith. Then, you can highlight the contents of the method contentOfSomeMethods, hit Ctrl+F, and use the following regular expressions:
Find: someMethod\(someField, (\w*), true\);
Replace with: doItWith(\1);
and then
Find: someMethod\(someField, (\w*), false\);
Replace with: doItNotWith(\1);
Be sure to check "Regular Expressions" and "Selected lines". Here's a picture of it:
The regular expressions match the constant that is used inside the function call with (\w*) and then it is used during the replacement with \1. Using this regular expression only on the selected lines minimizes the chance of breaking unrelated code.
Do it with a regular expression and verify it.
I'm assuming that each call to someMethod spans only one line. If not this method is still useful but slower.
Copy the original file.
Use ctrl+alt+h to show the Callers of someMethod and get a count of them.
Do regex search and replaces restricted to the proper area :
Find : someMethod(someField,([ ]*CONSTANT_[0-9]+)[ ]*,[ ]*true[ ]*)[ ]*;
Replace : doItWith("$1");
Find : someMethod(someField,([ ]*CONSTANT_[0-9]+)[ ]*,[ ]*false[ ]*)[ ]*;
Replace : doItNotWith("$1");
Make a diff of the original file and the new file showing only the lines of the original file which have changed.
diff --changed-group-format='%<' --unchanged-group-format='' original.java refactored.java | wc
You should get the same number of lines as you got in the callers of someMethod.
If the calls to someMethod are multiline, or if you want greater verification, just drop | wc to see the lines which were modified in the original file to ensure that only the correct lines have been modified.
Alas I know nothing in Eclipse that allows to do this today.
This is something I would like to achieve one day in AutoRefactor: https://github.com/JnRouvignac/AutoRefactor/issues/8
However the road to get there is quite long.
The only ways I know today are to extract local variables then extract method (as you suggested) or use regexes (as somebody else suggested).
I have an enum as follows:
public enum ServerTask {
HOOK_BEFORE_ALL_TASKS("Execute"),
COPY_MASTER_AND_SNAPSHOT_TO_HISTORY("Copy master db"),
PROCESS_CHECKIN_QUEUE("Process Check-In Queue"),
...
}
I also have a string (lets say string = "Execute") which I would like to make into an instance of the ServerTask enum based on which string in the enum that it matches with. Is there a better way to do this than doing equality checks between the string I want to match and every item in the enum? seems like this would be a lot of if statements since my enum is fairly large
At some level you're going to have to iterate over the entire set of enumerations that you have, and you'll have to compare them to equal - either via a mapping structure (initial population) or through a rudimentary loop.
It's fairly easy to accomplish with a rudimentary loop, so I don't see any reason why you wouldn't want to go this route. The code snippet below assumes the field is named friendlyTask.
public static ServerTask forTaskName(String friendlyTask) {
for (ServerTask serverTask : ServerTask.values()) {
if(serverTask.friendlyTask.equals(friendlyTask)) {
return serverTask;
}
}
return null;
}
The caveat to this approach is that the data won't be stored internally, and depending on how many enums you actually have and how many times you want to invoke this method, it would perform slightly worse than initializing with a map.
However, this approach is the most straightforward. If you find yourself in a position where you have several hundred enums (even more than 20 is a smell to me), consider what it is those enumerations represent and what one should do to break it out a bit more.
Create static reverse lookup map.
public enum ServerTask {
HOOK_BEFORE_ALL_TASKS("Execute"),
COPY_MASTER_AND_SNAPSHOT_TO_HISTORY("Copy master db"),
PROCESS_CHECKIN_QUEUE("Process Check-In Queue"),
...
FINAL_ITEM("Final item");
// For static data always prefer to use Guava's Immutable library
// http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/ImmutableMap.html
static ImmutableMap< String, ServerTask > REVERSE_MAP;
static
{
ImmutableMap.Builder< String, ServerTask > reverseMapBuilder =
ImmutableMap.builder( );
// Build the reverse map by iterating all the values of your enum
for ( ServerTask cur : values() )
{
reverseMapBuilder.put( cur.taskName, cur );
}
REVERSE_MAP = reverseMapBuilder.build( );
}
// Now is the lookup method
public static ServerTask fromTaskName( String friendlyName )
{
// Will return ENUM if friendlyName matches what you stored
// with enum
return REVERSE_MAP.get( friendlyName );
}
}
If you have to get the enum from the String often, then creating a reverse map like Alexander suggests might be worth it.
If you only have to do it once or twice, looping over the values with a single if statement might be your best bet (like Nizil's comment insinuates)
for (ServerTask task : ServerTask.values())
{
//Check here if strings match
}
However there is a way to not iterate over the values at all. If you can ensure that the name of the enum instance and its String value are identical, then you can use:
ServerTask.valueOf("EXECUTE")
which will give you ServerTask.EXECUTE.
Refer this answer for more info.
Having said that, I would not recommend this approach unless you're OK with having instances have the same String representations as their identifiers and yours is a performance critical application which is most often not the case.
You could write a method like this:
static ServerTask getServerTask(String name)
{
switch(name)
{
case "Execute": return HOOK_BEFORE_ALL_TASKS;
case "Copy master db": return COPY_MASTER_AND_SNAPSHOT_TO_HISTORY;
case "Process Check-In Queue": return PROCESS_CHECKIN_QUEUE;
}
}
It's smaller, but not automatic like #Alexander_Pogrebnyak's solution. If the enum changes, you would have to update the switch.
This question may have been answered before in some dark recess of the Interwebs, but I couldn't even figure out how to form a meaningful Google query to search for it.
So: Suppose I have a (simplified) XML document like so:
<root>
<tag1>Value</tag1>
<tag2>Word</tag2>
<tag3>
<something1>Foo</something1>
<something2>Bar</something2>
<something3>Baz</something3>
</tag3>
</root>
I know how to use JAXB to unmarshal this into a Java Object in the standard use cases.
What I don't know how to do is unmarshal tag3's contents wholesale into a String. By which I mean:
<something1>Foo</something1>
<something2>Bar</something2>
<something3>Baz</something3>
as a String, tags and all.
Use annotation #XmlAnyElement.
I've been looking for the same solution and I expected to find some annotation that prevents parsing dom and live it as it is, but did not find it.
Detail at:
Using JAXB to extract inner text of XML element
and
http://blog.bdoughan.com/2011/04/xmlanyelement-and-non-dom-properties.html
I added one cheking in method getElement(), otherwise we could get IndexOutOfBoundsException
if (xml.indexOf(START_TAG) < 0) {
return "";
}
For me it's quite strange behavior with this solution. method getElement() is called for every tag of your xml. The first call is for "Value", the second - "ValueWord", etc. It appends the next tag for previous
update:
I noticed that this approach works only for ONE occurence of tag that we want to parse to String. It's impossible to parse correctly the followint example:
<root>
<parent1>
<tag1>Value</tag1>
<tag2>Word</tag2>
<tag3>
<something1>Foo</something1>
<something2>Bar</something2>
<something3>Baz</something3>
</tag3>
</parent1>
<parent2>
<tag1>Value</tag1>
<tag2>Word</tag2>
<tag3>
<something1>TheSecondFoo</something1>
<something2>TheSecondBar</something2>
<something3>TheSecondBaz</something3>
</tag3>
</parent2>
"tag3" with parent tag "parent2" will contain parameters from the first tag (Foo, Bar, Baz) instead of (TheSecondFoo, TheSecondBar, TheSecondBaz)
Any suggestions are appreciated.
Thanks.
I have an utility method that might come in handy for you in that case. See if it helps. I made a sample code with your example:
public static void main(String[] args){
String text= "<root><tag1>Value</tag1><tag2>Word</tag2><tag3><something1>Foo</something1><something2>Bar</something2><something3>Baz</something3></tag3></root>";
System.out.println(extractTag(text, "<tag3>"));
}
public static String extractTag(String xml, String tag) {
String value = "";
String endTag = "</" + tag.substring(1);
Pattern p = Pattern.compile(tag + "(.*?)" + endTag);
Matcher m = p.matcher(xml);
if (m.find()) {
value = m.group(1);
}
return value;
}
I have written the following string comparison operation for a feature in my app.
But I hate the way this looks and how unwieldy it is.
String foo = "abc";
if(!foo.startsWith("ba") &&
!foo.equals("cab") &&
!foo.equals("bca") &&
!foo.equals("bbc") &&
!foo.equals("ccb") &&
!foo.equals("cca"))
{
// do something
}
Is there a more elegant and perhaps more maintainable way to write something like this?
You can use a regular expression.
private static final Pattern P = Pattern.compile(
"(ba.*|cab|bca|bbc|ccb|cca)");
String foo = "abc";
if (!P.matcher(foo).matches())
If you only have this condition once in your code, I would let it like this. If you have it several times, initialize a constant set:
private static final Set<String> STRINGS_TO_AVOID =
Collections.unmodifiableSet(new HashSet<String>(Arrays.asList("cab", "bca", "bbc", "ccb", "cca")));
...
if (!foo.startsWith("ba") && !STRINGS_TO_AVOID.contains(foo)) {
...
}
Regex (ideally) or if/else block. Here's a gentle introduction to Java regular expressions:
http://www.javamex.com/tutorials/regular_expressions/
With your example, you could use a HashSet to put all the Strings you want to compare.
Then you could simply type
if (!foo.startsWith("ba") && !mySet.contains(foo)) {
//doSomething
}
Edit : Dang... i was to late... someone else beat me. ;-)
One could argue that something like the following is more "elegant" than your original code.
if (!foo.matches("^ba.*|^cab$|^bca$|^bbc$|^ccb$|^cca$")) {
// do something
}
I prefer the original since it is simplest, uses natural language that anyone can understand and therefore it is more maintainable...