Java 8 find and replace matching string(s) - java

I am trying to find a string from messages.properties in an errorMessage and if the errorMessage has the string I need to replace it with corresponding value
I have messages.properties as below
inv_date=INVOICE DATE
inv_id=INVOICE NUMBER
cl_id=CLIENT ID
lf_matter_id=LAW FIRM MATTER ID
inv_total_net_due=INVOICE TOTAL
(inv_start_date|invoice start date)=BILLING START DATE
(inv_end_date|invoice end date)=BILLING END DATE
inv_desc=INVOICE DESCRIPTION
units=LINE ITEM NUMBER OF UNITS
discount_amount=LINE ITEM ADJUSTMENT AMOUNT
total_amount=LINE ITEM TOTAL
(charge_date|charge date)= LINE ITEM DATE
acca_task=LINE ITEM TASK CODE
acca_expense=LINE ITEM EXPENSE CODE
acca_activity= LINE ITEM ACTIVITY CODE
(tk_id|time keeper id)=TIMEKEEPER ID
charge_desc=LINE ITEM DESCRIPTION
lf_id=LAW FIRM ID
(BaseRate|Rate|tk_rate)=LINE ITEM UNIT COST
tk_level=TIMEKEEPER CLASSIFICATION
cl_matter_id=CLIENT MATTER ID
My errorMesage can have any of the (left side)string and I need to replace it with right side string value
Below are couple of sample error messages
String errorMesage1 = "Line 3 : Could not parse inv_date value"
String errorMesage2 = "Line : 1 BaseRate is a required field"
below is my method which conversts the error message
public static String toUserFriendlyErrorMessage(String message) {
ResourceBundle rb = ResourceBundle.getBundle("messages");
for(String key : rb.keySet()){
String header = rb.getString(key);
if(message.contains(key)) {
return message.replaceAll(key, rb.getString(key));
}
}
return message;
}
Below is the expected output:
for errorMessage1 it works fine
System.out.println(toUserFriendlyErrorMessage(errorMessage1)); ==> Line 3 : Could not parse INVOICE DATE value
But for errorMessage2 its not working. It doesnt replace BaseRate with LINE ITEM UNIT COST
System.out.println(toUserFriendlyErrorMessage(errorMessage2)); ==> Line : 1 BaseRate is a required field
Is there a way to find the occurance of multiple strings and replace it with its corresponding value?
For example: Find (BaseRate|Rate|tk_rate) and replace the string with LINE ITEM UNIT COST
Also am wondering can this method simplified further in java 8?

I think you should reconsider your design and use individual keys for the several "aliases" - or probably even better: No aliases at all and just one key per replacement. The problem is that the keys in those properties files are not supposed to contain spaces -- parantheses or not -- so the files are not parsed correctly. If you print the keys, you will see that they are truncated at the first space, e.g. (inv_start_date|invoice start date) becomes (inv_start_date|invoice.
Of course, this also means that, even if you split those "aliases" into separate keys, you can not have keys like invoice start date, as it still contains spaces and will not be parsed correctly.
You could just put those replacements into a regualr Map in your Java source code:
static Map<String, String> replacements = new HashMap<>();
static {
replacements.put("inv_date", "INVOICE DATE");
replacements.put("(BaseRate|Rate|tk_rate)", "LINE ITEM UNIT COST");
// ... more stuff ...
}
Or parse the file manually, splitting the strings at = and putting them into a Map.
And since keys like (BaseRate|Rate|tk_rate) are in fact valid regular expressions, you can just use replaceAll to replace them in all their variants. If they are not contained in the string, replaceAll will just do nothing, so the contains check is not really necessary.
public static String toUserFriendlyErrorMessage(String message) {
for (String key : replacements.keySet()) {
message = message.replaceAll(key, replacements.get(key));
}
return message;
}
Example output:
Line 3 : Could not parse INVOICE DATE value
Line : 1 LINE ITEM UNIT COST is a required field
Or, if you want to use some "Java 8 magic", you could use reduce, but personally, I think the loop is more readable.
return replacements.keySet().stream()
.reduce(message, (s, k) -> s.replaceAll(k, replacements.get(k)))
.toString();

Related

How to create correct wildcarded query in Marklogic database using Java?

I have a problem with incorrect results during sending wildcarded query in my Java project.
On my Marklogic database I do have saved multiple json files with same structure.
I want to receive those jsons which in field named "icsList" (It is List of Strings) starts with given String.
Example icsList in json looks like:
"icsList": ["11.040.40", "12.50.80"]
"icsList": ["12.50.60"]
"icsList": ["50.010.10"]
My example results:
Request - String "50"
Result - All jsons (even those which do not have "50" in their icsList)
Request - String "50."
Result - Jsons that have "50." inside icsList (for example: "50.010.10", "50.010.20" but also "12.50.60")
As I mantioned earlier my primary goal is to get all jsons which in field named "icsList" STARTS with given String.
My secound goal is to get rid of necessary dot at the end of request String.
My code is:
StructuredQueryBuilder sqb = new StructuredQueryBuilder();
String[] wordOptions = {"wildcarded"};
StructuredQueryDefinition queryDefinitionIcs = sqb.word(sqb.jsonProperty("icsList"),
null, wordOptions, 1, searchText + "*");
StructuredQueryDefinition query = sqb.and(queryDefinitionIcs);
query.setCollections(DocumentCollection.ATTRIBUTES.getName());
try (DocumentPage search = jsonDocumentManager.search(query, 1L)) {
JacksonHandle handle = new JacksonHandle();
List<DocumentAttributes> documentAttributes = StreamSupport.stream(search.spliterator(), false)
.map(v -> mapToDocumentAttributes(handle, v))
.toList();
}
private DocumentAttributes mapToDocumentAttributes(JacksonHandle handle, DocumentRecord v) {
try {
var doc = objectMapper.treeToValue(v.getContent(handle).get(), DocumentAttributes.class);
return doc;
} catch (JsonProcessingException e) {
throw new RuntimeException(e);
}
}
In pom.xml I do have:
<dependency>
<groupId>com.marklogic</groupId>
<artifactId>marklogic-client-api</artifactId>
<version>5.5.3</version>
</dependency>
If you enable Two Character Searches on the database, then you could search for "50*" instead of "50.*", but that could dramatically affect the size of your indexes and ingestion performance, so that may not be advisable.
You might need to enable Three Character Searches or Trailing Wildcard Searches on your database in order to be able to search efficiently with such a short wildcarded value as "50.*" or "50.* *".
https://docs.marklogic.com/guide/search-dev/wildcard#id_39731
If you used value() to construct a cts:json-property-value-query(), instead of a word query, and included the . in the wildcarded value, then it would find just that last document that starts with 50..
For example, this search:
cts:search(doc(), cts:json-property-value-query("icsList", "50.*"))
or:
cts:search(doc(), cts:json-property-value-query("icsList", "50* *"))
Note that the text content for the value in a cts:json-property-value-query is treated the same as a phrase in a cts:word-query, where the phrase is the property value. Therefore, any wildcard and/or stemming rules are treated like a phrase. For example, if you have an property value of "hello friend" with wildcarding enabled for a query, a cts:json-property-value-query for "he*" will not match because the wildcard matches do not span word boundaries, but a cts:json-property-value-query for "hello *" will match. A search for "*" will match, because a "*" wildcard by itself is defined to match the value. Similarly, stemming rules are applied to each term, so a search for "hello friends" would match when stemming is enabled for the query because "friends" matches "friend".
StructuredQueryBuilder sqb = new StructuredQueryBuilder();
String[] options = {"wildcarded"};
StructuredQueryDefinition queryDefinitionIcs = sqb.value(sqb.jsonProperty("icsList"),
null, options, 1, searchText + "*");
An alternative to making database-wide changes would be to create a field with the necessary index settings to facilitate a two-character wildcard search for that field.
field value searches
trailing wildcard searches
two character searches
Then you could search the field with trailing wildcard:
cts:search(doc(), cts:field-value-query("icsList", "50* *"))
Search against the field instead of a jsonProperty:
StructuredQueryDefinition queryDefinitionIcs = sqb.value(sqb.field("icsList"),
null, options, 1, searchText + "* *");

Extracting Substrings from a List in Java

If I have a parent string (let's call it output) that contains a list of variable assignments like so ...
status.availability-state available
status.enabled-state enabled
status.status-reason The pool is available
And I want to extract the values of each variable in that list given the variable names, ie the substring after the space following status.availability-state, status.enabled-state, and status.status-reason, such that I end up with three different variable assignments making each of the following String comparisons true ...
String availability = output.substring(TODO);
String enabled = output.substring(TODO);
String reason = output.substring(TODO);
availability.equals("available");
enabled.equals("enabled");
reason.equals("The pool is available");
What is the simplest way to do this? Should I even use substring for this?
This is a little tricky because you need to assign the value to a specific variable - you can't just have a map of keys to variables in Java.
I would consider doing this with a switch:
for (String line : output.split('\n')) {
String[] frags = line.split(' ', 2); // Split the line in 2 at the space.
switch (frags[0]) { // This is the "key" of the variable.
case "status.availability-state":
availability = frags[1]; // This assigns the "value" to the relevant variable.
break;
case "status.enabled-state":
enabled = frags[1];
break;
// ... etc
}
}
It's not very pretty, but you don't have too many options.
There seem to be two questions here -- how to parse the string, and how to assign to variables by name.
Tackle the string parsing one step at a time:
first write a program to read one line at a time and output each one in the body of a loop. String.split() or StringTokenizer are two options here.
next enhance this by writing a method to handle one line. The same tools are helpful here, to split on spaces.
You should now have a program that can print name: status.availability-state, value: available for each line of input.
Next, you're asking to programatically assign to variables based on the name of the parameter.
There is no legitimate way to look at a variable's name at runtime (OK, Java 8 reflection has ways, but it shouldn't be used without very good reason).
So, the best you can do is to use a switch or if statement:
switch(name) {
case status.availability-state:
availability = value;
break;
... etc.
}
However, whenever you use switch or if you should think about whether there's a better way.
Is there any reason you can't turn these variables into Map entries?
configMap.add(name,value);
Then to read it:
doSomethingWith(configMap.get("status.availability");
That's what maps are for. Use them.
This is a similar situation to the rookie mistake of using variables called person1, person2, person3... instead of using an array. Eventually they ask "How do I go from the number 25 to my variable person25?" -- and the answer is, you can't, but an array or list makes it easy. people[number] or people.get(number)
A valid alternative is to split the string by \n and add to a Map. Example:
String properties = "status.availability-state available\nstatus.enabled-state enabled\nstatus.status-reason The pool is available";
Map<String, String> map = Arrays.stream(properties.split("\n"))
.collect(Collectors.toMap(s -> s.split(" ")[0], s -> s.split(" ", 2)[1]));
System.out.println(map.get("status.status-reason"));
Should output The pool is available
This loop will match and extract the variables, and you can then assign them as you see fit:
Pattern regex = Pattern.compile("status\\.(.*?)-.*? ([a-z]+)");
Matcher matcher = regex.matcher(output);
while (matcher.find()) {
System.out.println(matcher.group(1) + "=" + matcher.group(2));
}
status\\. matches "status."
(.*?) matches any sequence of characters but isn't greedy, and captures them
-.* matches dash, any chars, space
([a-z]+) matches any string of lower-case letters, and captures them
Here's one way to do it:
Map<String, String> properties = getProperties(propertiesString);
availability = properties.get("availability-state");
enabled = properties.get("enabled-state");
reason = properties.get("status-reason");
// ...
public void getProperties(String input) {
Map<String, String> properties = new HashMap<>();
String[] lines = output.split("\n");
for (String line : lines) {
String[] parts = line.split(" ");
int keyStartIndex = parts[0].indexOf(".") + 1;
int spaceIndex = parts[1].indexOf(" ");
string key = parts[0].substring(keyStartIndex, spaceIndex);
properties.put(key, parts[1]);
}
return properties;
}
This seems to be a bit more straight-forward, in terms of the code that's setting these values, as each value is set to exactly the value from the map, rather than iterating over some list of strings and seeing if it contains a particular value and doing different things based on that.
This is designed with the primary use-case being that the string is created at runtime in memory. If the properties are created in an external file, this code would still work (after creating the desired String in memory), but it may be a better idea to use either a Properties file, or perhaps a Scanner.

Find position of sections within a string marked by special markers

I have a string to process that parts of it are "marked" with custom tags to indicate an area of the string that is "different" from the rest.
Example:
This is an {TypeAStart}arbitrary long{TypeAEnd} text
The arbitrary long part is an area within the string that is of interest.
I wanted a good way to get the start and end index of this part of the string and with regex I can do that (regex question)
The problems with using such type of approach are:
1) I can not easily generalize it
2) My main target is to end up with the string This is an arbitrary long text and have another data structure that describes which marker was applied and where in the final string.
I can not see any straightforward way to do this via regular expressions.
What I would like to achieve is to have e.g. an array of these custom markers as pairs and process the string to find all these substrings.
Example input:
This is an {TypeAStart}arbitrary long {SomeOtherStart} very very very {SomeOtherEnd} long long{TypeAEnd} text
Known markers:
[TypeAStart, TypeAEnd], [SomeOtherStart, SomeOtherEnd] etc
Output:
This is an arbitrary long very very very long long text
TypeA [11, 50] , SomeOther [26, 40]
How can I implement this?
I have an open source project that can help with that:
http://mtimmerm.github.io/dfalex/
Lets say you have an Enum Marker with values like TYPEASTART and TYPEAEND, and lets say the values implement a method String text() that gets the string to search for.
You can add patterns for your markers to a DfaBuilder, which will give you a DFA that you can use to find the markers in strings, like this:
DfaBuilder<Marker> builder = new DfaBuilder<>();
for (Marker val : Marker.values())
{
builder.addPattern(Pattern.match(val.text()), val);
}
DfaState<Marker> START_STATE = builder.build();
Then you can use this with a StringMatcher to find your patterns:
StringMatcher matcher = new StringMatcher(someString);
Marker found;
while((found = matcher.findNext(START_STATE))!=null)
{
//found is the kind of marker we found
//this is the start position in the string
int startPos = matcher.getLastMatchStart();
//this is the end position in the string
int endPos = matcher.getLastMatchEnd();
}
If you remember the positions of the start markers, you can easily extract the strings between markers when you find the matching end markers.
To get the string between without markers, open a StringBuilder and fill it with the stuff between markers until you get to the end

search elements in an array in java

I'm wondering what kind method should I use to search the elements in an array and what data structure to store the return value
For example a txt file contains following
123 Name line Moon night table
124 Laugh Cry Dog
123 quote line make pet table
127 line array hello table
and the search elements are line+table
I read every line as an string and then spilt by space
the output should like this
123 2 (ID 123 occurs twice that contains the search elements)
127 1
I want some suggestions of what kind method to search the elements in the array and what kind data structure to store the return value (the ID and the number of occurs. I'm thinking hashmap)
Read the text file and store each line that ends with table in ArrayList<String>. Then use contains for each element in ArrayList<String>. Store result in HashMap<key,value> where key is ID and value is Integer which represent number of times ID occurs.
First, I would keep reading through the file line by line, there's really no other way of going about it other than that.
Second, to pick out the rows to save, you don't need to do the split (assumption: they all end in (space)table). You can just get them by using:
if (line.endsWith(" table"))
Then, I would suggest using a Map<String, Integer> datatype to store your information. This way, you have the number of the table (key) and how many times if was found in the file (value).
Map<String, Integer> map = new HashMap<String, Integer>();
....reading file....
if (line.endsWith(" table")) {
String number = line.substring(0, line.indexOf(" "))
if (!map.containsKey(number)) {
map.put(number, 1);
} else {
Integer value = map.get(number);
value++;
map.put(number, value);
}
}

How to design String decollator in a string contains many params

I need pass a string parameter that contains many params. When receive the parameter, I use String.split() to split it to get all the params.
But one promblem accured. How to design my string decollator so that any ASCII CODE on keyboard can be passed correctly.
Hope for any advice.
Maybe you could have a look at variadic arguments instead of splitting a string. For example:
public void method(String... strings) {
// strings is actually an array
String firstParam = strings[0];
String secondParam = strings[1];
// ...
}
Calling:
method("string1");
method("string1", "string2", "string3");
// as many string args as you want
If I understood correctly - you need to encode set of parameters to one string. You can use some sequence of characters for this purpose, E.g.
final String delimiter = "###"
String value = "param1###param2###param3";
String[] parameters = value.split(delimiter);
Choose a character which is easy to enter and unlikely to appear in the input. Let's assume that character is #.
Normal input would like like Item 1#Item 2#Item 3. Actually, you can .trim() every item and let the user enter Item 1 # Item 2 # Item 3 if s/he prefers.
However, like you describe, say the user would like to enter Item #1, Item #2, etc.. There are a few ways to let him/her do this, but the easier is to let them escape the delimiter. For example, instead of Item #1 # Item #2 # Item #3, which would result in 6 different items being found normally, let the user enter, for example Item ##1 # Item ##2 # Item ##3. Then in your parsing, make sure to handle the case when two or more #'s have been entered in a row. split likely won't be good enough, you'll have to go through the string yourself.
Here's a sketch of a method which would split the input string for you:
private static List<String> parseArguments(String input) {
ArrayList<String> arguments = new ArrayList<String>();
String[] prelArguments = input.split("#");
for (int i = 0; i < prelArguments.length; i++) {
String argument = prelArguments[i];
if (argument.equals("")) {
// We will enter here if there were two or more #'s in a row
StringBuilder combinedArgument = new StringBuilder(arguments.remove(arguments.size() - 1));
int inARow = 0;
while (prelArguments[i+inARow].equals("")) {
inARow++;
combinedArgument.append('#');
}
i += inARow;
combinedArgument.append(prelArguments[i]);
arguments.add(combinedArgument.toString());
} else {
arguments.add(argument);
}
}
return arguments;
}
Error handling, edge-case handling and some performance improvement is missing from the above, but I think the idea comes through.
I would eliminate the problem, which is the misuse of String as an argument container. If you need to pass more parameters, pass more parameters. If this gets out of hand, consider passing a map, or a custom object that can contain all the parameters.

Categories