Algorithm to search and replace delimited parameters - java

I have a string that contains multiple parameters delimited by #, like this :
.... #param1# ... #param2# ... #paramN# ...
And I want to replace the parameter placeholders by values.
The current algorithm looks like this:
//retrieve place holder into this SQL select
Pattern p = Pattern.compile(DIMConstants.FILE_LINE_ESCAPE_INDICATOR);
Matcher m = p.matcher(sqlToExec); // get a matcher object
int count = 0;
int start = 0;
int end = 0;
StringBuilder params = new StringBuilder();
while (m.find()) {
count++;
if (count % 2 == 0) {
// Second parameter delimiter
String patternId = sqlToExec.substring(start, m.end());
//Clean value (#value#->value)
String columnName = patternId.substring(1, patternId.length() - 1);
//Look for this column into preLoad row ResultSet and retrieve its value
String preLoadTableValue = DIMFormatUtil.convertToString(sourceRow.get(columnName));
if (!StringUtils.isEmpty(preLoadTableValue)) {
aSQL.append(loadGemaDao.escapeChars(preLoadTableValue).trim());
} else {
aSQL.append(DIMConstants.COL_VALUE_NULL);
}
params.append(" " + columnName + "=" + preLoadTableValue + " ");
end = m.end();
} else {
// First parameter delimiter
start = m.start();
aSQL.append(sqlToExec.substring(end, m.start()));
}
}
if (end < sqlToExec.length()) {
aSQL.append(sqlToExec.substring(end, sqlToExec.length()));
}
I'm looking for a simplest solution, using regexp or another public API. Input parameters will be the source string, a delimiter and a map of values. Output parameter will be the source string with all the parameters replaced.

If this is for a normal SQL query, you might want to look into using PreparedStatements
Beyond that, am I missing something? Why not just use String.replace()? Your code could look like this:
for(int i = 0; i < n; i++){
String paramName = "#param" + i + "#"
sqlToExec = sqlToExec.replace(paramName,values.get(paramName));
}
That assumes you have a map called "values" with string mappings between parameters in the form "#paramN#"

If you need it more generic, this will find and return the whole param including the #'s:
public class ParamFinder {
public static void main(String[] args) {
String foo = "#Field1# #Field2# #Field3#";
Pattern p = Pattern.compile("#.+?#");
Matcher m = p.matcher(foo);
List matchesFound = new ArrayList();
int ndx = 0;
while(m.find(ndx)){
matchesFound.add(m.group());
ndx = m.end();
}
for(Object o : matchesFound){
System.out.println(o);
}
}
}

Related

Find secuentially occurrences of all String[] in a given String

I have a pair of Strings in an array to check in another String:
String[] validPair = "{"[BOLD]", "[/BOLD]" };
String toCheck = "Example [BOLD]bold long text[/BOLD] other example [BOLD]bold short[/BOLD]";
I need to check the balance of the tags, I know how to check if a string is inside another string, also how to achieve this using both indexOf of validPair content across the string and saving references, but is an ugly way and I don't wanna reinvent the wheel.
Something like :
int lastIndex = 0;
while (lastIndex != -1) {
int index = toCheck.findNextOccurrence(validPair, lastIndex); // here use indexOf
System.out.println(index);
lastIndex = index;
}
I was guessing if there is a way I can check nextOccurrence of any of the String's in String[] validPair in the String toCheck?
A kind of Iterator or Tokenizer but not splitting the string and giving only occurrences of the contents of the array (or List or any other Object).
OR:
OwnIterator ownIterator = new OwnIterator<String>(toCheck, validPair);
while (toCheck.hasNext()) {
String next = toCheck.findNextOccurrence();
System.out.println(next);
}
OUTPUT:
[BOLD]
[/BOLD]
[BOLD]
[/BOLD]
This is the solution I came up with. it is using array of regular expressions to search for every item in validPair separetely then combine all found occurrences into one list (and its iterator)
public class OwnIterator implements Iterator
{
private Iterator<Integer> occurrencesItr;
public OwnIterator(String toCheck, String[] validPair) {
// build regex to search for every item in validPair
Matcher[] matchValidPair = new Matcher[validPair.length];
for (int i = 0 ; i < validPair.length ; i++) {
String regex =
"(" + // start capturing group
"\\Q" + // quote entire input string so it is not interpreted as regex
validPair[i] + // this is what we are looking for, duhh
"\\E" + // end quote
")" ; // end capturing group
Pattern p = Pattern.compile(regex);
matchValidPair[i] = p.matcher(toCheck);
}
// do the search, saving found occurrences in list
List<Integer> occurrences = new ArrayList<>();
for (int i = 0 ; i < matchValidPair.length ; i++) {
while (matchValidPair[i].find()) {
occurrences.add(matchValidPair[i].start(0)+1); // +1 if you want index to start at 1
}
}
// sort the list
Collections.sort(occurrences);
occurrencesItr = occurrences.iterator();
}
#Override
public boolean hasNext()
{
return occurrencesItr.hasNext();
}
#Override
public Object next()
{
return occurrencesItr.next();
}
}
a quick test :
public static void main(String[] args)
{
String[] validPair = {"[BOLD]", "[/BOLD]" };
String toCheck = "Example [BOLD]bold long text[/BOLD] other example [BOLD]bold short[/BOLD]";
OwnIterator itr = new OwnIterator(toCheck, validPair);
while (itr.hasNext()) {
System.out.println(itr.next());
}
}
gives desired output:
9
29
51
67
EDIT:
found a better solution, with just one regular expression that includes all items in validPair with "or" condition (|). then you have the Matcher's own find() method as the iterator:
String regex = "(";
for (int i = 0 ; i < validPair.length ; i++) {
regex += (i == 0 ? "" : "|") + // add "or" after first item
"\\Q" + // quote entire input string so it is not interpreted as regex
validPair[i] + // this is what we are looking for, duhh
"\\E"; // end quote
}
regex += ")";
System.out.println("using regex : " + regex);
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(toCheck);
while (m.find()) {
System.out.println(m.group(0));
}
you get the output
using regex : (\Q[BOLD]\E|\Q[/BOLD]\E)
[BOLD]
[/BOLD]
[BOLD]
[/BOLD]
You can just do:
int first = toCheck.indexOf(validPair[0]);
boolean ok = first > -1 && toCheck.indexOf(validPair[1], first) > 0;

Unformat formatted String

I have a simple formatted String:
double d = 12.348678;
int i = 9876;
String s = "ABCD";
System.out.printf("%08.2f%5s%09d", d, s, i);
// %08.2f = '12.348678' -> '00012,35'
// %5s = 'ABCD' -> ' ABCD'
// %09d = '9876' -> '000009876'
// %08.2f%5s%09d = '00012,35 ABCD000009876'
When i know the pattern: %08.2f%5s%09d and String: 00012,35 ABCD000009876:
Can i "unformat" this String in some way?
eg. the expected result something like 3 tokens: '00012,35', ' ABCD', '000009876'
This is specific to your pattern. A general parser for a formatstring, (because what we call unformatting is parsing) would look much different.
public class Unformat {
public static Integer getWidth(Pattern pattern, String format) {
Matcher matcher = pattern.matcher(format);
if (matcher.find()) {
return Integer.valueOf(matcher.group(1));
}
return null;
}
public static String getResult(Pattern p, String format, String formatted,
Integer start, Integer width) {
width = getWidth(p, format);
if (width != null) {
String result = formatted.substring(start, start + width);
start += width;
return result;
}
return null;
}
public static void main(String[] args) {
String format = "%08.2f%5s%09d";
String formatted = "00012.35 ABCD000009876";
String[] formats = format.split("%");
List<String> result = new ArrayList<String>();
Integer start = 0;
Integer width = 0;
for (int j = 1; j < formats.length; j++) {
if (formats[j].endsWith("f")) {
Pattern p = Pattern.compile(".*([0-9])+\\..*f");
result.add(getResult(p, formats[j], formatted, start, width));
} else if (formats[j].endsWith("s")) {
Pattern p = Pattern.compile("([0-9])s");
result.add(getResult(p, formats[j], formatted, start, width));
} else if (formats[j].endsWith("d")) {
Pattern p = Pattern.compile("([0-9])d");
result.add(getResult(p, formats[j], formatted, start, width));
}
}
System.out.println(result);
}
}
Judging by your output format of "%08.2f%5s%09d", it seems comparable to this pattern
"([0-9]{5,}[\\.|,][0-9]{2,})(.{5,})([0-9]{9,})"
Try the following:
public static void main(String[] args) {
String data = "00012,35 ABCD000009876";
Matcher matcher = Pattern.compile("([0-9]{5,}[\\.|,][0-9]{2,})(.{5,})([0-9]{9,})").matcher(data);
List<String> matches = new ArrayList<>();
if (matcher.matches()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
matches.add(matcher.group(i));
}
}
System.out.println(matches);
}
Results:
[00012,35, ABCD, 000009876]
UPDATE
After seeing the comments, here's a generic example without using RegularExpressions as to not copy #bpgergo (+1 to you with generic RegularExpressions approach). Also added some logic in case the format ever exceeded the width of the data.
public static void main(String[] args) {
String data = "00012,35 ABCD000009876";
// Format exceeds width of data
String format = "%08.2f%5s%09d%9s";
String[] formatPieces = format.replaceFirst("^%", "").split("%");
List<String> matches = new ArrayList();
int index = 0;
for (String formatPiece : formatPieces) {
// Remove any argument indexes or flags
formatPiece = formatPiece.replaceAll("^([0-9]+\\$)|[\\+|-|,|<]", "");
int length = 0;
switch (formatPiece.charAt(formatPiece.length() - 1)) {
case 'f':
if (formatPiece.contains(".")) {
length = Integer.parseInt(formatPiece.split("\\.")[0]);
} else {
length = Integer.parseInt(formatPiece.substring(0, formatPiece.length() - 1));
}
break;
case 's':
length = Integer.parseInt(formatPiece.substring(0, formatPiece.length() - 1));
break;
case 'd':
length = Integer.parseInt(formatPiece.substring(0, formatPiece.length() - 1));
break;
}
if (index + length < data.length()) {
matches.add(data.substring(index, index + length));
} else {
// We've reached the end of the data and need to break from the loop
matches.add(data.substring(index));
break;
}
index += length;
}
System.out.println(matches);
}
Results:
[00012,35, ABCD, 000009876]
You can do something like this:
//Find the end of the first value,
//this value will always have 2 digits after the decimal point.
int index = val.indexOf(".") + 3;
String tooken1 = val.substring(0, index);
//Remove the first value from the original String
val = val.substring(index);
//get all values after the last non-numerical character.
String tooken3 = val.replaceAll(".+\\D", "");
//remove the previously extracted value from the remainder of the original String.
String tooken2 = val.replace(tooken3, "");
This will fail if the String value contains a number at the end and probably in some other cases.
As you know the pattern, it means that you are dealing with some kind of regular expression. Use them to utilize your needs.
Java has decent regular expression API for such tasks
Regular expressions can have capturing groups and each group would have a single "unformatted" part just as you want. All depends on regex you will use/create.
Easiest thing to do would be to parse the string using a regex with myString.replaceAll(). myString.split(",") may also be helpful for splitting your string into a string array

How to find if an ArrayList contains a character from a String?

I am trying to build a function that evaluates an ArrayList (with contents from a file) and checks if all its characters are contained inside a variable (String), and act accordingly if thats not the case.
For example, ["hello", "I am new to Java", "Help me out!"].contains("aeIou") would be ok because all the chars in "aeIou" exist on the array. If it was "aeiou" it would return a message, because 'i' is not in the array, as it's case sensitive (it wouldn't need to test the others). But note that the test chars could be anything, not just letters.
I've built this function, and although it does compile without errors, it always returns that the character is not in the array, although it is:
private static void ValidateDecFile(String testStr, ArrayList<String> fcontents) {
int count = 0;
for(int j = 0; j < testStr.length(); j++) {
if(!Arrays.asList(fcontents).contains(testStr.charAt(j))) {
String errMsg = "Character '" + testStr.charAt(j) + "' is not in the string.";
}
}
}
From the searches I've made, I am assming this is a variable type problem, that does not return the expected "output" for the comparison.
But I've outputed testStr.length(), Arrays.asList(fcontents), testStr.charAt(j) and they all return the expected results, so I have no idea what's going on!
Whatever I do, this function always returns the errMsg String, and the char that "fails" the comparison is always the first char of testStr.
You can do the test in one line:
List<String> list;
String chars;
String regex = chars.replaceAll(".", "(?=.*\\Q$0\\E)") + ".*";
StringBuilder sb = new StringBuilder();
for (String s : list)
sb.append(s);
boolean hasAll = s.toString().matches(regex);
In java 8, the whole thing can be one line:
boolean hasAll = list.stream()
.collect(Collectors.joining(""))
.matches(chars.replaceAll(".", "(?=.*\\Q$0\\E)") + ".*");
The trick is to turn chars into a series of look ahead assertions and run that over the list concatenate into one giant string.
This will work for any input chars and any test chars, due to the regex treating each char as a literal.
Invoking contains() on a List will compare each element in the list to the argument, so in your example it would be comparing hello, I am new to Java and so on to each one of the search characters.
Inside the loop, you should be testing if any of the Strings in the List contain the character, not if one of the Strings in the List is the character.
Note that String.contains() needs a CharSequence as an argument, and charAt returns a char. You could use indexOf instead and test if it returns a positive number.
private static void ValidateDecFile(String testStr, ArrayList<String> fcontents) {
int count = 0;
String errMsg;
for(int j = 0; j < testStr.length(); j++) {
boolean found = false;
for (int i = 0;i<fcontents.size() && !found;i++) {
found = fcontents.get(i).indexOf(testStr.charAt(j)) >= 0;
}
if (!found){
errMsg = "Character '" + testStr.charAt(j) + "' is not in the string.";
}
}
}
Ideone demo.
Try the below one, You need to iterate and check each elements in the list and not the list
Hopt you will get the required
private static void ValidateDecFile(String testStr, ArrayList<String> fcontents) {
int count = 0;
boolean flag = false;
for(int j = 0; j < testStr.length(); j++) {
flag = false;
for(String content:fcontents){
if(content.contains(""+(testStr.charAt(j)))){
flag=true;
}
}
if(flag) {
String errMsg = "Character '" + testStr.charAt(j) + "' is not in the string.";
}
}
}
private static void ValidateDecFile(String testStr, ArrayList<String> fcontents) {
String fullStr = "";
for( int j = 0; j < fcontents.length(); j++ ) {
fullStr += fcontents.get( j );
}
for(int j = 0; j < testStr.length(); j++) {
if( !fullStr.contains( testStr.charAt(j) ) ) {
String errMsg = "Character '" + testStr.charAt(j) + "' is not in the string.";
System.out.println( errMsg );
}
}
}
In Java 8 using streams you can simplify it with the following:
String searchStr = "aeliou";
List<String> data = Arrays.asList("hello", "I am new to Java", "Help me out!");
for(int i = 0; i < searchStr.length(); i++ )
{
final int c = i;
if( ! data.stream().anyMatch(t -> t.contains(Character.toString(searchStr.charAt(c)))) )
System.out.println("not found:" + searchStr.charAt(i));
}
Or even shorter in a single statement using the chars() method from java 8:
String searchStr = "aeliou";
List<String> data = Arrays.asList("hello", "I am new to Java", "Help me out!");
searchStr.chars().forEach(c -> {
if (!data.stream().anyMatch(t -> t.contains(Character.toString((char)c))))
System.out.println("not found:" + Character.toString((char)c));
} );

Java Regex : How to detect the index of not mached char in a complex regex

I'm using regex to control an input and I want to get the exact index of the wrong char.
My regex is :
^[A-Z]{1,4}(/[1-2][0-9][0-9][0-9][0-1][0-9])?
If I type the following input :
DATE/201A08
Then macher.group() (using lookingAt() method) will return "DATE" instead of "DATE/201". Then, I can't know that the wrong index is 9.
If I read this right, you can't do this using only one regex.
^[A-Z]{1,4}(/[1-2][0-9][0-9][0-9][0-1][0-9])? assumes either a String starting with 1 to 4 characters followed by nothing, or followed by / and exactly 6 digits. So it correctly parses your input as "DATE" as it is valid according to your regex.
Try to split this into two checks. First check if it's a valid DATE
Then, if there's an actual / part, check this against the non-optional pattern.
You want to know whether the entire pattern matched, and when not, how far it matched.
There regex fails. A regex test must succeed to give results in group(). If it also succeeds on a part, one does not know whether all was matched.
The sensible thing to do is split the matching.
public class ProgressiveMatch {
private final String[] regexParts;
private String group;
ProgressiveMatch(String... regexParts) {
this.regexParts = regexParts;
}
// lookingAt with (...)?(...=)?...
public boolean lookingAt(String text) {
StringBuilder sb = new StringBuilder();
sb.append('^');
for (int i = 0; i < regexParts.length; ++i) {
String part = regexParts[i];
sb.append("(");
sb.append(part);
sb.append(")?");
}
Pattern pattern = Pattern.compile(sb.toString());
Matcher m = pattern.matcher(text);
if (m.lookingAt()) {
boolean all = true;
group = "";
for (int i = 1; i <= regexParts.length; ++i) {
if (m.group(i) == null) {
all = false;
break;
}
group += m.group(i);
}
return all;
}
group = null;
return false;
}
// lookingAt with multiple patterns
public boolean lookingAt(String text) {
for (int n = regexParts.length; n > 0; --n) {
// Match for n parts:
StringBuilder sb = new StringBuilder();
sb.append('^');
for (int i = 0; i < n; ++i) {
String part = regexParts[i];
sb.append(part);
}
Pattern pattern = Pattern.compile(sb.toString());
Matcher m = pattern.matcher(text);
if (m.lookingAt()) {
group = m.group();
return n == regexParts.length;
}
}
group = null;
return false;
}
public String group() {
return group;
}
}
public static void main(String[] args) {
// ^[A-Z]{1,4}(/[1-2][0-9][0-9][0-9][0-1][0-9])?
ProgressiveMatch match = new ProgressiveMatch("[A-Z]{1,4}", "/",
"[1-2]", "[0-9]", "[0-9]", "[0-9]", "[0-1]", "[0-9]");
boolean matched = match.lookingAt("DATE/201A08");
System.out.println("Matched: " + matched);
System.out.println("Upto; " + match.group());
}
One could make a small DSL in java, like:
ProgressiveMatch match = ProgressiveMatchBuilder
.range("A", "Z", 1, 4)
.literal("/")
.range("1", "2")
.range("0", "9", 3, 3)
.range("0", "1")
.range("0", "9")
.match();

Manipulating a user's input

So I'm trying to manipulate the user's input in such a way that when I find a certain string in his input I turn that into a variable and replace the string with the name of the variable. (jumbled explanation I know, maybe an example will make it more clear).
public class Test {
static List<String> refMap = new ArrayList<String>();
public static void main(String[] args) {
String x = "PROPERTY_X";
String y = "PROPERTY_Y";
refMap.add(x);
refMap.add(y);
String z = "getInteger("PROPERTY_X)";
String text = "q=PROPERTY_X+10/(200*PROPERTY_X)";
String text1 = "if(PROPERTY_X==10){"
+ "j=1;"
+ "PROPERTY_X=5; "
+ "if(true){"
+ "m=4/PROPERTY_X"
+ "}"
+ "}";
detectEquals(text);
}
public static String detectEquals(String text) {
String a = null;
text = TestSplitting.addDelimiters(text);
String[] newString = text.split(" ");
List<String> test = Arrays.asList(newString);
StringBuilder strBuilder = new StringBuilder();
HashMap<String, Integer> signs = new HashMap<String, Integer>();
HashMap<String, Integer> references = new HashMap<String, Integer>();
List<String> referencesList = new ArrayList<String>();
List<Integer> indexList = new ArrayList<Integer>();
int index = 0;
for (int i = 0; i < test.size(); i++) {
a = test.get(i).trim();
//System.out.println("a= " + a);
strBuilder.append(a);
index = strBuilder.length() - a.length();
if (a.equals("=")) {
signs.put(a, index);
indexList.add(index);
// System.out.println("signs map--> : "+signs.get(a));
}
if (refMap .contains(a)) {
references.put(a, index);
// System.out.println("reference index-> "+references.get(a));
// System.out.println("reference-> "+references.toString());
}
}
//stuck here
for (String s : references.keySet()) {
//System.out.println("references-> " + s);
int position = references.get(s);
for (int j : indexList) {
if (j <= position) {
System.out.println(j);
}
}
//strBuilder.insert(j - 1, "temp1=\r\n");
}
System.out.println(strBuilder);
return a;
}
Say the user inputs the content of the string "text", I'm trying to parse that input so when I find "PROPERTY_X", I want to create a variable out of it and place it right before the occurrence of text, and then replace "PROPERTY_X" with the name of the newly created variable.
The reason I'm also searching for "=" sign is because I only want to do the above for the first occurrence of "PROPERTY_X" in the whole input and then just replace "PROPERTY_X" with tempVar1 wherever else I find "PROPERTY_X".
ex:
tempVar1=PROPERTY_X;
q=tempVar1+10/(200*tempVar1);
Things get more complex as the user input gets more complex, but for the moment I'm only trying to do it right for the first input example I created and then take it from there :).
As you can see, I'm a bit stuck on the logic part, the way I went with it was this:
I find all the "=" signs in the string (when I move on to more complex inputs I will need to search for conditions like if,for,else,while also) and save each of them and their index to a map, then I do the same for the occurrences of "PROPERTY_X" and their indexes. Then I try to find the index of "=" which is closest to the index of the "PROPERTY_X" and and insert my new variable there, after which I go on to replace what I need with the name of the variable.
Oh the addDelimiters() method does a split based on some certain delimiters, basically the "text" string once inserted in the list will look something like this:
q
=
PROPERTY_X
+
10
etc..
Any suggestions are welcome.

Categories