Why does this Java regex fail? - java

In my Java code, I would like to match strings such as:
1m
112m
10million
9Million
I also want to match stuff like "100k". The following works for the "k" but not for the "m". Why is that?
if (moneyExp.matches("[-+]?\\d+[kK]")) {
String modMoney = moneyExp.replaceAll("[^\\d.]", "");
modMoney += "000";
mw.hashX.remove("Amount");
mw.doublePut("Amount", modMoney, 1);
tagMap = mw.hashX;
} else if (money.matches("[-+]?\\d+[mM]")) {
String modMoney = moneyExp.replaceAll("[^\\d.]", "");
modMoney += "000000";
mw.hashX.remove("Amount");
mw.doublePut("Amount", modMoney, 1);
tagMap = mw.hashX;
}

Your code works only with 100k, 34m etc. because the match() need to match whole string to return true. So you can try with:
moneyExp.matches("\\b[-+]?\\d+[kK](\\w+)+?\\b"); // for k
moneyExp.matches("\\b[-+]?\\d+[mM](\\w)+?\\b"); // for m

Related

String Manipulation in java 1.6

String can be like below. Using java1.6
String example = "<number>;<name-value>;<name-value>";
String abc = "+17005554141;qwq=1234;ddd=ewew;otg=383";
String abc = "+17005554141;qwq=123454";
String abc = "+17005554141";
I want to remove qwq=1234 if present from String. qwq is fixed and its value can VARY like for ex 1234 or 12345 etc
expected result :
String abc = "+17005554141;ddd=ewew;otg=383";
String abc = "+17005554141"; \\removed ;qwq=123454
String abc = "+17005554141";
I tried through
abc = abc.replaceAll(";qwq=.*;", "");
but not working.
I came up with this qwq=\d*\;? and it works. It matches for 0 or more decimals after qwq=. It also has an optional parameter ; since your example seems to include that this is not always appended after the number.
I know the question is not about javascript, but here's an example where you can see the regex working:
const regex = /qwq=\d*\;?/g;
var items = ["+17005554141;qwq=123454",
"+17005554141",
"+17005554141;qwq=1234;ddd=ewew;otg=383"];
for(let i = 0; i < items.length; i++) {
console.log("Item before replace: " + items[i]);
console.log("Item after replace: " + items[i].replace(regex, "") + "\n\n");
}
You can use regex for removing that kind of string like this. Use this code,
String example = "+17005554141;qwq=1234;ddd=ewew;otg=383";
System.out.println("Before: " + example);
System.out.println("After: " + example.replaceAll("qwq=\\d+;?", ""));
This gives following output,
Before: +17005554141;qwq=1234;ddd=ewew;otg=383
After: +17005554141;ddd=ewew;otg=383
.* applies to multi-characters, not limited to digits. Use something that applies only to bunch of digits
abc.replaceAll(";qwq=\\d+", "")
^^
Any Number
please try
abc = abc.replaceAll("qwq=[0-9]*;", "");
If you don't care about too much convenience, you can achieve this by just plain simple String operations (indexOf, replace and substring). This is maybe the most legacy way to do this:
private static String replaceQWQ(String target)
{
if (target.indexOf("qwq=") != -1) {
if (target.indexOf(';', target.indexOf("qwq=")) != -1) {
String replace =
target.substring(target.indexOf("qwq="), target.indexOf(';', target.indexOf("qwq=")) + 1);
target = target.replace(replace, "");
} else {
target = target.substring(0, target.indexOf("qwq=") - 1);
}
}
return target;
}
Small test:
String abc = "+17005554141;qwq=1234;ddd=ewew;otg=383";
String def = "+17005554141;qwq=1234";
System.out.println(replaceQWQ(abc));
System.out.println(replaceQWQ(def));
outputs:
+17005554141;ddd=ewew;otg=383
+17005554141
Another one:
abc.replaceAll(";qwq=[^;]*;", ";");
You must to use groups in replaceAll method.
Here is an example:
abc.replaceAll("(.*;)(qwq=\\d*;)(.*)", "$1$3");
More about groups you can find on: http://www.vogella.com/tutorials/JavaRegularExpressions/article.html

replaceFirst for character "`"

First time here. I'm trying to write a program that takes a string input from the user and encode it using the replaceFirst method. All letters and symbols with the exception of "`" (Grave accent) encode and decode properly.
e.g. When I input
`12
I am supposed to get 28AABB as my encryption, but instead, it gives me BB8AA2
public class CryptoString {
public static void main(String[] args) throws IOException, ArrayIndexOutOfBoundsException {
String input = "";
input = JOptionPane.showInputDialog(null, "Enter the string to be encrypted");
JOptionPane.showMessageDialog(null, "The message " + input + " was encrypted to be "+ encrypt(input));
public static String encrypt (String s){
String encryptThis = s.toLowerCase();
String encryptThistemp = encryptThis;
int encryptThislength = encryptThis.length();
for (int i = 0; i < encryptThislength ; ++i){
String test = encryptThistemp.substring(i, i + 1);
//Took out all code with regard to all cases OTHER than "`" "1" and "2"
//All other cases would have followed the same format, except with a different string replacement argument.
if (test.equals("`")){
encryptThis = encryptThis.replaceFirst("`" , "28");
}
else if (test.equals("1")){
encryptThis = encryptThis.replaceFirst("1" , "AA");
}
else if (test.equals("2")){
encryptThis = encryptThis.replaceFirst("2" , "BB");
}
}
}
I've tried putting escape characters in front of the grave accent, however, it is still not encoding it properly.
Take a look at how your program works in each loop iteration:
i=0
encryptThis = '12 (I used ' instead of ` to easier write this post)
and now you replace ' with 28 so it will become 2812
i=1
we read character at position 1 and it is 1 so
we replace 1 with AA making 2812 -> 28AA2
i=2
we read character at position 2, it is 2 so
we replace first 2 with BB making 2812 -> BB8AA2
Try maybe using appendReplacement from Matcher class from java.util.regex package like
public static String encrypt(String s) {
Map<String, String> replacementMap = new HashMap<>();
replacementMap.put("`", "28");
replacementMap.put("1", "AA");
replacementMap.put("2", "BB");
Pattern p = Pattern.compile("[`12]"); //regex that will match ` or 1 or 2
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
while (m.find()){//we found one of `, 1, 2
m.appendReplacement(sb, replacementMap.get(m.group()));
}
m.appendTail(sb);
return sb.toString();
}
encryptThistemp.substring(i, i + 1); The second parameter of substring is length, are you sure you want to be increasing i? because this would mean after the first iteration test would not be 1 character long. This could throw off your other cases which we cannot see!

Deleting everything except last part of a String?

What kind of method would I use to make this:
http://www.site.net/files/file1.zip
To
file1.zip?
String yourString = "http://www.site.net/files/file1.zip";
int index = yourString.lastIndexOf('/');
String targetString = yourString.substring(index + 1);
System.out.println(targetString);// file1.zip
String str = "http://www.site.net/files/file1.zip";
str = str.substring(str.lastIndexOf("/")+1);
You could use regex to extract the last part:
#Test
public void extractFileNameFromUrl() {
final Matcher matcher = Pattern.compile("[\\w+.]*$").matcher("http://www.site.net/files/file1.zip");
Assert.assertEquals("file1.zip", matcher.find() ? matcher.group(0) : null);
}
It'll return only "file1.zip". Included here as a test as I used it to validate the code.
Use split:
String[] arr = "http://www.site.net/files/file1.zip".split("/");
Then:
String lastPart = arr[arr.length-1];
Update: Another simpler way to get this:
File file = new File("http://www.site.net/files/file1.zip");
System.out.printf("Path: [%s]%n", file.getName()); // file1.zip

Validate File name using regex in java

Here i want to validate filename using regex in java. i implemented below code but this is not works for me for 3rd type file.
Can i check prefix and extenstion in regex ???
My validate filename looks like these 3 ways
1) prefix_digit.digit.extenstion example : AB_1.1.fuij (Here fuij is my extension)
2) prefix_digit.digit.digit.extenstion example : AB_1.1.1.fuij
3) prefix_digit.digit.B/P.digit.extensionexample : AB_1.1.B.1.fuij
Only these 3 types of file valid. 3rd one is beta and pilot version files. if beta and pilot version file is there than is should be like this which i mentioned above
I am going to write some valid and invalid files
**Valid :**
AB_1.1.fuij
AB_1.4.fuij
AB_1.1.1.fuij
AB_1.1.B.1.fuij
AB_3.4.P.7.fuij
***Invalid :***
AB_0.1.fuij
AB_1.B.1.1.fuij(B/P should be place on 3rd always)
AB_1.2.B.0.fuij
CODE :
import java.util.ArrayList;
import java.util.regex.Pattern;
public class democlass {
/**
* Test harness.
*/
public static void main(String[] args) {
ArrayList<String> demoversion = new ArrayList<String>();
System.out.println("Result >>>>>>>>>>>> "
+isFileValid("AB_1.1.fuij"));
System.out.println("Result >>>>>>>>>>>> "
+isFileValid("AB_1.B.fuij"));
System.out.println("Result >>>>>>>>>>>> "
+isFileValid("AB_1.1.1.fuij"));
System.out.println("Result >>>>>>>>>>>> "
+isFileValid("AB_1.P.1.1.fuij"));
System.out.println("Result >>>>>>>>>>>> "
+isFileValid("AB_1.1.B.1.fuij"));
}
private static boolean isFileValid(String input)
{
String regexFinalBugFix = "^\\d+\\.\\d+\\.\\d+$";
String regexFinal = "^\\d+\\.\\d+$";
String regexBetaPilot = "^\\d+\\.\\d+\\.\\[BP]+\\.\\d+$";
final Pattern pattern1 = Pattern.compile(regexFinal);
final Pattern pattern2 = Pattern.compile(regexBetaPilot);
final Pattern pattern3 = Pattern.compile(regexFinalBugFix);
String inputVersion = null;
int suffixIndex = input.lastIndexOf(".");
int prefixIndex = input.lastIndexOf("_");
if (suffixIndex > 0 && prefixIndex > 0) {
inputVersion = input.substring(prefixIndex + 1,
suffixIndex);
String prefixString1 = input.substring(0, 3);
String suffixString1 = input.substring(suffixIndex);
if(prefixString1.equals("AB_") && suffixString1.equals(".fuij"))
{
if (pattern1.matcher(inputVersion).matches()
|| pattern2.matcher(inputVersion).matches()
|| pattern3.matcher(inputVersion).matches()) {
return true;
}
return false;
}
return false;
}
return false;
}
}
OUTPUT :
Result >>>>>>>>>>>> true
Result >>>>>>>>>>>> false
Result >>>>>>>>>>>> true
Result >>>>>>>>>>>> false
Result >>>>>>>>>>>> false : It should be valid but it is false, why??
Your regexBetaPilot is wrong: you are escaping the opening bracket of the [BP] class. Try this instead:
String regexBetaPilot = "^\\d+\\.\\d+\\.[BP]+\\.\\d+$";
You can easily combine all three patterns into a single pattern:
String regex = "\\d+\\.(\\d+\\.([BP]+\\.)?)?\\d+";
You don't need the anchors (^ and $). Since you are using matches() instead of find(), it will always try to match the entire string.
EDIT I left in the + after [BP] because that's what you had in your original code. However, if you want to match a single B or P, then you should remove the + from the pattern.
You are escaping the opening bracket of [BP], so it tries to find a [ in the string.
This works:
String regexBetaPilot = "^\\d+\\.\\d+\\.[BP]+\\.\\d+$";
Something like this should work with AB being static:
Regular Expression: AB_\d+\.\d+((\.\d){0,1}|\.[BP]\.\d+)\.fuij
as a Java string AB_\\d+\\.\\d+((\\.\\d){0,1}|\\.[BP]\\.\\d+)\\.fuij
This misses two of your listed invalids, but I was unsure why they should be invalid. I can halep more if you explain the rules for success / failure better?
You can simplify your regular expression to
AB_\d+\.\d+(?:(?:\.[BP])?\.\d+)?\.fuij
This matches AB_digits.digits. Then comes an optional .digits, .B.digits or .P.digits. And finally matches .fuij. From your examples, there might be only a single B or P. If you wish to match multiple Bs and Ps, just add the + again.
And then your isFileValid() function might be reduced to
private static boolean isFileValid(String input)
{
final String re = "AB_\\d+\\.\\d+(?:(?:\\.[BP])?\\.\\d+)?\\.fuij";
final Pattern pattern = Pattern.compile(re);
return pattern.matcher(input).matches();
}

How can you parse the string which has a text qualifier

How can I parse a String str = "abc, \"def,ghi\"";
such that I get the output as
String[] strs = {"abc", "\"def,ghi\""}
i.e. an array of length 2.
Should I use regular expression or Is there any method in java api or anyother opensource
project which let me do this?
Edited
To give context about the problem, I am reading a text file which has a list of records one on each line. Each record has list of fields separated by delimiter(comma or semi-colon). Now I have a requirement where I have to support text qualifier some thing excel or open office supports. Suppose I have record
abc, "def,ghi"
In this , is my delimiter and " is my text qualifier such that when I parse this string I should get two fields abc and def,ghi not {abc,def,ghi}
Hope this clears my requirement.
Thanks
Shekhar
The basic algorithm is not too complicated:
public static List<String> customSplit(String input) {
List<String> elements = new ArrayList<String>();
StringBuilder elementBuilder = new StringBuilder();
boolean isQuoted = false;
for (char c : input.toCharArray()) {
if (c == '\"') {
isQuoted = !isQuoted;
// continue; // changed according to the OP comment - \" shall not be skipped
}
if (c == ',' && !isQuoted) {
elements.add(elementBuilder.toString().trim());
elementBuilder = new StringBuilder();
continue;
}
elementBuilder.append(c);
}
elements.add(elementBuilder.toString().trim());
return elements;
}
This question seems appropriate: Split a string ignoring quoted sections
Along that line, http://opencsv.sourceforge.net/ seems appropriate.
Try this -
String str = "abc, \"def,ghi\"";
String regex = "([,]) | (^[\"\\w*,\\w*\"])";
for(String s : str.split(regex)){
System.out.println(s);
}
Try:
List<String> res = new LinkedList<String>();
String[] chunks = str.split("\\\"");
if (chunks.length % 2 == 0) {
// Mismatched escaped quotes!
}
for (int i = 0; i < chunks.length; i++) {
if (i % 2 == 1) {
res.addAll(Array.asList(chunks[i].split(",")));
} else {
res.add(chunks[i]);
}
}
This will only split up the portions that are not between escaped quotes.
Call trim() if you want to get rid of the whitespace.

Categories