Regex string replace considerably fast than regular replace in kotlin - java

I have been trying to replace characters in string like below
data = data.replace(Regex("[a-z:]", RegexOption.IGNORE_CASE), "")
.replace(Regex("/", RegexOption.IGNORE_CASE), ".")
.replace(Regex(",", RegexOption.IGNORE_CASE), "")
.replace(Regex("'", RegexOption.IGNORE_CASE), "")
.replace(Regex("é",RegexOption.IGNORE_CASE),"")
.replace(Regex("ê",RegexOption.IGNORE_CASE),"")
.replace(Regex("ö",RegexOption.IGNORE_CASE),"")
.replace(Regex("Ä",RegexOption.IGNORE_CASE),"")
.replace(Regex("ä",RegexOption.IGNORE_CASE),"")
.replace(Regex("ä |",RegexOption.IGNORE_CASE),"")
And
data = data.replace(Regex("[a-z:]", RegexOption.IGNORE_CASE), "")
.replace("/", ".")
.replace(",", "")
.replace("'", "")
.replace("é","")
.replace("ê","")
.replace("ö","")
.replace("Ä","")
.replace("ä","")
And I measured time required for both of this code and surprisingly code with regex turned out at least 20 times faster than normal replace.
As long as I have been reading about regex, they say regex is expensive operation, am I missing something?

Related

"How to replace "[" from string in java?"

I want to use the String::replaceall method in Java. I have a string which includes "[" and I want to replace that with "" but it's showing an error.
String str="already data exists = [ abc,xyz,123 ]";
String replacedStr = str.replaceAll("Already Po Exits =", "");
String replacedStr1 = replacedStr.replaceAll("\\[", "");
The following replace function will replace [ in your string.
str.replaceAll("\\[", "")
or you can use replace function to achieve the same
str.replace("[", "")
For replacing Unclosed character, you need to add escape character while replacing
String replacedStr1 = replacedStr.replaceAll("\\[", "");
You can use the replace function to do this.
String replacedStr1 = replacedStr.replace("[", "");

String.replaceAll(string,string) method gives unexpexted output

I have some amounts as input like:
Rs1 ,
INR10,954.00 ,
INR 45000 ,
INR 25000.70 ,
Rs.25000 ,
Rs.1,000 ,
Rs. 14000
these are the input formats I'm using
String getRuppee="Rs1"; // this string can not get the format and gives wrong output.
String val=getRuppee.toLowerCase()
.replaceAll(",", "")
.replaceAll("rs.", "")
.replaceAll("\"rs\"", "")
.replaceAll("rs", "")
.replaceAll("inr", "")
.replaceAll("inr ", "")
.replaceAll("mrp", "")
.replaceAll(" ", "");
This is how i get the output as per input show above, unless the first (Rs1).
Log.e("Converstion", "Converstion Balance."
+getRuppee.toLowerCase().
replaceAll("rs.", ""));
i need the output 1 but it gives me null.
Logcat Displays: "06-07 11:34:48.438: E/Converstion(8233): Converstion Balance."
Change your code as follows
String getRuppee="Rs1"; // this string can not get the format and gives wrong output.
String val=getRuppee.toLowerCase()
.replace(",", "")
.replace("rs.", "")
.replace("\"rs\"", "")
.replace("rs", "")
.replace("inr", "")
.replace("inr ", "")
.replace("mrp", "")
.replace(" ", "");
This is your problem here
.replaceAll("rs.", "")
As replaceAll is doing replacement based upon regexp, then this is replacing anything starting with rs followed by any char
Try using .replace instead which replaces all strings
Try below code:
String[] getRuppee = {"RS1", "INR10","954.00" , "INR 45000" , "INR 25000.70" , "Rs.25000" , "Rs.1,000" , "Rs. 14000"};
for (String tmp : getRuppee){
String val=tmp.toLowerCase()
.replaceAll(",", "")
.replaceAll("rs\\.", "") //check here
.replaceAll("\"rs\"", "")
.replaceAll("rs", "")
.replaceAll("inr", "")
.replaceAll("inr ", "")
.replaceAll("mrp", "")
.replaceAll(" ", "");
System.out.println(val);
}

Replace String in Java with regex and replaceAll

Is there a simple solution to parse a String by using regex in Java?
I have to adapt a HTML page. Therefore I have to parse several strings, e.g.:
href="/browse/PJBUGS-911"
=>
href="PJBUGS-911.html"
The pattern of the strings is only different corresponding to the ID (e.g. 911). My first idea looks like this:
String input = "";
String output = input.replaceAll("href=\"/browse/PJBUGS\\-[0-9]*\"", "href=\"PJBUGS-???.html\"");
I want to replace everything except the ID. How can I do this?
Would be nice if someone can help me :)
You can capture substrings that were matched by your pattern, using parentheses. And then you can use the captured things in the replacement with $n where n is the number of the set of parentheses (counting opening parentheses from left to right). For your example:
String output = input.replaceAll("href=\"/browse/PJBUGS-([0-9]*)\"", "href=\"PJBUGS-$1.html\"");
Or if you want:
String output = input.replaceAll("href=\"/browse/(PJBUGS-[0-9]*)\"", "href=\"$1.html\"");
This does not use regexp. But maybe it still solves your problem.
output = "href=\"" + input.substring(input.lastIndexOf("/")) + ".html\"";
This is how I would do it:
public static void main(String[] args)
{
String text = "href=\"/browse/PJBUGS-911\" blahblah href=\"/browse/PJBUGS-111\" " +
"blahblah href=\"/browse/PJBUGS-34234\"";
Pattern ptrn = Pattern.compile("href=\"/browse/(PJBUGS-[0-9]+?)\"");
Matcher mtchr = ptrn.matcher(text);
while(mtchr.find())
{
String match = mtchr.group(0);
String insMatch = mtchr.group(1);
String repl = match.replaceFirst(match, "href=\"" + insMatch + ".html\"");
System.out.println("orig = <" + match + "> repl = <" + repl + ">");
}
}
This just shows the regex and replacements, not the final formatted text, which you can get by using Matcher.replaceAll:
String allRepl = mtchr.replaceAll("href=\"$1.html\"");
If just interested in replacing all, you don't need the loop -- I used it just for debugging/showing how regex does business.

substring between two delimiters

I have a string as : "This is a URL http://www.google.com/MyDoc.pdf which should be used"
I just need to extract the URL that is starting from http and ending at pdf :
http://www.google.com/MyDoc.pdf
String sLeftDelimiter = "http://";
String[] tempURL = sValueFromAddAtt.split(sLeftDelimiter );
String sRequiredURL = sLeftDelimiter + tempURL[1];
This gives me the output as "http://www.google.com/MyDoc.pdf which should be used"
Need help on this.
This kind of problem is what regular expressions were made for:
Pattern findUrl = Pattern.compile("\\bhttp.*?\\.pdf\\b");
Matcher matcher = findUrl.matcher("This is a URL http://www.google.com/MyDoc.pdf which should be used");
while (matcher.find()) {
System.out.println(matcher.group());
}
The regular expression explained:
\b before the "http" there is a word boundary (i.e. xhttp does not match)
http the string "http" (be aware that this also matches "https" and "httpsomething")
.*? any character (.) any number of times (*), but try to use the least amount of characters (?)
\.pdf the literal string ".pdf"
\b after the ".pdf" there is a word boundary (i.e. .pdfoo does not match)
If you would like to match only http and https, try to use this instead of http in your string:
https?\: - this matches the string http, then an optional "s" (indicated by the ? after the s) and then a colon.
why don't you use startsWith("http://") and endsWith(".pdf") mthods of String class.
Both the method returns boolean value, if both returns true, then your condition succeed else your condition is failed.
Try this
String StringName="This is a URL http://www.google.com/MyDoc.pdf which should be used";
StringName=StringName.substring(StringName.indexOf("http:"),StringName.indexOf("which"));
You can use Regular Expression power for here.
First you have to find Url in original string then remove other part.
Following code shows my suggestion:
String regex = "\\b(http|ftp|file)://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*[-a-zA-Z0-9+&##/%=~_|]";
String str = "This is a URL http://www.google.com/MyDoc.pdf which should be used";
String[] splited = str.split(regex);
for(String current_part : splited)
{
str = str.replace(current_part, "");
}
System.out.println(str);
This snippet code cans retrieve any url in any string with any pattern.
You cant add customize protocol such as https to protocol part in above regular expression.
I hope my answer help you ;)
public static String getStringBetweenStrings(String aString, String aPattern1, String aPattern2) {
String ret = null;
int pos1,pos2;
pos1 = aString.indexOf(aPattern1) + aPattern1.length();
pos2 = aString.indexOf(aPattern2);
if ((pos1>0) && (pos2>0) && (pos2 > pos1)) {
return aString.substring(pos1, pos2);
}
return ret;
}
You can use String.replaceAll with a capturing group and back reference for a very concise solution:
String input = "This is a URL http://www.google.com/MyDoc.pdf which should be used";
System.out.println(input.replaceAll(".*(http.*?\\.pdf).*", "$1"));
Here's a breakdown for the regex: https://regexr.com/3qmus

remove : in 12:45 (java)

time 12:45
i want to remove : in java .. i just need 1245 ..How can i do that?
String time = "12:45".replace( ":", "" ); // "1245"
If you have Apache Coomons Lang in your classpath, and you are not sure that time is not null, you could use StringUtils:
time = StringUtils.remove( time, ":" );
this way is more compact than writing
if ( time != null ) {
time = time.replace( ":", "" );
}
There is the "replace" method.
s = s.replace(':','');
If you want to get fancy:
s = s.replaceAll("[^a-zA-Z0-9]", "");
This will remove all non-alpha numeric characters (including your ':')
All right there in the JavaDoc.
If "12:45" is a string, then just use "12:45".replaceAll(":", "").
String strTime = "12:45";
strTime.replace(':','');
For the easiest method, use replace:
String time = "12:45";
time = time.replace(':', "");
but you can use regular expressions:
Pattern pattern = new Pattern("(\\d{1,2}):(\\d{1,2})");
Matcher matcher = pattern.matcher("12:45");
String noColon = matcher.group(1) + matcher.group(2);
or the String API:
String time = "12:45";
int colonIndex = time.indexOf(':"';
String noColon = time.substring(0, colonIndex) +
time.substring(colonIndex + 1, time.length);
Like what others told, a method as simple as String's replace should suffice, but since i suspect your input is a date, have a look at SimpleDateFormat too.

Categories