Java Regex multi delimiter split in order - java

I am trying to split a string having multi-delimiters in it but want to first check if the string satisfies the regex and then split based on it.
Example:-
The testString will contain ://,:,#,:,/ these characters in specific order and I need to first check if the given string satisfies the pattern or not and if satisfy then split it. The string other characters can also contain these in it but I need to split based on order of these ://,:,#,:,/
String testString = "aman://jaspreet:raman!#127.0.0.1:5031/test";
String[]tokens = testString.split("://|\\:|#|\\:|\\/");
for(String s:tokens) {
System.out.println(s);
}
Here above I have tried the regex to split but it doesn't split by checking in order. It just checks if any given regex character exists in string or not and then splits based on it.

If you first validate the pattern, then you shouldn't do split() after. Use capturing groups to gather the data you already validated.
E.g. in a simple case, foo#bar, with separator #, you would validate with ^([^#]+)#(.+)$, i.e. match and capture text up to #, match but don't capture the #, then match and capture the rest:
Pattern p = Pattern.compile("^([^#]+)#(.+)$");
Matcher m = p.matcher("foo#bar");
if (! m.matches()) {
// invalid data
} else {
String a = m.group(1); // a = "foo"
String b = m.group(2); // b = "bar"
// use a and b here
}
For the matching in the question, a lenient pattern could be:
^(.*?)://(.*?):(.*?)#(.*?):(.*?)/(.*)$
You would then use code above, but with:
String scheme = m.group(1); // "aman"
String user = m.group(2); // "jaspreet"
String password = m.group(3); // "raman!"
String host = m.group(4); // "127.0.0.1"
String port = m.group(5); // "5031"
String path = m.group(6); // "test"
For a stricter matching, replace .*? with a pattern that only matches allowed characters, e.g. [^:]+ if it cannot be empty and cannot contain colons.
Alternatively, you could just use the URI class to parse the URL string.
String testString = "aman://jaspreet:raman!#127.0.0.1:5031/test";
URI uri = URI.create(testString);
String scheme = uri.getScheme(); // "aman"
String userInfo = uri.getUserInfo(); // "jaspreet:raman!"
String host = uri.getHost(); // "127.0.0.1"
String port = uri.getPort(); // "5031"
String path = uri.getPath(); // "test"

Related

How to get the string from matcher.group() and store it in separate string variable in java?

String emailAdress = "yourname#yourdomin.com";
Pattern emailAddress = Pattern.compile("(.*)(#)(.*)");
Matcher matchEmailAddress = emailAddress.matcher(emailAdress);
String secondPartOfEmail;
while(matchEmailAddress.find()){
System.out.println(matchEmailAddress.group(1));
System.out.println(matchEmailAddress.group(3));
}
When I run this source code, the output is:
yourname yourdomin.com
I want to store yourdomain.com in string type variable to use it later. I mean group(3) in matchEmailAddress matcher.
I've already tried:
String secondPartOfEmail = matchEmailAddress.group(3)
but a error occured.
Assuming you want to match only one email address, you can do this:
String emailAdress = "yourname#yourdomin.com";
Pattern emailAddress = Pattern.compile("(.*)(#)(.*)");
Matcher matchEmailAddress = emailAddress.matcher(emailAdress);
matchEmailAddress.find(); //find the next substring matching your pattern
String secondPartOfEmail = matchEmailAddress.group(3);

split the special charater contain inside the string

I have a String """JBL#gmail.com from which I want to remove the """ which is located at the front of the email address. I tried to use split, but unfortunately it didn't work.
Here is my code:
String [] sender1 = SA1.split(" ");
String str1 = sender1[0];
System.out.println("the str1 is :"+str1);
String [] sender2 = str1.split("\\\"");
String str2 = sender2[0];
String str3 = sender2[1];
System.out.println("the str2 is :"+str2);
System.out.println("the str3 is :"+str3);
Here is my code output-
the str1 is :"""JBL#gmail.com""
the str2 is :
the str3 is :
My SA1 will contain """JBL#gmail.com"" <JBL#gmail.com>". The email address can be a mix of lower/upper case letters, numbers, and etc.
If SA1 does in fact contain
"\"\"\"JBL#gmail.com\"\" <JBL#gmail.com>\""
then you can use Pattern/Matcher with a Regular Expression of "<(.*?)>" to retrieve the E-Mail Address from the String:
String sa1 = "\"\"\"JBL#gmail.com\"\" <JBL#gmail.com>\"";
String email = "";
Pattern pattern = Pattern.compile("<(.*?)>");
Matcher matcher = pattern.matcher(sa1);
while (matcher.find()) {
// Is a match found?
if (!matcher.group(1).equals("")) {
// There is so place the match into the
// email variable.
email = matcher.group(1);
}
}
// Display the E-Mail Address in Console Window.
System.out.println("E-Mail Address is: " + email);
Console window will display:
E-Mail Address is: JBL#gmail.com
Regular Expression Explanation:
You can obtain the email in the first part of the string by removing all quotation marks (replace("\"", "")), spliting by spaces (split(" ")), and taking the first element in the split ([0]):
String str = "\"\"\"JBL#gmail.com\"\" <JBL#gmail.com>\"";
str.replace("\"", "").split(" ")[0];
Note that the second element would produce <JBL#gmail.com>.
"fdsd\"\"\" dsd".split("\"\"\"")
you have to use
"yourWords".split("\"\"\"")
String s= "\"\"\"JBL#gmail.com\"\" <JBL#gmail.com>\"".
split("<")[1].replace(">", "").replace("\"", "");

Java String tokens

I have a string line
String user_name = "id=123 user=aron name=aron app=application";
and I have a list that contains: {user,cuser,suser}
And i have to get the user part from string. So i have code like this
List<String> userName = Config.getConfig().getList(Configuration.ATT_CEF_USER_NAME);
String result = null;
for (String param: user_name .split("\\s", 0)){
for(String user: userName ){
String userParam = user.concat("=.*");
if (param.matches(userParam )) {
result = param.split("=")[1];
}
}
}
But the problem is that if the String contains spaces in the user_name, It do not work.
For ex:
String user_name = "id=123 user=aron nicols name=aron app=application";
Here user has a value aron nicols which contain spaces. How can I write a code that can get me exact user value i.e. aron nicols
If you want to split only on spaces that are right before tokens which have = righ after it such as user=... then maybe add look ahead condition like
split("\\s(?=\\S*=)")
This regex will split on
\\s space
(?=\\S*=) which has zero or more * non-space \\S characters which ends with = after it. Also look-ahead (?=...) is zero-length match which means part matched by it will not be included in in result so split will not split on it.
Demo:
String user_name = "id=123 user=aron nicols name=aron app=application";
for (String s : user_name.split("\\s(?=\\S*=)"))
System.out.println(s);
output:
id=123
user=aron nicols
name=aron
app=application
From your comment in other answer it seems that = which are escaped with \ shouldn't be treated as separator between key=value but as part of value. In that case you can just add negative-look-behind mechanism to see if before = is no \, so (?<!\\\\) right before will require = to not have \ before it.
BTW to create regex which will match \ we need to write it as \\ but in Java we also need to escape each of \ to create \ literal in String that is why we ended up with \\\\.
So you can use
split("\\s(?=\\S*(?<!\\\\)=)")
Demo:
String user_name = "user=Dist\\=Name1, xyz src=activedirectorydomain ip=10.1.77.24";
for (String s : user_name.split("\\s(?=\\S*(?<!\\\\)=)"))
System.out.println(s);
output:
user=Dist\=Name1, xyz
src=activedirectorydomain
ip=10.1.77.24
Do it like this:
First split input string using this regex:
" +(?=\\w+(?<!\\\\)=)"
This will give you 4 name=value tokens like this:
id=123
user=aron nicols
name=aron
app=application
Now you can just split on = to get your name and value parts.
Regex Demo
Regex Demo with escaped =
CODE FISH, this simple regex captures the user in Group 1: user=\\s*(.*?)\s+name=
It will capture "Aron", "Aron Nichols", "Aron Nichols The Benevolent", and so on.
It relies on the knowledge that name= always follows user=
However, if you're not sure that the token following user is name, you can use this:
user=\s*(.*?)(?=$|\s+\w+=)
Here is how to use the second expression (for the first, just change the string in Pattern.compile:
String ResultString = null;
try {
Pattern regex = Pattern.compile("user=\\s*(.*?)(?=$|\\s+\\w+=)", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
Matcher regexMatcher = regex.matcher(subjectString);
if (regexMatcher.find()) {
ResultString = regexMatcher.group(1);
}
} catch (PatternSyntaxException ex) {
// Syntax error in the regular expression
}

substring between two delimiters

I have a string as : "This is a URL http://www.google.com/MyDoc.pdf which should be used"
I just need to extract the URL that is starting from http and ending at pdf :
http://www.google.com/MyDoc.pdf
String sLeftDelimiter = "http://";
String[] tempURL = sValueFromAddAtt.split(sLeftDelimiter );
String sRequiredURL = sLeftDelimiter + tempURL[1];
This gives me the output as "http://www.google.com/MyDoc.pdf which should be used"
Need help on this.
This kind of problem is what regular expressions were made for:
Pattern findUrl = Pattern.compile("\\bhttp.*?\\.pdf\\b");
Matcher matcher = findUrl.matcher("This is a URL http://www.google.com/MyDoc.pdf which should be used");
while (matcher.find()) {
System.out.println(matcher.group());
}
The regular expression explained:
\b before the "http" there is a word boundary (i.e. xhttp does not match)
http the string "http" (be aware that this also matches "https" and "httpsomething")
.*? any character (.) any number of times (*), but try to use the least amount of characters (?)
\.pdf the literal string ".pdf"
\b after the ".pdf" there is a word boundary (i.e. .pdfoo does not match)
If you would like to match only http and https, try to use this instead of http in your string:
https?\: - this matches the string http, then an optional "s" (indicated by the ? after the s) and then a colon.
why don't you use startsWith("http://") and endsWith(".pdf") mthods of String class.
Both the method returns boolean value, if both returns true, then your condition succeed else your condition is failed.
Try this
String StringName="This is a URL http://www.google.com/MyDoc.pdf which should be used";
StringName=StringName.substring(StringName.indexOf("http:"),StringName.indexOf("which"));
You can use Regular Expression power for here.
First you have to find Url in original string then remove other part.
Following code shows my suggestion:
String regex = "\\b(http|ftp|file)://[-a-zA-Z0-9+&##/%?=~_|!:,.;]*[-a-zA-Z0-9+&##/%=~_|]";
String str = "This is a URL http://www.google.com/MyDoc.pdf which should be used";
String[] splited = str.split(regex);
for(String current_part : splited)
{
str = str.replace(current_part, "");
}
System.out.println(str);
This snippet code cans retrieve any url in any string with any pattern.
You cant add customize protocol such as https to protocol part in above regular expression.
I hope my answer help you ;)
public static String getStringBetweenStrings(String aString, String aPattern1, String aPattern2) {
String ret = null;
int pos1,pos2;
pos1 = aString.indexOf(aPattern1) + aPattern1.length();
pos2 = aString.indexOf(aPattern2);
if ((pos1>0) && (pos2>0) && (pos2 > pos1)) {
return aString.substring(pos1, pos2);
}
return ret;
}
You can use String.replaceAll with a capturing group and back reference for a very concise solution:
String input = "This is a URL http://www.google.com/MyDoc.pdf which should be used";
System.out.println(input.replaceAll(".*(http.*?\\.pdf).*", "$1"));
Here's a breakdown for the regex: https://regexr.com/3qmus

Making a song name out of a URL

I have a URL and I want it to look like this:
Action Manatee - Action
http://xxxxxx.com/songs2/Music%20Promotion/Stream/Action%20Manatee%20-%20Action.mp3
What is the syntax for trimming up to where it after this "Stream/" and make spaces where the %20 is. I also want to trim the .mp3
Hmm, for that particular example, I would split the string according to the '/' character then trim the text that follows the final '.' character. Finally, do a replace of "%20" into " ". That should leave you with the string you want
Tested
String initial = "http://xxxxxx.com/songs2/Music%20Promotion/Stream/Action%20Manatee%20-%20Action.mp3";
String[] split = initial.split("/");
String output = split[split.length-1];
int length = output.lastIndexOf('.');
output = output.substring(0, length);
output = output.replace("%20", " ");
String urlParts[] = URL.split("\/");
String urlLast = urlParts[length-1];
String nameDotMp = urlLast.replaceAll("%20");
String name = nameDotMp.substring(0,nameDotMp.length-5);
You could use the split() and replace() methods to accomplish this, here are two ways:
Split your string apart by using the forward slashes:
string yourUrl = [URL Listed];
//Breaks your URL into sections on slashes
string[] sections = yourUrl.split("\/");
//Grabs the last section after the slashes, and replaces the %20 with spaces
string newString = sections[sectiongs.length-1].replace("%20"," ");
Split your string at the Stream/ section: (Only use this if you can guarantee it will be in that form)
string yourUrl = [URL Listed];
//This will get everything after Stream (your song name)
string newString = yourUrl.split("Stream\/")[1];
//Replaces your %20s with spaces
newString = newString.replace("%20"," ");
URL songURL = new URL("yourpath/filename");
String filename = songURL.getFile();

Categories