Extracting a pattern from String - java

I have a Random string from which i need to match a certain pattern and parse it out.
My String-
{"sid":"zw9cmv1pzybexi","parentId":null,"time":1373271966311,"color":"#e94d57","userId":"255863","st":"comment","type":"section","cType":"parent"},{},null,null,null,null,{"sid":"zwldv1lx4f7ovx","parentId":"zw9cmv1pzybexi","time":1373347545798,"color":"#774697","userId":"5216907","st":"comment","type":"section","cType":"child"},{},null,null,null,null,null,{"sid":"zw76w68c91mhbs","parentId":"zw9cmv1pzybexi","time":1373356224065,"color":"#774697","userId":"5216907","st":"comment","type":"section","cType":"child"},
From the above I want to parse out (using regex) all the values for userId attribute. Can anyone help me out on how to do this ? It is a Random string and not JSON. Can you provide me a regex solution for this ?

Is that a random string ? It looks like JSON to me, and if it is I would recommend a JSON parser in preference to a regexp. The right thing to do when faced with a particular language/grammar is to use the corresponding parser, rather than a (potentially) fragile regexp.

To get the user Ids, you can use this pattern:
String input = "{\"sid\":\"zw9cmv1pzybexi\",\"parentId\":null,\"time\":1373271966311,\"color\":\"#e94d57\",\"userId\":\"255863\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"parent\"},{},null,null,null,null,{\"sid\":\"zwldv1lx4f7ovx\",\"parentId\":\"zw9cmv1pzybexi\",\"time\":1373347545798,\"color\":\"#774697\",\"userId\":\"5216907\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"child\"},{},null,null,null,null,null,{\"sid\":\"zw76w68c91mhbs\",\"parentId\":\"zw9cmv1pzybexi\",\"time\":1373356224065,\"color\":\"#774697\",\"userId\":\"5216907\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"child\"},";
Pattern p = Pattern.compile("\"userId\":\"(.*?)\"");
Matcher m = p.matcher(input);
while (m.find()) {
System.out.println(m.group(1));
}
which outputs:
255863
5216907
5216907
If you want the full string "userId":"xxxx", you can use m.group(); instead of m.group(1);.

Use JSON parser instead of using Regex, your code will be much more readable and maintainable
http://json.org/java/
https://code.google.com/p/json-simple/

As other already told you, it looks like a JSON String, but if you really want to parse this string on your own, you could use this piece of code:
final Pattern pattern = Pattern.compile("\"userId\":\"(\\d+)\"");
final Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
The matcher will match every "userId":"12345" pattern. matcher.group(1) will return every userId, 12345 in this case (matcher.group() without parameter returns the entire group, ie "userId":"12345").

Here's the regex-code you're asking for ..
//assign subject
String subject = "{\"sid\":\"zw9cmv1pzybexi\",\"parentId\":null,\"time\":1373271966311,\"color\":\"#e94d57\",\"userId\":\"255863\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"parent\"},{},null,null,null,null,{\"sid\":\"zwldv1lx4f7ovx\",\"parentId\":\"zw9cmv1pzybexi\",\"time\":1373347545798,\"color\":\"#774697\",\"userId\":\"5216907\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"child\"},{},null,null,null,null,null,{\"sid\":\"zw76w68c91mhbs\",\"parentId\":\"zw9cmv1pzybexi\",\"time\":1373356224065,\"color\":\"#774697\",\"userId\":\"5216907\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"child\"},";
//specify pattern and matcher
Pattern pat = Pattern.compile( "userId\":\"(\\d+)", Pattern.CASE_INSENSITIVE|Pattern.DOTALL );
Matcher mat = pat.matcher( subject );
//browse all
while ( mat.find() )
{
System.out.println( "result [" + mat.group( 1 ) + "]" );
}
But OF COURSE I´d suggest to solve this using a JSON-Parser like
http://json.org/java/
Greetings
Christopher

It's a JSON format, so you have to use a JSON Parser:
JSONArray array = new JSONArray(yourString);
for (int i=0;i<array.length();i++){
JSONObject jo = inputArray.getJSONObject(i);
userId = jo.getString("userId");
}
EDIT : Regex pattern
"userId"[ :]+((?=\[)\[[^]]*\]|(?=\{)\{[^\}]*\}|\"[^"]*\")
Result :
"userId" : "Some user ID (numeric or letters)"

Related

Java: get substring from an string without lang3 utils and StringUtils

I need to get a integer from a string using Java 7.
String is something like:
"\"Transformed\": any-number,
I need to get any-number substring after first match of \"Transformed\": string.
As it has been advised in the comments, if it is a JSON it is more appropriate to use JSON parser. Otherwise you can use regex which might look similar to this:
String line = "\"Transformed\": 10";
String pattern = "Transformed.:.(\\d+)";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
if (m.find()) {
System.out.println("Found value: " + m.group(0) );
System.out.println("Found value: " + m.group(1) );
}else {
System.out.println("NO MATCH");
}
You can find more info about current regex here. Please note that this is the simplest case possible and you will probably have to update it, depending on the string format.

How to extract id from url ? Google sheet

I have the follow urls.
https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258
https://docs.google.com/a/example.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY/edit#gid=1842172258
https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY
Foreach url, I need to extract the sheet id: 1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY into a java String.
I am thinking of using split but it can't work with all test cases:
String string = "https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258";
String[] parts = string.split("/");
String res = parts[parts.length-2];
Log.d("hello res",res );
How can I that be possible?
You can use regex \/d\/(.*?)(\/|$) (regex demo) to solve your problem, if you look closer you can see that the ID exist between d/ and / or end of line for that you can get every thing between this, check this code demo :
String[] urls = new String[]{
"https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258",
"https://docs.google.com/a/example.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY/edit#gid=1842172258",
"https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY"
};
String regex = "\\/d\\/(.*?)(\\/|$)";
Pattern pattern = Pattern.compile(regex);
for (String url : urls) {
Matcher matcher = pattern.matcher(url);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
}
Outputs
1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY
1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY
1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY
it looks like the id you are looking for always follow "/spreadsheets/d/" if it is the case you can update your code to that
String string = "https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258";
String[] parts = string.split("spreadsheets/d/");
String result;
if(parts[1].contains("/")){
String[] parts2 = parts[1].split("/");
result = parts2[0];
}
else{
result=parts[1];
}
System.out.println("hello "+ result);
Using regex
Pattern pattern = Pattern.compile("(?<=\\/d\\/)[^\\/]*");
Matcher matcher = pattern.matcher(url);
System.out.println(matcher.group(1));
Using Java
String result = url.substring(url.indexOf("/d/") + 3);
int slash = result.indexOf("/");
result = slash == -1 ? result
: result.substring(0, slash);
System.out.println(result);
Google use fixed lenght characters for its IDs, in your case they are 44 characters and these are the characters google use: alphanumeric, -, and _ so you can use this regex:
regex = "([\w-]){44}"
match = re.search(regex,url)

What is the regex for date type

I have one string I want to find out all date values and relace them with specific string.
My code looks like:
String mydata = "{[... \"date\":\"2016-03-16T12:38:28.390Z\"]},{[ ... \"date\":\"2016-03-16T12:38:28.390Z\" ...]}";
Pattern pattern = Pattern.compile("");
Matcher matcher = pattern.matcher(mydata);
while(matcher.find()){
mydata = mydata.replace(matcher.group(), matcher.group().substring(0, 10));
}
System.out.println(mydata);
What string regex I should pass in Pattern.compile("");?
My output should look like:
{[... "date":"2016-03-16"]},{[ ... "date":"2016-03-16" ...]}
import org.json.*;
JSONObject obj = new JSONObject("{[... "date":"2016-03-16"]},{[ ... "date":"2016-03-16" ...]}");
The rest of the code depends on your json-structure. Have a look at:
How to parse JSON in Java
or
Java Api Link JsonObject
I am not sure about T and Z from input. If they are same always, then below regex will work. If T and Z not constant then change T and Z by [A-Z] in regex.
one more change i did, replaced
mydata = mydata.replace(matcher.group(), matcher.group().substring(0, 10));
by
mydata = matcher.replaceAll("");
It getting required output.
String mydata = "{[\"date\":\"2016-03-16T12:38:28.390Z\"]},{[\"date\":\"2016-03-16T12:38:28.390Z\"]}";
Pattern pattern = Pattern.compile("T\\d{2}\\:\\d{2}\\:\\d{2}\\.\\d{3}Z");
Matcher matcher = pattern.matcher(mydata);
while(matcher.find()){
mydata = matcher.replaceAll("");
}
System.out.println(mydata);
If you want a regex based solution this seems to work for your example
Pattern pattern = Pattern.compile("(\\{\\[.*?\"date\":\"\\d{4}\\-\\d{2}\\-\\d{2}).*?(\"\\]\\})");
Matcher matcher = pattern.matcher(mydata);
while(matcher.find()) {
System.out.println(matcher.group(1) + matcher.group(2));
}
IDEONE DEMO

Extract json data from given string

I am having a string something like this :
a.b.c.d.e =
{"altImages":2,"available":1,"availableColorCount":3};
Now I only need to fetch :
{"altImages":2,"available":1,"availableColorCount":3}
What should be regex expression to extract that part from given string. Please help
My Try :
(?smi)a.b.c.d\\(.*\"e\"=(.*?)\\}\\);.*
But its not helping around.
Try this:
.+\s*=\s*({(?:.+:.+,?)+})(?=;)
You can use something like:
.*?\n(.*);
Here is the version with named groups:
String text = "a.b.c.d.e = \n{\"altImages\":2,\"available\":1,\"availableColorCount\":3};";
Pattern pattern = Pattern.compile(".*?\n(?<JSON>.*);");
Matcher matcher = pattern.matcher(text);
if (matcher.matches()) {
System.out.println(matcher.group("JSON"));
}

Using Regex to get jsessionid

I have to get the jsessionid code from an url not the jsessionid string. It is possible to match something and exclude it?
https://esgf-data.dkrz.de/esgf-idp/idp/login.htm;jsessionid=436100313FAFBBB9B4DC8BA3C2EC267B
Result = 436100313FAFBBB9B4DC8BA3C2EC267B
Code added from comment:
Pattern pattern = Pattern.compile("/jsessionid=([a-z0-9]+)/i");
Matcher matcher=pattern.matcher(connection.getURL().toExternalForm());
/=([A-Z0-9]+)/ will get all uppercase and numbers after the equals sign = and move them to backreference #1
$subject = 'https://esgf-data.dkrz.de/esgf-idp/idp/login.htm;jsessionid=436100313FAFBBB9B4DC8BA3C2EC267B';
if (preg_match('/=([A-Z0-9]+)/', $subject, $regs)) {
$result = $regs[1];
} else {
$result = "";
}
Try this out :
String data = "https://esgf-data.dkrz.de/esgf-idp/idp/login.htm;jsessionid=436100313FAFBBB9B4DC8BA3C2EC267B";
Pattern pattern = Pattern.compile("jsessionid=(\\w+)");
Matcher matcher = pattern.matcher(data);
while (matcher.find()) {
System.out.println("Result is : " + matcher.group(1));
}
Just as a caveat if you have a routing identifier 1 2 on the end of your JSESSIONID, these previous regular expressions might fail. I found this will pick up the routing expression as well (Regex101).
^([A-F0-9]+)((\.[A-Za-z0-9]+)*)$

Categories