Using Regex to get jsessionid - java

I have to get the jsessionid code from an url not the jsessionid string. It is possible to match something and exclude it?
https://esgf-data.dkrz.de/esgf-idp/idp/login.htm;jsessionid=436100313FAFBBB9B4DC8BA3C2EC267B
Result = 436100313FAFBBB9B4DC8BA3C2EC267B
Code added from comment:
Pattern pattern = Pattern.compile("/jsessionid=([a-z0-9]+)/i");
Matcher matcher=pattern.matcher(connection.getURL().toExternalForm());

/=([A-Z0-9]+)/ will get all uppercase and numbers after the equals sign = and move them to backreference #1
$subject = 'https://esgf-data.dkrz.de/esgf-idp/idp/login.htm;jsessionid=436100313FAFBBB9B4DC8BA3C2EC267B';
if (preg_match('/=([A-Z0-9]+)/', $subject, $regs)) {
$result = $regs[1];
} else {
$result = "";
}

Try this out :
String data = "https://esgf-data.dkrz.de/esgf-idp/idp/login.htm;jsessionid=436100313FAFBBB9B4DC8BA3C2EC267B";
Pattern pattern = Pattern.compile("jsessionid=(\\w+)");
Matcher matcher = pattern.matcher(data);
while (matcher.find()) {
System.out.println("Result is : " + matcher.group(1));
}

Just as a caveat if you have a routing identifier 1 2 on the end of your JSESSIONID, these previous regular expressions might fail. I found this will pick up the routing expression as well (Regex101).
^([A-F0-9]+)((\.[A-Za-z0-9]+)*)$

Related

Java: get substring from an string without lang3 utils and StringUtils

I need to get a integer from a string using Java 7.
String is something like:
"\"Transformed\": any-number,
I need to get any-number substring after first match of \"Transformed\": string.
As it has been advised in the comments, if it is a JSON it is more appropriate to use JSON parser. Otherwise you can use regex which might look similar to this:
String line = "\"Transformed\": 10";
String pattern = "Transformed.:.(\\d+)";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
if (m.find()) {
System.out.println("Found value: " + m.group(0) );
System.out.println("Found value: " + m.group(1) );
}else {
System.out.println("NO MATCH");
}
You can find more info about current regex here. Please note that this is the simplest case possible and you will probably have to update it, depending on the string format.

How to extract id from url ? Google sheet

I have the follow urls.
https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258
https://docs.google.com/a/example.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY/edit#gid=1842172258
https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY
Foreach url, I need to extract the sheet id: 1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY into a java String.
I am thinking of using split but it can't work with all test cases:
String string = "https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258";
String[] parts = string.split("/");
String res = parts[parts.length-2];
Log.d("hello res",res );
How can I that be possible?
You can use regex \/d\/(.*?)(\/|$) (regex demo) to solve your problem, if you look closer you can see that the ID exist between d/ and / or end of line for that you can get every thing between this, check this code demo :
String[] urls = new String[]{
"https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258",
"https://docs.google.com/a/example.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY/edit#gid=1842172258",
"https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY"
};
String regex = "\\/d\\/(.*?)(\\/|$)";
Pattern pattern = Pattern.compile(regex);
for (String url : urls) {
Matcher matcher = pattern.matcher(url);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
}
Outputs
1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY
1mrsetjgfZI2BIypz7SGHMOfHGv6PTKTzY0xOM5c6TXY
1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY
it looks like the id you are looking for always follow "/spreadsheets/d/" if it is the case you can update your code to that
String string = "https://docs.google.com/spreadsheets/d/1mrsetjgfZI2BIypz7SGHMOfHGv6kTKTzY0xOM5c6TXY/edit#gid=1842172258";
String[] parts = string.split("spreadsheets/d/");
String result;
if(parts[1].contains("/")){
String[] parts2 = parts[1].split("/");
result = parts2[0];
}
else{
result=parts[1];
}
System.out.println("hello "+ result);
Using regex
Pattern pattern = Pattern.compile("(?<=\\/d\\/)[^\\/]*");
Matcher matcher = pattern.matcher(url);
System.out.println(matcher.group(1));
Using Java
String result = url.substring(url.indexOf("/d/") + 3);
int slash = result.indexOf("/");
result = slash == -1 ? result
: result.substring(0, slash);
System.out.println(result);
Google use fixed lenght characters for its IDs, in your case they are 44 characters and these are the characters google use: alphanumeric, -, and _ so you can use this regex:
regex = "([\w-]){44}"
match = re.search(regex,url)

Two separate patterns and matchers (java)

I'm working on a simple bot for discord and the first pattern reading works fine and I get the results I'm looking for, but the second one doesn't seem to work and I can't figure out why.
Any help would be appreciated
public void onMessageReceived(MessageReceivedEvent event) {
if (event.getMessage().getContent().startsWith("!")) {
String output, newUrl;
String word, strippedWord;
String url = "http://jisho.org/api/v1/search/words?keyword=";
Pattern reading;
Matcher matcher;
word = event.getMessage().getContent();
strippedWord = word.replace("!", "");
newUrl = url + strippedWord;
//Output contains the raw text from jisho
output = getUrlContents(newUrl);
//Searching through the raw text to pull out the first "reading: "
reading = Pattern.compile("\"reading\":\"(.*?)\"");
matcher = reading.matcher(output);
//Searching through the raw text to pull out the first "english_definitions: "
Pattern def = Pattern.compile("\"english_definitions\":[\"(.*?)]");
Matcher matcher2 = def.matcher(output);
event.getTextChannel().sendMessage(matcher2.toString());
if (matcher.find() && matcher2.find()) {
event.getTextChannel().sendMessage("Reading: "+matcher.group(1)).queue();
event.getTextChannel().sendMessage("Definition: "+matcher2.group(1)).queue();
}
else {
event.getTextChannel().sendMessage("Word not found").queue();
}
}
}
You had to escape the [ character to \\[ (once for the Java String and once for the Regex). You also did forget the closing \".
the correct pattern looks like this:
Pattern def = Pattern.compile("\"english_definitions\":\\[\"(.*?)\"]");
At the output, you might want to readd \" and start/end.
event.getTextChannel().sendMessage("Definition: \""+matcher2.group(1) + "\"").queue();

Extracting a pattern from String

I have a Random string from which i need to match a certain pattern and parse it out.
My String-
{"sid":"zw9cmv1pzybexi","parentId":null,"time":1373271966311,"color":"#e94d57","userId":"255863","st":"comment","type":"section","cType":"parent"},{},null,null,null,null,{"sid":"zwldv1lx4f7ovx","parentId":"zw9cmv1pzybexi","time":1373347545798,"color":"#774697","userId":"5216907","st":"comment","type":"section","cType":"child"},{},null,null,null,null,null,{"sid":"zw76w68c91mhbs","parentId":"zw9cmv1pzybexi","time":1373356224065,"color":"#774697","userId":"5216907","st":"comment","type":"section","cType":"child"},
From the above I want to parse out (using regex) all the values for userId attribute. Can anyone help me out on how to do this ? It is a Random string and not JSON. Can you provide me a regex solution for this ?
Is that a random string ? It looks like JSON to me, and if it is I would recommend a JSON parser in preference to a regexp. The right thing to do when faced with a particular language/grammar is to use the corresponding parser, rather than a (potentially) fragile regexp.
To get the user Ids, you can use this pattern:
String input = "{\"sid\":\"zw9cmv1pzybexi\",\"parentId\":null,\"time\":1373271966311,\"color\":\"#e94d57\",\"userId\":\"255863\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"parent\"},{},null,null,null,null,{\"sid\":\"zwldv1lx4f7ovx\",\"parentId\":\"zw9cmv1pzybexi\",\"time\":1373347545798,\"color\":\"#774697\",\"userId\":\"5216907\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"child\"},{},null,null,null,null,null,{\"sid\":\"zw76w68c91mhbs\",\"parentId\":\"zw9cmv1pzybexi\",\"time\":1373356224065,\"color\":\"#774697\",\"userId\":\"5216907\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"child\"},";
Pattern p = Pattern.compile("\"userId\":\"(.*?)\"");
Matcher m = p.matcher(input);
while (m.find()) {
System.out.println(m.group(1));
}
which outputs:
255863
5216907
5216907
If you want the full string "userId":"xxxx", you can use m.group(); instead of m.group(1);.
Use JSON parser instead of using Regex, your code will be much more readable and maintainable
http://json.org/java/
https://code.google.com/p/json-simple/
As other already told you, it looks like a JSON String, but if you really want to parse this string on your own, you could use this piece of code:
final Pattern pattern = Pattern.compile("\"userId\":\"(\\d+)\"");
final Matcher matcher = pattern.matcher(line);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
The matcher will match every "userId":"12345" pattern. matcher.group(1) will return every userId, 12345 in this case (matcher.group() without parameter returns the entire group, ie "userId":"12345").
Here's the regex-code you're asking for ..
//assign subject
String subject = "{\"sid\":\"zw9cmv1pzybexi\",\"parentId\":null,\"time\":1373271966311,\"color\":\"#e94d57\",\"userId\":\"255863\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"parent\"},{},null,null,null,null,{\"sid\":\"zwldv1lx4f7ovx\",\"parentId\":\"zw9cmv1pzybexi\",\"time\":1373347545798,\"color\":\"#774697\",\"userId\":\"5216907\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"child\"},{},null,null,null,null,null,{\"sid\":\"zw76w68c91mhbs\",\"parentId\":\"zw9cmv1pzybexi\",\"time\":1373356224065,\"color\":\"#774697\",\"userId\":\"5216907\",\"st\":\"comment\",\"type\":\"section\",\"cType\":\"child\"},";
//specify pattern and matcher
Pattern pat = Pattern.compile( "userId\":\"(\\d+)", Pattern.CASE_INSENSITIVE|Pattern.DOTALL );
Matcher mat = pat.matcher( subject );
//browse all
while ( mat.find() )
{
System.out.println( "result [" + mat.group( 1 ) + "]" );
}
But OF COURSE I´d suggest to solve this using a JSON-Parser like
http://json.org/java/
Greetings
Christopher
It's a JSON format, so you have to use a JSON Parser:
JSONArray array = new JSONArray(yourString);
for (int i=0;i<array.length();i++){
JSONObject jo = inputArray.getJSONObject(i);
userId = jo.getString("userId");
}
EDIT : Regex pattern
"userId"[ :]+((?=\[)\[[^]]*\]|(?=\{)\{[^\}]*\}|\"[^"]*\")
Result :
"userId" : "Some user ID (numeric or letters)"

How to use pattern in Java to fetch groups like 'sscanf' does in C?

I have String user#domain:port
I want to fetch user, domain and port from this String.
So I created regex:
public static final String MATCH_USER_DOMAIN_PORT = "^([0-9,a-zA-Z-.*_]+)#([a-z0-9]+[\\.-][a-z0-9]+\\.[a-z]{2,}+):(6553[0-5]|655[0-2]\\d|65[0-4]\\d{2}|6[0-4]\\d{3}|[1-5]\\d{4}|[1-9]\\d{0,3})$";
and this is my method in Unitest so far:
public void test____matchesUserDomainWithPort(){
String identityText = "maxim#domain.com:5555";
String user = "";
String domain = "";
String port = "";
if(identityText.matches(MATCH_USER_DOMAIN_PORT))
{
Pattern p = Pattern.compile(MATCH_USER_DOMAIN_PORT);
Matcher m = p.matcher(identityText);
user = m.group(1);
domain= m.group(2);
port= m.group(3);
}
assertEquals("maxim", user);
assertEquals("domain.com", domain);
assertEquals("5555", port);
}
I get error:
java.lang.IllegalStateException: No successful match so far
at java.util.regex.Matcher.ensureMatch(Matcher.java:607)
....
in row: user = m.group(1);
I opened http://gskinner.com/RegExr/?2v5r0
and there all seems good:
Output:
RegExp: /^([0-9,a-zA-Z-.*_]+#[a-z0-9]+([\.-][a-z0-9]+)*)+\.[a-z]{2,}+:(6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3})$/
pattern: ^([0-9,a-zA-Z-.*_]+#[a-z0-9]+([\.-][a-z0-9]+)*)+\.[a-z]{2,}+:(6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3})$
flags:
3 capturing groups:
group 1: ([0-9,a-zA-Z-.*_]+#[a-z0-9]+([\.-][a-z0-9]+)*)
group 2: ([\.-][a-z0-9]+)
group 3: (6553[0-5]|655[0-2]\d|65[0-4]\d{2}|6[0-4]\d{3}|[1-5]\d{4}|[1-9]\d{0,3})
Do I miss something?
in C i just write: sscanf(identityText,"%[^#]#%[^:]:%511s",user,domain,port);
For sure I can split this text with # and : and get 3 values, but its interesting how to do that in gentle form :)
Please, help
Please use
if(identityText.matches(MATCH_USER_DOMAIN_PORT)){
Pattern p = Pattern.compile(MATCH_USER_DOMAIN_PORT);
Matcher m = p.matcher(identityText);
while(m.find()){
user = m.group(1);
domain= m.group(2);
port= m.group(3);
}
}
thanks
Yes, I think your regex is wrong.
public static final String MATCH_USER_DOMAIN_PORT = "^([0-9,a-zA-Z-.*_]+#[a-z0-9]+([\\.-][a-z0-9]+)*)+\\.[a-z]{2,}+:(6553[0-5]|655[0-2]\\d|65[0-4]\\d{2}|6[0-4]\\d{3}|[1-5]\\d{4}|[1-9]\\d{0,3})$";
To break it down:
^(
[0-9,a-zA-Z-.*_]+
any number of these characters, will match "maxim"
#
will match "#"
[a-z0-9]+
any number of these characters, will match "domain"
([\\.-][a-z0-9]+)*
will match ".com" (or theoretically ".somethingelse.com", nice)
)+
will make group #2 "maxim#domain.com", I believe, but what's with the "+" ?
\\.
nothing in the input string here
[a-z]{2,}+
is this for a country code like .eu ? Again, what's with the "+" ?
:
(6553[0-5]|655[0-2]\\d|65[0-4]\\d{2}|6[0-4]\\d{3}|[1-5]\\d{4}|[1-9]\\d{0,3})
seems overly complicated - probably don't do the numeric validation with the regex
$
Take a look at Using a regular expression to validate an email address for some advice on validation of email addresses.

Categories