How do I extract '1358751074-6824' from this
http://api.discogs.com/images/R-1169056-1358751074-6824.jpeg
and it also needs to extract '13587510746824' from this
http://api.discogs.com/images/R-1169056-13587510746824.jpeg
So I thought I could do it by substringing from the 'second - of the last path component up to the final dot', but how do I work out the second -
Depending on the allowed variations of the string, you could do something like:
String extract = s.replaceAll(".*?-.*?-([\\d-]+).*", "$1");
.*?- skips everyhing up to the first hyphen
.*?- skips everything up to the second hyphen
([\\d-]+) is the part you want to keep: digits and hyphens
.* skips the rest of the string
You can work out the position of the second dash without regular expressions - by finding the position of the first dash, and working from there:
int pos = str.indexOf('-', str.indexOf('-')+1);
Demo.
You can try something like this:
// Your original String
String str = "http://api.discogs.com/images/R-1169056-1358751074-6824.jpeg";
// identify the one-before-last-dash
int i=str.lastIndexOf("-", str.lastIndexOf("-")-1);
// Extract the value you want
String newStr = str.substring(i+1, str.lastIndexOf("."));
// Return numeric value only
String strNums = newStr.replaceAll("[^?0-9]+", "");
Related
How can I remove the whitespaces before and after a specific char? I want also to remove the whitespaces only around the first occurrence of the specific char. In the examples below, I want to remove the whitespaces before and after the first occurrence of =.
For example for those strings:
something = is equal to = something
something = is equal to = something
something =is equal to = something
I need to have this result:
something=is equal to = something
Is there any regular expression that I can use or should I check for the index of the first occurrence of the char =?
private String removeLeadingAndTrailingWhitespaceOfFirstEqualsSign(String s1) {
return s1.replaceFirst("\\s*=\\s*", "=");
}
Notice this matches all whitespace including tabs and new lines, not just space.
You can use the regular expression \w*\s*=\s* to get all matches. From there call trim on the first index in the array of matches.
Regex demo.
Yes - you can create a Regex that matches optional whitespace followed by your pattern followed by optional whitepace, and then replace the first instance.
public static String replaceFirst(final String toMatch, final String forIP) {
// string you want to match before and after
final String quoted = Pattern.quote(toMatch);
final Pattern patt = Pattern.compile("\\s*" + quoted + "\\s*");
final Matcher match = patt.matcher(forIP);
return match.replaceFirst(toMatch);
}
For your inputs this gives the expected result - assuming toMatch is =. It also works with arbitrary bigger things - eg.. imagine giving "is equal to" instead ... getting
something =is equal to= something
For the simple case you can ignore the quoting, for an arbitrary case it helps (although as
many contributors have pointed out before the Pattern.quoting isn't good for every case).
The simple case thus becomes
return forIP.replaceFirst("\\s*" + forIP + "\\s*", forIP);
OR
return forIP.replaceFirst("\\s*=\\s*", "=");
I have a string and I'm getting value through a html form so when I get the value it comes in a URL so I want to remove all the characters before the specific charater which is = and I also want to remove this character. I only want to save the value that comes after = because I need to fetch that value from the variable..
EDIT : I need to remove the = too since I'm trying to get the characters/value in string after it...
You can use .substring():
String s = "the text=text";
String s1 = s.substring(s.indexOf("=") + 1);
s1.trim();
then s1 contains everything after = in the original string.
s1.trim()
.trim() removes spaces before the first character (which isn't a whitespace, such as letters, numbers etc.) of a string (leading spaces) and also removes spaces after the last character (trailing spaces).
While there are many answers. Here is a regex example
String test = "eo21jüdjüqw=realString";
test = test.replaceAll(".+=", "");
System.out.println(test);
// prints realString
Explanation:
.+ matches any character (except for line terminators)
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
= matches the character = literally (case sensitive)
This is also a shady copy paste from https://regex101.com/ where you can try regex out.
You can split the string from the = and separate in to array and take the second value of the array which you specify as after the = sign
For example:
String CurrentString = "Fruit = they taste good";
String[] separated = CurrentString.split("=");
separated[0]; // this will contain "Fruit"
separated[1]; //this will contain "they teste good"
then separated[1] contains everything after = in the original string.
I know this is asked about Java but this seems to also be the first search result for Kotlin so you should know that Kotlin has the String.substringAfter(delimiter: String, missingDelimiterValue: String = this) extension for this case.
Its implementation is:
val index = indexOf(delimiter)
return if (index == -1)
missingDelimiterValue
else
substring(index + delimiter.length, length)
Maybe locate the first occurrence of the character in the URL String. For Example:
String URL = "http://test.net/demo_form.asp?name1=stringTest";
int index = URL.indexOf("=");
Then, split the String based on an index
String Result = URL.substring(index+1); //index+1 to skip =
String Result now contains the value: stringTest
If you use the Apache Commons Lang3 library, you can also use the substringAfter method of the StringUtils utility class.
Official documentation is here.
Examples:
String value = StringUtils.substringAfter("key=value", "=");
// in this case where a space is in the value (e.g. read from a file instead of a query params)
String value = StringUtils.trimToEmpty(StringUtils.substringAfter("key = value", "=")); // = "value"
It manage the case where your values can contains the '=' character as it takes the first occurence.
If you have keys and values also containing '=' character it will not work (but the other methods as well); in the URL query params, such a character should be escaped anyway.
Its basically about getting string value between two characters. SO has many questions related to this. Like:
How to get a part of a string in java?
How to get a string between two characters?
Extract string between two strings in java
and more.
But I felt it quiet confusing while dealing with multiple dots in the string and getting the value between certain two dots.
I have got the package name as :
au.com.newline.myact
I need to get the value between "com." and the next "dot(.)". In this case "newline". I tried
Pattern pattern = Pattern.compile("com.(.*).");
Matcher matcher = pattern.matcher(beforeTask);
while (matcher.find()) {
int ct = matcher.group();
I tried using substrings and IndexOf also. But couldn't get the intended answer. Because the package name in android varies by different number of dots and characters, I cannot use fixed index. Please suggest any idea.
As you probably know (based on .* part in your regex) dot . is special character in regular expressions representing any character (except line separators). So to actually make dot represent only dot you need to escape it. To do so you can place \ before it, or place it inside character class [.].
Also to get only part from parenthesis (.*) you need to select it with proper group index which in your case is 1.
So try with
String beforeTask = "au.com.newline.myact";
Pattern pattern = Pattern.compile("com[.](.*)[.]");
Matcher matcher = pattern.matcher(beforeTask);
while (matcher.find()) {
String ct = matcher.group(1);//remember that regex finds Strings, not int
System.out.println(ct);
}
Output: newline
If you want to get only one element before next . then you need to change greedy behaviour of * quantifier in .* to reluctant by adding ? after it like
Pattern pattern = Pattern.compile("com[.](.*?)[.]");
// ^
Another approach is instead of .* accepting only non-dot characters. They can be represented by negated character class: [^.]*
Pattern pattern = Pattern.compile("com[.]([^.]*)[.]");
If you don't want to use regex you can simply use indexOf method to locate positions of com. and next . after it. Then you can simply substring what you want.
String beforeTask = "au.com.newline.myact.modelact";
int start = beforeTask.indexOf("com.") + 4; // +4 since we also want to skip 'com.' part
int end = beforeTask.indexOf(".", start); //find next `.` after start index
String resutl = beforeTask.substring(start, end);
System.out.println(resutl);
You can use reflections to get the name of any class. For example:
If I have a class Runner in com.some.package and I can run
Runner.class.toString() // string is "com.some.package.Runner"
to get the full name of the class which happens to have a package name inside.
TO get something after 'com' you can use Runner.class.toString().split(".") and then iterate over the returned array with boolean flag
All you have to do is split the strings by "." and then iterate through them until you find one that equals "com". The next string in the array will be what you want.
So your code would look something like:
String[] parts = packageName.split("\\.");
int i = 0;
for(String part : parts) {
if(part.equals("com")
break;
}
++i;
}
String result = parts[i+1];
private String getStringAfterComDot(String packageName) {
String strArr[] = packageName.split("\\.");
for(int i=0; i<strArr.length; i++){
if(strArr[i].equals("com"))
return strArr[i+1];
}
return "";
}
I have done heaps of projects before dealing with websites scraping and I
just have to create my own function/utils to get the job done. Regex might
be an overkill sometimes if you just want to extract a substring from
a given string like the one you have. Below is the function I normally
use to do this kind of task.
private String GetValueFromText(String sText, String sBefore, String sAfter)
{
String sRetValue = "";
int nPos = sText.indexOf(sBefore);
if ( nPos > -1 )
{
int nLast = sText.indexOf(sAfter,nPos+sBefore.length()+1);
if ( nLast > -1)
{
sRetValue = sText.substring(nPos+sBefore.length(),nLast);
}
}
return sRetValue;
}
To use it just do the following:
String sValue = GetValueFromText("au.com.newline.myact", ".com.", ".");
For the string value "ABCD_12" (including quotes), I would like to extract only the content and exclude out the double quotes i.e. ABCD_12 . My code is:
private static void checkRegex()
{
final Pattern stringPattern = Pattern.compile("\"([a-zA-Z_0-9])+\"");
Matcher findMatches = stringPattern.matcher("\"ABC_12\"");
if (findMatches.matches())
System.out.println("Match found" + findMatches.group(0));
}
Now I have tried doing findMatches.group(1);, but that only returns the last character in the string (I did not understand why !).
How can I extract only the content leaving out the double quotes?
Try this regex:
Pattern.compile("\"([a-zA-Z_0-9]+)\"");
OR
Pattern.compile("\"([^\"]+)\"");
Problem in your code is a misplaced + outside right parenthesis. Which is causing capturing group to capture only 1 character (since + is outside) and that's why you get only last character eventually.
A nice simple (read: non-regex) way to do this is:
String myString = "\"ABC_12\"";
String myFilteredString = myString.replaceAll("\"", "");
System.out.println(myFilteredString);
gets you
ABC_12
You should change your pattern to this:
final Pattern stringPattern = Pattern.compile("\"([a-zA-Z_0-9]+)\"");
Note that the + sign was moved inside the group, since you want the character repetition to be part of the group. In the code you posted, what you were actually searching for was a repetition of the group, which consisted in a single occurence of a single characters in [a-zA-Z_0-9].
If your pattern is strictly any text in between double quotes, then you may be better off using substring:
String str = "\"ABC_12\"";
System.out.println(str.substring(1, str.lastIndexOf('\"')));
Assuming it is a bit more complex (double quotes in between a larger string), you can use the split() function in the Pattern class and use \" as your regex - this will split the string around the \" so you can easily extract the content you want
Pattern p = Pattern.compile("\"");
// Split input with the pattern
String[] result =
p.split(str);
for (int i=0; i<result.length; i++)
System.out.println(result[i]);
}
http://docs.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html#split%28java.lang.CharSequence%29
Hi please help me out in getting regular expression for the
following requirement
I have string type as
String vStr = "Every 1 nature(s) - Universe: (Air,Earth,Water sea,Fire)";
String sStr = "Every 1 form(s) - Earth: (Air,Fire) ";
from these strings after using regex I need to get values as "Air,Earth,Water sea,Fire" and "Air,Fire"
that means after
String vStrRegex ="Air,Earth,Water sea,Fire";
String sStrRegex ="Air,Fire";
All the strings that are input will be seperated by ":" and values needed are inside brackets always
Thanks
The regular expression would be something like this:
: \((.*?)\)
Spelt out:
Pattern p = Pattern.compile(": \\((.*?)\\)");
Matcher m = p.matcher(vStr);
// ...
String result = m.group(1);
This will capture the content of the parentheses as the first capture group.
Try the following:
\((.*)\)\s*$
The ending $ is important, otherwise you'll accidentally match the "(s)".
If you have each string separately, try this expression: \(([^\(]*)\)\s*$
This would get you the content of the last pair of brackets, as group 1.
If the strings are concatenated by : try to split them first.
Ask yourself if you really need a regex. Does the text you need always appear within the last two parentheses? If so, you can keep it simple and use substring instead:
String vStr = "Every 1 nature(s) - Universe: (Air,Earth,Water sea,Fire)";
int lastOpeningParens = vStr.lastIndexOf('(');
int lastClosingParens = vStr.lastIndexOf(')');
String text = vStr.substring(lastOpeningParens + 1, lastClosingParens);
This is much more readable than a regex.
I assume that there are only whitespace characters between : and the opening bracket (:
Pattern regex = Pattern.compile(":\\s+\\((.+)\\)");
You'll find your results in capturing group 1.
Try this regex:
.*\((.*)\)
$1 will contain the required string