Java RegExp to get variable image name with its extension from javascript

Java RegExp to get variable image name with its extension from javascript - java

I am trying to get the image name from the following javascript.
var g_prefetch ={'Im': {url:'\/az\/hprichbg\/rb\/WhiteTippedRose_ROW10477559674_1366x768.jpg', hash:'674'}
Problem:
The name of the image is variable. That is, in the above example code the image changes regularly.
Output I want:
WhiteTippedRose_ROW10477559674_1366x768.jpg
and i tried the following regExp :
Pattern p = Pattern.compile("\{\'Im\'\: \{url\:\'\\\/az\\\/hprichbg\\\/rb\\\/(.*?)\.jpg\'\, hash\:\'674\'\}");
//System.out.println(p);
Matcher m=p.matcher(out);
if(m.find()) {
System.out.println(m.group());
}
I don't know too much RegExp so please help me and let me understand the approach.
Thank You

I would use the following regex, it should be fast enough:
Pattern p = Pattern.compile("[^/]+\\.jpg");
Matcher m = p.matcher(str);
if (m.find()) {
String match = m.group();
System.out.println(match);
}
This will match the a full sequence of characters ending with .jpg not including /.
I think that the correct approach will be to check the correct legality of a file name.
Here is a list of not legal characters for Windows: "\\/:*?\"<>|"
for Mac /:
Linux/Unix /;
Here is more complex example assuming format will change , it is mostly designed for legal Window file name:
String s = "{'Im': {url:'\\/az\\/hprichbg\\/rb\\/?*<>WhiteTippedRose_ROW10477559674_1366x768.jpg', hash:'674'}";
Pattern p = Pattern.compile("[^\\/:*?\"<>|]+\\.jpg");
Matcher m = p.matcher(s);
if (m.find()) {
String match = m.group();
System.out.println(match);
}
This will still print
WhiteTippedRose_ROW10477559674_1366x768.jpg
Here you may find a demo

Assuming that the image is always placed after a / and does not contain any /, you can use the following:
String s = "{'Im': {url:'\\/az\\/hprichbg\\/rb\\/WhiteTippedRose_ROW10477559674_1366x768.jpg', hash:'674'}";
s = s.replaceAll(".*?([^/]*?\\.jpg).*", "$1");
System.out.println("s = " + s);
outputs:
s = WhiteTippedRose_ROW10477559674_1366x768.jpg
In substance:
.*? skip the beginning of the string until the next pattern is found
([^/]*?\\.jpg) a group like "xxx.jpg" where xxx does not contain any "/"
.* rest of the string
$1 returns the content of the group

If the String is always of this form, I would simply do:
int startIndex = s.indexOf("rb\\/") + 4;
int endIndex = s.indexOf('\'', startIndex);
String image = s.substring(startIndex, endIndex);

Related

Regex to get value between two colon excluding the colons

I have a string like this:
something:POST:/some/path
Now I want to take the POST alone from the string. I did this by using this regex
:([a-zA-Z]+):
But this gives me a value along with colons. ie I get this:
:POST:
but I need this
POST
My code to match the same and replace it is as follows:
String ss = "something:POST:/some/path/";
Pattern pattern = Pattern.compile(":([a-zA-Z]+):");
Matcher matcher = pattern.matcher(ss);
if (matcher.find()) {
System.out.println(matcher.group());
ss = ss.replaceFirst(":([a-zA-Z]+):", "*");
}
System.out.println(ss);
EDIT:
I've decided to use the lookahead/lookbehind regex since I did not want to use replace with colons such as :*:. This is my final solution.
String s = "something:POST:/some/path/";
String regex = "(?<=:)[a-zA-Z]+(?=:)";
Matcher matcher = Pattern.compile(regex).matcher(s);
if (matcher.find()) {
s = s.replaceFirst(matcher.group(), "*");
System.out.println("replaced: " + s);
}
else {
System.out.println("not replaced: " + s);
}

There are two approaches:
Keep your Java code, and use lookahead/lookbehind (?<=:)[a-zA-Z]+(?=:), or
Change your Java code to replace the result with ":*:"
Note: You may want to define a String constant for your regex, since you use it in different calls.

As pointed out, the reqex captured group can be used to replace.
The following code did it:
String ss = "something:POST:/some/path/";
Pattern pattern = Pattern.compile(":([a-zA-Z]+):");
Matcher matcher = pattern.matcher(ss);
if (matcher.find()) {
ss = ss.replaceFirst(matcher.group(1), "*");
}
System.out.println(ss);

UPDATE
Looking at your update, you just need ReplaceFirst only:
String result = s.replaceFirst(":[a-zA-Z]+:", ":*:");
See the Java demo
When you use (?<=:)[a-zA-Z]+(?=:), the regex engine checks each location inside the string for a * before it, and once found, tries to match 1+ ASCII letters and then assert that there is a : after them. With :[A-Za-z]+:, the checking only starts after a regex engine found : character. Then, after matching :POST:, the replacement pattern replaces the whole match. It is totlally OK to hardcode colons in the replacement pattern since they are hardcoded in the regex pattern.
Original answer
You just need to access Group 1:
if (matcher.find()) {
System.out.println(matcher.group(1));
}
See Java demo
Your :([a-zA-Z]+): regex contains a capturing group (see (....) subpattern). These groups are numbered automatically: the first one has an index of 1, the second has the index of 2, etc.
To replace it, use Matcher#appendReplacement():
String s = "something:POST:/some/path/";
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile(":([a-zA-Z]+):").matcher(s);
while (m.find()) {
m.appendReplacement(result, ":*:");
}
m.appendTail(result);
System.out.println(result.toString());
See another demo

This is your solution:
regex = (:)([a-zA-Z]+)(:)
And code is:
String ss = "something:POST:/some/path/";
ss = ss.replaceFirst("(:)([a-zA-Z]+)(:)", "$1*$3");
ss now contains:
something:*:/some/path/
Which I believe is what you are looking for...

text wrongly matchs with sub string of words in group

I want to check the text to see if it starts with what or who and and is a question type, so for that I wrote the following code:
private static void startWithQOrIf(String commentstr){
String urlPattern = "(|who|what).*\\?.*$";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
if (m.find()) {
System.out.println("yes");
}
}
everything works good but for example when I try:
whooooooooo is the follower?
will match as well but should not because I am looking for who not whooooooooo
Any idea?

You can ensure a whole word using a word boundary \b:
(|who|what)\\b.*\\?.*$
^^
If the words in the alternation group are supposed to appear at the start of the string, you can just use matches and remove $ anchor:
String urlPattern = "(|who|what)\\b.*\\?.*";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
if (m.matches()) { // < - Here, matches is used
System.out.println("yes");
}
Note that (|who|what) matches either an empty string, or who, or what. If you do not plan to allow empty string, use just (who|what).

You must use word boundaries.
String urlPattern = "\\b(who|what)\\b.*\\?.*$";

Java Regex: Remove Everything After Last Instance Of Character

Say I have a string:
/first/second/third
And I want to remove everything after the last instance of / so I would end up with:
/first/second
What regular expression would I use? I've tried:
String path = "/first/second/third";
String pattern = "$(.*?)/";
Pattern r = Pattern.compile(pattern2);
Matcher m = r.matcher(path);
if(m.find()) path = m.replaceAll("");

Why use a regex at all here? Look for the last / character with lastIndexOf. If it's found, then use substring to extract everything before it.

Do you mean like this
s = s.replaceAll("/[^/]*$", "");
Or better if you are using paths
File f = new File(s);
File dir = f.getParent(); // works for \ as well.

If you have a string that contains your character (whether a supplemental code-point or not), then you can use Pattern.quote and match the inverse charset up to the end thus:
String myCharEscaped = Pattern.quote(myCharacter);
Pattern pattern = Pattern.compile("[^" + myCharEscaped + "]*\\z");
should do it, but really you can just use lastIndexOf as in
myString.substring(0, s.lastIndexOf(myCharacter) + 1)
To get a code-point as a string just do
new StringBuilder().appendCodePoint(myCodePoint).toString()

Despite the answers avoiding regex Pattern and Matcher, it's useful for performance (compiled patterns) and it'still pretty straightforward and worth mastering. :)
Not sure why you have "$" up front. Try either:
Matching starting group
String path = "/first/second/third";
String pattern = "^(.*)/"; // * = "greedy": maximum string from start to last "/"
Pattern r = Pattern.compile(pattern2);
Matcher m = r.matcher(path);
if (m.find()) path = m.group();
Stripping tail match:
String path = "/first/second/third";
String pattern = "/(.*?)$)/"; // *? = "reluctant": minimum string from last "/" to end
Pattern r = Pattern.compile(pattern2);
Matcher m = r.matcher(path);
if (m.find()) path = m.replace("");

regex command(remove everything but specified txt)

Does anyone out there know of a regex command that will take the following string
url = http://184.154.145.114:8013/wlraac name = wlr samplerate = 44100 channels = 2 format = S16le
and remove everything but the following
wlr
This line will come up multiple times, where everything changes after the = sign and each time all I want to keep is whats after name =
any help is appreciated

You could do something like
.*name =\s*(\w+).*
and replace with the content of group 1
See it here on Regexr
I search for "name =" and anything before. The \s* matches the following whitespace.
Then the \w+ inside brackets. \w will match any character and digit and underscore (if you use the option Pattern.UNICODE_CHARACTER_CLASS otherwise it sticks to ASCII only) . Because of the brackets it is stored in the first group.
String in = " url = http://184.154.145.114:8013/wlraac name = wlr samplerate = 44100 channels = 2 format = S16le";
Pattern r = Pattern.compile(".*name =\\s*(\\w+).*");
Matcher m = r.matcher(in);
String result = m.replaceAll("$1");
System.out.println(result);
Or your code
String str = line2.replaceAll(".*name =\\S*(\\W).*", "$1");

From your description its a little bit hard to understand what you need.
But regex is overkill. You should use smth like:
String s = myString.substring(myString.indexOf("name =")+6);

I'd recommend you to extract the word that appears after =, i.e.
Pattern p = Pattern.compile("=\\s*(\\S+)");
Matcher m = p.matcher(str);
if (m.find()) {
String value = m.group(1); // contains your wlr
...............
}

Java Regex: how to capture multiple matches in the same line

I am trying to match a regex pattern in Java, and I have two questions:
Inside the pattern I'm looking for there is a known beginning and then an unknown string that I want to get up until the first occurrence of an &.
there are multiple occurrences of these patterns in the line and I would like to get each occurrence separately.
For example I have this input line:
1234567 100,110,116,129,139,140,144,146 http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All&viewItems=25&subCatView=true ISx20070515x00001a http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=Screen+Refresh+Rate%7C120HZ&sName=View+All&subCatView=true 0 2819357575609397706
And I am interested in these strings:
Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.
Screen+Refresh+Rate%7C120HZ

Assuming the known beginning is filter=**, the regular expression pattern (?:filter=\\*\\*)(.*?)(?:&) should get you what you need. Use Matcher.find() to get all occurrences of the pattern in a given string. Using the test string you provided, the following:
final Pattern p = Pattern.compile("(?:filter=\\*\\*)(.*?)(?:&)");
final Matcher m = p.matcher(testString);
int cnt = 0;
while (m.find()) {
System.out.println(++cnt + ": G1: " + m.group(1));
}
Will output:
1: G1: Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.
2: G1: Screen+Refresh+Rate%7C120HZ**

If i know that I might need other query parameters in the future, I think it'll be more prudent to decode and parse the URL.
String url = URLDecoder.decode("http://www.gold.com/shc/s/c_10153_12605_" +
"Computers+%26+Electronics_Televisions?filter=Screen+Refresh+Rate" +
"%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All&viewItems=25&subCatView=true"
,"utf-8");
Pattern amp = Pattern.compile("&");
Pattern eq = Pattern.compile("=");
Map<String, String> params = new HashMap<String, String>();
String queryString = url.substring(url.indexOf('?') + 1);
for(String param : amp.split(queryString)) {
String[] pair = eq.split(param);
params.put(pair[0], pair[1]);
}
for(Entry<String, String> param : params.entrySet()) {
System.out.format("%s = %s\n", param.getKey(), param.getValue());
}
Output
subCatView = true
viewItems = 25
sName = View All
filter = Screen Refresh Rate|120HZ^Screen Size|37 in. to 42 in.

in your example, there is sometimes a "**" at the end before the "&". but basically, (assuming "filter=" is the start pattern you are looking for) you want something like:
"filter=([^&]+)&"

Using the regular expression (?<=filter=\*{0,2})[^&]*[^&*]+ in java:
Pattern p = Pattern.compile("(?<=filter=\\*{0,2})[^&]*[^&*]+");
String s = "1234567 100,110,116,129,139,140,144,146 http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All**&viewItems=25&subCatView=true ISx20070515x00001a http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ**&sName=View+All&subCatView=true 0 2819357575609397706";
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
EDIT:
Added [^&*]+ to the end of the regex to prevent the ** from being included in the second match.
EDIT2:
Changed regular expression to use lookbehind.

The regex you're looking for is
Screen\+Refresh\+Rate[^&]*
You could use Matcher.find() to find all matches.

are you looking for a string that follows with "filter=" and ignores the first "*" and is end with the first "&".
your can try the following:
String str = "1234567 100,110,116,129,139,140,144,146 http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ%5EScreen+Size%7C37+in.+to+42+in.&sName=View+All**&viewItems=25&subCatView=true ISx20070515x00001a http://www.gold.com/shc/s/c_10153_12605_Computers+%26+Electronics_Televisions?filter=**Screen+Refresh+Rate%7C120HZ**&sName=View+All&subCatView=true 0 2819357575609397706";
Pattern p = Pattern.compile("filter=(?:\\**)([^&]+?)(?:\\**)&");
Matcher matcher = p.matcher(str);
while(matcher.find()){
System.out.println(matcher.group(1));
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java RegExp to get variable image name with its extension from javascript - java

If the String is always of this form, I would simply do: int startIndex = s.indexOf("rb\\/") + 4; int endIndex = s.indexOf('\'', startIndex); String image = s.substring(startIndex, endIndex);

Related

Regex to get value between two colon excluding the colons

text wrongly matchs with sub string of words in group

Java Regex: Remove Everything After Last Instance Of Character

regex command(remove everything but specified txt)

Java Regex: how to capture multiple matches in the same line

Categories

Resources