regex command(remove everything but specified txt) - java

Does anyone out there know of a regex command that will take the following string
url = http://184.154.145.114:8013/wlraac name = wlr samplerate = 44100 channels = 2 format = S16le
and remove everything but the following
wlr
This line will come up multiple times, where everything changes after the = sign and each time all I want to keep is whats after name =
any help is appreciated

You could do something like
.*name =\s*(\w+).*
and replace with the content of group 1
See it here on Regexr
I search for "name =" and anything before. The \s* matches the following whitespace.
Then the \w+ inside brackets. \w will match any character and digit and underscore (if you use the option Pattern.UNICODE_CHARACTER_CLASS otherwise it sticks to ASCII only) . Because of the brackets it is stored in the first group.
String in = " url = http://184.154.145.114:8013/wlraac name = wlr samplerate = 44100 channels = 2 format = S16le";
Pattern r = Pattern.compile(".*name =\\s*(\\w+).*");
Matcher m = r.matcher(in);
String result = m.replaceAll("$1");
System.out.println(result);
Or your code
String str = line2.replaceAll(".*name =\\S*(\\W).*", "$1");

From your description its a little bit hard to understand what you need.
But regex is overkill. You should use smth like:
String s = myString.substring(myString.indexOf("name =")+6);

I'd recommend you to extract the word that appears after =, i.e.
Pattern p = Pattern.compile("=\\s*(\\S+)");
Matcher m = p.matcher(str);
if (m.find()) {
String value = m.group(1); // contains your wlr
...............
}

Related

Pattern Matching inside brackets with % sybol

I am a newbie to Java and have been trying to pattern match some data inside a TD tag and brackets with a percentage symbol, but for the life of me cannot get it to work.
I am sure it is very simple and I Just want to extract the numbers before the % symbol in here :
<td>0 items (0%)</td>
I have tried quite a number of suggestions but none seem to work.
linecache = readercache.readLine();
System.out.println(linecache);
Pattern patterncf1 = Pattern.compile("\\((.*?)\\)");
tried
Pattern patterncf1 = Pattern.compile("<td>\\d+ \\w+ \\((\\d+)?%\\)</td>");
tried
Pattern patterncf1 = Pattern.compile("<td>\\((\\d+)?%\\)</td>");
tried
Pattern patterncf1 = Pattern.compile("\\((\\d+)?%\\)");
but am always getting
<td>0 items (0%)</td>
Exception in thread "Thread-0" java.lang.IllegalStateException: No match found
I also tried the suggestion below but still erroring out and I would assume that this is the right group in this case.
linecache = readercache.readLine();
System.out.println(linecache);
String pattern = "\\d+(?=%)";
Pattern patterncf1 = Pattern.compile(pattern)
Matcher matchercf1 = patterncf1.matcher(linecache);
String passedvalue = matchercf1.group(1);
System.out.println(passedvalue);
This part in a different section of code works fine.
Pattern patternmb1 = Pattern.compile("<td>(.+?) GB</td>");
Matcher matchermb1 = patternmb1.matcher(line);
if (matchermb1.find()) {
String passedvalue = matchermb1.group(1);
String[] tmpStr = passedvalue.split("\\.") ;
String withoutDecStr = tmpStr[0];
Float passedvalue2 = Float.valueOf(withoutDecStr);
System.out.println("MIU: " + passedvalue2);
JVMinusearray.add(passedvalue2);
I would appreciate if someone could offer some advice please.
Thanks
You can use the following:
Pattern pattern = Pattern.compile("<td>.*\\((\\d+)%\\)</td>");
Matcher matcher = pattern.matcher("<td>0 items (2000%)</td>");
if(matcher.matches()) {
System.out.println(matcher.group(1));
}
You will get the number appended to %.
if you want to extract numbers before %, the following will match
(\\d+(?=%))
demo
Edit:
from your comment, i understood that the problem is in identifying the correct group to pick. in this regex, what you want in goup 1, you have to use group1 to make it work.
linecache = readercache.readLine();
System.out.println(linecache);
String pattern = "(\\d+(?=%))"; // just include ()
Pattern patterncf1 = Pattern.compile(pattern)
Matcher matchercf1 = patterncf1.matcher(linecache);
String passedvalue = matchercf1.group(1);
System.out.println(passedvalue);
Thanks for your help. It seems to work with a static string of text but not from the reading in of the data from the html file, so I will take this offline and see what's going on, but both suggestions have worked fine.
Thank you for your time. I appreciate it.
Regards,
Paul

How to write and use regular expression in java

I want to write a regular expression in java which will accept the String having alphabets, numbers, - and space any number of times any where.
The string should only contain above mentioned and no other special characters. How to code the regular expression in java?
I tried the following, It works when I run it as a java application.
But the same code when I run in web application and accept the values through XML, It accepts '/'.
String test1 = null;
Scanner scan = new Scanner(System.in);
test1 = scan.nextLine();
String alphaExp = "^[a-zA-Z0-9-]*$";
Pattern r = Pattern.compile(alphaExp);
Matcher m = r.matcher(test1);
boolean flag = m.lookingAt();
System.out.println(flag);
Can anyone help me on this please?
You can try to use POSIX character classes (see here):
Pattern p = Pattern.compile("^[\\p{Alnum}\\p{Space}-]*$");
Matcher m = p.matcher("asfsdf 1212sdfsd-gf121sdg5 4s");
boolean b = m.lookingAt();
With this regular expression if the string you pass contain anything else than alphanumeric or space characters it will be a no match result.
I think you're just missing a space from the character class - since you mentioned it in your text ^[a-zA-Z0-9 -]*$
You can add the Pattern.MULTILINE flag too so you can specify how the pattern handles the lines:
String alphaExp = "^[a-zA-Z0-9 -]*$";
Pattern r = Pattern.compile(alphaExp, Pattern.MULTILINE);
Matcher m = r.matcher(test1);
boolean flag = m.lookingAt();
Pay attention to the fact that * quantifier will make it match to everything including no matches (0 or more times, like empty lines or blank tokens "", infinitely.
If you instead use + "[\w\d\s-\]+" it will match one or more (consider using \\ for each \ in your Java Regex code as follow: "[\\w\\d\\s-]+"
Consider that * is a quantity operator that works as {0, } and + works like {1, }

Java Regex: Remove Everything After Last Instance Of Character

Say I have a string:
/first/second/third
And I want to remove everything after the last instance of / so I would end up with:
/first/second
What regular expression would I use? I've tried:
String path = "/first/second/third";
String pattern = "$(.*?)/";
Pattern r = Pattern.compile(pattern2);
Matcher m = r.matcher(path);
if(m.find()) path = m.replaceAll("");
Why use a regex at all here? Look for the last / character with lastIndexOf. If it's found, then use substring to extract everything before it.
Do you mean like this
s = s.replaceAll("/[^/]*$", "");
Or better if you are using paths
File f = new File(s);
File dir = f.getParent(); // works for \ as well.
If you have a string that contains your character (whether a supplemental code-point or not), then you can use Pattern.quote and match the inverse charset up to the end thus:
String myCharEscaped = Pattern.quote(myCharacter);
Pattern pattern = Pattern.compile("[^" + myCharEscaped + "]*\\z");
should do it, but really you can just use lastIndexOf as in
myString.substring(0, s.lastIndexOf(myCharacter) + 1)
To get a code-point as a string just do
new StringBuilder().appendCodePoint(myCodePoint).toString()
Despite the answers avoiding regex Pattern and Matcher, it's useful for performance (compiled patterns) and it'still pretty straightforward and worth mastering. :)
Not sure why you have "$" up front. Try either:
Matching starting group
String path = "/first/second/third";
String pattern = "^(.*)/"; // * = "greedy": maximum string from start to last "/"
Pattern r = Pattern.compile(pattern2);
Matcher m = r.matcher(path);
if (m.find()) path = m.group();
Stripping tail match:
String path = "/first/second/third";
String pattern = "/(.*?)$)/"; // *? = "reluctant": minimum string from last "/" to end
Pattern r = Pattern.compile(pattern2);
Matcher m = r.matcher(path);
if (m.find()) path = m.replace("");

Java RegExp to get variable image name with its extension from javascript

I am trying to get the image name from the following javascript.
var g_prefetch ={'Im': {url:'\/az\/hprichbg\/rb\/WhiteTippedRose_ROW10477559674_1366x768.jpg', hash:'674'}
Problem:
The name of the image is variable. That is, in the above example code the image changes regularly.
Output I want:
WhiteTippedRose_ROW10477559674_1366x768.jpg
and i tried the following regExp :
Pattern p = Pattern.compile("\{\'Im\'\: \{url\:\'\\\/az\\\/hprichbg\\\/rb\\\/(.*?)\.jpg\'\, hash\:\'674\'\}");
//System.out.println(p);
Matcher m=p.matcher(out);
if(m.find()) {
System.out.println(m.group());
}
I don't know too much RegExp so please help me and let me understand the approach.
Thank You
I would use the following regex, it should be fast enough:
Pattern p = Pattern.compile("[^/]+\\.jpg");
Matcher m = p.matcher(str);
if (m.find()) {
String match = m.group();
System.out.println(match);
}
This will match the a full sequence of characters ending with .jpg not including /.
I think that the correct approach will be to check the correct legality of a file name.
Here is a list of not legal characters for Windows: "\\/:*?\"<>|"
for Mac /:
Linux/Unix /;
Here is more complex example assuming format will change , it is mostly designed for legal Window file name:
String s = "{'Im': {url:'\\/az\\/hprichbg\\/rb\\/?*<>WhiteTippedRose_ROW10477559674_1366x768.jpg', hash:'674'}";
Pattern p = Pattern.compile("[^\\/:*?\"<>|]+\\.jpg");
Matcher m = p.matcher(s);
if (m.find()) {
String match = m.group();
System.out.println(match);
}
This will still print
WhiteTippedRose_ROW10477559674_1366x768.jpg
Here you may find a demo
Assuming that the image is always placed after a / and does not contain any /, you can use the following:
String s = "{'Im': {url:'\\/az\\/hprichbg\\/rb\\/WhiteTippedRose_ROW10477559674_1366x768.jpg', hash:'674'}";
s = s.replaceAll(".*?([^/]*?\\.jpg).*", "$1");
System.out.println("s = " + s);
outputs:
s = WhiteTippedRose_ROW10477559674_1366x768.jpg
In substance:
.*? skip the beginning of the string until the next pattern is found
([^/]*?\\.jpg) a group like "xxx.jpg" where xxx does not contain any "/"
.* rest of the string
$1 returns the content of the group
If the String is always of this form, I would simply do:
int startIndex = s.indexOf("rb\\/") + 4;
int endIndex = s.indexOf('\'', startIndex);
String image = s.substring(startIndex, endIndex);

How do I make a regex match for measurement units?

I'm building a small Java library which has to match units in strings. For example, if I have "300000000 m/s^2", I want it to match against "m" and "s^2".
So far, I have tried most imaginable (by me) configurations resembling (I hope it's a good start)
"[[a-zA-Z]+[\\^[\\-]?[0-9]+]?]+"
To clarify, I need something that will match letters[^[-]numbers] (where [ ] denotes non obligatory parts). That means: letters, possibly followed by an exponent which is possibly negative.
I have studied regex a little bit, but I'm really not fluent, so any help will be greatly appreciated!
Thank you very much,
EDIT:
I have just tried the first 3 replies
String regex1 = "([a-zA-Z]+)(?:\\^(-?\\d+))?";
String regex2 = "[a-zA-Z]+(\\^-?[0-9]+)?";
String regex3 = "[a-zA-Z]+(?:\\^-?[0-9]+)?";
and it doesn't work... I know the code which tests the patterns work, because if I try something simple, like matching "[0-9]+" in "12345", it will match the whole string. So, I don't get what's still wrong. I'm trying with changing my brackets for parenthesis where needed at the moment...
CODE USED TO TEST:
public static void main(String[] args) {
String input = "30000 m/s^2";
// String input = "35345";
String regex1 = "([a-zA-Z]+)(?:\\^(-?\\d+))?";
String regex2 = "[a-zA-Z]+(\\^-?[0-9]+)?";
String regex3 = "[a-zA-Z]+(?:\\^-?[0-9]+)?";
String regex10 = "[0-9]+";
String regex = "([a-zA-Z]+)(?:\\^\\-?[0-9]+)?";
Pattern pattern = Pattern.compile(regex3);
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
System.out.println("MATCHES");
do {
int start = matcher.start();
int end = matcher.end();
// System.out.println(start + " " + end);
System.out.println(input.substring(start, end));
} while (matcher.find());
}
}
([a-zA-Z]+)(?:\^(-?\d+))?
You don't need to use the character class [...] if you're matching a single character. (...) here is a capturing bracket for you to extract the unit and exponent later. (?:...) is non-capturing grouping.
You're mixing the use of square brackets to denote character classes and curly brackets to group. Try this instead:
[a-zA-Z]+(\^-?[0-9]+)?
In many regular expression dialects you can use \d to mean any digit instead of [0-9].
Try
"[a-zA-Z]+(?:\\^-?[0-9]+)?"

Categories