Java Regex: Remove Everything After Last Instance Of Character

Java Regex: Remove Everything After Last Instance Of Character - java

Say I have a string:
/first/second/third
And I want to remove everything after the last instance of / so I would end up with:
/first/second
What regular expression would I use? I've tried:
String path = "/first/second/third";
String pattern = "$(.*?)/";
Pattern r = Pattern.compile(pattern2);
Matcher m = r.matcher(path);
if(m.find()) path = m.replaceAll("");

Why use a regex at all here? Look for the last / character with lastIndexOf. If it's found, then use substring to extract everything before it.

Do you mean like this
s = s.replaceAll("/[^/]*$", "");
Or better if you are using paths
File f = new File(s);
File dir = f.getParent(); // works for \ as well.

If you have a string that contains your character (whether a supplemental code-point or not), then you can use Pattern.quote and match the inverse charset up to the end thus:
String myCharEscaped = Pattern.quote(myCharacter);
Pattern pattern = Pattern.compile("[^" + myCharEscaped + "]*\\z");
should do it, but really you can just use lastIndexOf as in
myString.substring(0, s.lastIndexOf(myCharacter) + 1)
To get a code-point as a string just do
new StringBuilder().appendCodePoint(myCodePoint).toString()

Despite the answers avoiding regex Pattern and Matcher, it's useful for performance (compiled patterns) and it'still pretty straightforward and worth mastering. :)
Not sure why you have "$" up front. Try either:
Matching starting group
String path = "/first/second/third";
String pattern = "^(.*)/"; // * = "greedy": maximum string from start to last "/"
Pattern r = Pattern.compile(pattern2);
Matcher m = r.matcher(path);
if (m.find()) path = m.group();
Stripping tail match:
String path = "/first/second/third";
String pattern = "/(.*?)$)/"; // *? = "reluctant": minimum string from last "/" to end
Pattern r = Pattern.compile(pattern2);
Matcher m = r.matcher(path);
if (m.find()) path = m.replace("");

Related

No match for Java Regular Expression

I am running into an issue where my code is unable to find regex occurrences. Code:
String content = "This\ is\ an\ example.=This is an example\nThis\ is\ second\:=This is second"
String regex = "\"^.*(?=\\=)\"gm";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(content);
List<String> mKeys = new ArrayList<>();
while (m.find()) {
mKeys.add(m.group());
}
mKeys turns out to be empty. I have already validated my regex here https://regex101.com/r/YResRc/3. I am expecting the list to contain two keys from the content.

Your content contains no " quotes, and no text gm, so why would you expect that regex to match?
FYI: Syntaxes like "foo"gm or /foo/gm are something other languages do for regex literals. Java doesn't do that.
The g flag is implied by the fact that you're using a find() loop, and m is the MULTILINE flag that affects ^ and $ and you can specify that using the (?m) pattern, or by adding a second parameter to compile(), i.e. one of these ways:
Pattern p = Pattern.compile("foo", Pattern.MULTILINE);
Pattern p = Pattern.compile("(?m)foo");
Your regex should simply be:
(?m)^.*(?==)
which means: Match everything from the beginning of a line up to the last = sign on the line.
Test
String content = "This is an example.=This is an example\nThis is second:=This is second";
String regex = "(?m)^.*(?==)";
Matcher m = Pattern.compile(regex).matcher(content);
List<String> mKeys = new ArrayList<>();
while (m.find()) {
mKeys.add(m.group());
}
System.out.println(mKeys);
Output
[This is an example., This is second:]

Pattern matching with string containing dots

Pattern is:
private static Pattern r = Pattern.compile("(.*\\..*\\..*)\\..*");
String is:
sentVersion = "1.1.38.24.7";
I do:
Matcher m = r.matcher(sentVersion);
if (m.find()) {
guessedClientVersion = m.group(1);
}
I expect 1.1.38 but the pattern match fails. If I change to Pattern.compile("(.*\\..*\\..*)\\.*");
// notice I remove the "." before the last *
then 1.1.38.XXX fails
My goal is to find (x.x.x) in any incoming string.
Where am I wrong?

Problem is probably due to greedy-ness of your regex. Try this negation based regex pattern:
private static Pattern r = Pattern.compile("([^.]*\\.[^.]*\\.[^.]*)\\..*");
Online Demo: http://regex101.com/r/sJ5rD4

Make your .* matches reluctant with ?
Pattern r = Pattern.compile("(.*?\\..*?\\..*?)\\..*");
otherwise .* matches the whole String value.
See here: http://regex101.com/r/lM2lD5

Java RegExp to get variable image name with its extension from javascript

I am trying to get the image name from the following javascript.
var g_prefetch ={'Im': {url:'\/az\/hprichbg\/rb\/WhiteTippedRose_ROW10477559674_1366x768.jpg', hash:'674'}
Problem:
The name of the image is variable. That is, in the above example code the image changes regularly.
Output I want:
WhiteTippedRose_ROW10477559674_1366x768.jpg
and i tried the following regExp :
Pattern p = Pattern.compile("\{\'Im\'\: \{url\:\'\\\/az\\\/hprichbg\\\/rb\\\/(.*?)\.jpg\'\, hash\:\'674\'\}");
//System.out.println(p);
Matcher m=p.matcher(out);
if(m.find()) {
System.out.println(m.group());
}
I don't know too much RegExp so please help me and let me understand the approach.
Thank You

I would use the following regex, it should be fast enough:
Pattern p = Pattern.compile("[^/]+\\.jpg");
Matcher m = p.matcher(str);
if (m.find()) {
String match = m.group();
System.out.println(match);
}
This will match the a full sequence of characters ending with .jpg not including /.
I think that the correct approach will be to check the correct legality of a file name.
Here is a list of not legal characters for Windows: "\\/:*?\"<>|"
for Mac /:
Linux/Unix /;
Here is more complex example assuming format will change , it is mostly designed for legal Window file name:
String s = "{'Im': {url:'\\/az\\/hprichbg\\/rb\\/?*<>WhiteTippedRose_ROW10477559674_1366x768.jpg', hash:'674'}";
Pattern p = Pattern.compile("[^\\/:*?\"<>|]+\\.jpg");
Matcher m = p.matcher(s);
if (m.find()) {
String match = m.group();
System.out.println(match);
}
This will still print
WhiteTippedRose_ROW10477559674_1366x768.jpg
Here you may find a demo

Assuming that the image is always placed after a / and does not contain any /, you can use the following:
String s = "{'Im': {url:'\\/az\\/hprichbg\\/rb\\/WhiteTippedRose_ROW10477559674_1366x768.jpg', hash:'674'}";
s = s.replaceAll(".*?([^/]*?\\.jpg).*", "$1");
System.out.println("s = " + s);
outputs:
s = WhiteTippedRose_ROW10477559674_1366x768.jpg
In substance:
.*? skip the beginning of the string until the next pattern is found
([^/]*?\\.jpg) a group like "xxx.jpg" where xxx does not contain any "/"
.* rest of the string
$1 returns the content of the group

If the String is always of this form, I would simply do:
int startIndex = s.indexOf("rb\\/") + 4;
int endIndex = s.indexOf('\'', startIndex);
String image = s.substring(startIndex, endIndex);

Replacing Pattern Matches in a String

String output = "";
pattern = Pattern.compile(">Part\s.");
matcher = pattern.matcher(docToProcess);
while (matcher.find()) {
match = matcher.group();
}
I'm trying to use the above code to find the pattern >Part\s. inside docToProcess (Which is a string of a large xml document) and then what I want to do is replace the content that matches the pattern with <ref></ref>
Any ideas how I can make the output variable equal to docToProcess except with the replacements as indicated above?
EDIT: I need to use the matcher somehow when replacing. I can't just use replaceAll()

You can use String#replaceAll method. It takes a Regex as first parameter: -
String output = docToProcess.replaceAll(">Part\\s\\.", "<ref></ref>");
Note that, dot (.) is a special meta-character in regex, which matches everything, and not just a dot(.). So, you need to escape it, unless you really wanted to match any character after >Part\\s. And you need to add 2 backslashes to escape in Java.
If you want to use Matcher class, the you can use Matcher.appendReplacement method: -
String docToProcess = "XYZ>Part .asdf";
Pattern p = Pattern.compile(">Part\\s\\.");
Matcher m = p.matcher(docToProcess);
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, "<ref></ref>");
}
m.appendTail(sb);
System.out.println(sb.toString());
OUTPUT : -
"XYZ<ref></ref>asdf"

This is what you need:
String docToProcess = "... your xml here ...";
Pattern pattern = Pattern.compile(">Part\\s.");
Matcher matcher = pattern.matcher(docToProcess);
StringBuffer output = new StringBuffer();
while (matcher.find()) matcher.appendReplacement(output, "<ref></ref>");
matcher.appendTail(output);
Unfortunately, you can't use the StringBuilder due to historical constraints on the Java API.

docToProcess.replaceAll(">Part\\s[.]", "<ref></ref>");

String output = docToProcess.replaceAll(">Part\\s\\.", "<ref></ref>");

regex command(remove everything but specified txt)

Does anyone out there know of a regex command that will take the following string
url = http://184.154.145.114:8013/wlraac name = wlr samplerate = 44100 channels = 2 format = S16le
and remove everything but the following
wlr
This line will come up multiple times, where everything changes after the = sign and each time all I want to keep is whats after name =
any help is appreciated

You could do something like
.*name =\s*(\w+).*
and replace with the content of group 1
See it here on Regexr
I search for "name =" and anything before. The \s* matches the following whitespace.
Then the \w+ inside brackets. \w will match any character and digit and underscore (if you use the option Pattern.UNICODE_CHARACTER_CLASS otherwise it sticks to ASCII only) . Because of the brackets it is stored in the first group.
String in = " url = http://184.154.145.114:8013/wlraac name = wlr samplerate = 44100 channels = 2 format = S16le";
Pattern r = Pattern.compile(".*name =\\s*(\\w+).*");
Matcher m = r.matcher(in);
String result = m.replaceAll("$1");
System.out.println(result);
Or your code
String str = line2.replaceAll(".*name =\\S*(\\W).*", "$1");

From your description its a little bit hard to understand what you need.
But regex is overkill. You should use smth like:
String s = myString.substring(myString.indexOf("name =")+6);

I'd recommend you to extract the word that appears after =, i.e.
Pattern p = Pattern.compile("=\\s*(\\S+)");
Matcher m = p.matcher(str);
if (m.find()) {
String value = m.group(1); // contains your wlr
...............
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Regex: Remove Everything After Last Instance Of Character - java

Why use a regex at all here? Look for the last / character with lastIndexOf. If it's found, then use substring to extract everything before it.

Do you mean like this s = s.replaceAll("/[^/]*$", ""); Or better if you are using paths File f = new File(s); File dir = f.getParent(); // works for \ as well.

Related

No match for Java Regular Expression

Pattern matching with string containing dots

Java RegExp to get variable image name with its extension from javascript

Replacing Pattern Matches in a String

regex command(remove everything but specified txt)

Categories

Resources