Exclude/remove a string between two special characters using regex - java

I was trying to replace/remove any string between - <branch prefix> /
Example:
String name = Application-2.0.2-bug/TEST-1.0.0.zip
expected output :
Application-2.0.2-TEST-1.0.0.zip
I tried the below regex, but it's not working accurate.
String FILENAME = 2.2.1-Application-2.0.2-bug/TEST-1.0.0.zip
println(FILENAME.replaceAll(".+/", ""))

There can be many ways e.g. you can replace \w+\/ with a "". Note that \w+ means one or more word characters.
Demo:
public class Main {
public static void main(String[] args) {
String FILENAME = "Application-2.0.2-bug/TEST-1.0.0.zip";
FILENAME = FILENAME.replaceAll("\\w+\\/", "");
System.out.println(FILENAME);
}
}
Output:
Application-2.0.2-TEST-1.0.0.zip
ONLINE DEMO

Related

How to extract number suffix from a filename

In Java I have a filename example ABC.12.txt.gz, I want to extract number 12 from the filename. Currently I am using last index method and extracting substring multiple times.
You could try using pattern matching
import java.util.regex.Pattern;
import java.util.regex.Matcher;
// ... Other features
String fileName = "..."; // Filename with number extension
Pattern pattern = Pattern.compile("^.*(\\d+).*$"); // Pattern to extract number
// Then try matching
Matcher matcher = pattern.matcher(fileName);
String numberExt = "";
if(matcher.matches()) {
numberExt = matcher.group(1);
} else {
// The filename has no numeric value in it.
}
// Use your numberExt here.
You can just separate every numeric part from alphanumeric ones by using a regular expression:
public static void main(String args[]) {
String str = "ABC.12.txt.gz";
String[] parts = str.split("(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)");
// view the resulting parts
for (String s : parts) {
System.out.println(s);
}
// do what you want with those values...
}
This will output
ABC.
12
.txt.gz
Then take the parts you need and do what you have to do with them.
We can use something like this to extract the number from a string
String fileName="ABC.12.txt.gz";
String numberOnly= fileName.replaceAll("[^0-9]", "");

Remove all escape symbols from String svg JAVA

I am using this regex to remove all escape symbols from my string svg .replaceAll("\\{", "{") I tested it in a simple main method like and it works fine
System.out.println("<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" version=\"1.1\" class=\"highcharts-root\" style=\"font-family:"lucida grande", "lucida sans unicode", arial, helvetica, sans-serif;font-size:12px;\" xmlns=\"http://www.w3.org/2000/svg\" width=\"600\" height=\"350\"><desc>Created"
+ " with Highcharts 5.0.7</desc><defs><clipPath id=\"highcharts-lqrco8y-45\"><rect x=\"0\" y=\"0\" width=\"580\" height=".replaceAll("\\{", "{"));
When i tried to use this in my code there is no exception but the replace all function seems no to work.
#RequestMapping(value = URL, method = RequestMethod.POST)
public String svg(#RequestBody String svg) throws TranscoderException, IOException {
String result = svg;
String passStr = (String) result.subSequence(5, result.length() - 2);
passStr = passStr.replaceAll("\\{", "{");
InputStream is = new ByteArrayInputStream(Charset.forName("UTF-8").encode(passStr).array());
service.converter(is);
return result;
}
Try it this way:
public static void main(String[] args) {
String test = "abc\\{}def";
System.out.println("before: " + test);
System.out.println("after: " + test.replaceAll("\\\\[{]", "{"));
}
Output
before: abc\{}def
after: abc{}def
Your first example doesn't have any "{" characters, so I'm not surprised it works (??).
But anyway, your regex is wrong. Backslash is in an escape character in both Java Strings and in regular expressions. So \\ in a string just means \. That means your regex is actually just \{ , which just means {. So all you are really doing is replacing { with {.
If you want to make a regex that replaces \{ with {, you need to double each of your backslash characters in the regex: \\\\{.

Java replaceALL for string

I have a string:
100-200-300-400
i want replace the dash to "," and add single quote so it become:
'100','200','300','400'
My current code only able to replace "-" to "," ,How can i plus the single quote?
String str1 = "100-200-300-400";
split = str1 .replaceAll("-", ",");
if (split.endsWith(","))
{
split = split.substring(0, split.length()-1);
}
You can use
split = str1 .replaceAll("-", "','");
split = "'" + split + "'";
As an alternative if you are using java 1.8 then you could create a StringJoiner and split the String by -. This would be a bit less time efficient, but it would be more safe if you take, for example, a traling - into account.
A small sample could look like this.
String string = "100-200-300-400-";
String[] splittet = string.split("-");
StringJoiner joiner = new StringJoiner("','", "'", "'");
for(String s : splittet) {
joiner.add(s);
}
System.out.println(joiner);
This will work for you :
public static void main(String[] args) throws Exception {
String s = "100-200-300-400";
System.out.println(s.replaceAll("(\\d+)(-|$)", "'$1',").replaceAll(",$", ""));
}
O/P :
'100','200','300','400'
Or (if you don't want to use replaceAll() twice.
public static void main(String[] args) throws Exception {
String s = "100-200-300-400";
s = s.replaceAll("(\\d+)(-|$)", "'$1',");
System.out.println(s.substring(0, s.length()-1));
}

complex regular expression in Java

I have a rather complex (to me it seems rather complex) problem that I'm using regular expressions in Java for:
I can get any text string that must be of the format:
M:<some text>:D:<either a url or string>:C:<some more text>:Q:<a number>
I started with a regular expression for extracting the text between the M:/:D:/:C:/:Q: as:
String pattern2 = "(M:|:D:|:C:|:Q:.*?)([a-zA-Z_\\.0-9]+)";
And that works fine if the <either a url or string> is just an alphanumeric string. But it all falls apart when the embedded string is a url of the format:
tcp://someurl.something:port
Can anyone help me adjust the above reg exp to extract the text after :D: to be either a url or a alpha-numeric string?
Here's an example:
public static void main(String[] args) {
String name = "M:myString1:D:tcp://someurl.com:8989:C:myString2:Q:1";
boolean matchFound = false;
ArrayList<String> values = new ArrayList<>();
String pattern2 = "(M:|:D:|:C:|:Q:.*?)([a-zA-Z_\\.0-9]+)";
Matcher m3 = Pattern.compile(pattern2).matcher(name);
while (m3.find()) {
matchFound = true;
String m = m3.group(2);
System.out.println("regex found match: " + m);
values.add(m);
}
}
In the above example, my results would be:
myString1
tcp://someurl.com:8989
myString2
1
And note that the Strings can be of variable length, alphanumeric, but allowing some characters (such as the url format with :// and/or . - characters
You mention that the format is constant:
M:<some text>:D:<either a url or string>:C:<some more text>:Q:<a number>
Capture groups can do this for you with the pattern:
"M:(.*):D:(.*):C:(.*):Q:(.*)"
Or you can do a String.split() with a pattern of "M:|:D:|:C:|:Q:". However, the split will return an empty element at the first index. Everything else will follow.
public static void main(String[] args) throws Exception {
System.out.println("Regex: ");
String data = "M:<some text>:D:tcp://someurl.something:port:C:<some more text>:Q:<a number>";
Matcher matcher = Pattern.compile("M:(.*):D:(.*):C:(.*):Q:(.*)").matcher(data);
if (matcher.matches()) {
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println(matcher.group(i));
}
}
System.out.println();
System.out.println("String.split(): ");
String[] pieces = data.split("M:|:D:|:C:|:Q:");
for (String piece : pieces) {
System.out.println(piece);
}
}
Results:
Regex:
<some text>
tcp://someurl.something:port
<some more text>
<a number>
String.split():
<some text>
tcp://someurl.something:port
<some more text>
<a number>
To extract the URL/text part you don't need the regular expression. Use
int startPos = input.indexOf(":D:")+":D:".length();
int endPos = input.indexOf(":C:", startPos);
String urlOrText = input.substring(startPos, endPos);
Assuming you need to do some validation along with the parsing:
break the regex into different parts like this:
String m_regex = "[\\w.]+"; //in jsva a . in [] is just a plain dot
String url_regex = "."; //theres a bunch online, pick your favorite.
String d_regex = "(?:" + url_regex + "|\\p{Alnum}+)"; // url or a sequence of alphanumeric characters
String c_regex = "[\\w.]+"; //but i'm assuming you want this to be a bit more strictive. not sure.
String q_regex = "\\d+"; //what sort of number exactly? assuming any string of digits here
String regex = "M:(?<M>" + m_regex + "):"
+ "D:(?<D>" + d_regex + "):"
+ "C:(?<D>" + c_regex + "):"
+ "Q:(?<D>" + q_regex + ")";
Pattern p = Pattern.compile(regex);
Might be a good idea to keep the pattern as a static field somewhere and compile it in a static block so that the temporary regex strings don't overcrowd some class with basically useless fields.
Then you can retrieve each part by its name:
Matcher m = p.matcher( input );
if (m.matches()) {
String m_part = m.group( "M" );
...
String q_part = m.group( "Q" );
}
You can go even a step further by making a RegexGroup interface/objects where each implementing object represents a part of the regex which has a name and the actual regex. Though you definitely lose the simplicity makes it harder to understand it with a quick glance. (I wouldn't do this, just pointing out its possible and has its own benefits)

Remove all words starting with "http" in a string?

I simply want to replace all words starting with "http" and ends with space or "\n" in a string
Example string is.
Full results below;
http://www.google.com/abc.jpg is a url of an image.
or some time it comes like https://www.youtube.com/watch?v=9Xwhatever this is an example text
Result of the string should be like
is a url of an image.
or some time it comes like this is an example text
I simply want to replace it with ""; i know the logic but don't know the function.
My logic is
string.startwith("http","\n")// starts with http and ends on next line or space
.replaceAll("")
public static void main(String[] args) {
String s = "https://www.google.com/abc.jpg is a url of an image.";
System.out.println(s.replaceAll("https?://.*?\\s+", ""));
}
O/P :
is a url of an image.
String.replaceAll() allows you to use a regex. In a regex, ^ allows you to capture the beginning of the String. Hence, you can do like that :
System.out.print("http://google-http".replaceAll("^http", ""));
result:
://google-http
The http at the beginning has be removed but not the one at the end.
public static void main(String[] args) {
String str = "https://www.google.com/abc.jpg is a url of an image.";
String subStr1 = "http://";
String substr2 = "https://";
String foundStr = "";
if(str.startsWith(subStr1)) {
foundStr = subStr1;
}
if (str.startsWith(subStr2)) {
foundStr = subStr2;
}
str = str.replaceAll(foundStr, "");
str = str.replaceAll(" ", "");
}

Categories