I have a String xxxxxxxxsrc="/slm/attachment/63338424306/Note.jpg"xxxxxxxx Now, I want to extract substrings slm/attachment/63338424306/Note.jpg & Note.jpg from the String in to variables i.e. temp1 & temp2.
How can I do that using regex in Java?
Note: 63338424306 could be any random no. & Note.jpg could be anything
like Note.png or abc.jpg or xxxx.yyy etc.
Please help me to extract these two strings using regex.
You can use negative look behind to get file name
((?:.(?<!/))+)\"
and below regex to get full path
/(.*)\"
Sample code
public static void main(String[] args) {
Pattern pattern = Pattern.compile("/(.*)\"");
Pattern pattern1 = Pattern.compile("((?:.(?<!/))+)\"");
String matchString = "/slm/attachment/63338424306/Note.jpg\"xxxxxxxx";
Matcher matcher = pattern.matcher(matchString);
String fullString = "";
while (matcher.find()) {
fullString = matcher.group(1);
}
matcher = pattern1.matcher(matchString);
String fileName = "";
while (matcher.find()) {
fileName = matcher.group(1);
}
System.out.println(fullString + " " + fileName);
}
As per your comment taking the string as declared below in my code:
Please clarify if your input string is not like this or I'm missing something.
public static void main(String[] args) {
String str = "xxxxxxxxsrc=\"/slm/attachment/63338424306/Note.jpg\"xxxxxxxx";
String url = null;
// The below pattern will grab string between quotes
Pattern p = Pattern.compile("\"([^\"]*)\"");
Matcher m = p.matcher(str);
while (m.find()) {
System.out.println(m.group(1));
url = m.group(1);
}
// and this will grab filename from the path(url)
p = Pattern.compile("(?:.(?<!/))+$");
m = p.matcher(url);
while (m.find()) {
System.out.println(m.group());
}
}
Related
in the following examples, I want to extract "Mywebsite.xx". How do I do it?
Search Mywebsite.de ----> Mywebsite.de
Mywebsite.de durchsuchen ----> Mywebsite.de
Search Mywebsite.co.uk ----> Mywebsite.co.uk
Mywebsite.co.uk something ----> Mywebsite.co.uk
I tried this but it's not working:
String mydata2 = "Mywebsite.de durchsuchen";
Matcher matcher = Pattern.compile("Mywebsite(.*?)").matcher(mydata2);
if (matcher.find())
{
System.out.println(matcher.group(1));
}
You can use the Mywebsite\.([a-z]+\.[a-z]+)
public static void extractDomain(String domain){
Pattern domainPattern = Pattern.compile("Mywebsite\.([a-z]+\.[a-z]+)");
Matcher match = domainPattern.matcher(domain);
System.out.println("Mywebsite"+ match.group(1));
}
You can try this pattern match for the input array of possible strings. The first four strings will match.
String patternStr = "(\\s|^)mywebsite([.][a-z][a-z]){1,2}(\\s|$)";
Pattern pattern = Pattern.compile(patternStr, Pattern.CASE_INSENSITIVE);
String [] stringsToMatch = {
"Mywebsite.co.uk xyz",
"abc Mywebsite.co.uk",
"abc Mywebsite.co.uk xyz",
"Mywebsite.co.uk",
"Mywebsite.co.uk.us",
"Mywebsite"
};
for (String str : stringsToMatch) {
Matcher matcher = pattern.matcher(str);
System.out.println(str);
if (matcher.find()) {
System.out.println(" " + str.substring(matcher.start(), matcher.end()));
}
else {
System.out.println(" No match");
}
}
To find the domain name from a string you can use regex like
(?:http[s]?:\/\/)?(?:[a-zA-Z]|[0-9]|[$-_#.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+
This program will capture list of domain from your string
public static List<String> extractDomainNames(String input) {
List<String> domainNames = new ArrayList<>();
String domainNamePattern = "(?:http[s]?://)?(?:[a-zA-Z]|[0-9]|[$-_#.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+";
Pattern pattern = Pattern.compile(domainNamePattern);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
domainNames.add(matcher.group());
}
return domainNames;
}
You could try this regex: Mywebsite\.[^\s]+
String input = "Mywebsite.de durchsuchen";
Pattern regexPattern = Pattern.compile("Mywebsite\.[^\s]+");
Matcher regexMatcher = regexPattern.matcher(input);
while (regexMatcher.find()) {
System.out.println(regexMatcher.group());
}
See regex demo here
I have a text file in json, and I want to replace NumberInt(x) with the number x.
In the text file, there are records/data which is in json that has a field workYear: NumberInt(2010) as an example.
I want to replace this into workYear: 2010 by removing NumberInt( and ).
This NumberInt(x) is located anywhere in text file and I want to replace all of it with its number.
I can search all the occurences of this, but I am not sure how to replace it with just the number value.
String json = <json-file-content>
String sPattern = "NumberInt\\([0-9]+\\)";
Pattern pattern = Pattern.compile(sPattern);
Matcher matcher = pattern.matcher(json);
while (matcher.find()) {
String s = matcher.group(0);
int workYear = Integer.parseInt(s.replaceAll("[^0-9]", ""));
System.out.println(workYear);
}
I would like to replace all the NumberInt(x) with just the number value int json String... then I will update the text file (json file).
Thanks!
Following should work. You need to capture the tokens.
String json = "workYear:NumberInt(2010) workYear:NumberInt(2011)";
String sPattern = "NumberInt\\(([0-9]+)\\)";
Pattern pattern = Pattern.compile(sPattern);
Matcher matcher = pattern.matcher(json);
List<String> numbers = new ArrayList<>();
while (matcher.find()) {
String s = matcher.group(1);
numbers.add(s);
}
for (String number: numbers) {
json = json.replaceAll(String.format("NumberInt\\(%s\\)", number), number);
}
System.out.println(json);
You could build the output using a StringBuilder like below,
Please refer to JavaDoc for appendReplacement for info on how this works.
String s = "workYear: NumberInt(2010)\nworkYear: NumberInt(2012)";
String sPattern = "NumberInt\\([0-9]+\\)";
Pattern pattern = Pattern.compile(sPattern);
Matcher matcher = pattern.matcher(s);
StringBuilder sb = new StringBuilder();
while (matcher.find()) {
String s2 = matcher.group(0);
int workYear = Integer.parseInt(s2.replaceAll("[^0-9]", ""));
matcher.appendReplacement(sb, String.valueOf(workYear));
}
matcher.appendTail(sb);
String result = sb.toString();
I have a method like this :
for(String abc:abcs){
xyz = abc.replaceAll(abc+"\\(", "_"+abc+"\\(");
}
How to avoid replacing few replacements which have specific prefixes for them in java
I tried this :
String data = "Today, abc.xyz is object oriented language";
String regex = "(?<!abc.)xyz";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(data);
System.out.println(matcher.find());
Does this work for you?
package test;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Test {
public static void main(String[] args) {
String prefix = "abc";
String replaceWith = " text";
String input = "This xyz is an example xyz to show how you can replace certains values of the xyz.\n"
+ "The xyz can conain any arbitrary xyz, for example abc.xyz.";
Pattern pattern = Pattern.compile("[^" + prefix + "].xyz");
Matcher m = pattern.matcher(input);
while (m.find()) {
input = input.replace(m.group().substring(1), replaceWith);
}
System.out.println(input);
}
}
I have string like
{Action}{RequestId}{Custom_21_addtion}{custom_22_substration}
{Imapact}{assest}{custom_23_multiplication}.
From this I want only those sub string which contains "custom".
For example from above string I want only
{Custom_21_addtion}{custom_22_substration}{custom_23_multiplication}.
How can I get this?
You can use a regular expression, looking from {custom to }. It will look like this:
Pattern pattern = Pattern.compile("\\{custom.*?\\}", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputString);
while (matcher.find()) {
System.out.print(matcher.group());
}
The .* after custom means 0 or more characters after the word "custom", and the question mark limits the regex to as few character as possible, meaning that it will break on the next } that it can find.
If you want an alternative solution without regex:
String a = "{Action}{RequestId}{Custom_21_addtion}{custom_22_substration}{Imapact}{assest}{custom_23_multiplication}";
String[] b = a.split("}");
StringBuilder result = new StringBuilder();
for(String c : b) {
// if you want case sensitivity, drop the toLowerCase()
if(c.toLowerCase().contains("custom"))
result.append(c).append("}");
}
System.out.println(result.toString());
you can do it sth like this:
StringTokenizer st = new StringTokenizer(yourString, "{");
List<String> llista = new ArrayList<String>():
Pattern pattern = Pattern.compile("(\W|^)custom(\W|$)", Pattern.CASE_INSENSITIVE);
while(st.hasMoreTokens()) {
String string = st.nextElement();
Matcher matcher = pattern.matcher(string);
if(matcher.find()){
llista.add(string);
}
}
Another solution:
String inputString = "{Action}{RequestId}{Custom}{Custom_21_addtion}{custom_22_substration}{Imapact}{assest}" ;
String strTokens[] = inputString.split("\\}");
for(String str: strTokens){
Pattern pattern = Pattern.compile( "custom", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputString);
if (matcher.find()) {
System.out.println("Tag Name:" + str.replace("{",""));
}
}
I have a String say:
<encoded:2,Message request>
Now I want to extract 2 and Message request from the line above.
private final String pString = "<encoded:[0-9]+,.*>";
private final Pattern pattern = Pattern.compile(pString);
private void parseAndDisplay(String line) {
Matcher matcher = pattern.matcher(line);
if (matcher.matches()) {
while(matcher.find()) {
String s = matcher.group();
System.out.println("=====>"+s);
}
}
}
This doesn't retrieve it. What is wrong with it
You have to define groups in your regex:
"<encoded:([0-9]+),(.*?)>"
or
"<encoded:(\\d+),([^>]*)"
try
String s = "<encoded:2,Message request>";
String s1 = s.replaceAll("<encoded:(\\d+?),.*", "$1");
String s2 = s.replaceAll("<encoded:\\d+?,(.*)>", "$1");
Try
"<encoded:([0-9]+),([^>]*)"
Also, as suggested in other comments, use group(1) and group(2)
Try this out :
Matcher matcher = Pattern.compile("<encoded:(\\d+)\\,([\\w\\s]+)",Pattern.CASE_INSENSITIVE).matcher("<encoded:2,Message request>");
while (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
}