I would like to split the following string by commas using a DOTALL regex pattern what will accept letters, numbers, whitespaces and special characters such as underscores and asterisks i.e. #input("Test_1, Test_TWO , TEST_THIRTY_3*") so the output would look like:
"Test_1",
"Test_TWO",
"TEST_THIRTY_3*"
public static void main(String args[])
{
String line = "#input(\"Test_1,Test_TWO , TEST_THIRTY_3*\"\\)\";
String pattern = "#input(\"(.*?)\".*";
Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println("Found word: " + m.group(1) );
}
You have to escape the ( by \( so your regex should look like this #input\(\"(.*?)\".*, second you can use \s*,\s* to split the result like this :
String line = "#input(\"Test_1,Test_TWO , TEST_THIRTY_3*\"\\)";
String pattern = "#input\\(\"(.*?)\".*";
Pattern r = Pattern.compile(pattern, Pattern.DOTALL);
Matcher m = r.matcher(line);
while (m.find()) {
System.out.println(Arrays.toString(m.group(1).split("\\s*,\\s*")));
//----------------------------------------------------^^^^^^^^
}
outputs
[Test_1, Test_TWO, TEST_THIRTY_3*]
If you do not have to stick to regex you might just take the string methods.
List<String> output = Arrays.asList(line.split(","));
Related
I have below string:
String line = put retur#ERns between #errf #fgrf#re paragraphs #fg^%tg2#785Ty*;
How can I get below values with regex:
#ERns
#errf
#fgrf
#re
#fg^%tg2
#785Ty*
My code:
String pattern = "^#\S+";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);
while (m.find()) {
Log.i("log", m.group());
}
You can use this regex instead:
#[^#\s]*
RegEx Demo
Negated character class [^#\s] matches a character that is not # and not a whitespace.
In Java use:
final String pattern = "#[^#\\s]*";
So I have a few lines like such:
tag1:
line1word1 lineoneanychar
line2word1
tag2:
line1word1 ....
line2word1 .....
I am trying to build a java regex that extracts all the data under the tags. i.e:
String parsed1 = line1word1 lineone\nline2word1
String parsed2 = line1word1 ....\nline2word1 .....
I believe the right way to do this is using something like this, but I haven't quite got it right:
Pattern p = Pattern.compile("tag1:\n( {1}.*)\n(?!\\w+)", Pattern.DOTALL);
Matcher m = p.matcher(clean_data);
if(m.find()){
System.out.println(m.group(1));
}
Any help would be appreciated!
Could be something like that
public static void main(String[] args) throws Exception {
String input = "tag1:\n"
+ " line1word1 lineoneanychar\n"
+ " line2word1\n"
+ "tag2:\n"
+ " line1word1 ....\n"
+ " line2word1 .....\n";
Pattern p = Pattern.compile("tag\\d+:$\\n((?:^\\s.*?$\\n)+)", Pattern.DOTALL|Pattern.MULTILINE);
Matcher m = p.matcher(input);
while(m.find()){
System.out.println(m.group(1));
}
}
Remember to escape \\ in your regex.
\d is a number
\s a space
(?:something) is for making a group that won't be a real 'group' in the matcher
How can i get Strings between double quotes using Regex in Java?
_settext(_textbox(0,_near(_span("My Name"))) ,"Brittas John");
ex: I need My Name and Brittas John
Get the matched group from index 1 that is captured by enclosing inside the parenthesis (...)
"([^"]*)"
DEMO
Pattern explanation:
" '"'
( group and capture to \1:
[^"]* any character except: '"' (0 or more times) (Greedy)
) end of \1
" '"'
sample code:
Pattern p = Pattern.compile("\"([^\"]*)\"");
Matcher m = p.matcher("_settext(_textbox(0,_near(_span(\"My Name\"))) ,\"Brittas John\");");
while (m.find()) {
System.out.println(m.group(1));
}
Try this regex..
public static void main(String[] args) {
String s = "_settext(_textbox(0,_near(_span(\"My Name\"))) ,\"Brittas John\");";
Pattern p = Pattern.compile("\"(.*?)\"");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group(1));
}
}
O/P :
My Name
Brittas John
I have string like
{Action}{RequestId}{Custom_21_addtion}{custom_22_substration}
{Imapact}{assest}{custom_23_multiplication}.
From this I want only those sub string which contains "custom".
For example from above string I want only
{Custom_21_addtion}{custom_22_substration}{custom_23_multiplication}.
How can I get this?
You can use a regular expression, looking from {custom to }. It will look like this:
Pattern pattern = Pattern.compile("\\{custom.*?\\}", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputString);
while (matcher.find()) {
System.out.print(matcher.group());
}
The .* after custom means 0 or more characters after the word "custom", and the question mark limits the regex to as few character as possible, meaning that it will break on the next } that it can find.
If you want an alternative solution without regex:
String a = "{Action}{RequestId}{Custom_21_addtion}{custom_22_substration}{Imapact}{assest}{custom_23_multiplication}";
String[] b = a.split("}");
StringBuilder result = new StringBuilder();
for(String c : b) {
// if you want case sensitivity, drop the toLowerCase()
if(c.toLowerCase().contains("custom"))
result.append(c).append("}");
}
System.out.println(result.toString());
you can do it sth like this:
StringTokenizer st = new StringTokenizer(yourString, "{");
List<String> llista = new ArrayList<String>():
Pattern pattern = Pattern.compile("(\W|^)custom(\W|$)", Pattern.CASE_INSENSITIVE);
while(st.hasMoreTokens()) {
String string = st.nextElement();
Matcher matcher = pattern.matcher(string);
if(matcher.find()){
llista.add(string);
}
}
Another solution:
String inputString = "{Action}{RequestId}{Custom}{Custom_21_addtion}{custom_22_substration}{Imapact}{assest}" ;
String strTokens[] = inputString.split("\\}");
for(String str: strTokens){
Pattern pattern = Pattern.compile( "custom", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputString);
if (matcher.find()) {
System.out.println("Tag Name:" + str.replace("{",""));
}
}
How do I write a regex that will match multiline delmitied by new line and spaces?
The following code works for one multiline but does not work if the input
is
String input = "A1234567890\nAAAAA\nwwwwwwww"
By which I mean matches() is not true for the input.
Here is my code:
package patternreg;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class pattrenmatching {
public static void main(String[] args) {
String input = "A1234567890\nAAAAA";
String regex = ".*[\\w\\s\\w+].*";
Pattern p = Pattern.compile(regex,Pattern.MULTILINE);
Matcher m =p.matcher(input);
if (m.matches()) {
System.out.println("matches() found the pattern \""
+ "\" starting at index "
+ " and ending at index ");
} else {
System.out.println("matches() found nothing");
}
}
}
You could also add the DOTALL flag to get it working:
Pattern p = Pattern.compile(regex, Pattern.MULTILINE | Pattern.DOTALL);
I believe your problem is that .* is greedy, so it's matching all the other '\n' in the string.
If you want to stick with the code above try: "[\S]*[\s]+". Which means match zero or more non-whitespace chars followed by one or more whitespace chars.
fixed up code:
public static void main(String[] args) {
String input = "A1234567890\nAAAAA\nsdfasdf\nasdfasdf";
String regex = "[\\S]*[\\s]+";
Pattern p = Pattern.compile(regex, Pattern.MULTILINE);
Matcher m = p.matcher(input);
while (m.find()) {
System.out.println(input.substring(m.start(), m.end()) + "*");
}
if (m.matches()) {
System.out.println("matches() found the pattern \"" + "\" starting at index " + " and ending at index ");
} else {
System.out.println("matches() found nothing");
}
}
OUTPUT:
A1234567890
* AAAAA
* sdfasdf
* matches() found nothing
Also, a pattern of
"([\\S]*[\\s]+)+([\\S])*"
will match the entire output (matcher returns true) but messes up the token part of your code.