Unable to write regex in Java [closed] - java

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 days ago.
Improve this question
I am reading a CSV file using Java and wanted to write a regular expression to match below pattern. If it does not matches then output in the file. My issue is that i am not able to write the correct regex expression. Can someone help ?
Pattern = Name+" - "+NUMBER
Example:
Foo, Blah - 2345 = This should not output in the file
Some-Crap-Hello-1234 = this should output in the file
Some Name, Foo - 7898 = This should not output since the pattern matches with the last hyphen and space and number after that
Some Name - 1235 = This should output in the file since it has bunch of whitespaced before the hyphen
Foo, Blah - 11233 = This should not output in the file

Maybe this regex can help u
^[A-Za-z]+ - \d+$
For example, this pattern would match the following strings:
John - 123
Sarah - 4567
But not:
Sarah - 4567 other string here
Alexandra-9 (no space around the hyphen)
Alexandra - 9

Related

Regex to remove prefix and suffix in a string Java [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to remove prefix and suffix in a String and extract the middle portion of the string.
For eg: Consider the Strings - "www.hello.com" and "www.test.com"
Here prefix - "www." and suffix - ".com". I want to extract the middle words - hello and test.
Currently i have achieved this using String.replace() method in Java.
str.replace("www.","").replace(".com","");
I want know is there any regular expression available to achieve it in a single method in java.
You could use a regex for that, it would work in the same way. Your regex would simply contain a capture group with both the prefix and the suffix in an OR operation.
(www\.|\.com)
You could then use this like you did with the replace.
String test = "www.test.com";
String output = test.replaceAll("(www\\.|\\.com)","")
P.S. this code is untested. Please don't just copy and paste it expecting everything to work.
(?<=www.)(.*)(?=.com)
This uses the lookbehind and lookahead feature of regex

How to string match the entire line based on the keyword using regex and empty it [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
In a file there are 1000 of lines. I want to empty the line which has particular keyword.
For Example:
Input:
Danny akdfkldnklnflnwklfnlwen
I am sam
asndklfnklnfkldn DANNYkandfnkldnfkldnfklnskln
He is very well mannered.
klansdfnmkldnfkldnfklnsd_danny
Output Should be:
I am sam
He is very well mannered.
The line with keyword Danny should be entirely deleted to get the desired output.
Thanks for your help.
I have tried this http://rubular.com/r/bF0RzeaFYW
But it is not case sensitive.
The easiest option would be to:
Read the file one line at a time.
Use the .contains() method.
If the condition holds, then do nothing
Else, write the output to another file.
If you want to do the whole thing with a regular expression, you could find this expression: ^.*?danny.*?$ and replace it with an empty string. Example here.
Note: You will need to provide the following flags to the engine:
Multi line
Case Insensitive
From the command line: grep -vi danny input.txt
The -v flag removes all lines matching the pattern -- in this case "danny". The -i flag makes it case-insensitive.

Regular Expression for EDI file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I want to get a value in a edi file having below format
\nRJCK3:0*20180105*U*127.35
\nRJCK3:0*20180105*B*127.35
I want a value U in 1st case, which comes between 2nd & 3rd star after RJC*K3, and want B in second string
Precisely, want to fetch a single character from a string, where that character will comes between 2nd & 3rd star(*) of a RJC*K3(static value).
You can use a classic Pattern matching way :
String str1 = "\\nRJC*K3:0*20180105*U*127.35";
Matcher m = Pattern.compile("RJC\\*K3.*\\*(\\w)\\*.*").matcher(str1);
String res1 = m.find() ? m.group(1) : "";
System.out.println(res1); // U
But if there is always the same amount of * before the letter you want, you cn easily split and take the 3rd part :
String str2 = "\\nRJC*K3:0*20180105*G*127.35";
String res2 = str2.split("\\*")[3];
System.out.println(res2); // G
There is no need to fight with edi files, you can use available libraries.
Please take a look at https://github.com/imsweb/x12-parser/
RJCloop.getSegment("RJC").getElementValue("RJC02")
can get you the value needed.

Regex that removes everything but the number [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I am trying to use java's string.replaceAll() or replaceFirst() method in order to edit data read from a pdf document. A line of data that could be returned is:
21/1**E (6-11) 4479 77000327633 (U)
I wish to only store the 77000327633 into a variable for working with and looking for the correct regex that will capture ONLY this 11 digit number. I've tried searching around for a regex but nothing seems to give me my desired outcome.
It could be done like this:
String value = "21/1**E (6-11) 4479 77000327633 (U)";
Pattern pattern = Pattern.compile(".* (\\d{11}) .*");
System.out.println(pattern.matcher(value).replaceAll("$1"));
Output:
77000327633
NB: This assumes that your number has 11 digits and that there is a space before and after.
NB2: It is not meant to be perfect it is only to show the idea which is here to define a global pattern with a group and replace everything by the content of the group
This is it : (.*)[ ]([0-9])*[ ](.*)
Can access to your value using $2

How do I extract the following patterns in java [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have string in following format:
String s = " some text....
[[Category:Anarchism| ]]
[[Category:Political culture]]
[[Category:Political ideologies]]
[[Category:Far-left politics]]
... some more text"
I want to extract all the categories from this text. [Anarchism,Political culture ....,Far-left politics]
Also, is there a good tutorial where I can learn about this regex pattern matching stuff..
Thanks
You can use the following regex to get categories:
\[\[Category:(.+)\]\]
Then you can access to your groups to get the category values.
Remember to add backslash to backslashes if you use on java strings:
\\[\\[Category:(.+)\\]\\]
You can see it working:
Assuming you don't want to select the word "Category" itself, the regex would be:
(?<=Category:).*?(?=])
I'll break this down a bit for you.
The first bit in brackets looks for Category, without actually selecting it.
Next .+? looks for 1-infinity characters (other than a newline), but stops as soon as the next part is matched:
The final brackets tells it to look for a ], but without actually selecting it.
The results would be the bits below highlighted in blue.

Categories