How to match multiple lines inside double quotes using regex? [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have CSV file which contains following line.
INPUT:
No,NAme,ID,Description
1,Stack,232,"ABCDEFGHIJKLMNO
-- Jiuaslkm asdasdasd"
2,Queue,454,"PQRSTUVWXYZ
-- Other
words here"
3,Que,4343,"sdfwerrew"
OUTPUT EXPECTED:
No,NAme,ID,Description
1,Stack,232,"ABCDEFGHIJKLMNO \n -- Jiuaslkm asdasdasd"
2,Queue,454,"PQRSTUVWXYZ \n -- Other \n words here"
3,Que,4343,"sdfwerrew"
or
No,NAme,ID,Description
1,Stack,232,"ABCDEFGHIJKLMNO -- Jiuaslkm asdasdasd"
2,Queue,454,"PQRSTUVWXYZ -- Other words here"
3,Que,4343,"sdfwerrew"
Is there any java regex pattern available to find and merge the lines based starting double quotes and end quotes?

You are going down the wrong path. Not everything should be solved using regular expressions. CSV parsing is one of those things.
Seriously: you are about to re-invent the wheel. And the wheel you are about to create will be deficient, and prone to break over and over again.
The sane approach: there are many existing CSV parsers for Java out there. They deal perfectly with multi-line values. So: use one of them (see here as starting point for the many choices you have)
There is a nice rule of thumb: when your regex becomes so complicated that you can't write it down yourself; then consider doing things differently. You are the person who owns this code; you will have to maintain and maybe enhance it - not those folks here that are able to write down a regex that solves this one flavor of CSV example input.

Related

Regex to remove prefix and suffix in a string Java [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I want to remove prefix and suffix in a String and extract the middle portion of the string.
For eg: Consider the Strings - "www.hello.com" and "www.test.com"
Here prefix - "www." and suffix - ".com". I want to extract the middle words - hello and test.
Currently i have achieved this using String.replace() method in Java.
str.replace("www.","").replace(".com","");
I want know is there any regular expression available to achieve it in a single method in java.
You could use a regex for that, it would work in the same way. Your regex would simply contain a capture group with both the prefix and the suffix in an OR operation.
(www\.|\.com)
You could then use this like you did with the replace.
String test = "www.test.com";
String output = test.replaceAll("(www\\.|\\.com)","")
P.S. this code is untested. Please don't just copy and paste it expecting everything to work.
(?<=www.)(.*)(?=.com)
This uses the lookbehind and lookahead feature of regex

How to string match the entire line based on the keyword using regex and empty it [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
In a file there are 1000 of lines. I want to empty the line which has particular keyword.
For Example:
Input:
Danny akdfkldnklnflnwklfnlwen
I am sam
asndklfnklnfkldn DANNYkandfnkldnfkldnfklnskln
He is very well mannered.
klansdfnmkldnfkldnfklnsd_danny
Output Should be:
I am sam
He is very well mannered.
The line with keyword Danny should be entirely deleted to get the desired output.
Thanks for your help.
I have tried this http://rubular.com/r/bF0RzeaFYW
But it is not case sensitive.
The easiest option would be to:
Read the file one line at a time.
Use the .contains() method.
If the condition holds, then do nothing
Else, write the output to another file.
If you want to do the whole thing with a regular expression, you could find this expression: ^.*?danny.*?$ and replace it with an empty string. Example here.
Note: You will need to provide the following flags to the engine:
Multi line
Case Insensitive
From the command line: grep -vi danny input.txt
The -v flag removes all lines matching the pattern -- in this case "danny". The -i flag makes it case-insensitive.

How do I extract the following patterns in java [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I have string in following format:
String s = " some text....
[[Category:Anarchism| ]]
[[Category:Political culture]]
[[Category:Political ideologies]]
[[Category:Far-left politics]]
... some more text"
I want to extract all the categories from this text. [Anarchism,Political culture ....,Far-left politics]
Also, is there a good tutorial where I can learn about this regex pattern matching stuff..
Thanks
You can use the following regex to get categories:
\[\[Category:(.+)\]\]
Then you can access to your groups to get the category values.
Remember to add backslash to backslashes if you use on java strings:
\\[\\[Category:(.+)\\]\\]
You can see it working:
Assuming you don't want to select the word "Category" itself, the regex would be:
(?<=Category:).*?(?=])
I'll break this down a bit for you.
The first bit in brackets looks for Category, without actually selecting it.
Next .+? looks for 1-infinity characters (other than a newline), but stops as soon as the next part is matched:
The final brackets tells it to look for a ], but without actually selecting it.
The results would be the bits below highlighted in blue.

How to validate math formular string using regex? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a textbox to input a mathematical formular inlcude +,-,*,/,(,),TRUNC,ROUND,POWER,MOD,SQRT,FLOOR,DECODE. After user input a formula string, i want to validate this string but i don't know how ???
Please help me out.
Unfortunately you can't validate such expressions using regex. The nature of the regular expressions is so, that you never can validate matching parenthesis. Regex is simply to weak.
For more information why it is so: http://en.wikipedia.org/wiki/Regular_expression
In order to validate/parse or evaluate mathematical expressions you need a context free grammar parser. You can relatively simple generate one using one of parser generators. I would recommend
JavaCC: https://javacc.java.net/
Antlr: http://www.antlr.org/
Context free grammars: http://en.wikipedia.org/wiki/Context-free_language
Have a look at this question on Code Review. It shows some code for parsing such expressions and the answers give examples on how to alternatively use the scripting engine that ships with Java.

How do I write a regex that fit a desirable pattern and not fit another pattern? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I am writing a simplified Java compiler. I wrote a regex for variable name:
"(_?[a-zA-Z]+[\w]*)"
and I want to add that the name can not be certain words, like int, double, true, false...
I tryed using ^ , but it is not working.
It can be done with a RE, but it's not easy for a human to write it. Treat keywords as identifiers in the scanner and distinguish the identifiers vs keywords in the tokenizer afterwards. That should be substantially easier.
I don't believe that this should do that via regular expressions but rather can be better done using a HashSet<String> and exclude identifier names that are contained in the set.
^ is used for something else :
^ may appear at the beginning of a pattern to require the match to
occur at the very beginning of a line. For example, ^abc matches
abc123 but not 123abc.
consider using "(?!...)" :
(?!...) is a negative look-ahead because it requires that the
specified pattern not exist.
i suggest that if it's impossible or too hard , go to real coding instead . sometimes , regular expressions can be much slower than real , optimized code , and they can be very confusing and you might have problems finding what's wrong with what you've written.
for trying out your regular expressions , check this one:
http://gskinner.com/RegExr/
for quick referencing , check this one:
http://www.autohotkey.com/docs/misc/RegEx-QuickRef.htm

Categories