I am trying to remove a specific lines in a text file using regex but I am receiving an Illegal State Exception. I am recently trying to get accustomed to regex and have tried to to use match.matches(); but that solution has not worked for me . any advice to what I am doing wrong
try {
BufferedReader br = new BufferedReader(new FileReader("TestFile.txt"));
//System.out.println(br.toString());
ArrayList<String> list = new ArrayList<String>();
String line= br.readLine() ;
while (br.readLine() != null ) {
//System.out.println(line);
//System.out.println("test1"); {
Pattern regex = Pattern.compile("[^\\s\"]+|\"[^\"]*\"");
Matcher regexMatcher = regex.matcher(line);
String match = regexMatcher.group();// here is where the illegalstateexception occurs
match = removeLeadingChar(match, "\"");
match = removeLeadingChar(match, "\"");
list.add(match);
// }
// br.close();
System.out.println(br);
Exception in thread "main" java.lang.IllegalStateException: No match found
at java.base/java.util.regex.Matcher.group(Unknown Source)
at java.base/java.util.regex.Matcher.group(Unknown Source)
Use Matcher.find() method to see if there is a match in the regular expression pattern. Debug the results of the regexMatcher.find() method in the IDE(e.g. IntelliJ)
try {
BufferedReader br = new BufferedReader(new FileReader("TestFile.txt"));
ArrayList<String> list = new ArrayList<>();
String line;
// Assign one line read from the file to a variable
while ((line = br.readLine()) != null) {
System.out.println(line);
Pattern regex = Pattern.compile("[^\\s\"]+|\"[^\"]*\"");
Matcher regexMatcher = regex.matcher(line);
// Returns true if a match is found for the regular expression pattern.
while (regexMatcher.find()) {
String match = regexMatcher.group();
match = removeLeadingChar(match, "\"");
match = removeLeadingChar(match, "\"");
list.add(match);
}
}
// What is the purpose of this code?
System.out.println(br);
// If you want to output the string elements of the list
System.out.println(list.toString());
// must be closed after use.(to prevent memory leak)
br.close();
} catch (IOException e) {
// exception handling
e.printStackTrace();
}
You had the while loop wrong so it causes the line to be null, try that:
try {
BufferedReader br = new BufferedReader(new FileReader("TestFile.txt"));
ArrayList<String> list = new ArrayList<String>();
String line; // <--- FIXED
while ((line = br.readLine()) != null) { // <--- FIXED
Pattern regex = Pattern.compile("[^\\s\"]+|\"[^\"]*\"");
Matcher regexMatcher = regex.matcher(line);
String match = regexMatcher.group();// here is where the illegalstateexception occurs
match = removeLeadingChar(match, "\"");
match = removeLeadingChar(match, "\"");
list.add(match);
}
br.close();
System.out.println(list.toString());
}
Related
I have a code which replace some characters (space, tabulator) of string introduced by the user, and then shows the text:
System.out.println("Text:");
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(System.in));
try {
String text = bufferedReader.readLine();
text = text.replaceAll("\n", "");
text = text.replaceAll(" ", "");
text = text.replaceAll("\t", "");
System.out.println(text);
} catch (IOException e) {
}
But when I paste a text of varios lines:
First Substring Introduced
Second Substring Introduced
Third Substring Introduced
it shows just the substring before the first newline like:
firstSubtringIntroduced
I want to obtain the next result of whole pasted text:
FirstSubstringIntroducedSecondSubstringIntroducedThirdSubstringIntroduced
You are reading just one line, the first one:
String text = bufferedReader.readLine(); //just one line
That's why you got that output that only shows the first line processed. You should make a loop in order to read all of the lines you are entering:
while((text=bufferedReader.readLine())!=null)
{
text = text.replaceAll("\n", "");
text = text.replaceAll(" ", "");
text = text.replaceAll("\t", "");
System.out.print(text);
}
The first loop will print FirstSubtringIntroduced, the second SecondSubstringIntroduced, and so on, until all the lines are processed.
Try aggregating all lines together, after removing tab and space from each line:
StringBuilder sb = new StringBuilder();
String text = "";
try {
while ((text = br.readLine()) != null) {
text = text.replaceAll("[\t ]", "");
sb.append(text);
}
}
catch (IOException e) {
}
System.out.println(sb);
The issue here is that your BufferedReader is reading one line at a time.
As an alternative, and closer to your current solution, you could just using System.out.print, which does not automatically print a newline, instead of System.out.println:
try {
while ((text = br.readLine()) != null) {
text = text.replaceAll("[\t ]", "");
System.out.print(text);
}
}
catch (IOException e) {
}
Note that String#replaceAll expects a regular expression. String#replace replaces all occurrences of the first argument with the second argument (which is what you want).
System.out.println(text.replace("\n", "").replace("\r", ""));
The method names are a little bit confusing.
public static void main(String args[]) {
System.out.println("Text:");
StringBuilder stringBuilder = new StringBuilder();
try (InputStreamReader inputStreamReader = new InputStreamReader(System.in);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
Scanner scanner = new Scanner(bufferedReader);
) {
while (scanner.hasNext()) {
stringBuilder.append(scanner.next());
}
} catch (IOException e) {
e.printStackTrace();
}
System.out.println(stringBuilder.toString());
}
I do think this is what you need.
{
"TEST":"189456",
"TEST1":"X_Y_Z",
"TEST2":"Y_Z_W",
"TEST3":"GGG ",
"TEST4":"32423423233322"
},
{
"TEST":"123456",
"TEST1":"X_E_Z",
"TEST2":"T_Z_W",
"TEST3":"EWE ",
"TEST4":"324234243234"
}
This is a .txt file I want to read and print only 189456,123456 from the above file.Can anyone help me in doing this.Please find the code for reference.Please post the easiest code.....
Pattern p = Pattern.compile("\"Test\"\\s*:\\s*\"(.*)\"", Pattern.CASE_INSENSITIVE);
while ( (line = bf.readLine()) != null) {
linecount++;
Matcher m = p.matcher(line);
// indicate all matches on the line
while (m.find()) {
System.out.println(m.group(1));
}
}
Another way to do it:
while ((line = br.readLine()) != null) {
if(line.contains("\"TEST:\"")){
String[] lineValues = line.split(":");
System.out.println(lineValues[1].replace("\"", "").replace(",",""));
}
}
As for a Regex solution :
(.*)\"TEST":\"(.*?)\"
Note the ? , it makes your regex to stop at the first match of ".
With spaces in between :
(.*)\"TEST"\s*:\s*\"(.*?)\"
With provided input, you should read it as json instead of raw text.
com.fasterxml.jackson.databind.ObjectMapper mapper = new com.fasterxml.jackson.databind.ObjectMapper();
List<TestObj> test = new ArrayList<TestObj>();
test = mapper.readValue(new File("c:\\YourFile.txt"), test.getClass());
Where TestObj is something like this:
class TestObj {
String test;
String test1; // You should use json annotation here because it does not match your json field name.
...
// getter setter methods
}
Hope I understood the question the right way :D
String saveData;
Pattern p = Pattern.compile("\"Test\"\\s*:\\s*\"(.*)\"", Pattern.CASE_INSENSITIVE);
while ( (line = bf.readLine()) != null) {
linecount++;
Matcher m = p.matcher(line);
// indicate all matches on the line
if(line.contains("189456") || line.contains("123456")) {
saveData = line;
}
}
if the String you get from readLine() contains the searched string it will save it in saveData
FileInputStream fstream = new FileInputStream("D:\\prac\\src\\test.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
while ((strLine = br.readLine()) != null) {
if(strLine.contains("\"TEST\":")){
System.out.println(strLine.split(":")[1].replaceAll("\"","").replace(",",""));
}
}
br.close();
}
Output:
189456
123456
I have a pattern here which finds the integers after a comma.
The problem I have is that my return value is in new lines, so the pattern only works on the new line. How do I fix this? I want it to find the pattern in every line.
All help is appreciated:
url = new URL("https://test.com");
con = url.openConnection();
is = con.getInputStream();
br = new BufferedReader(new InputStreamReader(is));
while ((line = br.readLine()) != null) {
String responseData = line;
System.out.println(responseData);
}
pattern = "(?<=,)\\d+";
pr = Pattern.compile(pattern);
match = pr.matcher(responseData); // String responseData
System.out.println();
while (match.find()) {
System.out.println("Found: " + match.group());
}
Here is the response returned as a string:
test.test.test.test.test-test,0,0,0
test.test.test.test.test-test,2,0,0
test.test.test.test.test-test,0,0,3
Here is the printout:
Found: 0
Found: 0
Found: 0
The problem is with building your String, you're assigning only the last line from the BufferedReader:
responseData = line;
If you print responseData before you try to match, you'll see it's only one line, and not what you expected.
Since you're printing the buffer's content using a System.out.println, you do see the whole result, but what's getting saved to responseData is actually the last line.
You should use a StringBuilder to build the whole string:
StringBuilder str = new StringBuilder();
while ((line = br.readLine()) != null) {
str.append(line);
}
responseData = str.toString();
// now responseData contains the whole String, as you expected
Tip: Use the debugger, it'll make you better understand your code and will help you to find bugs very faster.
You can use the Pattern.MULTILINE option when compiling your regex:
pattern = "(?<=,)\\d+";
pr = Pattern.compile(pattern, Pattern.MULTILINE);
I want to apply my regular expression not just to the first line of the text file, but to the all lines together.
Currently it matches only when the entire appropriate match is on one line. And if the appropriate match continues on the next line - it doesn't match at all.
class Parser {
public static void main(String[] args) throws IOException {
Pattern patt = Pattern.compile("(include|"
+ "integrate|"
+ "driven based on|"
+ "facilitate through|"
+ "contain|"
+ "using|"
+ "equipped"
+ "integrate|"
+ "implement|"
+ "utilized to facilitate|"
+ "comprise){1}"
+ "[\\s\\w\\,\\(\\)\\;\\:]*\\."); //Regex
BufferedReader r = new BufferedReader(new FileReader("E:/test/test.txt")); // read the file
String line;
PrintWriter pWriter = null;
while ((line = r.readLine()) != null) {
Matcher matcher = patt.matcher(line);
while (matcher.find()) {
try{
pWriter = new PrintWriter(new BufferedWriter(new FileWriter("E:/test/test1.txt", true)));//append any given input
pWriter.println(matcher.group()); //write the result of matcher to the new file
} catch (IOException ioe) {
ioe.printStackTrace();
} finally {
if (pWriter != null){
pWriter.flush();
pWriter.close();
}
}
System.out.println(matcher.group());
}
}
}
}
Change while ((line = r.readLine()) != null) to this:
String file = ""; // Basically, a conglomerate of all of the lines in the file
while ((line = r.readLine()) != null) {
file += line; // Append each line to the "file" string
}
Matcher matcher = patt.matcher(file);
while (matcher.find()) {
/* Blah blah blah, your outputting goes here. */
}
The reason why this happens is because you're doing each line individually. For what you want, you need to apply the regex to the file all at once.
Currently the matcher is applied per line, it needs to be applied to the whole file to work as intended.
Regex are greedy, you will match the whole String on the first match unless you have . (or other special characters) in your String:
...
+ "comprise){1}"
+ "[\\s\\w\\,\\(\\)\\;\\:]*\\."); //Regex
On the last line you match any whitespace and word, so pretty much anything but .. Also the {1} and most of the \ are superfluous (because in []):
...
+ "comprise)"
+ "[\\s\\w,();:]*\\."); //Regex
If you don't care about the newline characters just remove them first and it should work (I see no way around it if you have something like "com\nprise" and want to match that):
s = s.replaceAll("\\n+", "");
I have a text file with several lines
Category: Type of problem you're having
Description: Overview of the problem
How To Fix: Directions to fix your problem (has carriage
returns, sometimes)
Related Links: Additional Resources
**There are no numbers in my list; it was the only way I could think of to make it neater...*
I've been trying to get my code to recognize all of the information between "How To Fix:" and "Related Links" when it has more than one line. I know from my research that I have to use either (?s) or Pattern.DOTALL, however neither of them seem to be working. I'm fairly new to Regex, so I'm expect is something elementary. Here is my code:
String fileName = System.getProperty("user.home") + "/Desktop/Test.txt";
try {
FileReader fr = new FileReader(fileName);
BufferedReader br = new BufferedReader(fr);
sc1 = new Scanner(br);
String findingRegex = "(Description:.*)";
String recommRegex = "(?<=How To Fix:)(.*)(?=Related Links)";//regex I'm trying to use
Pattern pFinding = Pattern.compile(findingRegex);
Pattern pRecomm = Pattern.compile(recommRegex, Pattern.DOTALL);
while (sc1.hasNextLine()) {
String clean = sc1.nextLine().trim();
String clean2 = clean.replaceAll("\\\\x\\p{XDigit}{2}", "");
Matcher mFinding = pFinding.matcher(clean2);
Matcher mRecomm = pRecomm.matcher(clean2);
while (mFinding.find()) {
System.out.println(mFinding);
}
while (mRecomm.find()){
System.out.println(mRecomm); //nothing prints?
}
}
br.close();
fr.close();
System.out.println("The following data was imported: ");
try {
tbl.displayAll();
} catch (NullPointerException npe) {
System.out.println("You have no data.");
}
} catch (FileNotFoundException fnfe) {
System.out.println("File named Test.txt was not located on your desktop. Program Terminated.");
System.exit(0);
} catch (IOException ioe) {
System.out.println("The import operation failed. Program Terminated");
System.exit(0);
} finally {
sc1.close();
}
Lastly, I tested my Regex here and it worked as expected?
MY SOLUTION:
String findingRegex = "(?<=Description:)(.*)(?=How To Fix)";
String recommRegex = "(?<=How To Fix:)(.*)(?=Related Links)";
Pattern pFinding = Pattern.compile(findingRegex, Pattern.DOTALL);
Pattern pRecomm = Pattern.compile(recommRegex, Pattern.DOTALL);
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null){
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
String newFile = sb.toString();
Matcher mFinding = pFinding.matcher(newFile);
Matcher mRecomm = pRecomm.matcher(newFile);
while (mFinding.find()) {
System.out.println(mFinding);
}
while (mRecomm.find()){
System.out.println(mRecomm);
}
Here:
String clean = sc1.nextLine().trim();
You are breaking your input up by line. But then you're trying to match multiple lines. There aren't multiple lines to match, because you only kept the one.
You could read the entire file into memory first, and then match against it. Or you could do something like
StringBuilder sb = new StringBuilder();
int state = 0;
while (sc1.hasNextLine()) {
String line = sc1.nextLine();
if (line.contains("How To Fix:")) {
state = 1;
}
if (state == 1) {
sb.append(line);
}
if (line.contains("Related Links:")) {
state = 0;
}
}
(You'll need to modify this if you need to match more than once per file.)