How to read a text file for particular kind of data? - java

I have special scenario, in which I am trying to read a file in two while loops . in second loop it reads the file from egining, but I want to read the file where the first while stops reading the file.
Here is my code stuff:
while ((line = br.readLine()) != null) {
if (line.startsWith(rootId.trim())) {
break;
}
}
while (!(line = br.readLine()).contains("---------------------------------------------------")) {
// my other code stuff
}
Here my file stores data as follow,
-----------------------------------------------------
00001# // this is the rootId
N1
N2
-----------------------------------------------------
00002#
N1
N2
-----------------------------------------------------
00003#
N1
N2
This method takes rootId and displays Nodes(N1,N2) and my other stuff. Here my stratergy is to read the file untill I get the rootId after that in another loop untill I get a line (--------) doing my stuff. but in next loop it again starts reading the file from begining . How to solve this. can any help me in this.

while ((line = br.readLine()) != null)
{
if(line.startsWith(rootId.trim()))
{
break;
}
if(!(line = br.readLine()).contains("---------------------------------------------------"))
{
// my other code stuff
}
}
All of your code should be in a single loop, try like this and it should work.

I don't think you need two loops - you should be able to process your file line by line using the first loop and then using if- statements
while ((line = br.readLine()) != null)
{
if(line.startsWith(rootId.trim())||(line.contains("----"))
{
continue;
}
//Process your nodes here - add more if statements if necessary
}
Note that you need
continue;
rather than
break;

Related

How to read specific lines with bufferedReader

I need to find some specific data from txt file, see code bellow.
while ((line = bufferedReader.readLine())!= null) {
//pokial obsahuje string zapíš do array
if (line.toLowerCase().contains("list c.")) {
parsedData.add(line);
}
if(line.toLowerCase().startsWith("re")) {
parsedData.add(line);//add found data to array
//i need to access and save second and third line after this one
}
System.out.println(line);
}
In the second condition, when I find a line that starts with "re" I need to save the second and third line after this specific one.
from your question i am not sure but if your target is to get next couple of lines (for example 2) after receiving re at start of line, you can do it by having some flags.
boolean needsToConsider = false;
int countOfLines = 2;
while ((line = bufferedReader.readLine())!= null) {
if(needsToConsider && countOfLines > 0){
// add here
countOfLines--;
if(countOfLines == 0)
needsToConsider = false;
}
//pokial obsahuje string zapíš do array
if (line.toLowerCase().contains("list c.")) {
parsedData.add(line);
}
if(line.toLowerCase().startsWith("re")) {
parsedData.add(line);//add found data to array
//i need to access and save second and third line after this one
needsToConsider = true;
}
A simple approach here might be to just use a counter to keep track of hitting those second and third lines:
int counter = 0;
while ((line = bufferedReader.readLine())!= null) {
if (line.toLowerCase().contains("list c.")) {
parsedData.add(line);
}
else if (line.toLowerCase().startsWith("re")) {
parsedData.add(line);
counter = 2;
}
else if (counter > 0) {
// add second and third lines after "re" here
parsedData.add(line);
--counter;
}
}
A more advanced approach could be to read in the entire portion of text of interest, and then use a regex matcher to extract what you want.

Ignoring blank lines in CSV file in Java

I am trying to iterate through a CSV file in Java. It iterates through the entire file, but will get to the end of the file and try to read the next blank line and throw an error. My code is below.
public class Loop() {
public static void main(String[] args) {
BufferedReader br = null;
String line = "";
try {
HashMap<Integer, Integer> changeData = new HashMap<Integer, Integer>();
br = new BufferedReader(new FileReader("C:\\xxxxx\\xxxxx\\xxxxx\\the_file.csv"));
String headerLine = br.readLine();
while ((line = br.readLine()) != null) {
String[] data = line.split(",");
/*Below is my latest attempt at fixing this,*/
/*but I've tried other things too.*/
if (data[0].equals("")) { break; }
System.out.println(data[0] + " - " + data[6]);
int changeId = Integer.parseInt(data[0]);
int changeCv = Integer.parseInt(data[6]);
changeData.put(changeId, changeCv);
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
Like I typed, this works fine until it gets to the end of the file. When it gets to the end of the file, I get the error Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 0 at com.ucg.layout.ShelfTableUpdates.main(ShelfTableUpdates.java:23). I've stepped through the code by debugging it in Spring Tool Suite. The error comes up whenever I try to reference data[0] or data[6]; likely because there is nothing in that line. Which leads me back to my original question of why it is even trying to read the line in the first place.
It was my understanding that while ((line = br.readLine()) != null) would detect the end of the file, but it doesn't seem to be. I've tried re-opening the file and deleting all of the blank rows, but that did not work.
Any idea how I can detect the end of the file so I don't get an error in this code?
ANSWER:
Credit goes to user #quemeraisc. I also was able to replace the commas with blanks, and if the line then equals null or "", then you know that it is the end of the file; in my case, there are no blank rows before the end of the file. This still does not solve the problem of detecting the end of the file in that if I did have blank rows in between my data that were not the EOF then this would detect those.
Solution 1:
if (data.length < 7) {
System.out.println(data.length);
break;
}
Solution 1:
if (line.replace(",", "").equals(null) || line.replace(",", "").equals("")) {
System.out.println(line.replace(",", ""));
break;
}
Just skip all blank lines:
while ((line = br.readLine()) != null) {
if( line.trim().isEmpty() ) {
continue;
}
....
....
The last line may contain some control characters (like new line, carriage return, EOF and others unvisible chars), in this case a simple String#trim() doesn't remove them, see this answer to know how to remove them: How can i remove all control characters from a java string?
public String readLine() will read a line from your file, even empty lines. Thus, when you split your line, as in String[] data = line.split(","); you get an array of size 1.
Why not try :
if (data.length >= 7)
{
System.out.println(data[0] + " - " + data[6]);
int changeId = Integer.parseInt(data[0]);
int changeCv = Integer.parseInt(data[6]);
changeData.put(changeId, changeCv);
}
which will make sure there are at least 7 elements in your array before proceeding.
To skip blank lines you could try:
while ((line = reader.readLine()) != null) {
if(line.length() > 0) {
String[] data = line.split(",");
/*Below is my latest attempt at fixing this,*/
/*but I've tried other things too.*/
if (data[0] == null || data[0].equals("")) { break; }
System.out.println(data[0] + " - " + data[6]);
int changeId = Integer.parseInt(data[0]);
int changeCv = Integer.parseInt(data[6]);
changeData.put(changeId, changeCv);
}
}
Instead of replace method use replaceAll method. Then it will work.

reversing order of lines in txt file every 'x' number of lines - JAVA

so i have an input txt file where i have to take the first 50 lines and reverse it's order so that the file will start with the 50th line, then the 49th, until the 1st, then continues with the 100th line followed by the 99th, and so on...
but i can only store at most 50 elements. i can't store more than that.
the code i've written so far only grabs the first 50 lines and reverses them, but i dont know how to make it continue on.
this is what i have so far:
ArrayList<String> al = new ArrayList<String>();
int running = 0;
while(running == 0) {
for (String line = r.readLine(); line != null; line = r.readLine()) {
if(al.size() <50 ) {
al.add(line);
}
}
Collections.reverse(al);
for (String text : al) {
w.println(text);
}
if(al.size() < 50) {
break;
}
al.clear();
}
idk why my while loop won't keep running, im only getting the first 50 lines reversed in my output file.
This:
for (String line = r.readLine(); line != null; line = r.readLine()) {
if(al.size() <50 ) {
al.add(line);
}
}
reads all the lines in the file, stores the first fifty in al, and discards the rest. After you then process the results of al, there's nothing more for your program to do: it's read the whole file.
There's a great blog post on how to debug small programs, that I highly recommend: http://ericlippert.com/2014/03/05/how-to-debug-small-programs/
And in your specific case, I suggest breaking your program into functions. One of those functions will be "read up to n lines, and return them in an array-list". This sort of structure makes it easier to reason about each part of the program.
Your initial loop:
for (String line = r.readLine(); line != null; line = r.readLine()) {
if(al.size() <50 ) {
al.add(line);
}
}
Continues to read lines after you've filled al through to the end of the file - it just doesn't put them in the list.
You most likely need something like:
for (String line = r.readLine(); line != null; line = r.readLine()) {
if (al.size() == 50)
outputReverseLines();
al.add(line);
}
outputReverseLines();
Where outputReverseLines is a method that reverses, prints and clears the list.

How to read a ;-separated CSV in Java that can countain an unknown number of elements

I know there exist a lot questions about reading CSV files, but I simply can't find one that fits my needs.
I try to get keywords from a keywords.csv that can be in a form like this. The delimeter is always the ";".
SAP;BI; Business Intelligence;
ERP;
SOA;
SomethingElse;
I already looked into openCSV and so on, but I can't find a functioning example how to do that (simple) task.
I tried this:
public void getKeywords()
{
try {
int rowCount = 0;
CSVReader reader = new CSVReader(new FileReader(csvFilename), ';');
String[] row = null;
while((row = reader.readNext()) != null) {
System.out.println(row[rowCount]);
rowCount++;
}
//...
reader.close();
}
catch (IOException e) {
System.out.println("File Read Error");
}
But it will just return the first element. I don't know what I do wrong. Im new to coding as you may have noticed :)
EDIT: Got what I wanted, thanks for your help!
while((row = reader.readNext()) != null) {
for (int i=0; i< row.length; i++ )
{
System.out.println(row[i]);
}
Please help an old man out.
Thank you!
Using openCSV, you could use this code:
CSVReader reader = new CSVReader(new FileReader("yourfile.csv"), ';');
That will open the .csv file, read it in, and use a ; as the delimiter. A similar example can be found on the openCSV home page.
Once you have the file read in, you can use the data with something like the following:
String [] nextLine;
// Read from the csv sequentially until all the lines have been read.
while ((nextLine = reader.readNext()) != null) {
// nextLine[] is an array of values from the line
System.out.println(nextLine[0] + nextLine[1] + "etc...");
}
Where nextLine is a line from the file, and nextLine[0] will be the first element of the line, nextLine[1] will be the second, etc.
Edit:
In your comment below, you mentioned that you don't know how many elements will be in each row. You can handle that by using nextLine.length and figuring out how many elements are in that row.
For example, change the above code to something like:
String [] nextLine;
while ((nextLine = reader.readNext()) != null) {
if(nextLine.length == 1) {
// Do something with the first element, nextLine[0]
System.out.println(nextLine[0]);
}
else if(nextLine.length == 2) {
// Do something with both nextLine[0] and nextLine[1]
System.out.println(nextLine[0] + ", " + nextLine[1]);
}
// Continue depending on how you want to handle the different rows.
}
You can read the file using the readLine() method from the Scanner class. The output of this method is one line of the input file. You can then use the String.split(";") method to get the individual elements. You can then move to the next line using the methods in the Scanner class and then continue from thereon.
You will get a number of arrays - one corresponding to each line from the input file. You can just combine them to get what you want.

Java, how to extract some text from a large file and import it into a smaller file

I'm relatively new to Java programming and am trying to create an application which will help some colleagues.
The background of what I am trying to do is, read the content of a large file, upto and possibly more than 400,000 lines, which contains XML but is not an valid XML document, as its kind of a log.
What I am trying to do, is build an application where a user enters a unique ID, this then scans the document to find if it exists, if it does, and often the unique ID occurs a few times in the produced XML, then I want to traverse backwards to a node ID of <documentRequestMessage> , then copy everything from that node to its closing node, and put that into it's own document.
I know how to create the new document, but am struggling to find out how to essentially 'find backwards' and copy everything to the closing tag, any help greatly appreciated.
EDIT
Unfortunately, I haven't been able to figure out how to implement of either of the 3 suggestions thus far.
The correlationId is the unique reference previously mentioned.
The current code I have, which works and outputs the findings to the console, is
String correlationId = correlationID.getText();
BufferedReader bf = new BufferedReader(new FileReader(f));
System.out.println("Looking for " + correlationId);
int lineCount = 0;
String line;
while ((line = bf.readLine()) != null) {
lineCount++;
int indexFound = line.indexOf(correlationId);
if (indexFound > -1) {
System.out.println("Found CorrelationID on line " + "\t" + lineCount + "\t" + line);
}
}
bf.close();
Any further help greatfully appreciated, I'm not asking for someone to write it for me, just some really clear and basic instructions :) please
EDIT 2
A copy of the file I'm trying to read and extract from can be found here
While you are reading forward through the file looking for your unique ID, keep a reference to the most recent documentRequestMessage that you encounter. When you find the unique ID, you'll already have the reference that you need to extract the message.
In this context, "reference" can mean a couple of things. Since you are not traversing a DOM (because it's not valid XML) you will probably just store the position in the file where the documentRequestMessage is. If you're using a FileInputStream (or any InputStream where mark is supported), you can just mark/reset to store and return to the place in the file where your message starts.
Here is an implementation of what I believe you are looking for. It makes a lot of assumptions based on the log file that you linked, but it works for the sample file:
private static void processMessages(File file, String correlationId)
{
BufferedReader reader = null;
try {
boolean capture = false;
StringBuilder buffer = new StringBuilder();
String lastDRM = null;
String line;
reader = new BufferedReader(new FileReader(file));
while ((line = reader.readLine()) != null) {
String trimmed = line.trim();
// Blank lines are boring
if (trimmed.length() == 0) {
continue;
}
// We only actively look for lines that start with an open
// bracket (after trimming)
if (trimmed.startsWith("[")) {
// Do some house keeping - if we have data in our buffer, we
// should check it to see if we are interested in it
if (buffer.length() > 0) {
String message = buffer.toString();
// Something to note here... at this point you could
// create a legitimate DOM Document from 'message' if
// you wanted to
if (message.contains("documentRequestMessage")) {
// If the message contains 'documentRequestMessage'
// then we save it for later reference
lastDRM = message;
} else if (message.contains(correlationId)) {
// If the message contains the correlationId we are
// after, then print out the last message with the
// documentRequestMessage that we found, or an error
// if we never saw one.
if (lastDRM == null) {
System.out.println(
"No documentRequestMessage found");
} else {
System.out.println(lastDRM);
}
// In either case, we're done here
break;
}
buffer.setLength(0);
capture = false;
}
// Based on the log file, the only interesting messages are
// the ones that are DEBUG
if (trimmed.contains("DEBUG")) {
// Some of the debug messages have the XML declaration
// on the same line, and some the line after, so let's
// figure out which is which...
if (trimmed.endsWith("?>")) {
buffer.append(
trimmed.substring(
trimmed.indexOf("<?")));
buffer.append("\n");
capture = true;
} else if (trimmed.endsWith("Message:")) {
capture = true;
} else {
System.err.println("Can't handle line: " + trimmed);
}
}
} else {
if (capture) {
buffer.append(line).append("\n");
}
}
}
} catch (IOException ex) {
ex.printStackTrace(System.err);
} finally {
if (reader != null) {
try {
reader.close();
} catch (IOException ex) {
/* Ignore */
}
}
}
}
What you can do is read the contents of the file and look for <documentRequestMessage> element. When you find one of the above elements, read till you find </documentRequestMessage> and store it in a list so all the documentRequestMessage will be available in a list.
You can iterate through this list at the end or while adding to list to find the unique id you're looking for. If you find it write to XML Files or ignore.
I'm assuming your log is a series of <documentRequestMessage> contents.
Don't scan the log at all.
Read the log, and each time you encounter a <documentRequestMessage> header, start saving the contents of that <documentRequestMessage> block into a block area.
I'm not sure if you have to parse the XML or you can just save it as a List of Strings.
When you encounter a </documentRequestMessage> trailer, check to see if the ID of the block matches the ID you're looking for,
If the ID matches, write the <documentRequestMessage> block to an output file. If the ID doesn't match, clear the block area and read to the next <documentRequestMessage> header.
This way, there's no backtracking in your file reading.

Categories