Skip parts while reading and writing a file in Android/Java - java

I'm trying to learn Java/Android and right now I'm doing some experiments with the replaceAll function. But I've found that with large text files the process gets sluggish so I was wondering if there is a way to skip the "useless" parts of a file to have a better performance. (Note: Just skip them, not delete them)
Note: I am not trying to "count lines" or "println" or "system.out", I'm just replacing strings and saving the changes in the same file.
Example
AAAA
CCCC- 9234802394819102948102948104981209381'238901'2309'129831'2381'2381'23081'23081'284091824098304982390482304981'20841'948023984129048'1489039842039481'204891'29031'923481290381'20391'294872385710239841'20391'20931'20853029573098341'290831'20893'12894093274019799919208310293810293810293810293810298'120931¿2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿... DDDD
AAAA
CCCC- 9234802394819102948102948104981209381'238901'2309'129831'2381'2381'23081'23081'284091824098304982390482304981'20841'948023984129048'1489039842039481'204891'29031'923481290381'20391'294872385710239841'20391'20931'20853029573098341'290831'20893'12894093274019799919208310293810293810293810293810298'120931¿2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿... DDDD
and so on....like a zillion times
I want to replace all "AAAA" with "BBBB", but there are large portions of data between the strings I am replacing. Also, this portions always begin with "CCCC" and end with "DDDD".
Here's the code I am using to replace the string.
File file = new File("my_file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "", oldtext = "";
while((line = reader.readLine()) != null) {
oldtext += line + "\r\n";
}
reader.close();
// Replacing "AAAA" strings
String newtext= oldtext.replaceAll("AAAA", "BBBB");
FileWriter writer = new FileWriter("my_file.txt");
writer.write(newtext);
writer.close();
I think reading all lines is inefficient, especially when you won't be modifying these parts (and they represent the 90% of the file).
Does anyone know a solution???

You are wasting a lot of time on this line --
oldtext += line + "\r\n";
In Java, String is immutable, which means you can't modify them. Therefore, when you do the concatenation, Java is actually making a complete copy of oldtext. So, for every line in your file, you are recopying every line that came before in your new String. Take a look at StringBuilder for a a way to build a String avoiding these copies.
However, in your case, you do not need the whole file in memory, because you can process line by line. By moving your replaceAll and write into your loop, you can operate on each line as you read it. This will keep the memory footprint of the routine down, because you are only keeping a single line in memory.
Note that since the FileWriter is opened before you read the input file, you need to have a different name for the output file. If you want to keep the same name, you can do a renameTo on the File after you close it.
File file = new File("my_file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
FileWriter writer = new FileWriter("my_out_file.txt");
String line = "";
while((line = reader.readLine()) != null) {
// Replacing "AAAA" strings
String newtext= line.replaceAll("AAAA", "BBBB");
writer.write(newtext);
}
reader.close();
writer.close();

Related

How to Read csv file with line breaks when "\r\n" does not work

I am new to Java and have been reading Java docs and other threads (1 ,2) but couldn't make it work.
Basically my csv file has few records which read like this
How are
you
so I want my code to read it as one line
How are you
My code looks like this:
BufferedReader bReader = new BufferedReader(new InputStreamReader(new FileInputStream(csv),"utf-8"));
while ((line = bReader.readLine()) != null) {
String lines = line.replaceAll("\r\n", " ");
System.out.println(lines);
Manually, when I pressed backspace at youit goes back with areand I pressed space. Then it was fine. But I have a big csv file with 29k records. There must be a way through which I can fix this. Can you please point me towards the direction? Thank you.
[Edit]
This is how it appears.
Fav: Beaver tails.
Least fav: HST not included in prices.
Edit 2:
-3166,1054,CF ,5992841,15:37.5,en,13007,12,12,Comments: Favorite and/or least favorite things,0,"Cafe Fun
-Least favourite - cabs"
"Cafe Fun Least Favourite - cabs" should be on the same line.
readLine() will return the next line in the file, without the line separator. So on the first iteration of your loop, lines is "How are" and on the second iteration, lines is "you". Neither of these contain "\r\n", so your calls to replaceAll(...) just return the same string.
Then, System.out.println(...) prints the text with a line separator appended, so you get back to what you started with.
You can collect all the lines into a list:
List<String> lines = Files.readAllLines(csv);
and then concatenate them using String.join(...):
String allLines = String.join(" ", lines);
BufferedReader.readLine() doesn't read the newline, so your String lines does't have a brake.
You only print a newline with System.out.println(lines); change it to System.out.print(lines); and invoke System.out.println(); after the while-loop.
BufferedReader bReader = new BufferedReader(new InputStreamReader(new FileInputStream(csv),"utf-8"));
while ((line = bReader.readLine()) != null) {
System.out.print(line);
}
System.out.println();
To start the csv (as the name implies) are files separated by commas, not by spaces. But forgetting that, the readLine only reads the line where the "pointer" is, and in that case "you" is in another line than "How are". I think that's where your problem lies. One way to solve it would be to use the StringBuilder and the "append (String)". and it is adding everything together. Regards

How to read from a file into a JTextArea (GUI) line by line?

I am reading in from a file (which is a list of names and their contact numbers) and displaying it in a textarea in a GUI. The file reads in fine and the text is displayed. But I want each line from the file to be on a new line on the GUI. So each name and address to be on a new line. How do I do this?
This is my code so far, but it doesn't display each line from the file on a new line on the GUI.
public void books() throws IOException {
String result = " ";
String line;
LineNumberReader lnr = new LineNumberReader(new FileReader(newFile("books2.txt")));
while ((line = lnr.readLine()) != null) {
result += line;`
}
area1 = new JTextArea(" label 1 ");
area1.setText(result);
area1.setBounds(50, 50, 900, 300);
area1.setForeground(Color.black);
panelMain.add(area1);
}
You don't really need to read it line by line. Something like this will do:
String result = new String(Files.readAllBytes(Paths.get("books2.txt")),
StandardCharsets.UTF_8);
This, of course, will require more memory: first to read bytes, and then to create a string. But if memory is a concern, then reading the whole file at once is probably a bad idea anyway, not to mention displaying it in a JTextArea!
It may not handle different line endings properly. When you use readLine(), it strips the line of all endings, be it CR LF, LF or CR. The way above will read the string as-is. So maybe reading it line-by-line is not a bad idea after all. But I've just checked—JTextArea seems to handle CR LF all right. It may cause other problems, though.
With line-by-line approach, I'd do something like
String result = String.join("\n",
Files.readAllLines(Paths.get("books2.txt"),
StandardCharsets.UTF_8));
This still strips the last line of EOL. If that's important (e. g., you want to be able to put text cursor on the line after the last one), just do one more + "\n".
All of the above requires Java 7/8.
If you're using Java 6 or something, then the way you do it is probably OK, except that:
Replace LineNumberReader with BufferedReader—you don't need line numbers, do you?
Replace String result with StringBuilder result = new StringBuilder(), and += with result.append(line).append('\n').
In the end, use result.toString() to get the string.

Remove new line character from the middle of a file line in java

I would like to remove the new line character in the middle of a line of a file while it is reading the file.
If I'm going to read the file with BufferedReader then it is recognised as a new line and split the line in the middle. I want to be able to read the file and remove those new line characters of the middle while reading.
The format of each line is a simple Json.
Thank you
If what youre saying is you want to remove the newlines from the original file after reading them, I think you can just write to a new (temporary) file while youre reading the lines, and then replace the file with the original after youre done writing.
If I'm interpreting your question correctly, what you want to do isn't quite as simple as "read and write at the same time". What you need is a loop and a StringBuilder.
public String readFileWithNoLines(BufferedReader reader) throws IOException {
StringBuilder builder = new StringBuilder();
String line;
while((line = reader.readLine()) != null) {
builder.append(line);
}
return builder.toString();
}
Then you want to write the return value of that function to the file.

How to copy a file line by line keeping its original line breaks

I need to replace keys in various files (all types and line break format).
To do this, i tried to copy the file line by line and replace keys in the line. This works but original line breaks are lost.
Here is my code, quite common:
FileInputStream fis = new FileInputStream(file);
BufferedReader reader = new BufferedReader(new InputStreamReader(fis));
FileWriter writer = new FileWriter(tmpFile);
BufferedWriter out = new BufferedWriter(writer);
String line;
while ((line = reader.readLine() != null) {
String updatedLine = replaceKeys(line);
out.write(updatedLine);
out.newLine();
}
I need to read the file line by line to be able to replace keys correctly (keys are determined by some delimiters, they must not be cut during file reading).
The problem is my unix files (.sh) has wrong line breaks after replacement (code is run on windows). And files with only one line is changed into a 2-lines file.
Question is, how to keep original file line breaks while copying the file line by line or, at least, how to be able to determine the end of the file to not add an additional line at the end? Thanks for your help.
Edit: Useless DataInputStream removed.
You can use a Scanner for this job:
try(Scanner s=new Scanner(file).useDelimiter("(?<=\n)|(?!\n)(?<=\r)");
FileWriter out=new FileWriter(tmpFile)) {
while(s.hasNext()){
String line=s.next();
String updatedLine = replaceKeys(line);
out.write(updatedLine);
}
}
The key point is the regex specified as delimiter. The pattern used above will match what BufferedReader.readLine() matches for a line break, that is, a '\n', '\r' followed by '\n', or a lone '\r'. But it uses “zero width lookbehind” to match the position after the line break rather than the line break itself so the line break becomes part of the token returned by Scanner.next().
So the String line will contain the line break at its end, unless it’s the last line not terminated by a line break. So all you have to do, assuming that replaceKeys leaves the line break untouched, is to write the Stringas-is without appending a line break manually.
If replaceKeys can not cope with the String having a line break at its end, you have to split it before calling the method and joining afterwards.

java read properties and xml file using stringbuilder

I need to read a set of xml and property files and parse the data. Currently I am using inputstream ans string builder to do this. But this does not create the file in the same way as input file is. I donot want to remove the white spaces and new lines. How do i achieve this.
is = test.getInputStream();
br = new BufferedReader(new InputStreamReader(is));
String line5;
StringBuilder sb5 = new StringBuilder();
while ((line5 = br.readLine()) != null) {
sb5.append(line5);
}
String s = sb5.toString();
My output is:
#test 123 #test2 345
Expected output is:
#test
123
#test2
345
Any thoughts ? Thanks
br.readLine() consumes the line breaks, you need to add them to your StringBuilder after appending the line.
is = test.getInputStream();
br = new BufferedReader(new InputStreamReader(is));
String line5;
StringBuilder sb5 = new StringBuilder();
while ((line5 = br.readLine()) != null) {
sb5.append(line5);
sb5.append("\n");
}
If you want an extremely simple solution for reading a file to a String, Apache Commons-IO has a method for performing such a task (org.apache.commons.io.FileUtils).
FileUtils.readFileToString(File file, String encoding);
readLine() method doesn't add the EOL character (\n). So while appending the string to the builder, you need to add the EOL char, like sb5.append(line5+"\n");
The various readLine methods discard the newline from the input.
From the BufferedReader docs:
Returns: A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached
A solution may be as simple as adding back a newline to your StringBuilder for every readLine: sb5.append(line5 + "\n");.
A better alternative is to read into an intermediate buffer first, using the read method, supplying your own char[]. You can still use StringBuilder.append, and get a String will match the file contents.

Categories