BufferedWriter to write at BufferedReader position - java

My code reads through an xml file encoded with UTF-8 until a specfied string has been found. It finds the specified string fine, but I wish to write at this point in the file.
I would much prefer to do this through a stream as only small tasks need to be done.
I cannot find a way to do this. Any alternative methods are welcome.
Code so far:
final String RESOURCE = "/path/to/file.xml";
BufferedReader in = new BufferedReader(new InputStreamReader(ClassLoader.class.getResourceAsStream(RESOURCE), "UTF-8"));
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(ClassLoader.class.getResource(RESOURCE).getPath()),"UTF-8"));
String fileLine = in.readLine();
while (!fileLine.contains("some string")) {
fileLine = in.readLine();
}
// File writing code here

You can't really write into the middle of the file, except for overwriting existing bytes (using something like RandomAccessFile). that would only work, however, if what you needed to write was exactly the same byte length as what you were replacing, which i highly doubt.
instead, you need to re-write the file to a new file, copying the input to the output, replacing the parts you need to replace in the process. there are a variety of ways you could do this. i would recommend using a StAX event reader and writer as the StAX api is fairly user friendly (compared to SAX) as well as fast and memory efficient.

Related

Does a Java InputStream help or hurt memory usage with large files?

I see some posts on StackOverflow that contradict each other, and I would like to get a definite answer.
I started with the assumption that using a Java InputStream would allow me to stream bytes out of a file, and thus save on memory, as I would not have to consume the whole file at once. And that is exactly what I read here:
Loading all bytes to memory is not a good practice. Consider returning the file and opening an input stream to read it, so your application won't crash when handling large files. – andrucz
Download file to stream instead of File
But then I used an InputStream to read a very large Microsoft Excel file (using the Apache POI library) and I ran into this error:
java.lang.outofmemory exception while reading excel file (xlsx) using POI
I got an OutOfMemory error.
And this crucial bit of advice saved me:
One thing that'll make a small difference is when opening the file to start with. If you have a file, then pass that in! Using an InputStream requires buffering of everything into memory, which eats up space. Since you don't need to do that buffering, don't!
I got rid of the InputStream and just used a bare java.io.File, and then the OutOfMemory error went away.
So using java.io.File is better than an InputSteam, when it comes to memory use? That doesn't make any sense.
What is the real answer?
So you are saying that an InputStream would typically help?
It entirely depends on how the application (or library) >>uses<< the InputStream
With what kind of follow up code? Could you offer an example of memory efficient Java?
For example:
// Efficient use of memory
try (InputStream is = new FileInputStream(largeFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
String line;
while ((line = br.readLine()) != null) {
// process one line
}
}
// Inefficient use of memory
try (InputStream is = new FileInputStream(largeFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
StringBuilder sb = new StringBuilder();
while ((line = br.readLine()) != null) {
sb.append(line).append("\n");
}
String everything = sb.toString();
// process the entire string
}
// Very inefficient use of memory
try (InputStream is = new FileInputStream(largeFileName);
BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
String everything = "";
while ((line = br.readLine()) != null) {
everything += line + "\n";
}
// process the entire string
}
(Note that there are more efficient ways of reading a file into memory. The above examples are purely to illustrate the principles.)
The general principles here are:
avoid holding the entire file in memory, all at the same time
if you have to hold the entire file in memory, then be careful about you "accumulate" the characters.
The posts that you linked to above:
The first one is not really about memory efficiency. Rather it is talking about a limitation of the AWS client-side library. Apparently, the API doesn't provide an easy way to stream an object while reading it. You have to save it the object to a file, then open the file as a stream. Whether that is memory efficient or not depends on what the application does with the stream; see above.
The second one specific to the POI APIs. Apparently, the POI library itself is reading the stream contents into memory if you use a stream. That would be an implementation limitation of that particular library. (But there could be a good reason; e.g. maybe because POI needs to be able to "seek" or "rewind" the stream.)

Loading PNG as String and saving corrupts it

I have program in which I have to load a PNG as a String and then save it again, but after I save it it becomes unreadable. If I open both the loaded PNG and the saved String in the editor, I can see that java created linebreaks all over the file. If this is is the problem, how can I avoid this?
public static void main(String[] args)
{
try
{
File file1 = new File("C://andim//testFile.png");
StringBuffer content = new StringBuffer();
BufferedReader reader = null;
reader = new BufferedReader(new FileReader(file1));
String s = null;
while ((s = reader.readLine()) != null)
{
content.append(s).append(System.getProperty("line.separator"));
}
reader.close();
String loaded=content.toString();
File file2=new File("C://andim//testString.png");
FileWriter filewriter = new FileWriter(file2);
filewriter.write(loaded);
filewriter.flush();
filewriter.close();
}
catch(Exception exception)
{
exception.printStackTrace();
}
}
I have program in which I have to load a PNG as a String and then save it again, but after I save it it becomes unreadable.
Yes, I'm not surprised. You're treating arbitrary binary data as if it's text data (in whatever your platform default encoding is, to boot). It's not. Don't do that. It's possible that in some encodings you'll get away with it - until you start trying to pass the string elsewhere in a way that strips unprintable characters etc.
If you must convert arbitrary binary data to text, use base64 or hex. If possible, avoid the conversion to text in the first place though. If you just want to copy a file, use InputStream and OutputStream - not Reader and Writer.
This is a big general point: keep data in its "native" representation as long as you possibly can. Only convert data to a different representation when you absolutely have to, and be very careful about it.
Don't use text-based APIs to read binary files. In this case, you don't want a BufferedReader, and you certainly don't want readLine, which may well treat more than just one thing as a line separator. Use an InputStream (for instance, FileInputStream) and an OutputStream (for instance, FileOutputStream), not readers and writers.
Don't do that
PNGs are not textual data.
If you try to read arbitrary bytes into a string, Java will mangle the bytes into actual text, corrupting the data you read.
You need to use byte[]sm not strings.

Java String I/O

I have to write a code in JAVA like following structure:
Read String From File
// Perform some string processing
Write output string in file
Now, for reading/writing string to/from file, I am using,
BufferedReader br = new BufferedReader(new FileReader("Text.txt"), 32768);
BufferedWriter out = new BufferedWriter(new FileWriter("AnotherText.txt"), 32768);
while((line = br.readLine()) != null) {
//perform some string processing
out.write(output string) ;
out.newLine();
}
However, it seems reading and writing is quite slow. Is there any other fastest method to read/write strings to/from a file in JAVA ?
Additional Info:
1) Read File is 144 MB.
2) I can allocate large memory (50 MB) for reading or writing.
3)I have to write it as a string, not as Byte.
It sounds slower than it should be.
You can try increasing the buffer size.
Maybe also try FileOutputStream instead of FileWriter.
You mentioned 50MB. Are you modifying the memory parameters of the program at all when you run it using a -X switch?
Ignoring the fact that you have not posted what your performance requirements are:
Try reading/writing the file as bytes and internally convert the byte to characters/string.
This question might be helpful: Number of lines in a file in Java

How to open a .dat file in java program

I was handed some data in a file with an .dat extension. I need to read this data in a java program and build the data into some objects we defined. I tried the following, but it did not work
FileInputStream fstream = new FileInputStream("news.dat");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
Could someone tell me how to do this in java?
What kind of file is it? Is it a binary file which contains serialized Java objects? If so, then you rather need ObjectInputStream instead of DataInputStream to read it.
FileInputStream fis = new FileInputStream("news.dat");
ObjectInputStream ois = new ObjectInputStream(fis);
Object object = ois.readObject();
// ...
(don't forget to properly handle resources using close() in finally, but that's beyond the scope of this question)
See also:
Basic serialization tutorial
A .dat file is usually a binary file, without any specific associated format. You can read the raw bytes of the file in a manner similar to what you posted - but you will need to interpret these bytes according to the underlying format. In particular, when you say "open" the file, what exactly do you want to happen in Java? What kind of objects do you want to be created? How should the stream of bytes map to these objects?
Once you know this, you can either write this layer yourself or use an existing API (assuming it's a standard format).
For reference, your example doesn't work because it assumes that the binary format is a character representation in the platform's default charset (as per the InputStreamReader constructor). And as you say it's binary, this will fail to convert the binary to a stream of characters (since, after all, it's not).
// BufferedInputStream not strictly needed, but much more efficient than reading
// one byte at a time
BufferedInputStream in = new BufferedInputStream (new FileInputStream("news.dat"));
This will give you a buffered stream which will return the raw bytes of the file; you can now either read and process them yourself, or pass this input stream to some library API that will create appropriate objects for you (if such a library exists).
That entirely depends on what sort of file the .dat is. Unfortunately, .dat is often used as a generic extension for a data file. It could be binary, in which case you could use FileInputStream fstream = new FileInputStream(new File("news.dat")); and call read() to get bytes from the file, or text, in which case you could use BufferedReader buff = new BufferedInputReader(new FileInputStream(new File("news.dat"))); and call readLine() to get each line of text. [edit]Or it could be Java objects in which case what BalusC said.[/edit]
In both cases, you'd then need to know what format the file was in to divide things up and get meaning from it, although this would be much easier if it was text as it could be done by inspection.
Please try the below code:
FileReader file = new FileReader(new File("File.dat"));
BufferedReader br = new BufferedReader(file);
String temp = br.readLine();
while (temp != null) {
temp = br.readLine();
System.out.println(temp);
}
A better way would be to use try-with-resources so that you would not have to worry about closing the resources.
Here is the code.
FileInputStream fis = new FileInputStream("news.dat");
try(ObjectInputStream objectstream = new ObjectInputStream(fis)){
objectstream.readObject();
}
catch(IOException e){
//
}

Newlines in string not writing out to file

I'm trying to write a program that manipulates unicode strings read in from a file. I thought of two approaches - one where I read the whole file containing newlines in, perform a couple regex substitutions, and write it back out to another file; the other where I read in the file line by line and match individual lines and substitute on them and write them out. I haven't been able to test the first approach because the newlines in the string are not written as newlines to the file. Here is some example code to illustrate:
String output = "Hello\nthere!";
BufferedWriter oFile = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("test.txt"), "UTF-16"));
System.out.println(output);
oFile.write(output);
oFile.close();
The print statement outputs
Hello
there!
but the file contents are
Hellothere!
Why aren't my newlines being written to file?
You should try using
System.getProperty("line.separator")
Here is an untested example
String output = String.format("Hello%sthere!",System.getProperty("line.separator"));
BufferedWriter oFile = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("test.txt"), "UTF-16"));
System.out.println(output);
oFile.write(output);
oFile.close();
I haven't been able to test the first
approach because the newlines in the
string are not written as newlines to
the file
Are you sure about that? Could you post some code that shows that specific fact?
Use System.getProperty("line.separator") to get the platform specific newline.
Consider using PrintWriters to get the println method known from e.g. System.out

Categories