When loading huge files with ObjectInputStream, all read objects are buffered by stream for object graph resolving.
This cause huge memory overhead which isn't needed in my case (all objects read are interdependent).
Is there an equivalent to the reset() method of ObjectOutputStream which reset this buffer?
Code example:
try (FileInputStream fileInputStream = new FileInputStream(filename);
BufferedInputStream bufferedInputStream = new BufferedInputStream(fileInputStream);
ObjectInputStream objectInputStream = new ObjectInputStream(bufferedInputStream)) {
while (object = objectInputStream.readObject()) {
System.Out.println(object.toString());
}
}
There is actually a reset method on the class but it does a complete different thing.
See Java APIs which cause memory bloat
It's up to the sender to decide when to break the integrity of sent object graphs, by calling ObjectOutputStream.reset(). Not the receiver.
NB your code doesn't compile, and wouldn't be valid if it did:
while (object = objectInputStream.readObject()) {
}
This should be
try {
while (true) {
object = objectInputStream.readObject();
// ...
}
}
catch (EOFException exc) {
// end of stream
}
There is a misconception abroad that readObject() returns null at end of stream. It doesn't. It throws EOFException. It can return null any time you wrote a null.
Hmm it seems you need to use some sort of lazy loading techniques where you only load necessairy components of the object graph, not everything.
Related
I am trying to read text file whilst running the program from a jar archive.
I come accros that I need to use InputStream to read file. The snippet of code:
buffer = new BufferedInputStream(this.getClass().getResourceAsStream((getClass().getClassLoader().getResource("English_names.txt").getPath())));
System.out.println(buffer.read()+" yeas");
At this line System.out.println(buffer.read()+" yeas"); program stops and nothing happens since then. Once you output the contents of buffer object it is not null.
What might be the problem?
From InputStream#read():
This method blocks until input data is available, the end of the stream is detected, or an exception is thrown.
So basically, the stream appears to be waiting on content. I'm guessing it's how you've constructed the stream, you can simplify your construction to:
InputStream resourceStream = getClass().getResourceAsStream("/English_names.txt");
InputStream buffer = new BufferedInputStream(resourceStream);
I'd also check to make sure that resourceStream is not-null.
You should not worry about InputStream being null when passed into BufferedInputStream constructor since it, the constructor handles null parameters just fine. When supplied with null it will just return null without throwing any exception. Also since InputStream implements AutoClosable the try-with-resources block will take care of closing your streams properly.
try (
final InputStream is = getClass().getResourceAsStream("/English_names.txt");
final BufferedInputStream bis = new BufferedInputStream(is);
) {
if (null == bis)
throw new IOException("requsted resource was not found");
// Do your reading.
// Do note that if you are using InputStream.read() you may want to call it in a loop until it returns -1
} catch (IOException ex) {
// Either resource is not found or other I/O error occurred
}
I just wanted to see if there was a better way I should be handling this. My understanding of streams is that as long as you close a stream, any streams encapsulated within it will be closed which is why I only close TarArchiveOutputStream in finally. If I get FileNotFound on the rawDir or archiveDir I want to log it, otherwise anything else I want to throw.
public static void createTarGzOfDirectory(File rawDir, File archiveFile) throws IOException {
FileOutputStream fOut = null;
BufferedOutputStream bOut = null;
GzipCompressorOutputStream gzOut = null;
TarArchiveOutputStream tOut = null;
try {
fOut = new FileOutputStream(archiveFile);
bOut = new BufferedOutputStream(fOut);
gzOut = new GzipCompressorOutputStream(bOut);
tOut = new TarArchiveOutputStream(gzOut);
addFileToTarGz(tOut, rawDir, "");
} catch (FileNotFoundException e) {
log.error("File not found: " + e);
} finally {
if(tOut != null) {
tOut.finish();
tOut.close();
}
}
Any other considerations or thoughts on improving things?
My understanding of streams is that as long as you close a stream, any streams encapsulated within it will be closed ...
That is correct.
However, your code is (effectively) assuming that if tOut is null, then none of the other streams in the chain have been created. That's a somewhat dodgy assumption. Consider this sequence:
The FileOutputStream is created and is assigned to fOut.
The BufferedOutputStream is created and is assigned to bOut.
The GzipCompressorOutputStream constructor throws an exception or error. (Maybe the heap is full ...).
The catch is skipped ... wrong exception.
The finally checks tOut, finds it is null, and does nothing.
Net result: we've leaked the file descriptor / channel held by the FileOUtputStream.
The key to getting this example (absolutely) right is to understand which of those stream objects holds the critical resources, and ensuring that THAT stream gets closed. The other streams that don't hold resources don't have to be closed.
} finally {
if (fOut != null) {
fOut.close();
}
}
The other point is that you need to move the tOut.finish() call into the try block after the addFileToTarGz call.
If the addFileToTarGz call throws an exception, or if you don't get that far, the finish call is a waste of time.
The finish call will attempt to write the index to the archive, and THAT could throw an IOException. If this happens in the finally block, then any following code in the finally block to close the stream chain won't get executed ... and a file descriptor will be leaked.
Although it would look ugly and is,maybe, unlikely to be the case, you should close them all in cascade. Yes, if you close the TarArchiveOutputStream, it is supposed to close the underlyning streams. But, depending on the implementation, it may not always be the case. Moreover, and probably mainly, if one of the intermediate constructors throw an exception, tOut will be null, but the other ones may not be. Meaning that your streams are opened but your did not close any.
You could chain all your constructors together like so:
tOut = new TarArchiveOutputStream(new GzipCompressorOutputStream(new BufferedOutputStream(new FileOutputStream(archiveFile))));
And save yourself 6 lines of initialization and 3 local variables for debugging. Not everyone likes chaining things that way - I personally find it more readable but the rest of your team may prefer it your way.
As far as closing the stream, it looks correct to me.
I have to read a series of object from a binary file.
I use:
ObjectOutputStream obj = new ObjectOutputStream(new FileInputStream(fame));
obj.readObject(p);
where p is a reference to an object I had created. How can I read the entire file until the end?
I can use:
while(p!=null){}
?
readObject() returns null if and only if you wrote a null. The correct technique is to catch EOFException and when you get it close the stream and exit the reading loop.
Let's assume you meant ObjectInputStream and p = obj.readObject().
I would do something like this: (this is wrong, see EDIT below)
FileInputStream fstream = new FileInputStream(fileName);
try {
ObjectInputStream ostream = new ObjectInputStream(fstream);
while (ostream.available() > 0) {
Object obj = ostream.readObject();
// do something with obj
}
} finally {
fstream.close();
}
EDIT
I take it back! EJP rightly points out that the use of available() is incorrect here. I think the fixed code might be:
FileInputStream fstream = new FileInputStream(fileName);
try {
ObjectInputStream ostream = new ObjectInputStream(fstream);
while (true) {
Object obj;
try {
obj = ostream.readObject();
} catch (EOFException e) {
break;
}
// do something with obj
}
} finally {
fstream.close();
}
Although the documentation for readObject() doesn't explicitly say that EOFException is thrown at the end of the stream, it seems to be implied and may be the only way to detect the end of the stream.
Another option if you control the code that wrote the stream would be to write an object count at the beginning, or a flag after each object indicating whether the previous object was the final one.
If you want to read object into your program, then you have to use ObjectInputStream, not ObjectOutputStream.
And if you will store a bunch of objects, then use an appropriate Collection for writing to file and reading from it. The API documentation for readObject does not state that it will return null or throw an exception if EOF is reached. So to be on the safe side, use Collections.
You may also want to read API docs on ObjectInputStream and ObjectOutputStream.
Boolean i = true;
while(i) {
try {
System.out.println(reader.readObject());
} catch(Exception e) {
i = false;
System.out.println("Dead end");
}
}
Guava's Files.toByteArray does what you want so if fame in your code is a File, then
import com.google.common.io.Files;
...
byte[] fameBytes = Files.toByteArray(fame);
My code makes use of BufferedReader to read from a file [main.txt] and PrintWriter to write to a another temp [main.temp] file. I close both the streams and yet I was not able to call delete() method on the File object associated with [main.txt]. Only after calling System.gc() after closing both the stream was I able to delete the File object.
public static boolean delete (String str1, String str2, File FileLoc)
{
File tempFile = null;
BufferedReader Reader = null;
PrintWriter Writer = null;
try
{
tempFile = new File (FileLoc.getAbsolutePath() + ".tmp");
Reader = new BufferedReader(new FileReader(FileLoc));
Writer = new PrintWriter(new FileWriter(tempFile));
String lsCurrLine = null;
while((lsCurrLine = Reader.readLine()) != null)
{
// ...
// ...
if (true)
{
Writer.println(lsCurrLine);
Writer.flush();
}
}
Reader.close();
Writer.close();
System.gc();
}
catch(FileNotFoundException loFileExp)
{
System.out.println("\n File not found . Exiting");
return false;
}
catch(IOException loFileExp)
{
System.out.println("\n IO Exception while deleting the record. Exiting");
return false;
}
}
Is this reliable? Or is there a better fix?
#user183717 - that code you posted is clearly not all of the relevant code. For instance, those "..."'s and the fact that File.delete() is not actually called in that code.
When a stream object is garbage collected, its finalizer closes the underlying file descriptor. So, the fact that the delete only works when you added the System.gc() call is strong evidence that your code is somehow failing to close some stream for the file. It may well be a different stream object to the one that is opened in the code that you posted.
Properly written stream handling code uses a finally block to make sure that streams get closed no matter what. For example:
Reader reader = new BufferedReader(new FileReader(file));
try {
// do stuff
} finally {
try {
reader.close();
} catch (IOException ex) {
// ...
}
}
If you don't follow that pattern or something similar, there's a good chance that there are scenarios where streams don't always get closed. In your code for example, if one of the read or write calls threw an exception you'd skip past the statements that closed the streams.
Is this [i.e. calling System.gc();] reliable?
No.
The JVM may be configured to ignore your application's gc() call.
There's no guarantee that the lost stream will be unreachable ... yet.
There's no guarantee that calling System.gc() will notice that the stream is unreachable. Hypothetically, the stream object might be tenured, and calling System.gc() might only collect the Eden space.
Even if the stream is found to be unreachable by the GC, there's no guarantee that the GC will run the finalizer immediately. Hypothetically, running the finalizers can be deferred ... indefinitely.
Or is there a better fix ?
Yes. Fix your application to close its streams properly.
try using java.io.File library. here the simple sample:
File f = new File("file path or file name");
f.delete();
When you say you "close both the streams" you mean the BufferedReader and the PrintWriter?
You should only need to close the BufferedReader before the delete will work, but you also need to close the underlying stream; normally calling BufferedReader.close() will do that. It sounds like you think you are closing the stream but you aren't actually succeeding.
One problem with your code: you don't close the streams if exceptions occur. It's usually best to close the streams in a finally block.
Also, the code you posted doesn't use File.delete() anywhere? And what exactly do the ... lines do - are they re-assinging Reader to a new stream by any chance?
try using the apache commons io
http://commons.apache.org/io/description.html
snippet from The Server code :
public void run() {
try {
// Create data input and output streams
ObjectInputStream inputFromClient = new ObjectInputStream(
socket.getInputStream());
ObjectOutputStream outputToClient = new ObjectOutputStream(
socket.getOutputStream());
while (true) {
cop = inputFromClient.readObject();
String[][] m1=new String[][] {{"1", "1","1"}};
Object xx=new getSerialModel(m1);
outputToClient.reset();
outputToClient.writeObject(xx);
outputToClient.flush();
}
}
snippet from the Client :
//////////////
/// sockt jop
try {
// Create a socket to connect to the server
socket = new Socket("127.0.0."+Math.round(50+Math.random()*50), 8000);
// Create an output stream to send data to the server
toServer = new ObjectOutputStream(socket.getOutputStream());
toServer.flush();
}
catch (IOException ex) {
msgArea.append('\n' + ex.toString() + '\n');
}
///////////////////
//***
///////////////////
buttonSave.addActionListener(new ActionListener()
{ public void actionPerformed(ActionEvent ev)
{
System.out.println("Saving data is not implemented yet.");
String[][] m1={{"0","0","0"}};
for ( int i = 0 ; i < tableModel.getRowCount() ; i++ ){
{ for ( int j = 0 ; j < tableModel.getColumnCount() ; j++ )
m1[i][j]=(String)tableModel.getValueAt(i, j) ;
}
}
getSerialModel obt =new getSerialModel(m1);
try{
toServer.reset();
toServer.writeObject(obt);
toServer.flush();
}
catch (Exception ex) {
msgArea.append("cant reach the server its may be off" + '\n');
}
}
});
// button send msg
buttonsendtest.addActionListener(new ActionListener()
{ public void actionPerformed(ActionEvent ev)
{
try{
fromServer = new ObjectInputStream(socket.getInputStream());
Object mdata = fromServer.readObject();
tableModel.setDataVector((((getSerialModel)mdata).getmodel()), columnNames);
table.updateUI();
}
catch (Exception ex) {
System.out.print(ex.getStackTrace());
msgArea.append("cant reach the server its may be off "+ ex.toString() + '\n');
}
}
});
When I try to read serializable object from the server multible times , I get this exception , for first time the reciever read it successfully .
java.io.StreamCorruptedException: invalid stream header: 00007571
how can I fix it ?
If you are creating multiple ObjectInputStream instances in series for the same socket input stream, this seems like a bad idea. If the server is writing multiple objects to the same output stream, then there is serialization-related information that only gets sent once per unique object, and only the first ObjectInputStream instance on the client would be able to reliably read this. Using only one ObjectInputStream instance per socket input stream and one ObjectOutputStream instance per socket output stream is probably the safest implementation.
Also, if you are writing multiple objects to the same ObjectOutputStream instance on the server side (i.e., multiple writeObject() calls), this can result in stream header problems due to potentially multiple references to the same objects (typically nested references) when they are read by the client's input stream
This problem occurs when the object output stream wraps a socket output stream since during normal serialization, the second and later references to an object do not describe the object but rather only use a reference. The client's ObjectInputStream does not reconstruct the objects properly for some reason due to a difference in the header information it is expecting (it doesn't retain it from previous readObject() calls); this only seems to happen with socket streams, not file I/O, etc. This problem does not occur with the first readObject() call but rather the second and subsequent ones.
If you want to continue to use the same socket stream to write multiple objects, you will need something like the following in the server code:
objectOut.reset()
objectOut.writeObject(foo);
The reset() call re-initializes the stream, ignoring the state of any objects previously sent along the stream. This ensures that each object is sent in its entirety without the handle-type references that are typically used to compress ObjectOutputStream data and avoid duplication. It's less efficient, but there should be no data corruption when read by the client.
From the documentation for ObjectInputStream.readObject(), I quote:
Read an object from the ObjectInputStream. The class of the
object, the signature of the class,
and the values of the non-transient
and non-static fields of the class and
all of its supertypes are read.
Default deserializing for a class can
be overriden using the writeObject and
readObject methods. Objects referenced
by this object are read transitively
so that a complete equivalent graph of
objects is reconstructed by
readObject.
The root object is completely restored when all of its fields and
the objects it references are
completely restored. At this point the
object validation callbacks are
executed in order based on their
registered priorities. The callbacks
are registered by objects (in the
readObject special methods) as they
are individually restored.
Exceptions are thrown for problems with the InputStream and for classes
that should not be deserialized. All
exceptions are fatal to the
InputStream and leave it in an
indeterminate state; it is up to the
caller to ignore or recover the stream
state.
Specified by:
readObject in interface ObjectInput
Returns:
the object read from the stream
Throws:
ClassNotFoundException - Class of a serialized object cannot be found.
InvalidClassException - Something is wrong with a class used by serialization.
StreamCorruptedException - Control information in the stream is inconsistent.
OptionalDataException - Primitive data was found in the stream instead of objects.
IOException - Any of the usual Input/Output related exceptions.
I'd guess that you're trying to read an object before one has been written to the object stream, or one where the output stream hasn't been flushed.
You are trying to read in an object of type 'Object'. Is that how it was serialized? You need to make sure that you are reading the object into the same class that it was written from, remember those pesky serialVersionUID warnings that come up? This is key to object serialization and reconstruction, hence the need for matching classes. Also the reason that you need to update your UID when your class structure changes.
Perhaps you're trying to read multiple times the same object from the stream, while the server wrote the object only once.
Or you're trying to use an ObjectInputStream before a corresponding ObjectOutputStream is created, and that invalidates the communication between the two. An ObjectOutputStream writes a serialization stream header upon its creation, and if it's not created before the corresponding ObjectOutputStream, that header is lost.