I am trying to read the number of line in a binary file using readObject, but I get IOException EOF. Am I doing this the right way?
FileInputStream istream = new FileInputStream(fileName);
ObjectInputStream ois = new ObjectInputStream(istream);
/** calculate number of items **/
int line_count = 0;
while( (String)ois.readObject() != null){
line_count++;
}
readObject() doesn't return null at EOF. You could catch the EOFException and interpret it as EOF, but this would fail to detect distinguish a normal EOF from a file that has been truncated.
A better approach would be to use some meta-data. That is, rather than asking the ObjectInput how many objects are in the stream, you should store the count somewhere. For example, you could create a meta-data class that records the count and other meta-data and store an instance as the first object in each file. Or you could create a special EOF marker class and store an instance as the last object in each file.
I had the same problem today. Although the question is quite old, the problem remains and there was no clean solution provided. Ignoring EOFException should be avoided as it may be thrown when some object was not saved correctly. Writing null obviously prevents you from using null values for any other purposes. Finally using available() on the objects stream always returns zero, as the number of objects is unknown.
My solution is quite simple. ObjectInputStream is just a wrapper for some other stream, such as FileInputStream. Although ObjectInputStream.available () returns zero, the FileInputStream.available will return some value.
FileInputStream istream = new FileInputStream(fileName);
ObjectInputStream ois = new ObjectInputStream(istream);
/** calculate number of items **/
int line_count = 0;
while( istream.available() > 0) // check if the file stream is at the end
{
(String)ois.readObject(); // read from the object stream,
// which wraps the file stream
line_count++;
}
No. Catch EOFException and use that to terminate the loop.
If you write a null object at the end of the file, when you read it back you will get a null value and can terminate your loop.
Just add:
out.writeObject(null);
when you serialize the data.
It's curious that the API doesn't supply a more elegant solution to this. I guess the EOFException would work but I've always been encouraged to see exceptions as unexpected events whereas here you would often expect the object stream to come to an end.
I tried to work around this by writing a kind of "marker" object to signify the end of the object stream:
import java.io.Serializable;
public enum ObjectStreamStatus implements Serializable {
EOF
}
Then in the code reading the object i checked for this EOF object in the object reading loop.
No, you need to know how many objects there is in the binary file. You could write the number of objects at the beginning of the file (using writeInt for example) and read it while loading it.
Another option is to call ois.available() and loop until it returns 0. However, I am not sure if this is 100% sure.
It looks like the problem is with the data that you wrote out. Assuming the data is written as expected by this code, there shouldn't be a problem.
(I see you are reading Strings. This ObectInputStream isn't for reading text files. Use InputStreamReader and BufferedReader.readLine for that. Similarly if you have written the file with DataOutputSteam.writeUTF, read it with DataInputStream.readUTF)
The available method of ObjectInputStream cannot used to terminate the loop as it returns 0 even if there are objects to be read in a file. Writing a null to a file doen't seem to be a good solution either since objects can be null which then would be interpreted as the end of file. I think catching the EOFException to terminate the loops is a better practice since if EOFException occurs(either because you reached the end of the file or some other reason), you have to terminate the loop anyway.
The best possible way to end the loop could be done by adding a null object at the end. While reading the null object can be used as a boundary condition to exit the loop. Catching the EOFException also solves the purpose but it takes few m
Related
Consider the following code snippet for reading a file in Java/Android:
FileInputStream fis = openFileInput("myfile.txt");
BufferedInputStream bis = new BufferedInputStream(fis);
StringBuffer b = new StringBuffer();
while (bis.available()!=0) {
char c = (char) bis.read();
b.append(c);
}
bis.close();
fis.close();
I am talking about available() method in the condition of while loop. I looked the API documentation for that method and I have the following questions:
How the iteration inside the the while loop is happening, i.e. how is the file pointer moving to another chunk of data during each iteration of while loop? This is not specified in the API documentation.
How can I figure out which method of which class should I use to accomplish a task?
The available method returns an int as an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream.
That while loop is iterating till the end of the file essentially, when there is no bytes there is no file.
Here is some documentation: http://docs.oracle.com/javase/7/docs/api/java/io/BufferedInputStream.html
The bis.available functions checks for end of file. in each iteration of the while loop when u do bis.read() the file pointer reads one character and automatically moves on to the next one.
As far as your query about which method to use. just look up the parameters that the function takes. and what u need to accomplish. its not very difficult figuring that out.
I am learning how to use an InputStream. I was trying to use mark for BufferedInputStream, but when I try to reset I have these exceptions:
java.io.IOException: Resetting to invalid mark
I think this means that my mark read limit is set wrong. I actually don't know how to set the read limit in mark(). I tried like this:
is = new BufferedInputStream(is);
is.mark(is.available());
This is also wrong.
is.mark(16);
This also throws the same exception.
How do I know what read limit I am supposed to set? Since I will be reading different file sizes from the input stream.
mark is sometimes useful if you need to inspect a few bytes beyond what you've read to decide what to do next, then you reset back to the mark and call the routine that expects the file pointer to be at the beginning of that logical part of the input. I don't think it is really intended for much else.
If you look at the javadoc for BufferedInputStream it says
The mark operation remembers a point in the input stream and the reset operation causes all the bytes read since the most recent mark operation to be reread before new bytes are taken from the contained input stream.
The key thing to remember here is once you mark a spot in the stream, if you keep reading beyond the marked length, the mark will no longer be valid, and the call to reset will fail. So mark is good for specific situations and not much use in other cases.
This will read 5 times from the same BufferedInputStream.
for (int i=0; i<5; i++) {
inputStream.mark(inputStream.available()+1);
// Read from input stream
Thumbnails.of(inputStream).forceSize(160, 160).toOutputStream(out);
inputStream.reset();
}
The value you pass to mark() is the amount backwards that you will need to reset. if you need to reset to the beginning of the stream, you will need a buffer as big as the entire stream. this is probably not a great design as it will not scale well to large streams. if you need to read the stream twice and you don't know the source of the data (e.g. if it's a file, you could just re-open it), then you should probably copy it to a temp file so you can re-read it at will.
Im reading values from a file and storing these values in a hashmap, using a bufferedreader, in the following manner --
while((String str=buffread.readLine()).length()>1)
{
hashMap.put(str.substring(0,5),str);
}
I can also verify that the hashmap has all data that was initially present in the file.
Now, Im trying to write the values of exact hashmap to another file in the following manner --
FileWriter outFile = new FileWriter("file path");
PrintWriter out = new PrintWriter(outFile);
Set entries = hashMap.entrySet();
Iterator entryIter = entries.iterator();
while (entryIter.hasNext()) {
Map.Entry entry = (Map.Entry)entryIter.next();
Object value = entry.getValue(); // Get the value.
out.println(value.toString());
}
But this seems to write lesser number of entries into the file than the value of hashMap1.size() or essentially, the number of entries that it initially read from the source file.
Though I have a hunch that its because of the Printwriter and filewriter, if anyone could point me to why this issue is occurring, it would be of great help.
Regards
p1nG
Perhaps you left this out of the code you posted, but are you explicitly calling flush() and close() on the PrintWriter/FileWriter objects when you are done with them?
Each call to println() does not necessarily cause a line to be written to the underlying OutputStream/file.
Unless the first 5 characters on every line of your source file are unique, this line
hashMap.put(str.substring(0,5),str);
will ensure you're overwriting some entries in the Map.
There is a possibility that something fails when writing to file:
Methods in this (PrintWriter) class never throw I/O exceptions. The client may inquire as to whether any errors have occurred by invoking checkError().
In general, I don't think it's a problem with HashMap, as you said that the data was read correctly.
You can't possibly read a file correctly with that code. You have to check the result of readLine() for null before you do anything else with it, unless you like catching NullPointerExceptions of course.
You don't need the Iterator at this point, just use a keyset at iterate this
Set<String> keys = hashMap.keySet();
for(String key : keys){
out.println(hashMap.get(key));
}
should do it.
Is there a way to flush the input stream in Java, perhaps prior to closing it? In relation to
iteratively invoking the statements below, while reading several files on disk
InputStream fileStream = item.openStream();
fileStream.close;
InputStream cannot be flushed. Why do you want to do this?
OutputStream can be flushed as it implements the interface Flushable. Flushing makes IMHO only sense in scenarios where data is written (to force a write of buffered data). Please see the documentation of Flushable for all implementing classes.
This is an old question but appears to be the only one of its kind, and I think there is a valid use case for "flushing" an input stream.
In Java, if you are using a BufferedReader or BufferedInputStream (which I believe is a common case), "flushing" the stream can be considered to be equivalent to discarding all data currently in the buffer -- i.e., flushing the buffer.
For an example of when this might be useful, consider implementing a REPL for a programming language such as Python or similar.
In this case you might use a BufferedReader on System.in. The user enters a (possibly large) expression and hits enter. At this point, the expression (or some part of it) is stored in the buffer of your Reader.
Now, if a syntax error occurs somewhere within the expression, it will be caught and displayed to the user. However, the remainder of the expression still resides in the input buffer.
When the REPL loop continues, it will begin reading just beyond the point where the syntax error occurred, in the middle of some erroneous expression. This is likely not desirable. Rather, it would be better to simply discard the remainder of the buffer and continue with a "fresh start."
In this sense, we can use the BufferedReader API method ready() to discard any remaining buffered characters. The documentation reads:
"Tells whether this stream is ready to be read. A buffered character stream is ready if the buffer is not empty, or if the underlying character stream is ready."
Then we can define a method to "flush" a BufferedReader as:
void flush(BufferedReader input) throws IOException
{
while (input.ready())
input.read();
}
which simply discards all remaining data until the buffer is empty. We then call flush() after handling a syntax error (by displaying to the user). When the REPL loop resumes you have an empty buffer and thus do not get a pile of meaningless errors caused by the "junk" left in the buffer.
I currently have 2 BufferedReaders initialized on the same text file. When I'm done reading the text file with the first BufferedReader, I use the second one to make another pass through the file from the top. Multiple passes through the same file are necessary.
I know about reset(), but it needs to be preceded with calling mark() and mark() needs to know the size of the file, something I don't think I should have to bother with.
Ideas? Packages? Libs? Code?
Thanks
TJ
The Buffered readers are meant to read a file sequentially. What you are looking for is the java.io.RandomAccessFile, and then you can use seek() to take you to where you want in the file.
The random access reader is implemented like so:
try{
String fileName = "c:/myraffile.txt";
File file = new File(fileName);
RandomAccessFile raf = new RandomAccessFile(file, "rw");
raf.readChar();
raf.seek(0);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
The "rw" is a mode character which is detailed here.
The reason the sequential access readers are setup like this is so that they can implement their buffers and that things can not be changed beneath their feet. For example the file reader that is given to the buffered reader should only be operated on by that buffered reader. If there was another location that could affect it you could have inconsistent operation as one reader advanced its position in the file reader while the other wanted it to remain the same now you use the other reader and it is in an undetermined location.
What's the disadvantage of just creating a new BufferedReader to read from the top? I'd expect the operating system to cache the file if it's small enough.
If you're concerned about performance, have you proved it to be a bottleneck? I'd just do the simplest thing and not worry about it until you have a specific reason to. I mean, you could just read the whole thing into memory and then do the two passes on the result, but again that's going to be more complicated than just reading from the start again with a new reader.
The best way to proceed is to change your algorithm, in a way in which you will NOT need the second pass. I used this approach a couple of times, when I had to deal with huge (but not terrible, i.e. few GBs) files which didn't fit the available memory.
It might be hard, but the performance gain usually worths the effort
About mark/reset:
The mark method in BufferedReader takes a readAheadLimit parameter which limits how far you can read after a mark before reset becomes impossible. Resetting doesn't actually mean a file system seek(0), it just seeks inside the buffer. To quote the Javadoc:
readAheadLimit - Limit on the number of characters that may be read while still preserving the mark. After reading this many characters, attempting to reset the stream may fail. A limit value larger than the size of the input buffer will cause a new buffer to be allocated whose size is no smaller than limit. Therefore large values should be used with care.
"The whole business about mark() and reset() in BufferedReader smacks of poor design."
why don't you extend this class and have it do a mark() in the constructor() and then do a seek(0) in topOfFile() method.
BR,
~A