How to re-read an InputStream after calling IOUtils.copy? - java

I simply use
IOUtils.copy(myInputStream, myOutputStream);
And I see that before calling IOUtils.copy the input stream is avaible to read and after not.
flux.available()
(int) 1368181 (before)
(int) 0 (after)
I saw some explanation on this post, and I see I can copy the bytes from my InputStream to a ByteArrayInputStream and then use mark(0) and read(), in order to read multiple times an input stream.
Here is the code resulted (which is working).
I find this code very verbose, and I'd like if there is a better solution to do that.
ByteArrayInputStream fluxResetable = new ByteArrayInputStream(IOUtils.toByteArray(myInputStream));
fluxResetable.mark(0);
IOUtils.copy(fluxResetable, myOutputStream);
fluxResetable.reset();

An InputStream, unless otherwise stated, is single shot: you consume it once and that's it.
If you want to read it many times, that isn't just a stream any more, it's a stream with a buffer. Your solution reflects that accurately, so it is acceptable. The one thing I would probably change is to store the byte array and always create a new ByteArrayInputStream from it when needed, rather than resetting the same one:
byte [] content = IOUtils.toByteArray(myInputStream);
IOUtils.copy(new ByteArrayInputStream(content), myOutputStream);
doSomethingElse(new ByteArrayInputStream(content));
The effect is more or less the same but it's slightly easier to see what you're trying to do.

Related

get size of InputStream in Java

I am trying to encrypt an InputStream which I am getting from FileItem.getInputStream(); and I want to know the encrypted stream's length.
If it would be a simple FileInputStream, I could have tried File#length() or FileInputStream#getChennal().size() though not sure even those would have given me exactly what I want, but, in this case what I have is an InputStream (the encrypted one) and I want to find length of the same, I tried searching on Internet but there I could not find any efficient solution to that.
Please help
You have to read the whole stream for that. Streams are probably not even complete when you start reading from them, so the size may not be known at that time.
You can output whats in InputStream and get the length
ObjA = new ObjA;
while( more message to read ){
ObjA = inputStreamObj.read();
}
System.out.println(ObjA.length());

Can BufferedReader read bytes?

Sorry if this question is a dulplicate but I didn't get an answer I was looking for.
Java docs says this
In general, each read request made of a Reader causes a corresponding read request to be made of the underlying character or byte stream. It is therefore advisable to wrap a BufferedReader around any Reader whose read() operations may be costly, such as FileReaders >and InputStreamReaders. For example,
BufferedReader in = new BufferedReader(new FileReader("foo.in"));
will buffer the input from the specified file. Without buffering, each invocation of read() or readLine() could cause bytes to be read from the file, converted into characters, and then returned, which can be very inefficient.
My first question is If bufferedReader can read bytes then why can't we work on images which are in bytes using bufferedreader.
My second question is Does Bufferedreader store characters in BUFFER and what is the meaning of this line
will buffer the input from the specified file.
My third question is what is the meaning of this line
In general, each read request made of a Reader causes a corresponding read request to be >made of the underlying character or byte stream.
There are two questions here.
1. Buffering
Imagine you lived a mile from your nearest water source, and you drink a cup of water every hour. Well, you wouldn't walk all the way to the water for every cup. Go once a day, and come home with a bucket full of enough water to fill the cup 24 times.
The bucket is a buffer.
Imagine your village is supplied water by a river. But sometimes the river runs dry for a month; other times the river brings so much water that the village floods. So you build a dam, and behind the dam there is a reservoir. The reservoir fills up in the rainy season and gradually empties in the dry season. The village gets a steady flow of water all year round.
The reservoir is a buffer.
Data streams in computing are similar to both those scenarios. For example, you can get several kilobytes of data from a filesystem in a single OS system call, but if you want to process one character at a time, you need something similar to a reservoir.
A BufferedReader contains within it another Reader (for example a FileReader), which is the river -- and an array of bytes, which is the reservoir. Every time you read from it, it does something like:
if there are not enough bytes in the "reservoir" to fulfil this request
top up the "reservoir" by reading from the underlying Reader
endif
return some bytes from the "reservoir".
However when you use a BufferedReader, you don't need to know how it works, only that it works.
2. Suitability for images
It's important to understand that BufferedReader and FileReader are examples of Readers. You might not have covered polymorphism in your programming education yet, so when you do, remember this. It means that if you have code which uses FileReader -- but only the aspects of it that conform to Reader -- then you can substitute a BufferedReader and it will work just the same.
It's a good habit to declare variables as the most general class that works:
Reader reader = new FileReader(file);
... because then this would be the only change you need to add buffering:
Reader reader = new BufferedReader(new FileReader(file));
I took that detour because it's all Readers that are less suitable for images.
Reader has two read methods:
int read(); // returns one character, cast to an int
int read(char[] block); // reads into block, returns how many chars it read
The second form is unsuitable for images because it definitely reads chars, not ints.
The first form looks as if it might be OK -- after all, it reads ints. And indeed, if you just use a FileReader, it might well work.
However, think about how a BufferedReader wrapped around a FileReader will work. The first time you call BufferedReader.read(), it will call FileReader.read(buffer) to fill its buffer. Then it will cast the first char of the buffer back to an int, and return that.
Especially when you bring multi-byte charsets into the picture, that can cause problems.
So if you want to read integers, use InputStream not Reader. InputStream has int read(byte[] buf, int offset, int length) -- bytes are much more reliably cast back and forth from int than chars.
Readers (and Writers) in java are specialized classes for dealing with text (character) streams - the concept of a line is meaningless in any other type of stream.
for the general IO equivalent, have a look at BufferedInputStream
so, to answer your questions:
while the reader does eventually read bytes, it converts them to characters. it is not intended to read anything else (like images) - use the InputStream family of classes for that
a buffered reader will read large blocks of data from the underlying stream (which may be a file, socket, or anything else) into a buffer in memory and will then serve read requests from this buffer until the buffer is emptied. this behaviour of reading large chunks instead of smaller chucks every time improves performance.
it means that if you dont wrap a reader in a buffered reader then every time you want to read a single character, it will access the disk.network to get just the single character you want. doing I/O in such small chunks is usually terrible for performance.
Default behaviour is it will convert to character, but when you have an image you cannot have a character data, instead you need pixel of bytes data. So you cannot use it.
It is buffereing, means , it is reading a certain chunk of data in an char array. You can see this behaviour in the code:
public BufferedReader(Reader in) {
this(in, defaultCharBufferSize);
}
and the defaultCharBufferSize is as mentioned below:
private static int defaultCharBufferSize = 8192;
3 Every time you do read operation, it will be reading only one character.
So in a nutshell, buffred means, it will read few chunk of character data first that will keep in a char array and that will be processed and again it will read same chunk of data until it reaches end of stream
You can refer the following to get to know more
BufferedReader

Buffered Input Stream mark read limit

I am learning how to use an InputStream. I was trying to use mark for BufferedInputStream, but when I try to reset I have these exceptions:
java.io.IOException: Resetting to invalid mark
I think this means that my mark read limit is set wrong. I actually don't know how to set the read limit in mark(). I tried like this:
is = new BufferedInputStream(is);
is.mark(is.available());
This is also wrong.
is.mark(16);
This also throws the same exception.
How do I know what read limit I am supposed to set? Since I will be reading different file sizes from the input stream.
mark is sometimes useful if you need to inspect a few bytes beyond what you've read to decide what to do next, then you reset back to the mark and call the routine that expects the file pointer to be at the beginning of that logical part of the input. I don't think it is really intended for much else.
If you look at the javadoc for BufferedInputStream it says
The mark operation remembers a point in the input stream and the reset operation causes all the bytes read since the most recent mark operation to be reread before new bytes are taken from the contained input stream.
The key thing to remember here is once you mark a spot in the stream, if you keep reading beyond the marked length, the mark will no longer be valid, and the call to reset will fail. So mark is good for specific situations and not much use in other cases.
This will read 5 times from the same BufferedInputStream.
for (int i=0; i<5; i++) {
inputStream.mark(inputStream.available()+1);
// Read from input stream
Thumbnails.of(inputStream).forceSize(160, 160).toOutputStream(out);
inputStream.reset();
}
The value you pass to mark() is the amount backwards that you will need to reset. if you need to reset to the beginning of the stream, you will need a buffer as big as the entire stream. this is probably not a great design as it will not scale well to large streams. if you need to read the stream twice and you don't know the source of the data (e.g. if it's a file, you could just re-open it), then you should probably copy it to a temp file so you can re-read it at will.

Can I perform successive mark operations on an InputStream in Java

I'm trying to build a simple parser, and since InputStream doesn't have some peek-like method, I'm using mark and reset.
But I suspect that successive calls to mark, invalidate the previous ones. Is that the case?
Is it possible to do something like
foo.mark(1);
...
foo.mark(2);
...
foo.reset();
...
foo.reset();
If not, is there some other way to simulate this or the peek method?
Thx.
Your suspicion is correct, the InputStream.mark(int readlimit) method will allow you reposition the stream only to the last marked position, provided you have read less than readlimit bytes. If you want a "peekable" InputStream you may want to consider the PushbackInputStream. It doesn't explicitly offer peek functionality, but it will allow you to "push back" bytes you have read.
Marks don't nest.
If you want to reread the stream several times, you might need to copy (a portion of) the stream into a byte array, and make a ByteArrayInputStream of it. You still can't have multiple marks, but you can have multiple ByteArrayInputStreams. (Or just forget about ByteArrayInputStream and pick bytes off the array directly.)

Testing for empty InputStream in Java

how do you guys test for an empty InputStream? I know that InputStream is designed to work with remote resources, so you can't know if it's there until you actually read from it. I cannot use read() because current position would change and using mark() and resetting for that seems to be inappropriate.
The problem is, that sometimes one can't test if read() returns -1, because if you have a stream and some third party library uses it, you need to test if it is empty before you send it there.
By empty InputStreams I mean these new ByteArrayInputStream(new byte[0])
You can wrap your InputStream in a PushbackInputStream. This class will store the first few bytes from read() in an internal buffer. You can later unread() the bytes and pass the object to the 3rd party library.
I don't really like ByteArrayInputStream, because it keeps all the data from the stream in memory.
Also, in any case, you will be forced to read() to check for the empty stream, which means you'll hit the network, at least for a few bytes.
A couple of alternatives:
ByteArrayInputStreams and several other similar classes are by definition non-blocking, as the data is already in the VM memory. In those cases the available() from InputStream could be what you need. This will not work when reading from an input source external to the program, e.g. a network socket, the standard input or perhaps even a file.
If the markSupported() method returns true for a specific InputStream instance, you may be able to use the mark() and reset() methods to return to the start of the stream after attempting to use read() on it.
EDIT:
By the way, ByteArrayInputStreams support mark() and reset() quite nicely and they are by default marked at position 0. This code:
InputStream x = new ByteArrayInputStream(new String("1234567890").getBytes());
byte b[] = new byte[1];
x.read(b, 0 , 1);
System.out.println(b[0]);
x.read(b, 0 , 1);
System.out.println(b[0]);
x.reset();
x.read(b, 0 , 1);
System.out.println(b[0]);
has this output:
49
50
49

Categories