How do I read more than one byte with BufferedInputStream - java

This is an exact quote from my text:
The purpose of I/O buffering is to improve system performance.
Rather than reading a byte at a time, a large number of bytes are read together
the first time the read() method is invoked.
However, when I use BufferedInputStream.read() all I can do is get a single byte. What am I doing wrong and what do I need to do?

It's not you, it is the stream that reads more than one character at a time. The BufferedInputStream keeps a buffer, and next time you call read() the next byte from that buffer is returned without any access to a physical drive (unless the buffer is empty and the next chunk of data has to be read into the buffer).
Note there are methods that read more than a byte, but these don't really have to do with the difference you explicitly asked for in your question.

The BufferedInputStream class facilitates buffering to your input streams. Rather than read one byte at a time from the network or disk, you read a larger block at a time.
You can set the buffer size to be used internally by the BufferedInputStream with the following constructor
InputStream input = new BufferedInputStream(new FileInputStream("PathOfFile"), 2 * 1024);
This example sets the buffer size to 2 KB
When the BufferedInputStream is created, an internal buffer array is created. As bytes from the stream are read or skipped, the internal buffer is refilled as necessary from the contained input stream, many bytes at a time

Related

FileInputStream read byte by byte or block?

The reason why bufferedinputstream(BIS) is faster than FileInputStream(FIS) provided on Why is using BufferedInputStream to read a file byte by byte faster than using FileInputStream? is that
With a BufferedInputStream, the method delegates to an overloaded
read() method that reads 8192 amount of bytes and buffers them until
they are needed while FIS read the single byte
Per my understanding Disk is a 'block device'. The disk is always going to read/write entire blocks, even if the read request is for some smaller amount of data.
Is n't it ? So how even both FIS and BIS will be reading complete block not single byte(as stated for FIS). Right ? So how BIS is faster than FIS ?
The java API of InputStream is what it is. Specifically, it has this method:
int read() throws IOException
which reads a single byte (it returns an int, so that it can return -1 to indicate EOF).
So, if you try to read a SINGLE BYTE from a file, it'll try to do that. In the case of a block device like a harddisk, that'll likely read the entire block, and then chuck everything except that one byte, so, if you call that read() method 8192 times, it reads the same block, over and over, 8192 times, each time chucking away 8191 bytes and giving you just the one you want. Thus, reading 67 million bytes in the entire process. Ouch. Not very efficient.
Given that the kernel, CPU, disk, etc all read in a block size of 8192, there is zero performance difference between a BufferedInputStream(new FileInputStream) and just the new FileInputStream, IF you use something like:
byte[] buffer = new byte[8192];
in.read(buffer);
Now even plain jane unbuffered new FileInputStream just ends up reading that block off of disk just once.
BufferedInputStream does that 'under the hood' even if you use the single-byte form of read(), and will then feed you data from that byte array for the next 8191 calls to read(). That's all BufferedInputStream does.
If you are using the read() (one byte at a time) variant (or the byte-array variant of read, but with really small byte arrays), then BufferedInputStream makes sense. Otherwise, that does nothing and there is no need to put that in there.
NB: As far as I know, java makes no guesses about what the disk buffer size is and just uses some reasonable buffer size. The effect is the same: If using single-byte-at-a-time, wrapping your filestream into a bufferedstream improves performance by a factor 1000+, if you are using the byte array variant, no difference whatsoever.

Java Bytebuffer can only read sequentially?

I am mapping a file to memory and reading it back with java's ByteBuffer. This proves to be a really fast way of reading large files. However, I can only read the values sequentially. Meaning that once I read them buffer.getInt()the buffer pointer moves to the next bytes. So If I want to use a value more than once I have to store it to another variable:
int a = buffer.getInt()
I am noticing that this approach of copying a piece of memory to another is taking a long time (especially with a very large file) compared to just reading bytes. Is there a way I can re-read those bytes instead of copying them?
Just use position(int) to seek in ByteBuffer. Then you can read from anywhere.
ByteBuffer buffer=ByteBuffer.allocate(1000);
byte[] data=new byte[10];
buffer.position(100);
//read 10 from postion 100
buffer.get(data);

Does java socket read the data exactly as it's sent

Consider we have a socket connection between two device (A and B). Now if I write only 16 bytes (size doesn't matter here) to the output stream (not BufferedOutputStream) of the socket in side A 3 times or in general more than once like this:
OutputStream outputStream = socket.getOutputStream();
byte[] buffer = new byte[16];
OutputStream.write(buffer);
OutputStream.write(buffer);
OutputStream.write(buffer);
I read the data in side B using the socket input stream (not BufferedInputStream) with a buffer larger than sending buffer for example 1024:
InputStream inputStream = socket.getInputStream();
byte[] buffer = new byte[1024];
int read = inputStream.read(buffer);
Now I wonder how the data is received on side B? May it get accumulated or it exactly read the data as A sends it? In another word may the read variable get more than 16?
InputStream makes very few guarantees about how much data will be read by any invocation of the multi-byte read() methods. There is a whole class of common programming errors that revolve around misunderstandings and wrong assumptions about that. For example,
if InputStream.read(byte[]) reads fewer bytes than the provided array can hold, that does not imply that the end of the stream has been reached, or even that another read will necessarily block.
the number of bytes read by any one invocation of InputStream.read(byte[]) does not necessarily correlate to any characteristic of the byte source on which the stream draws, except that it will always read at least one byte when not at the end of the stream, and that it will not read more bytes than are actually available by the time it returns.
the number of available bytes indicated by the available() method does not reliably indicate how many bytes a subsequent read should or will read. A nonzero return value reliably indicates only that the next read will not block; a zero return value tells you nothing at all.
Subclasses may make guarantees about some of those behaviors, but most do not, and anyway you often do not know which subclass you actually have.
In order to use InputStreams properly, you generally must be prepared to perform repeated reads until you get sufficient data to process. That can mean reading until you have accumulated a specific number of bytes, or reading until a delimiter is encountered. In some cases you can handle any number of bytes from any given read; generally these are cases where you are looping anyway, and feeding everything you read to a consumer that can accept variable length chunks (many compression and encryption interfaces are like that, for example).
Per the docs:
public int read(byte[] b) throws IOException
Reads some number of bytes from the input stream and stores them into the buffer array b. The number of bytes
actually read is returned as an integer. This method blocks until
input data is available, end of file is detected, or an exception is
thrown. If the length of b is zero, then no bytes are read and 0 is
returned; otherwise, there is an attempt to read at least one byte. If
no byte is available because the stream is at the end of the file, the
value -1 is returned; otherwise, at least one byte is read and stored
into b.
The first byte read is stored into element b[0], the next one into
b[1], and so on. The number of bytes read is, at most, equal to the
length of b. Let k be the number of bytes actually read; these bytes
will be stored in elements b[0] through b[k-1], leaving elements b[k]
through b[b.length-1] unaffected.
Read(...) tells you how many bytes it put into the array and yes, you can read further; you'll get whatever was already there.

Can InputSteam.read overflow buffer

Does the read command check the size of the buffer when filling it with data or is there a chance that data is lost because buffer isn't big enough? In other words, if there are ten bytes of data available to be read, will the server continue to store the remaining 2 bytes of data until the next read.
I'm just using 8 as an example here to over dramatise the situation.
InputStream stdout;
...
while(condition)
{
...
byte[] buffer = new byte[8];
int len = stdout.read(buffer);
}
No, read() won't lose any data just because you haven't given it enough space for all the available bytes.
It's not clear what you mean by "the server" here, but the final two bytes of a 10 byte message would be available after the first read. (Or possible, the first read() would only read the first six bytes, leaving four still to read, for example.)

what if we exceed the capacity of allocating buffer in ByteBuffer.allocate(48) NIO package class in java

file = new RandomAccessFile("xanadu.txt", "rw");
FileChannel channel = file.getChannel();
ByteBuffer buffer = ByteBuffer.allocate(48);
int byteReads = channel.read(buffer);
SO I am allocating 48 as a capacity in the Buffer. Now
consider the txt file I am reading is of about 10MB , so logically it is crossing the buffer allocation size.
But when we try to read, we will be able to read all the contents of the file despite the size.
SO how this thing is possible.
I am new to this streaming field so may be my question seems to be very basic.
The read call simply won't read more than 48 bytes.
Nothing will overflow, you'll just need to call read repeatedly until you've read all the data you're expecting.
This is stated in the ReadableByteChannel interface docs:
Reads a sequence of bytes from this channel into the given buffer.
An attempt is made to read up to r bytes from the channel, where r is the number of bytes remaining in the buffer, that is, dst.remaining(), at the moment this method is invoked.
You need to clear() the buffer after processing its content before passing it back to read.

Categories