Why the read () method is different in reading the total number of byte?
For example,
int n = System.in.read();
System.out.println("The total bytes are:"+System.in.available());
And in another place we use
byte [] in= new byte[30];
int n = System.in.read(byte);
System.out.println("The total bytes are:"+System.in.available());
And the word Java has been read in both methods
The output of the first method is :
the total bytes are 5
Where the second method is:
the total bytes are 6
What is the differnce in returing bytes between these two methods?
As the Javadoc says of the available() method, it: "Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream."
The exact way a stream determines this count is not strictly defined. In the case of System.in it may use the number of bytes currently available in its internal buffer, or it may delegate the call to the underlying input stream which may be implementation dependent (e.g. by operating system). The only thing you can really determine from the returned value is the number of bytes you should be able to safely read without blocking.
System.in.read() it read value from standard input stream.
byte [] in= new byte[30];
int n = System.in.read(byte);
It reads value from your in array.
Related
I have a few questions about the way InputStream.read() works. I'm trying to listen on InputStream and whenever bytes are available, I need them to be copied to a byte array and then the byte array is handed off to another method for further processing and then return to listening to the InputStream for next set of bytes. Here's how far I've gotten:
while(true){
while((i = inputstream.read(recvBuffer, 0, recvBuffer.length)) != -1){
doSomething(recvBuffer);
}
}
I read that read(byte[] b, int off, int len) method returns an int that is supposed to represent the number of bytes read from the stream. Doesn't that mean whenever bytes are available, it sets the value of i to the respective number of bytes? Once it reads x number of bytes and reaches the end of the stream, wouldn't it return a positive integer representing the number of bytes, instead of -1? Then, when would the check against -1 happen for i? I know I'm interpreting this wrong but I can't say how. Any help understanding this would be appreciated
Also, I know the max amount of bytes that a sender can push onto a stream. In this case, is it sufficient to specify the size of recvBuffer as the max amount of bytes or is it prudent to allocate a bit more than that?
doSomething(recvBuffer);
That should be
doSomething(recvBuffer, i);
The method needs to know how many bytes were actually received.
I read that read(byte[] b, int off, int len) method returns an int that is supposed to represent the number of bytes read from the stream.
Correct.
Doesn't that mean whenever bytes are available, it sets the value of
i to the respective number of bytes?
Yes.
Once it reads x number of bytes
and reaches the end of the stream, wouldn't it return a positive
integer representing the number of bytes, instead of -1?
No, it transfers the bytes and returns the count, then the next time there are no bytes, only end of stream, so it returns -1.
Then, when would the check against -1 happen for i? I know I'm interpreting this
wrong but I can't say how.
See above.
Also, I know the max amount of bytes that a sender can push onto a stream. In this case, is it sufficient to specify the size of recvBuffer as the max amount of bytes or is it prudent to allocate a bit more than that?
Most people use 4096 or 8192 bytes. There's not a lot of point in specifying a buffer larger than the path MTU in truth, which is normally < 1500, unless you are slow at reading so that the kernel socket receive buffer fills up.
Consider we have a socket connection between two device (A and B). Now if I write only 16 bytes (size doesn't matter here) to the output stream (not BufferedOutputStream) of the socket in side A 3 times or in general more than once like this:
OutputStream outputStream = socket.getOutputStream();
byte[] buffer = new byte[16];
OutputStream.write(buffer);
OutputStream.write(buffer);
OutputStream.write(buffer);
I read the data in side B using the socket input stream (not BufferedInputStream) with a buffer larger than sending buffer for example 1024:
InputStream inputStream = socket.getInputStream();
byte[] buffer = new byte[1024];
int read = inputStream.read(buffer);
Now I wonder how the data is received on side B? May it get accumulated or it exactly read the data as A sends it? In another word may the read variable get more than 16?
InputStream makes very few guarantees about how much data will be read by any invocation of the multi-byte read() methods. There is a whole class of common programming errors that revolve around misunderstandings and wrong assumptions about that. For example,
if InputStream.read(byte[]) reads fewer bytes than the provided array can hold, that does not imply that the end of the stream has been reached, or even that another read will necessarily block.
the number of bytes read by any one invocation of InputStream.read(byte[]) does not necessarily correlate to any characteristic of the byte source on which the stream draws, except that it will always read at least one byte when not at the end of the stream, and that it will not read more bytes than are actually available by the time it returns.
the number of available bytes indicated by the available() method does not reliably indicate how many bytes a subsequent read should or will read. A nonzero return value reliably indicates only that the next read will not block; a zero return value tells you nothing at all.
Subclasses may make guarantees about some of those behaviors, but most do not, and anyway you often do not know which subclass you actually have.
In order to use InputStreams properly, you generally must be prepared to perform repeated reads until you get sufficient data to process. That can mean reading until you have accumulated a specific number of bytes, or reading until a delimiter is encountered. In some cases you can handle any number of bytes from any given read; generally these are cases where you are looping anyway, and feeding everything you read to a consumer that can accept variable length chunks (many compression and encryption interfaces are like that, for example).
Per the docs:
public int read(byte[] b) throws IOException
Reads some number of bytes from the input stream and stores them into the buffer array b. The number of bytes
actually read is returned as an integer. This method blocks until
input data is available, end of file is detected, or an exception is
thrown. If the length of b is zero, then no bytes are read and 0 is
returned; otherwise, there is an attempt to read at least one byte. If
no byte is available because the stream is at the end of the file, the
value -1 is returned; otherwise, at least one byte is read and stored
into b.
The first byte read is stored into element b[0], the next one into
b[1], and so on. The number of bytes read is, at most, equal to the
length of b. Let k be the number of bytes actually read; these bytes
will be stored in elements b[0] through b[k-1], leaving elements b[k]
through b[b.length-1] unaffected.
Read(...) tells you how many bytes it put into the array and yes, you can read further; you'll get whatever was already there.
This is an exact quote from my text:
The purpose of I/O buffering is to improve system performance.
Rather than reading a byte at a time, a large number of bytes are read together
the first time the read() method is invoked.
However, when I use BufferedInputStream.read() all I can do is get a single byte. What am I doing wrong and what do I need to do?
It's not you, it is the stream that reads more than one character at a time. The BufferedInputStream keeps a buffer, and next time you call read() the next byte from that buffer is returned without any access to a physical drive (unless the buffer is empty and the next chunk of data has to be read into the buffer).
Note there are methods that read more than a byte, but these don't really have to do with the difference you explicitly asked for in your question.
The BufferedInputStream class facilitates buffering to your input streams. Rather than read one byte at a time from the network or disk, you read a larger block at a time.
You can set the buffer size to be used internally by the BufferedInputStream with the following constructor
InputStream input = new BufferedInputStream(new FileInputStream("PathOfFile"), 2 * 1024);
This example sets the buffer size to 2 KB
When the BufferedInputStream is created, an internal buffer array is created. As bytes from the stream are read or skipped, the internal buffer is refilled as necessary from the contained input stream, many bytes at a time
I have a binary file in custom format than I have written using DataOutputStream.
The simplified format of the data in the file is: IntCharIntCharIntChar...IntChar
I am using DataInputStream to read from this file, and available() to determine whether or not the next read will be successful.
Everything works fine for small files. However, for big files, with filesize bigger than Integer.MAX_VALUE bytes, the available() call returns strange negative values after the first read. The file I am trying to read is 4751054632 bytes (about 4.8 gig).
simplified test code:
DataInputStream reader=new DataInputStream(new BufferedInputStream(new FileInputStream("/path/file")));
System.out.println("available at start:\t" + reader.available());
while(reader.available()>0){
int a=reader.readInt();
System.out.println("available after readInt:\t" + reader.available());
char b=reader.readChar();
System.out.println("available after readChar:\t" + reader.available());
//do something
}
output:
available at start: 2147483647 //this is equal to Integer.MAX_VALUE
available after readInt: -2147475461
available after readChar: -2147475463
Instead of using available() I could just execute the readInt() and readChar() commands in a try block and catch the exception when the file is finished, but I am trying to understand why this behaviour is happening. Essentially I am looking for a method that will return true if there is data available to read and false if the file is finished/ the stream has ended. I thought available()>0 would do exactly that but I guess not?
I am using DataInputStream to read from this file, and available() to determine whether or not the next read will be successful.
Then you're doing it wrong. See the Javadoc: "Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking by the next invocation of a method for this input stream.". Many implementations returns zero, and the ones that don't return zero don't guarantee to return a positive number: they can't, when the number concerned exceeds Integer.MAX_VALUE.
Also, the size of the file could change between available() and read().
You should detect end of stream by catching EOFException on the methods that throw it, or -1 or null returned by the methods that don't (i.e. read(), all overloads, and readLine() respectively).
According to Java specification, the size of a char data type is 16 bits or two bytes.
So, I have the written following code:
private static final int BUFFER_SIZE=1024;
char[] buffer=new char[BUFFER_SIZE];
BufferedReader br= new BufferedReader(new InputStreamReader(conn.getInputStream()));
while (true){
byteFromStream=in.read(buffer);
if (byteFromStream==-1) break;
totalBytesLoaded=totalBytesLoaded+byteFromStream*2;
}
But for some strange reason I am reading more bytes then is available on the stream, according to the specification of read() return numbers of characters actually read by stream.
Oh, I am getting total stream size by
bytesTotal=conn.getContentLength();
Which is working pretty fine as I myself uploaded files on the server and I know their sizes.
The method returns the amount of read characters. That value does not need to be multiplied by 2, especially since you cannot make that general assumption about the byte size of a character from a stream.
The amount of bytes per character depends on the character encoding (it can be 1 byte for example). The reader component knows that and only tells you the amount of read characters.