From the API the method write(int byte) should take an int representing a byte so in that way it when EOF comes it can return -1.
However it's possible doing the following thing:
FileOutputStream fi = new FileOutputStream(file);
fi.write(100000);
I expected to not compile as the number exceeds the byte range.
How does the JVM interpret it exactly?
Thanks in advance.
From the OutputStream.write(int) doc:
Writes the specified byte to this output stream. The general contract for write is that one byte is written to the output stream. The byte to be written is the eight low-order bits of the argument b. The 24 high-order bits of b are ignored.
Emphasis mine.
Note that the method takes an int. And since 100000 is a valid integer literal, there is no point of it being not compiling.
Where did you read that part about EOF and -1?
The method just writes one byte, which for some reason is passed along as an int.
Writes the specified byte to this output stream. The general contract for write is that one byte is written to the output stream. The byte to be written is the eight low-order bits of the argument b. The 24 high-order bits of b are ignored.
I expected to not compile as the number exceeds the byte range
No, this will compile okay. The compiler just looks for an int. (A long would not compile).
Everything except the lowest 8 bits will be ignored.
Related
I am currently going through Java I/O tutorial and having hard time understanding the read() method of FileInputStream class. I know that per documantion that read() method reads "byte" of data from stream and returns an integer representing the byte (between 0 and 256) or -1 if it reaches the end of file.
Byte in java has a range between -128 and 127, so, how come when I edit xanadu.txt and add ASCI symbol "ƒ" (which has a decimal value of 131), java does not complain by throwing an error that value 131 is out of range defined by byte (-128 and 127)? When I try to test this using literals I get two different results.
The following works:
byte b = 120;
int c = b;
System.out.println((char)c);
Output: x
But this does NOT work (even though it works when added to xanadu.txt):
byte b = 131;
int c = b;
System.out.println((char)c);
Output: error: incompatible types: possible lossy conversion from int to byte
byte b = 131;
I tried explicitly casting using byte: (how is this possible?)
byte b = (byte)131;
int c = b;
System.out.println((char)c);
Output: テ
I am total newbie when it comes to I/O streams, somebody please help me understand it.
EDIT: Turns out my knowledge on concepts of type casting was lacking, specifically in understanding the difference between "Widening" and "Narrowing". Reading up more about these concepts helped me understand why explicit (aka narrowing) casting works.
Allow me to explain: Look at the 3rd code block where I am explicitly casting the literal '131' to type of byte. If we are to convert the literal 131 into binary form of 32-bit signed 2's complement integer, we will get 00000000 00000000 00000000 10000011 which is 32-bits or 4 bytes. Recall that Java data type 'byte' can only hold 8-bit signed 2's complement integer, so, 131 is out of range and thus we get error "possible lossy conversion from int to byte". But, when we explicitly cast it to byte, we are 'chopping off' or correct term would be 'narrowing' the binary down to 8 bit integer. So, when we do that, then the resulting binary is 10000011 which is -125 in decimal value. Since -125 is in range of -128 and 127, byte has no issues accepting and storing it. Now when I try to story the value of byte in int c, implicit or "widening" casting takes place, where -125 in binary form of 8-bit 10000011 is converted into equivalent -125 in binary form of 32-bit 11111111 11111111 11111111 10000011. Finally, system.out is trying to output the value of (char)c which is another explicit or "narrowing" casting where its trying to shrink from 32-bit signed to 16-bit unsigned. When casting is complete, we get 11111111 10000011 in binary form. Now, when this binary is converted into character form by java, it returns テ.
I can conclude by saying that it helps converting everything into binary form and go from there. But make sure you understand encoding and 2's complement
I don't know where you got the value 131 from, but as far as I am concerned, LATIN SMALL LETTER F WITH HOOK (ƒ) is not in the original ASCII character set, but in extended ASCII, with a decimal value of 159. See here. It is also encoded in UTF-16 (how Java chars are encoded) as hex 192 (decimal value 402).
First, ensure that your text files are encoded in extended ASCII, and not UTF-8 (which is the most likely encoding). Then you can use a FileInputStream to read the file, and you will get 159.
Note that 159 is outside the range of the the Java byte type. This is fine, because read returns an int. If the text file is encoded in UTF-8 however, ƒ is encoded in 2 bytes, so read will be reading one byte at a time.
Your second code block doesn't work because as you said, byte goes from -128 to 127, so 131 obviously doesn't fit.
Your third code block forces 131 into a byte, which causes overflow and the value "wraps back around" to -125. b and c are both -125. When you cast this to a char it becomes 65411 because this conversion involves padding the whole number to 16-bits first, then treating it as an unsigned integer.
The reason why this all works when you use FileInputStream.read instead of doing these conversions yourself, is because read actually returns an int, not a byte. It's just that the int it returns will always be in the range -1 ~ 255. This is why we say "read returns a byte", but its actual return type is int.
byte b = 131; // this is 8 bits type, but >8 bits value
int c = b; // this is 32 bits type
System.out.println((char)c); // this is 16 bits type
Output: error: incompatible types: possible lossy conversion from int to byte
byte b = 131;
The two-complement encoding of 131 is:
2^7+2^1+2^0
^^^
sign bit
131 won't fit in a signed byte without an overflow in the two complement representation that is used for signed types. The highest bit=sign bit is set which gets extended when casting from byte to int.
The Java compiler notices that 131 won't fit properly in a byte which leads to the error.
read() method returns an int that represents the next byte of data and read(byte[] b) method does not return anything, it assigns the bytes data values to the array passed as an argument.
I have made some tests with an image file and I have taken 2 ways:
Print the results returned by read() method until this result is -1 (what means that the end of the file has been reached).
Create an array of bytes and pass it as an argument of read(byte[] b) method and print the numbers that have been assigned to that array of bytes.
I have noticed that the results in both cases are different: in the second case, as the results are of byte type, the numbers were not greater than 127 or less than -128; while in the first case, i found numbers greater than 200, for example.
Should not the numbers be the same in both cases due to the fact that the file is the same in both cases and those numbers represent the data of that file?
I also used a FileOutputStream to write the data of the file into another new file and in both cases, the new file had the same bytes and look the same (as I said, it was an image).
Thank you.
Since Java has only signed datatypes, read(byte[] b) reads regular bytes, i.e. -128-127. However read() returns an int so it can indicate end of stream with -1, returning unsigned byte values from 0-255.
byte b = (byte)in.read(); // Provided that stream has data left
Would give you an unsigned byte looking like the values you've gotten in your byte[] b.
There is a strange restriction in java.io.DataOutputStream.writeUTF(String str) method, which limits the size of an UTF-8 encoded string to 65535 bytes:
if (utflen > 65535)
throw new UTFDataFormatException(
"encoded string too long: " + utflen + " bytes");
It is strange, because:
there is no any information about this restriction in JavaDoc of this method
this restriction can be easily solved by copying and modifying an internal static int writeUTF(String str, DataOutput out) method of this class
there is no such restriction in the opposite method java.io.DataInputStream.readUTF().
According to the said above I can not understand the purpose of a such restriction in the writeUTF method. What have I missed or misunderstood?
The Javadoc of DataOutputStream.writeUTF states:
First, two bytes are written to the output stream as if by the
writeShort method giving the number of bytes to follow. This value
is the number of bytes actually written out, not the length of the
string.
Two bytes means 16 bits: in 16 bits the maximum integer one can encode is 2^16 == 65535.
DataInputStream.readUTF has the exact same restriction, because it first reads the number of UTF-8 bytes to consume, in the form of a 2-byte integer, which again can only have a maximum value of 65535.
writeUTF first writes two bytes with the length, which has the same result as calling writeShort with the length and then writing the UTF-encoded bytes. writeUTF doesn't actually call writeShort - it builds up a single byte[] with both the 2-byte length and the UTF bytes. But that is why the Javadoc says "as if by the writeShort method" rather than just "by the writeShort method".
Hello I have the following code:
int i=12345;
DataOutputStream dos=new DataOutputStream(new FileOutputStream("Raw.txt"));
dos.write(i);
dos.close();
System.out.println(new File("Raw.txt").length());
The file size is being reported as 1 byte. Why is it not 4 bytes when an integer is 4 bytes long?
Thanks
Because you only wrote one byte to it. See the Javadoc for DataOutputStream.write(int). It writes a byte, not an int.
While the DataOutputStream.write method takes an int argument, it actually only writes the bottom 8 bits of that argument. So you actually wrote only one byte ... and hence the file is one byte long.
If you want to write the entire int you should use the writeInt(int) method.
The underlying reason for this strangeness is (I believe) that the write(int) method is defined to be consistent with OutputStream.write(int) which in turn defined to be consistent with InputStream.read(). InputStream.read() reads a byte and returns it as an int ... with the value -1 used to indicate the end-of-stream condition.
How can i read bits from file ? I wrote bits to file something like that:
File plik=new File("bitowo");
FileOutputStream fos=new FileOutputStream(plik);
byte[] test =new byte[2];
test[0]=(byte)01101000;
test[1]=(byte)10101010;
fos.write(test);
fos.close();
and "bitowo" has only 2 bytes but how can i read from file "bitowo" bit after bit ?
You can't read bit-by-bit. You can read byte-by-byte and then shift your byte bit-by-bit.
This:
test[0]=(byte)01101000;
test[1]=(byte)10101010;
Does not do what you think it does. Specifically, it does not write two bytes with the bit patterns that the code seems to suggest.
The number 01101000 will be interpreted as an octal integer literal, because it starts with 0. In decimal, that would be the number 295424. When you cast that to a byte, only the lower 8 bits are kept, and those happen to be 0. So the first byte in your file is 0.
The number 10101010 will be interpreted as a decimal integer literal (the number ten million, one hundred and one thousand and ten). Again, by casting it to byte, only the lower 8 bits are kept, so the second byte in your file will contain the value 18 (decimal).
If you're using Java 7, you can use binary literals in your code by prefixing the digits with 0b:
test[0]=(byte)0b01101000;
test[1]=(byte)0b10101010;
To read the two bytes back, just open the file with a FileInputStream and read two bytes from it.