Java - InputStream .read() function - java

If I use int oneByte = dis.read(byteArray, 0, 1) does this mean I am reading only 1 byte and I am assigning its decimal value to integer oneByte?
If I want to check for | (pipe) character to break out of a loop can I do something like this:
while((oneByte = dis.read(byteArray, 0, 1)) != 124)

If I use int oneByte = dis.read(byteArray, 0, 1) does this mean I am
reading only 1 byte and I am assigning its decimal value to integer
oneByte?
Nope. You're reading up to 1 bytes into byteArray and receiving the number of bytes read in oneByte. Perhaps you'd prefer:
int oneByte = dis.read();
Also be careful because you'll get the integer value...not a decimal. Keep in mind that it will return -1 when you reach the end of the stream.
If I want to check for | (pipe) character to break out of a loop can I
do something like this: while((oneByte = dis.read(byteArray, 0, 1)) !=
124)
You'll need to also check for the end of stream (-1). Try something more like this:
while(true) {
int oneByte = dis.read();
if(oneByte == -1 || oneByte == '|') {
break;
}
}

No, it means you are trying to read 1 byte and you are assigning the number of bytes actually read to oneByte. So oneByte cannot be greater than 1 in this case. If you want to check for the "|" char you have to do:
dis.read(byteArray, 0, 1);
while(byteArray[0] != 124) {
dis.read(byteArray, 0, 1);
}

As Matthew says in a comment, you should read the API.
The read() method is overloaded. Without arguments, it does what you want. With the arguments you're passing, it returns the number of characters read.
Make sure you check for EOF (-1) too.

No. oneByte = dis.read(byteArray,0,1); The 0 means you're reading from an offset of 0 (so if you haven't read anything from the stream yet this will be the beginning of the stream), the 1 means you want to read up to 1 byte, the byteArray is the array into which you're reading and the return value assigned to oneByte is the number of bytes read (since if your stream contains fewer bytes than you tried to read this number would be less than the len parameter).
You can break on pipe by using the read() method:
while(dis.read() !- 124)

dis.read(buffer) will return as result the numbers of bytes read.
dis.read() will read ONE single byte and return it's value as integer (0 to 255)
The while loop should be like this to do what you want:
while ((oneByte = dis.read()) != 124)

If you check the Java API for InputStream, you'll see that the 'return' value for InputStream.read(byte[], int, int) is described as:
the total number of bytes read into the buffer, or -1 if there is
no more data because the end of the stream has been reached.
So, no. You're reading one byte and storing into byteArray[0]. OneByte will be -1, 0 or 1. Breaking your loop will not work in this scenario. But, for the record, if you really only need one byte at a time, the InputStream.read() method will do the trick.
You also, for readability's sake, may want to check against the exact character you're looking for. So:
while (stream.read()!='|'){
//stuff
}
This way, anyone who reads your code (future coders, graders, etc.) will immediately know "Oh, it breaks on a Pipe Character".

Related

System.in.read using my char value as an int? [duplicate]

What does :
System.in.read()
return ? The documentation says :
Returns:
the next byte of data, or -1 if the end of the stream is reached.
But for example if I enter : 10 I get back 49 . Why is that ?
49 is the ASCII value of the char 1. It is the value of the first byte.
The stream of bytes that is produced when you enter 10Enter on your console or terminal contains the three bytes {49,48,10} (on my Mac, may end with 10,12 or 12 instead of 10, depending on your System).
So the output of the simple snippet
int b = System.in.read();
while (b != -1) {
System.out.println(b);
b = System.in.read();
}
after entering a 10 and hitting enter, is (on my machine)
49
48
10
System.in.read() reads just one byte.
49 is the Unicode point value for 1.
Try to print:
System.out.println((char)49);
This will help you to understand it more.
When you enter 10, it is not read as an integer but as a String or, more precisely here, an array of bytes.
49 is the ASCII code for the character 1.

Java char variable

The following code within a program allows 90 to be assigned to the variable 'ch'. 'Z' is then printed to the console.
char ch;
ch = 90;
System.out.println(ch);
However, the following code, that lies within a program, does not compile. If the following code requires the input to the ch variable to be a character type, i.e. (char) System.in.read();, then why does the same not apply when 90 is assigned to ch above? Why doesn't it have to be ch = (char) 90?
char ch;
ch = System.in.read();
The compiler knows that 90 is a valid value for char. However, System.in.read() can return any int, which may be out of the valid range for chars.
If you change 90 to 90000, the code won't compile:
char ch;
ch = 90000;
Whenever you are dealing with system io you need to ensure that you can handle all valid byte input values. However, you then need a mechanism to indicate that the stream has been exhausted. The javadoc for InputStream.read() (System.in is a global InputStream) says (emphasis added),
Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value -1 is returned.
If you were to cast -1 to char you would get 65535 because char is unsigned. With byte, it's even worse in that -1 is a valid value. Regardless, you aren't reading char values; you are reading byte(s) encoded as int in the range 0-255 plus -1 to indicate end of stream. If you want char(s) I suggest you look at an InputStreamReader which is described in the javadoc as
An InputStreamReader is a bridge from byte streams to character streams

Java integer is equal to character?

I apologize if this question is a bit simplistic, but I'm somewhat puzzled as to why my professor has made the following the statement:
Notice that read() returns an integer value. Using an int as a return type allows read() to use -1 to indicate that it has reached the end of the stream. You will recall from your introduction to Java that an int is equal to a char which makes the use of the -1 convenient.
The professor was referencing the following sample code:
public class CopyBytes {
public static void main(String[] args) throws IOException {
FileInputStream in = null;
FileOutputStream out = null;
try {
in = new FileInputStream("Independence.txt");
out = new FileOutputStream("Independence.txt");
int c;
while ((c = in.read()) != -1) {
out.write(c);
}
} finally {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
}
}
This is an advanced Java course, so obviously I've taken a few introductory courses prior to this one. Maybe I'm just having a "blonde moment" of sorts, but I'm not understanding in what context an integer could be equal to a character when making comparisons. The instance method read() returns an integer value when it comes to EOF. That I understand perfectly.
Can anyone shed light on the statement in bold?
In Java, chars is a more specific type of int. I can write.
char c = 65;
This code prints out "A". I need the cast there so Java knows I want the character representation and not the integer one.
public static void main(String... str) {
System.out.println((char) 65);
}
You can look up the int to character mapping in an ASCII table.
And per your teacher, int allows for more values. Since -1 isn't a character value, it can serve as a flag value.
To a computer a character is just a number (that may at some point be mapped to a picture of a letter for display to the user). Languages usually have a special character type to distinguish between "just a number" and "a number that refers to a character", but inside, it's still just some sort of integer.
The reason why read() returns an int is to have "one extra value" to represent EOF. All the values of char are already defined to mean something else, so it uses a larger type to get more values.
It means your professor has been spending too much time programming in C. The definition of read for InputStream (and FileInputStream) is:
Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value -1 is returned.
(See http://docs.oracle.com/javase/7/docs/api/java/io/InputStream.html#read())
A char in Java, on the other hand, represents a Unicode character, and is treated as an integer in the range 0 to 65535. (In C, a char is an 8-bit integral value, either 0 to 255 or -128 to 127.)
Please note that in Java, a byte is actually an integer in the range -128 to 127; but the definition of read has been specified to avoid the problem, by decreeing that it will return 0 to 255 anyway. The javadoc is using "byte" in a loose sense here.
The char data type in Java is a single 16-bit Unicode character. It has a minimum value of '\u0000' (or 0) and a maximum value of '\uffff' (or 65,535 inclusive).
The int data type in Java is a 32-bit signed two's complement integer. It has a minimum value of -2,147,483,648 and a maximum value of 2,147,483,647 (inclusive).
Since char cannot be negative (a number between 0 and 65,535) and an int can be negative, the possible values returned from the method is -1 (to signify nothing left) to 65,535 (max value of a char).
What your professor is referring to the fact that characters are just integers used in a special context. If we ignore Unicode and other encoding types and focus on the old days of ASCII, there was an ASCII table (http://www.asciitable.com/). A string of characters is really just a sequence of integers, for example, TUV would be 84 followed by 85 followed by 86.
The 'char' type is an integer internally in the JVM and is more or less a hint that this integer should only be used in a character context.
You can even cast between them.
char a = (char) 65;
int i = (int) 'A';
Those two variables hold the same data in memory, but the compiler and JVM treat them slightly differently.
Because of this, read() returns an integer instead of char so as to allow a -1, which is not a valid character code. Values other than -1 can be cast to a char, while -1 indicates EOF.
Of course, Unicode changes all of this with multi-byte character and code points. I'll leave that as an exercise to you.
I am not sure what the professor means but what it all comes down to is computers only understand 1's and 0's we don't understand 1's and 0's all that we'll so we use a code system first Morris code then ascii now utf -16 ... It varies from computer to computer how accurate numbers(int) is.you know in the real world int is infinate they just keep counting.char also has a size.in utf _16 let's just say it's 16 bits (I will let you read up on that) so if char and int both take 16 bits as the professor says they are the same (size) and reading 1 char is the same as 1int . By the way to be politically correct char is infinite as well.Chinese characters French characters and the character I just made up but can't post cause its not supported.so think of the code system for int and char. -1 int is eof char.(eof = end of file) good luck, I hope this helped.what I don't understand is reading and writing to the same file?

What does System.in.read actually return?

What does :
System.in.read()
return ? The documentation says :
Returns:
the next byte of data, or -1 if the end of the stream is reached.
But for example if I enter : 10 I get back 49 . Why is that ?
49 is the ASCII value of the char 1. It is the value of the first byte.
The stream of bytes that is produced when you enter 10Enter on your console or terminal contains the three bytes {49,48,10} (on my Mac, may end with 10,12 or 12 instead of 10, depending on your System).
So the output of the simple snippet
int b = System.in.read();
while (b != -1) {
System.out.println(b);
b = System.in.read();
}
after entering a 10 and hitting enter, is (on my machine)
49
48
10
System.in.read() reads just one byte.
49 is the Unicode point value for 1.
Try to print:
System.out.println((char)49);
This will help you to understand it more.
When you enter 10, it is not read as an integer but as a String or, more precisely here, an array of bytes.
49 is the ASCII code for the character 1.

Why doesn't StringReader.Read() return a byte?

I was using StringReader in a Data Structures assignment (Huffman codes), and was testing if the end of the string had been reached. I found that the int value that StringReader.read() returns is not -1, but 65535, so casting the result to a byte solved my infinite loop problem I was having.
Is this a bug in JDK, or is it common practice to cast values returned from Reader.read() calls to bytes? Or am I missing something?
The gist of my code was something like this:
StringReader sr = new StringReader("This is a test string");
char c;
do {
c = sr.read();
//} while (c != -1); //<--Broken
} while ((byte)c != -1); //<--Works
In fact that doesn't even compile. I get:
Type mismatch: cannot convert from int to char
Since the sr.read() call returns an int I suggest you store it as such.
This compiles (and works as expected):
StringReader sr = new StringReader("This is a test string");
int i; // <-- changed from char
do {
i = sr.read();
// ... and if you need a char...
char c = (char) i;
} while (i != -1); // <-- works :-)
Why doesn't StringReader.Read() return a byte?
Strings are composed of 16-bit unicode characters. These won't fit in an 8-bit byte. One could argue that a char would have been enough, but then there is no room for providing an indication that the EOF is reached.
Characters in java are 2 bytes because they're encoded in UTF-16. This is why read() returns an int, because byte is not large enough.
char c = (char) -1;
System.out.println(""+c);
System.out.println(""+(byte)c);
This code will solve your doubt ..
A Java String is a sequence of chars which are not bytes but values that represent UTF-16 code-points. The semantics of read is to return the next atom from the input stream. In case of a StringReader the atomic component is a 16-bit value which cannot be represented as a single byte.
StringReader#read returns an int value which is -1 if the end of the stream has been reached.
The problem in your code is that you already convert the int value to a char and test the char:
System.out.println("Is it still (-1)?: " + (int) ((char) -1));

Categories