BufferedReader and InputStreamReader in Java - java

I recently started with Java and want to understand a java module of a large app. I came across this line of java code:
String line = (new BufferedReader(new InputStreamReader(System.in))).readLine();
What does this java code do. Is there a C/C++ equivalent of this?

System.in is the standard input.
InputStreamReader allows you to associate a stream that reads from the specified input (in this case the standard input), so now we have a stream.
BufferedReader is an "abstraction" to help you to work with streams. For example, it implements readLine instead of reading character by character until you find a '\n' to get the whole line. It just returns a String after this proccess.
So this line means: "Read a line from standard input and store it in line variable".

> What does this java code do:
String line is your string object
new BufferedReader().readLine() is the instance of a BufferedReader to read text from a character input stream; and readline() is a method it implements to read until a newline character.
new InputStreamReader() gives you a instance of an InputStreamReader which is the "bridge" between the standard in byte stream and the character stream which a BufferedReader wants.
System.in is the standard input (byte stream)
> Is there a C/C++ equivalent of this
Well... there's no language called C/C++... ;)
So I'll assume you wanted an answer for each of them.
In C, there are no "strings" you have to use a character array, but you can read data in to a character array from stdin with something like:
char input[100];
...
scanf("%99[^\n]", input);
or
fgets (input, 100 , stdin)
In C++, you'd use:
using namespace std;
string line;
getline(cin, line);

Your snippet uses a BufferedReader, chained to an InputStreamReader, to read aline from the standard input console and store it to the String line .
BufferedReader
Read text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines.
The buffer size may be specified, or the default size may be used. The default is large enough for most purposes.
In general, each read request made of a Reader causes a corresponding read request to be made of the underlying character or byte stream. It is therefore advisable to wrap a BufferedReader around any Reader whose read() operations may be costly, such as FileReaders and InputStreamReaders.
BufferedReader#readLine()
Read a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.
InputStreamReader
An InputStreamReader is a bridge from byte streams to character streams: It reads bytes and decodes them into characters using a specified charset. The charset that it uses may be specified by name or may be given explicitly, or the platform's default charset may be accepted.
Each invocation of one of an InputStreamReader's read() methods may cause one or more bytes to be read from the underlying byte-input stream. To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation.
System
The System class contains several useful class fields and methods. It cannot be instantiated.
Among the facilities provided by the System class are standard input, standard output, and error output streams; access to externally defined "properties"; a means of loading files and libraries; and a utility method for quickly copying a portion of an array.
System.in
The "standard" input stream. This stream is already open and ready to supply input data. Typically this stream corresponds to keyboard input or another input source specified by the host environment or user.

What the code does is just simply read a line from input stream. from pattern point of view, this is a decorator. As to using BufferedReader is aiming to improve IO performance.

An InputStreamReader is a bridge from byte streams to character streams: It reads bytes and decodes them into characters using a specified charset. The charset that it uses may be specified by name or may be given explicitly, or the platform's default charset may be accepted.
Each invocation of one of an InputStreamReader's read() methods may cause one or more bytes to be read from the underlying byte-input stream. To enable the efficient conversion of bytes to characters, more bytes may be read ahead from the underlying stream than are necessary to satisfy the current read operation.
For top efficiency, we consider wrapping an InputStreamReader within a BufferedReader. For example:
BufferedReader in
= new BufferedReader(new InputStreamReader(System.in));

Related

Java.io Two ways to obtain buffered character stream from unbuffered byte one

I am switching to Java from c++ and now going through some of the documentation on Java IO. So if I want to make buffered character stream from unbuffered byte stream, I can do this in two ways:
Reader input1 = new BufferedReader(new InputStreamReader(new FileInputStream("Xanadu.txt")));
and
Reader input2 = new InputStreamReader(new BufferedInputStream(new FileInputStream("Xanadu.txt")));
So I can make it character and after this buffered or vise versa.
What is the difference between them and which is better?
Functionally, there is no difference. The two versions will behave the same way.
There is a likely to be difference in performance, with the first version likely to be a bit faster than the second version when you read characters from the Reader one at a time.
In the first version, an entire buffer full of data will be converted from bytes to chars in a single operation. Then each read() call on the Reader will fetch a character directly from the character buffer.
In the second version, each read() call on the Reader performs one or more read() calls on the input stream and converts only those bytes read to a character.
If I was going to implement this (precise) functionality, I would do it like this:
Reader input = new BufferedReader(new FileReader("Xanadu.txt"));
and let FileReader deal with the bytes-to-characters decoding under the hood.
There is a case for using an InputStreamReader, but only if you need to specify the character set for the bytes-to-characters conversion explicitly.

How to force UTF-16 while reading/writing in Java?

I see that you can specify UTF-16 as the charset via Charset.forName("UTF-16"), and that you can create a new UTF-16 decoder via Charset.forName("UTF-16").newDecoder(), but I only see the ability to specify a CharsetDecoder on InputStreamReader's constructor.
How so how do you specify to use UTF-16 while reading any stream in Java?
Input streams deal with raw bytes. When you read directly from an input stream, all you get is raw bytes where character sets are irrelevant.
The interpretation of raw bytes into characters, by definition, requires some sort of translation: how do I translate from raw bytes into a readable string? That "translation" comes in the form of a character set.
This "added" layer is implemented by Readers. Therefore, to read characters (rather than bytes) from a stream, you need to construct a Reader of some sort (depending on your needs) on top of the stream. For example:
InputStream is = ...;
Reader reader = new InputStreamReader(is, Charset.forName("UTF-16"));
This will cause reader.read() to read characters using the character set you specified. If you would like to read entire lines, use BufferedReader on top:
BufferedReader reader = new BufferedReader(new InputStreamReader(is, Charset.forName("UTF-16")));
String line = reader.readLine();

Testing for unseen characters in java

I'm writing a program in Java that tests the validity of several FTP commands. These commands must in a a carriage return and new line feed (the sequence "\r\n"). I'm using a BufferedReader to read in lines, but I cannot come up with a way to check if the line ends in this sequence. Any ideas?
Do not use the BufferedReader because abstraction level seems too high for your specific tasks. Use the ordinary InputStream, and read into byte array. InputStream will read all bytes as they are. You can process them and later produce strings yourself later using new String(array, offset, length). Maybe it can be other invalid characters like 0x0C in the input.

Why character streams?

I understand that Java character streams wrap byte streams such that the underlying byte stream is interpreted as per the system default or an otherwise specifically defined character set.
My systems default char-set is UTF-8.
If I use a FileReader to read in a text file, everything looks normal as the default char-set is used to interpret the bytes from the underlying InputStreamReader. If I explicitly define an InputStreamReader to read the UTF-8 encoded text file in as UTF-16, everything obviously looks strange. Using a byte stream like FileInputStream and redirecting its output to System.out, everything looks fine.
So, my questions are;
Why is it useful to use a character stream?
Why would I use a character stream instead of directly using a byte stream?
When is it useful to define a specific char-set?
Code that deals with strings should only "think" in terms of text - for example, reading an input source line by line, you don't want to care about the nature of that source.
However, storage is usually byte-oriented - so you need to create a conversion between the byte-oriented view of a source (encapsulated by InputStream) and the character-oriented view of a source (encapsulated by Reader).
So a method which (say) counts the lines of text in an input source should take a Reader parameter. If you want to count the lines of text in two files, one of which is encoded in UTF-8 and one of which is encoded in UTF-16, you'd create an InputStreamReader around a FileInputStream for each file, specifying the appropriate encoding each time.
(Personally I would avoid FileReader completely - the fact that it doesn't let you specify an encoding makes it useless IMO.)
An InputStream reads bytes, while a Reader reads characters. Because of the way bytes map to characters, you need to specify the character set (or encoding) when you create an InputStreamReader, the default being the platform character set.
When you are reading/writing text which contains characters which could be > 127 , use a char stream. When you are reading/writing binary data use a byte stream.
You cna read text as binary if you wish, but unless you make alot of assumptions it rarely gains you much.

Why we writeBytes into a OutputStream and readLine out of a InputStream in java?

Here's just this example:
http://www.xyzws.com/Javafaq/how-to-use-httpurlconnection-post-data-to-web-server/139
Why it feels so strange?
You are actually looking at two different kinds of stream.
The Writer / Reader classes and subclasses are for reading / writing character-based data. It takes care of conversion between Java's internal UTF-16 representation of text and the character encoding used outside. The BufferedReader class adds a readLine() method that understands end-of-line makers.
The InputStream / OutputStream classes and subclasses are for reading and writing byte-based data without any assumptions about character encodings, or that the data is text. Since it eschews these assumptions, "line" has no clear meaning, and hence the BufferedInputStream class does not have a readLine() method.
(Incidentally, DataInputStream does have a readLine() method, but it is deprecated because it is broken. It makes assumptions about encodings, etc that are invalid on some platforms!)
In your particular example, the code is asymmetric because the HTTP service it designed to talk to is asymmetric. The service expects a request with binary content (encoded using the DataOutputStream wrapper), and delivers a response with text content. This is not particularly unusual ... or wrong.
The strangeness of writing the "input" to a server to an "output" is merely a matter of perspective. In simple terms, an OutputStream / Writer is something you "write to" (i.e. a data sink) and an InputStream or Reader is something you "read from" (i.e. a data source). That's just the way it is, and it is not strange at all once you get used to it.
Actually, we don't. There is no method readLine defined in InputStream. It also operates on bytes only, just like OutputStream.
In the code you referenced, readLine is called on a BufferedReader.
Reader and Writer are for text data and operate on characters (and Strings), InputStream and OutputStream work with binary data (raw bytes). To convert between the two (i.e. wrap an InputStream into a Reader or an OutputStream into a Writer), you need to choose a character set.
I'm feeling strange why not read out from OutputStream but from InputStream
That's just a matter of perspective.
An OutputStream or a Writer is where you write your output to.
An InputStream or a Reader is where you read your input from.
Of course, somewhere, on the other end of the stream, someone might treat your OutputStream as their InputStream ...
readLine does exactly what the name implies -- it reads a line of text until the end-of-line marker.
When you write to a stream, you already know where your line ends.
If you are looking for a way to write to streams in a more intuitive way, try PrintWriter.

Categories