I can't seem to determine any difference between InputStreamReader and FileReader besides the way the two are initialized. Is there any benefit to using one or the other? Most other articles cover FileInputStream vs InputStreamReader, but I am contrasting with FileReader instead. Seems to me they both have the same purpose.
First, InputStreamReader can handle all input streams, not just files. Other examples are network connections, classpath resources and ZIP files.
Second, FileReader until Java 11 did not allow you to specify an encoding and instead only used the plaform default encoding, which made it pretty much useless as using it would result in corrupted data when the code is run on systems with different platform default encodings.
Since Java 11, FileReader is a useful shortcut for wrapping an InputStreamReader around a FileInputStream.
FileReader reads character from a file in the file system. InputStreamReader reads characters from any kind of input stream. The stream could be a FileInputStream, but could also be a stream obtained from a socket, an HTTP connection, a database blob, whatever.
I usually prefer using an InputStreamReader wrapping a FileInputStream to read from a file because it allows specifying a specific character encoding.
FileReader extends InputStreamReader. The only differences is that FileReader has constructors which assume you are reading from a file such as String filename, File file and FileDescriptor fd
I suggest you have a look at the source for FileReader to know more.
Related
Why i always need to chain FileReader to other readers like BufferedReader or Scanner ?
Why i can't use just a FileScanner since it is Convenience for reading character files.
I'm just practicing and i do not care about the performance or functionality. Yet there is one method public int read() for FileReader and it return int, how can i use that integer number for chars?
BufferedReader is mainly used because it is more efficient than a FileReader. The difference between the two is that a FileReader is used to read characters from a file whereas BufferedReader wraps around FileReader and uses it to buffer the input (hence the name BufferedReader). This leads to passing FileReader to BufferedReader ending up with a more efficient way of reading.
But as you asked in the question... It is perfectly fine using FileReader as long as you are okay dealing with what it provides as functionality.
I have a FileInputStream. I'd like to read character-oriented, linewise data from it, until I find a particular delimiter. Then I'd like to pass the FileInputStream, with the current position set immediately after the end of the delimiter line, to a library that needs an InputStream.
I can use a BufferedReader to walk through the file a line at a time, and everything works great. However, this leaves the underlying file stream in
BufferedReader br = new BufferedReader(new InputStreamReader(myFileStream))
at a non-deterministic position -- the BufferedReader had to look ahead, and I don't know how far, and AFAICT there's no way to tell the BufferedReader to rewind the underlying stream to just after the last-returned line.
Is this the best solution? It seems crazy to have a ReaderInputStream(BufferedReader(InputStreamReader(FileInputStream))) but it's the only way I've seen to avoid rolling my own. I'd really like to avoid writing my own entire stream-that-reads-lines implementation if at all possible.
You cannot unbuffer a buffered reader. You have to use the same wrapper for the life for the application. In your situation I would use
DataInputStream dis = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
String line = dis.readLine();
While DataInputStream.readLine() is deprecated, it could work for you if you are careful. Otherwise you only option is to read the bytes yourself and parse the text using the encoding required.
I'm processing a Unicode text file using the Java platform on OS X. When I open the file using TextEdit or TextWrangler instead of seeing "Nattvardsgästerna" I see "Nattvardsg‰sterna" (which is incorrect). When I open the file using the Java io stream, I see the same incorrect String "Nattvardsg‰sterna".
When I open the file on my PC I see the correct String. I'm not sure where to start solving this problem... Is it an issue with my OS X set-up? Should I open the Java stream with a special flag?
Thanks.
P.S. I'm opening the file like so: fileReader = new BufferedReader(new FileReader(file));
P.S.S. Also, I should mention that I'd like to output the result as an SQL text file so it is important for the OS to distinguish ä correctly.
An InputStream reads bytes (not characters), so I assume when you say:
When I open the file using java io stream
... that you really mean "when I open the file using a Java Reader".
EDIT: Your comment says that you're doing this:
new BufferedReader(new FileReader(file));
An InputStreamReader has a constructor that allows you to set the character encoding. If you don't specify one, it will use the platform default. It's unlikely the platform default will be unicode (on my Macbook, it's set to "US-ASCII").
In order to set the character encoding, you must create the intermediate input stream reader rather than that letting FileReader do it for you (because FileReader uses the platform default encoding).
Assuming the file is encoding using UTF-8, use:
new BufferedReader(new InputStreamReader(new FileInputStream(file),
Charset.forName("UTF-8")));
Alternatively, you can change the platform default by supplying an argument to the JVM. You can look at this answer for the full details, but the basic idea is that you set the file.encoding Java system property. The linked answer provides a few ways to achieve this.
FURTHER EDIT:
P.S.S. Also, I should mention that I'd like to output the result as an SQL text file so it is important for the OS to distinguish ä correctly.
The OS hasn't got anything to do with this. The file system is just shuffling bytes around. How those bytes are interpreted is entirely up to the applications that are reading those files. This answer tells you how to make your Java program interpret the bytes correctly. For your database to be able to interpret the bytes correctly, you'll need to configure the database encoding.
Which class should be used in situations that require writing characters rather than bytes?
Please take a look at java.io.Writer and subclasses.
PrintWriter will be useful
http://download.oracle.com/javase/1.4.2/docs/api/java/io/PrintWriter.html
An important thing to know about I/O in Java is that streams (InputStream and OutputStream etc.) are used for reading and writing binary data (you read or write bytes exactly as they are in the file), and readers and writers (Reader and Writer etc.) are for reading and writing characters.
Readers and writers are a layer on top of streams. A Reader interprets the bytes from an InputStream using a character encoding (such as UTF-8, ISO-8859-1, US-ASCII) to convert them into characters, and a Writer uses a character encoding to turn characters into bytes.
Here's just this example:
http://www.xyzws.com/Javafaq/how-to-use-httpurlconnection-post-data-to-web-server/139
Why it feels so strange?
You are actually looking at two different kinds of stream.
The Writer / Reader classes and subclasses are for reading / writing character-based data. It takes care of conversion between Java's internal UTF-16 representation of text and the character encoding used outside. The BufferedReader class adds a readLine() method that understands end-of-line makers.
The InputStream / OutputStream classes and subclasses are for reading and writing byte-based data without any assumptions about character encodings, or that the data is text. Since it eschews these assumptions, "line" has no clear meaning, and hence the BufferedInputStream class does not have a readLine() method.
(Incidentally, DataInputStream does have a readLine() method, but it is deprecated because it is broken. It makes assumptions about encodings, etc that are invalid on some platforms!)
In your particular example, the code is asymmetric because the HTTP service it designed to talk to is asymmetric. The service expects a request with binary content (encoded using the DataOutputStream wrapper), and delivers a response with text content. This is not particularly unusual ... or wrong.
The strangeness of writing the "input" to a server to an "output" is merely a matter of perspective. In simple terms, an OutputStream / Writer is something you "write to" (i.e. a data sink) and an InputStream or Reader is something you "read from" (i.e. a data source). That's just the way it is, and it is not strange at all once you get used to it.
Actually, we don't. There is no method readLine defined in InputStream. It also operates on bytes only, just like OutputStream.
In the code you referenced, readLine is called on a BufferedReader.
Reader and Writer are for text data and operate on characters (and Strings), InputStream and OutputStream work with binary data (raw bytes). To convert between the two (i.e. wrap an InputStream into a Reader or an OutputStream into a Writer), you need to choose a character set.
I'm feeling strange why not read out from OutputStream but from InputStream
That's just a matter of perspective.
An OutputStream or a Writer is where you write your output to.
An InputStream or a Reader is where you read your input from.
Of course, somewhere, on the other end of the stream, someone might treat your OutputStream as their InputStream ...
readLine does exactly what the name implies -- it reads a line of text until the end-of-line marker.
When you write to a stream, you already know where your line ends.
If you are looking for a way to write to streams in a more intuitive way, try PrintWriter.