I have a FileInputStream. I'd like to read character-oriented, linewise data from it, until I find a particular delimiter. Then I'd like to pass the FileInputStream, with the current position set immediately after the end of the delimiter line, to a library that needs an InputStream.
I can use a BufferedReader to walk through the file a line at a time, and everything works great. However, this leaves the underlying file stream in
BufferedReader br = new BufferedReader(new InputStreamReader(myFileStream))
at a non-deterministic position -- the BufferedReader had to look ahead, and I don't know how far, and AFAICT there's no way to tell the BufferedReader to rewind the underlying stream to just after the last-returned line.
Is this the best solution? It seems crazy to have a ReaderInputStream(BufferedReader(InputStreamReader(FileInputStream))) but it's the only way I've seen to avoid rolling my own. I'd really like to avoid writing my own entire stream-that-reads-lines implementation if at all possible.
You cannot unbuffer a buffered reader. You have to use the same wrapper for the life for the application. In your situation I would use
DataInputStream dis = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
String line = dis.readLine();
While DataInputStream.readLine() is deprecated, it could work for you if you are careful. Otherwise you only option is to read the bytes yourself and parse the text using the encoding required.
Related
Why i always need to chain FileReader to other readers like BufferedReader or Scanner ?
Why i can't use just a FileScanner since it is Convenience for reading character files.
I'm just practicing and i do not care about the performance or functionality. Yet there is one method public int read() for FileReader and it return int, how can i use that integer number for chars?
BufferedReader is mainly used because it is more efficient than a FileReader. The difference between the two is that a FileReader is used to read characters from a file whereas BufferedReader wraps around FileReader and uses it to buffer the input (hence the name BufferedReader). This leads to passing FileReader to BufferedReader ending up with a more efficient way of reading.
But as you asked in the question... It is perfectly fine using FileReader as long as you are okay dealing with what it provides as functionality.
I am switching to Java from c++ and now going through some of the documentation on Java IO. So if I want to make buffered character stream from unbuffered byte stream, I can do this in two ways:
Reader input1 = new BufferedReader(new InputStreamReader(new FileInputStream("Xanadu.txt")));
and
Reader input2 = new InputStreamReader(new BufferedInputStream(new FileInputStream("Xanadu.txt")));
So I can make it character and after this buffered or vise versa.
What is the difference between them and which is better?
Functionally, there is no difference. The two versions will behave the same way.
There is a likely to be difference in performance, with the first version likely to be a bit faster than the second version when you read characters from the Reader one at a time.
In the first version, an entire buffer full of data will be converted from bytes to chars in a single operation. Then each read() call on the Reader will fetch a character directly from the character buffer.
In the second version, each read() call on the Reader performs one or more read() calls on the input stream and converts only those bytes read to a character.
If I was going to implement this (precise) functionality, I would do it like this:
Reader input = new BufferedReader(new FileReader("Xanadu.txt"));
and let FileReader deal with the bytes-to-characters decoding under the hood.
There is a case for using an InputStreamReader, but only if you need to specify the character set for the bytes-to-characters conversion explicitly.
I have an InputStream that is returning, for example:
<?xml version='1.0' ?><env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/"><bbs:rule xmlns:bbs="http://com.foo/bbs">
I then pass the stream to a method that return a byte array.
I'd like to substitute "com.foo" with something else, like "org.bar" before I pass to the byte[] method.
What is a good way to do that?
If you have a bytearray you can transform it into a String. Pay attention to the encoding, in the example I use utf-8. I think this is a simple way to do that:
String newString = new String(byteArray, "utf-8");
newString = newString.replace("com.foo", "org.bar");
return newString.getBytes("utf-8");
One way is to wrap your InputStream in your own FilterInputStream subclass that does the transformation on the fly. It will have to be a look-ahead stream that checks every "c" character to see if it is followed by "om.foo" and if so make the substitution. You'll probably have to override just the read() method.
A stream reads/writes bytes. Trying to replace text in a binary representation is asking for trouble. So the first thing to do would be wrapping this stream into a Reader (like InputStreamReader) which will take care of translating the binary data into character information for you. You'll have to know the encoding of your streamed data, however, to make sure it is interpreted correctly. For example, UTF-8 or ISO-8859-1.
Once you have your textual data, you can think of how to replace parts of it. One way to do this is using regular expressions. However, this means you'll first have to read the entire stream into a string, do the substitution and then return the byte array. For large amounts of data, this might be inefficient.
Since you're dealing with XML data, you could make use of a higher-level approach and parse the XML in some way that allows you to process the contents without having to store them entirely in an intermediate format. A SAXParser with your own ContentHandler would do the trick. As events arrive, simply write them out again but with the proper alterations. Another approach would be an XSLT transformation with some extension function magic.
Wasn't there supposed to be some support for stream manipulations like this in java.nio? Or was this planned for an upcoming Java version?
This may not be the most efficient way to do it, but it certainly works.
InputStream is = // input;
ByteArrayOutputStream baos = new ByteArrayOutputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(baos));
String line = null;
while((line = reader.readLine()) != null)
{
if(line.contains("com.foo"))
{
line = line.replace("com.foo", "org.bar");
}
writer.write(line);
}
return baos.toByteArray();
Here's just this example:
http://www.xyzws.com/Javafaq/how-to-use-httpurlconnection-post-data-to-web-server/139
Why it feels so strange?
You are actually looking at two different kinds of stream.
The Writer / Reader classes and subclasses are for reading / writing character-based data. It takes care of conversion between Java's internal UTF-16 representation of text and the character encoding used outside. The BufferedReader class adds a readLine() method that understands end-of-line makers.
The InputStream / OutputStream classes and subclasses are for reading and writing byte-based data without any assumptions about character encodings, or that the data is text. Since it eschews these assumptions, "line" has no clear meaning, and hence the BufferedInputStream class does not have a readLine() method.
(Incidentally, DataInputStream does have a readLine() method, but it is deprecated because it is broken. It makes assumptions about encodings, etc that are invalid on some platforms!)
In your particular example, the code is asymmetric because the HTTP service it designed to talk to is asymmetric. The service expects a request with binary content (encoded using the DataOutputStream wrapper), and delivers a response with text content. This is not particularly unusual ... or wrong.
The strangeness of writing the "input" to a server to an "output" is merely a matter of perspective. In simple terms, an OutputStream / Writer is something you "write to" (i.e. a data sink) and an InputStream or Reader is something you "read from" (i.e. a data source). That's just the way it is, and it is not strange at all once you get used to it.
Actually, we don't. There is no method readLine defined in InputStream. It also operates on bytes only, just like OutputStream.
In the code you referenced, readLine is called on a BufferedReader.
Reader and Writer are for text data and operate on characters (and Strings), InputStream and OutputStream work with binary data (raw bytes). To convert between the two (i.e. wrap an InputStream into a Reader or an OutputStream into a Writer), you need to choose a character set.
I'm feeling strange why not read out from OutputStream but from InputStream
That's just a matter of perspective.
An OutputStream or a Writer is where you write your output to.
An InputStream or a Reader is where you read your input from.
Of course, somewhere, on the other end of the stream, someone might treat your OutputStream as their InputStream ...
readLine does exactly what the name implies -- it reads a line of text until the end-of-line marker.
When you write to a stream, you already know where your line ends.
If you are looking for a way to write to streams in a more intuitive way, try PrintWriter.
i want to read a file one character at a time and write the contents of first file to another file one character at a time.
i have asked this question earlier also but didnt get a satisfactory answer.....
i am able to read the file and print it out to std o/p.but cant write the same read character to a file.
It may have been useful to link to your previous question to see what was unsatisfactory. Here's a basic example:
public static void copy( File src, File dest ) throws IOException {
Reader reader = new FileReader(src);
Writer writer = new FileWriter(dest);
int oneChar = 0;
while( (oneChar = reader.read()) != -1 ) {
writer.write(oneChar);
}
writer.close();
reader.close();
}
Additional things to consider:
wrap reader/writer with BufferedReader/Writer for better performance
the close calls should be in a finally block to prevent resource leaks
You can read characters from a file by using a FileReader (there's a read method that lets you do it one character at a time if you like), and you can write characters to a file using a FileWriter (there's a one-character-at-a-time write method). There are also methods to do blocks of characters rather than one character at a time, but you seemed to want those, so...
That's great if you're not worried about setting the character encoding. If you are, look at using FileInputStream and FileOutputStream with InputStreamReader and OutputStreamWriter wrappers (respectively). The FileInputStream and FileoutputStream classes work with bytes, and then the stream reader/writers work with converting bytes to characters according to the encoding you choose.