Is there an -1 at the end of an Inputstream?

Is there an -1 at the end of an Inputstream? - java

I am quite new to programming.
While reading the article Byte Streams in "Basic I/O" in The Java Tutorials by Oracle, I came accross this code:
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class CopyBytes {
public static void main(String[] args) throws IOException {
FileInputStream in = null;
FileOutputStream out = null;
try {
in = new FileInputStream("xanadu.txt");
out = new FileOutputStream("outagain.txt");
int c;
while ((c = in.read()) != -1) {
out.write(c);
}
} finally {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
}
}
I do not understand the condition of the while-loop. Is -1 some kind of sign that the Message is over? Does the FileOutputStream add it at the end?
Thank you all for your attention. I hope you have a wonderfull sylvester.

To add to the other answers, the tool for figuring this out is the documentation.
For the 'read' method of FileInputStream:
public int read()
throws IOException
Reads a byte of data from this input stream. This method blocks if no input is yet available. Specified by:
read in class InputStream
Returns: the next byte of data, or -1 if the
end of the file is reached.
This is definitive.
All standard Java classes are documented in this manner. In case of uncertainty, a quick check will reassure you.

EDIT: "Signals that an end of file or end of stream has been reached unexpectedly during input.
This exception is mainly used by data input streams, which generally expect a binary file in a specific format, and for which an end of stream is an unusual condition. Most other input streams return a special value on end of stream."
The right way is to catch EOFException to find out is it end of file or not, but in tihs case reading chars as EOF -1 is returned and not null, and it's working because there is no char for negative ascii, it's the same to check while ((c = in.read()) >= 0) {}, so you can use != -1 and it will work.

Related

FileInputStream.read() vs FileOutputStream.write()

I tried to make a simple program that copies a file. According to the documentation, FileInputStream.read() and FileOuputStream.write() seemed similar to me. They read and write an int, from and to a file, respectively. So then, why does the following not work?
import java.io.*;
class CopyFile {
public static void main(String[] args) throws IOException {
FileInputStream original = new FileInputStream(args[0]);
FileOutputStream copy = new FileOutputStream(args[1]);
while (original.read() != -1) {
copy.write(original.read());
}
}
}
The resulting file is totally different from the original. Why isn't this working as I expected?

Look at your code:
while (original.read() != -1) {
copy.write(original.read());
}
You read one byte to test if it's end of file, then you read another byte to write.
Hence the byte you read in while condition is skipped.
The correct way is:
int b;
while ((b=original.read()) != -1) {
copy.write(b);
}

BufferedReader#readLine() hangs even though a line has been read

Updated Question (to be more clear):
Is there a way to design the InputStream below such that BufferedReader#readLine() will return after reading the new line character?
In the example below, readLine() hangs forever even though the reader has read a new line because (presumably) it is waiting for the buffer to fill up. Ideally, readLine() would return after reading the new line character.
I know something like what I want is possible, because when you read from System.in using BufferedReader#readLine(), it does not wait for the buffer to fill up before returning.
import java.io.*;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
public class Example {
private static final class MyInputStream extends InputStream {
public final BlockingQueue<String> lines = new LinkedBlockingQueue<>();
private InputStream current = null;
#Override
public int read() throws IOException {
try {
if(current == null || current.available() == 0)
current = new ByteArrayInputStream(lines.take().getBytes("UTF-8"));
return current.read();
}
catch(InterruptedException ex) {
return -1;
}
}
}
public static void main(String[] args) throws Exception {
MyInputStream myin = new MyInputStream();
myin.lines.offer("a line\n");
BufferedReader in = new BufferedReader(new InputStreamReader(myin));
System.out.println(in.readLine());
}
}
Also, if there is a better way to send a string to an InputStream, I'm open to suggestions.
Accepted Solution:
Based on a suggestion from Sotirios Delimanolis in one of the comments on his solution, I'm just going to used a PipedInputStream instead. I've coupled it to a PipedOutputStream, and BufferedReader#readLine() returns immediately as long as I call PipedOutputStream#flush() after sending a string that contains a new line character.

After updated question, the only way to get the BufferedReader to stop reading after the new line character is to set the buffer size to 1, which completely removes the need for a BufferedReader.
You'll have to write your own implementation.
A BufferedReader reads more bytes than required. In your case, that means it will read further than the new line character. For example, with the Oracle JVM, it will attempt to read 8192 bytes. Through your inheritance hierarchy, this
System.out.println(in.readLine());
will attempt to invoke your read() method 8192 times.
The first 6 calls will return a value, one for each of the characters in your String's byte array. The next one, will see
if(current == null || current.available() == 0)
current = new ByteArrayInputStream(lines.take().getBytes("UTF-8"));
and current.available() will return 0 since the ByteArrayInputStream has been fully consumed. It will then attempt to take from the BlockingQueue and block indefinitely.

Also, if there is a better way to send a string to an InputStream, I'm open to suggestions.
Well, instead of an InputStream you can try a BufferedReader, with something that looks like this:
public int read(String directory) throws Exception{
String line = "";
File file = new File(directory);
FileReader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr);
do{
lines.add(br.readLine());
while(br.readLine() != null);
br.close();
return Integer.parseInt(line);
}

How to read a string stream in Java discarding illegal characters?

I have to parse a stream of bytes coming from a TCP connection that's supposed to only give me printable characters, but in reality that's not always the case. I've seen some binary zeros in there, at the start and end of some fields. I have no control over the source of the data and I need to process the "dirty" lines. If I could just filter out the invalid characters, that'd be OK. The relevant code is as such:
srvr = new ServerSocket(myport);
skt = srvr.accept();
// Tried with no encoding argument too
in = new Scanner(skt.getInputStream(), "ISO-8859-1");
in.useDelimiter("[\r\n]");
for (;;) {
String myline = in.next();
if (!myline.equals(""))
ProcessRecord(myline);
}
I get an exception at every line that has "dirt." What's a good way to filter out invalid characters while still being able to obtain the rest of the string?

You have to wrap your InputStream in a CharsetDecoder, defining an empty error handler:
//let's create a decoder for ISO-8859-1 which will just ignore invalid data
CharsetDecoder decoder=Charset.forName("ISO-8859-1").newDecoder();
decoder.onMalformedInput(CodingErrorAction.IGNORE);
decoder.onUnmappableCharacter(CodingErrorAction.IGNORE);
//let's wrap the inputstream into the decoder
InputStream is=skt.getInputStream();
in = new Scanner(decoder.decode(is));
you can also use a custom CodingErrorAction and define your own action in case of coding error.

The purest solution is to filter the InputStream (binary bytes-level I/O).
in = new Scanner(new DirtFilterInputStream(skt.getInputStream()), "Windows-1252");
public class DirtFilterInputStream extends InputStream {
private InputStream in;
public DirtFilterInputStream(InputStream in) {
this.in = in;
}
#Override
public int read() throws IOException {
int ch = in.read();
if (ch != -1) {
if (ch == 0) {
ch = read();
}
}
return ch;
}
}
(You need to override all methods, and delegate to the original stream.)
Windows-1252 is Windows Latin-1, an extended Latin 1, ISO-8859-1, using 0x80 - 0xBF.

I was completely off base. I get the "dirty" strings no problem (and NO, I have NO option to clean up the data source, it's from a closed system and I have to just grin and deal with it) but trying to store them in PostgreSQL is what gets me the exception. That means I have total freedom to clean it up before processing.

Appending characters onto string

I am trying to write code to import all characters (including spaces) of a given text file into a single string for analysis. I am using the given files in Java for this, and ran across a strange error while putthing it together. I'm not really familiar with coding at all, and would appreciate clarification. What happens is that in the below code, when I set
text.append(ch);
I have errors of Default constructor cannot handle exception thrown by X, must define explicit constructor;
and when I set text.append('ch');
the above errors go away and my 'ch' line just gives invalid char const. error, fixable by removing the ''s.
So I take it I have to construct an explicit constructor for my givens from Java, is this necessary? As I have no idea how to do so, it would be nice to have a roundabout solution.
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import java.lang.StringBuilder;
public class TextReader //cannot place inputs/outputs of string on this line
{
StringBuilder text = new StringBuilder();
//StringBuilder google
//google end of file check java
InputStream in = new FileInputStream("charfile.txt");
Reader r = new InputStreamReader(in, "US-ASCII");
int intch;
{
while ((intch = r.read()) != -1)
{
char ch = (char) intch;
// ...
text.append(ch); //if I make this a 'ch', the errors above go away, what's the problem?
}
}
}

You need to place your statements in a code block, e.g. main method.
public static void main(String[] args) throws IOException {
StringBuilder text = new StringBuilder();
// StringBuilder google
// google end of file check java
InputStream in = new FileInputStream("charfile.txt");
Reader r = new InputStreamReader(in, "US-ASCII");
int intch;
{
while ((intch = r.read()) != -1) {
char ch = (char) intch;
// ...
text.append(ch);
}
}
}
The statements
InputStream in = new FileInputStream("charfile.txt");
Reader r = new InputStreamReader(in, "US-ASCII");
both throw checked exceptions which cannot occur in the class block.

Actually IO in java require try and catch block, else it will give you error. Also in above code you have to place the declaration in explicitly define constructor
TextReader()
{
//----------- Your Code here.
}

When you do text.append(ch);, error should not come at this line. It may complain about other issue e.g. Expected Exceptions not handled or thrown e.g.
Handled:
try{
while ((intch = r.read()) != -1){
char ch = (char) intch;
// ...
text.append(ch);
}
}catch(IOException ioex){
ioex.printStackTace();
}
Thrown:
Change your method declaration with throws clause as :
public static void main(String[] args) throws IOException{
When you say text.append('ch');, your argument is not a variable or single character literal any more. You should be getting compilation error at that line. Though you can do something like text.append('c'); as c is a single character.

Java: Issue with available() method of BufferedInputStream

I'm dealing with the following code that is used to split a large file into a set of smaller files:
FileInputStream input = new FileInputStream(this.fileToSplit);
BufferedInputStream iBuff = new BufferedInputStream(input);
int i = 0;
FileOutputStream output = new FileOutputStream(fileArr[i]);
BufferedOutputStream oBuff = new BufferedOutputStream(output);
int buffSize = 8192;
byte[] buffer = new byte[buffSize];
while (true) {
if (iBuff.available() < buffSize) {
byte[] newBuff = new byte[iBuff.available()];
iBuff.read(newBuff);
oBuff.write(newBuff);
oBuff.flush();
oBuff.close();
break;
}
int r = iBuff.read(buffer);
if (fileArr[i].length() >= this.partSize) {
oBuff.flush();
oBuff.close();
++i;
output = new FileOutputStream(fileArr[i]);
oBuff = new BufferedOutputStream(output);
}
oBuff.write(buffer);
}
} catch (Exception e) {
e.printStackTrace();
}
This is the weird behavior I'm seeing... when I run this code using a 3GB file, the initial iBuff.available() call returns a value of a approximatley 2,100,000,000 and the code works fine. When I run this code on a 12GB file, the initial iBuff.available() call only returns a value of 200,000,000 (which is smaller than the split file size of 500,000,000 and causes the processing to go awry).
I'm thinking this discrepancy in behvaior has something to do with the fact that this is on 32-bit windows. I'm going to run a couple more tests on a 4.5 GB file and a 3.5 GB file. If the 3.5 file works and the 4.5 one doesn't, that will further confirm the theory that it's a 32bit vs 64bit issue since 4GB would then be the threshold.

Well if you read the javadoc it quite clearly states:
Returns the number of bytes that can
be read from this input stream
without blocking (emphasis added by me)
So it's quite clear that what you want is not what this method offers. So depending on the underlying InputStream you may get problems much earlier (eg a stream over the network with a server that doesn't return the filesize - you'd have to read the complete file and buffer it just to return the "correct" available() count, which would take a lot of time - what if you only want to read a header?)
So the correct way to handle this is to change your parsing method to be able to handle the file in pieces. Personally I don't see much reason at all to even use available() here - just calling read() and stopping as soon as read() returns -1 should work fine. Can be made more complicated if you want to assure that every file really contains blockSize byte - just add an internal loop if that scenario is important.
int blockSize = XXX;
byte[] buffer = new byte[blockSize];
int i = 0;
int read = in.read(buffer);
while(read != -1) {
out[i++].write(buffer, 0, read);
read = in.read(buffer);
}

There are few correct uses of available(), and this isn't one of them. You don't need all that junk. Memorize this:
int count;
byte[] buffer = new byte[8192]; // or more
while ((count = in.read(buffer)) > 0)
out.write(buffer, 0, count);
That's the canonical way to copy a stream in Java.

You should not use the InputStream.available() function at all. It is only needed in very special circumstances.
You should also not create byte arrays that are larger than 1 MB. It's a waste of memory. The commonly accepted way is to read a small block (4 kB up to 1 MB) from the source file and then store only as many bytes as you have read in the destination file. Do that until you have reached the end of the source file.

available isn't a measure of how much is still to be read but more a measure how much is guaranteed to be able to read before it might EOF or block waiting for input
and put close calls in the finallies
BufferedInputStream iBuff = new BufferedInputStream(input);
int i = 0;
FileOutputStream output;
BufferedOutputStream oBuff=0;
try{
int buffSize = 8192;
int offset=0;
byte[] buffer = new byte[buffSize];
while(true){
int len = iBuff.read(buffer,offset,buffSize-offset);
if(len==-1){//EOF write out last chunk
oBuff.write(buffer,0,offset);
break;
}
if(len+offset==buffSize){//end of buffer write out to file
try{
output = new FileOutputStream(fileArr[i]);
oBuff = new BufferedOutputStream(output);
oBuff.write(buffer);
}finally{
oBuff.close();
}
++i;
offset=0;
}
offset+=len;
}//while
}finally{
iBuff.close();
}

Here is some code that splits a file. If performance is critical to you, you can experiment with the buffer size.
package so6164853;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.util.Formatter;
public class FileSplitter {
private static String printf(String fmt, Object... args) {
Formatter formatter = new Formatter();
formatter.format(fmt, args);
return formatter.out().toString();
}
/**
* #param outputPattern see {#link Formatter}
*/
public static void splitFile(String inputFilename, long fragmentSize, String outputPattern) throws IOException {
InputStream input = new FileInputStream(inputFilename);
try {
byte[] buffer = new byte[65536];
int outputFileNo = 0;
OutputStream output = null;
long writtenToOutput = 0;
try {
while (true) {
int bytesToRead = buffer.length;
if (bytesToRead > fragmentSize - writtenToOutput) {
bytesToRead = (int) (fragmentSize - writtenToOutput);
}
int bytesRead = input.read(buffer, 0, bytesToRead);
if (bytesRead != -1) {
if (output == null) {
String outputName = printf(outputPattern, outputFileNo);
outputFileNo++;
output = new FileOutputStream(outputName);
writtenToOutput = 0;
}
output.write(buffer, 0, bytesRead);
writtenToOutput += bytesRead;
}
if (output != null && (bytesRead == -1 || writtenToOutput == fragmentSize)) {
output.close();
output = null;
}
if (bytesRead == -1) {
break;
}
}
} finally {
if (output != null) {
output.close();
}
}
} finally {
input.close();
}
}
public static void main(String[] args) throws IOException {
splitFile("d:/backup.zip", 1440 << 10, "d:/backup.zip.part%04d");
}
}
Some remarks:
Only those bytes that have actually been read from the input file are written to one of the output files.
I left out the BufferedInputStream and BufferedOutputStream since their buffer's size is only 8192 bytes, which less than the buffer I use in the code.
As soon as I open a file, I make sure that it will be closed at the end, no matter what happens. (The finally blocks.)
The code contains only one call to input.read and only one call to output.write. This makes it easier to check for correctness.
The code for splitting a file does not catch the IOException, since it doesn't know what to do in such a case. It is just passed to the caller; maybe the caller knows how to handle it.

Both #ratchet and #Voo are correct.
As for what is happening.
int max value is 2,147,483,647 (http://download.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html).
14 gigabytes is 15,032,385,536 which clearly don't fit an int.
See that according to the API Javadoc (http://download.oracle.com/javase/6/docs/api/java/io/BufferedInputStream.html#available%28%29) and as stated by #Voo, this don't break the method contract at all (just isn't what you are looking for).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Is there an -1 at the end of an Inputstream? - java

Related

FileInputStream.read() vs FileOutputStream.write()

BufferedReader#readLine() hangs even though a line has been read

How to read a string stream in Java discarding illegal characters?

Appending characters onto string

Java: Issue with available() method of BufferedInputStream

Categories

Resources