Flaws with PipedInputStream/PipedOutputStream - java

I've seen two answers on SO that claim that the PipedInputStream and PipedOutputStream classes provided by Java are flawed. But they did not elaborate on what was wrong with them. Are they really flawed, and if so in what way? I'm currently writing some code that uses them, so I'd like to know whether I'm taking a wrong turn.
One answer said:
PipedInputStream and PipedOutputStream are broken (with regards to threading). They assume each instance is bound to a particular thread. This is bizarre.
To me that seems neither bizarre nor broken. Perhaps the author also had some other flaws in mind?
Another answer said:
In practice they are best avoided. I've used them once in 13 years and I wish I hadn't.
But that author could not recall what the problem was.
As with all classes, and especially classes used in multiple threads, you will have problems if you misuse them. So I do not consider the unpredictable "write end dead" IOException that PipedInputStream can throw to be a flaw (failing to close() the connected PipedOutputStream is a bug; see the article Whats this? IOException: Write end dead, by Daniel Ferbers, for more information). What other claimed flaws are there?

They are not flawed.
As with all classes, and especially classes used in multiple threads, you will have problems if you misuse them. The unpredictable "write end dead" IOException that PipedInputStream can throw is not a flaw (failing to close() the connected PipedOutputStream is a bug; see the article Whats this? IOException: Write end dead, by Daniel Ferbers, for more information).

I have used them nicely in my project and they are invaluable for modifying streams on the fly and passing them around. The only drawback seemed to be that PipedInputStream had a short buffer (around 1024) and my outputstream was pumping in around 8KBs.
There is no defect with it and it works perfectly well.
-------- Example in groovy
public class Runner{
final PipedOutputStream source = new PipedOutputStream();
PipedInputStream sink = new PipedInputStream();
public static void main(String[] args) {
new Runner().doit()
println "Finished main thread"
}
public void doit() {
sink.connect(source)
(new Producer(source)).start()
BufferedInputStream buffer = new BufferedInputStream(sink)
(new Consumer(buffer)).start()
}
}
class Producer extends Thread {
OutputStream source
Producer(OutputStream source) {
this.source=source
}
#Override
public void run() {
byte[] data = new byte[1024];
println "Running the Producer..."
FileInputStream fout = new FileInputStream("/Users/ganesh/temp/www/README")
int amount=0
while((amount=fout.read(data))>0)
{
String s = new String(data, 0, amount);
source.write(s.getBytes())
synchronized (this) {
wait(5);
}
}
source.close()
}
}
class Consumer extends Thread{
InputStream ins
Consumer(InputStream ins)
{
this.ins = ins
}
public void run()
{
println "Consumer running"
int amount;
byte[] data = new byte[1024];
while ((amount = ins.read(data)) >= 0) {
String s = new String(data, 0, amount);
println "< $s"
synchronized (this) {
wait(5);
}
}
}
}

One flaw might be that there is not clear way for the writer to indicate to the reader that it encountered a problem:
PipedOutputStream out = new PipedOutputStream();
PipedInputStream in = new PipedInputStream(out);
new Thread(() -> {
try {
writeToOut(out);
out.close();
}
catch (SomeDataProviderException e) {
// Have to notify the reading side, but how?
}
}).start();
readFromIn(in);
The writer could close out, but maybe the reader misinterprets that as end of data. To handle this correctly additional logic is needed. It would be easier if functionality to manually break the pipe was provided.
There is now JDK-8222924 which requests a way to manually break the pipe.

From my point of view there is a flaw. More precisely there is a high risk of a deadlock if the Thread which should pump data into the PipedOutputStream dies prematurely before it actually writes a single byte into the stream. The problem in such a situation is that the implementation of the piped streams is not able to detect the broken pipe. Consequently the thread reading from PipedInputStream will wait forever (i.e. deadlock) in it's first call to read().
Broken pipe detection actually relies on the first call to write() as the implementation will than lazily initialize the write side thread and only from that point in time broken pipe detection will work.
The following code reproduces the situation:
import java.io.IOException;
import java.io.PipedInputStream;
import java.io.PipedOutputStream;
import org.junit.Test;
public class PipeTest
{
#Test
public void test() throws IOException
{
final PipedOutputStream pout = new PipedOutputStream();
PipedInputStream pin = new PipedInputStream();
pout.connect(pin);
Thread t = new Thread(new Runnable()
{
public void run()
{
try
{
if(true)
{
throw new IOException("asd");
}
pout.write(0); // first byte which never get's written
pout.close();
}
catch(IOException e)
{
throw new RuntimeException(e);
}
}
});
t.start();
pin.read(); // wait's forever, e.g. deadlocks
}
}

The flaws that I see with the JDK implementation are:
1) No timeouts, reader or writer can block infinitely.
2) Suboptimal control over when data is transferred (should be done only with flush, or when circular buffer is full)
So I created my own to address the above, (timeout value passed via a ThreadLocal):
PipedOutputStream
How to use:
PiedOutputStreamTest
Hope it helps...

Related

Read from ByteArrayOutputStream while it's being written to

I have a class that is constantly producing data and writing it to a ByteArrayOutputStream on its own thread. I have a 2nd thread that gets a reference to this ByteArrayOutputStream. I want the 2nd thread to read any data (and empty) the ByteArrayOutputStream and then stop when it doesn't get any bytes and sleep. After the sleep, I want it to try to get more data and empty it again.
The examples I see online say to use PipedOutputStream. If my first thread is making the ByteArrayOutputStream available to the outside world from a separate reusable library, I don't see how to hook up the inputStream to it.
How would one setup the PipedInputStream to connect it to the ByteArrayOutputStream to read from it as above? Also, when reading the last block from the ByteArrayOutputStream, will I see bytesRead == -1, indicating when the outputStream is closed from the first thread?
Many thanks,
Mike
Write to the PipedOutputStream directly (that is, don't use a ByteArrayOutputStream at all). They both extend OutputStream and so have the same interface.
There are connect methods in both PipedOutputStream and PipedInputStream that are used to wire two pipes together, or you can use one of the constructors to create a pair.
Writes to the PipedOutputStream will block when the buffer in the PipedInputStream fills up, and reads from the PipedInputStream will block when the buffer is empty, so the producer thread will sleep (block) if it gets "ahead" of the consumer and vice versa.
After blocking the threads wait for 1000ms before rechecking the buffer, so it's good practice to flush the output after writes complete (this will wake the reader if it is sleeping).
Your input stream will see the EOF (bytesRead == -1) when you close the output stream in the producer thread.
import java.io.*;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class PipeTest {
public static void main(String[] args) throws IOException {
PipedOutputStream out = new PipedOutputStream();
// Wire an input stream to the output stream, and use a buffer of 2048 bytes
PipedInputStream in = new PipedInputStream(out, 2048);
ExecutorService executor = Executors.newCachedThreadPool();
// Producer thread.
executor.execute(() -> {
try {
for (int i = 0; i < 10240; i++) {
out.write(0);
// flush to wake the reader
out.flush();
}
out.close();
} catch (IOException e) {
throw new UncheckedIOException(e);
}
});
// Consumer thread.
executor.execute(() -> {
try {
int b, read = 0;
while ((b = in.read()) != -1) {
read++;
}
System.out.println("Read " + read + " bytes.");
} catch (IOException e) {
throw new UncheckedIOException(e);
}
});
executor.shutdown();
}
}

GZIPOutputStream that does its compression in a separate thread

Is there an implemetation of GZIPOutputStream that would do the heavy lifting (compressing + writing to disk) in a separate thread?
We are continuously writing huge amounts of GZIP-compressed data. I am looking for a drop-in replacement that could be used instead of GZIPOutputStream.
You can write to a PipedOutputStream and have a thread which reads the PipedInputStream and copies it to any stream you like.
This is a generic implementation. You give it an OutputStream to write to and it returns an OutputStream for you to write to.
public static OutputStream asyncOutputStream(final OutputStream out) throws IOException {
PipedOutputStream pos = new PipedOutputStream();
final PipedInputStream pis = new PipedInputStream(pos);
new Thread(new Runnable() {
#Override
public void run() {
try {
byte[] bytes = new byte[8192];
for(int len; (len = pis.read(bytes)) > 0;)
out.write(bytes, 0, len);
} catch(IOException ioe) {
ioe.printStackTrace();
} finally {
close(pis);
close(out);
}
}
}, "async-output-stream").start();
return pos;
}
static void close(Closeable closeable) {
if (closeable != null) try {
closeable.close();
} catch (IOException ignored) {
}
}
I published some code that does exactly what you are looking for. It has always frustrated me that Java doesn't automatically pipeline calls like this across multiple threads, in order to overlap computation, compression, and disk I/O:
https://github.com/lukehutch/PipelinedOutputStream
This class splits writing to an OutputStream into separate producer and consumer threads (actually, starts a new thread for the consumer), and inserts a blocking bounded buffer between them. There is some data copying between buffers, but this is done as efficiently as possible.
You can even layer this twice to do the disk writing in a separate thread from the gzip compression, as shown in README.md.

Best practice for reading / writing to a java server socket

How do you design a read and write loop which operates on a single socket (which supports parallel read and write operations)? Do I have to use multiple threads? Is my (java) solution any good? What about that sleep command? How do you use that within such a loop?
I'm trying to use 2 Threads:
Read
public void run() {
InputStream clientInput;
ByteArrayOutputStream byteBuffer;
BufferedInputStream bufferedInputStream;
byte[] data;
String dataString;
int lastByte;
try {
clientInput = clientSocket.getInputStream();
byteBuffer = new ByteArrayOutputStream();
bufferedInputStream = new BufferedInputStream(clientInput);
while(isRunning) {
while ((lastByte = bufferedInputStream.read()) > 0) {
byteBuffer.write(lastByte);
}
data = byteBuffer.toByteArray();
dataString = new String(data);
byteBuffer.reset();
}
} catch (IOException e) {
e.printStackTrace();
}
}
Write
public void run() {
OutputStream clientOutput;
byte[] data;
String dataString;
try {
clientOutput = clientSocket.getOutputStream();
while(isOpen) {
if(!commandQueue.isEmpty()) {
dataString = commandQueue.poll();
data = dataString.getBytes();
clientOutput.write(data);
}
Thread.sleep(1000);
}
clientOutput.close();
}
catch (IOException e) {
e.printStackTrace();
}
catch (InterruptedException e) {
e.printStackTrace();
}
}
Read fails to deliver a proper result, since there is no -1 sent.
How do I solve this issue?
Is this sleep / write loop a good solution?
There are basically three ways to do network I/O:
Blocking. In this mode reads and writes will block until they can be fulfilled, so if you want to do both simultaneously you need separate threads for each.
Non-blocking. In this mode reads and writes will return zero (Java) or in some languages (C) a status indication (return == -1, errno=EAGAIN/EWOULDBLOCK) when they cannot be fulfilled, so you don't need separate threads, but you do need a third API that tells you when the operations can be fulfilled. This is the purpose of the select() API.
Asynchronous I/O, in which you schedule the transfer and are given back some kind of a handle via which you can interrogate the status of the transfer, or, in more advanced APIs, a callback.
You should certainly never use the while (in.available() > 0)/sleep() style you are using here. InputStream.available() has few correct uses and this isn't one of them, and the sleep is literally a waste of time. The data can arrive within the sleep time, and a normal read() would wake up immediately.
You should rather use a boolean variable instead of while(true) to properly close your thread when you will want to. Also yes, you should create multiple thread, one per client connected, as the thread will block itself until a new data is received (with DataInputStream().read() for example). And no, this is not really a design question, each library/Framework or languages have its own way to listen from a socket, for example to listen from a socket in Qt you should use what is called "signals and slots", not an infinite loop.

Write end dead exception using PipedInputStream java

Write end dead exception occurs in the following situation:
Two threads:
A: PipedOutputStream put = new PipedOutputStream();
String msg = "MESSAGE";
output.wirte(msg.getBytes());
output.flush();
B: PipedInputStream get = new PipedOutputStream(A.put);
byte[] get_msg = new byte[1024];
get.read(get_msg);
Here is the situation: A and B run concurrently, and A writes to the pipe and B reads it. B just read from the pipe and buffer of this pipe is cleared. Then A doesn't write msg to the pipe in unknown interval. However, at one moment, B read the pipe again and java.io.IOException: write end dead occurs, because the buffer of the pipe is still empty. And I don't want to sleep() thread B to wait for A writing the pipe, which is also unstable. How to avoid this problem and solve it? Thanks
"Write end dead" exceptions will arise when you have:
A PipedInputStream connected to a PipedOutputStream and
The ends of these pipe are read/written by two different threads
The threads finish without closing their side of the pipe.
To resolve this exception, simply close your Piped Stream in your Thread's runnable after you have completed writing and reading bytes to/from the pipe stream.
Here is some sample code:
final PipedOutputStream output = new PipedOutputStream();
final PipedInputStream input = new PipedInputStream(output);
Thread thread1 = new Thread(new Runnable() {
#Override
public void run() {
try {
output.write("Hello Piped Streams!! Used for Inter Thread Communication".getBytes());
output.close();
} catch(IOException io) {
io.printStackTrace();
}
}
});
Thread thread2 = new Thread(new Runnable() {
#Override
public void run() {
try {
int data;
while((data = input.read()) != -1) {
System.out.println(data + " ===> " + (char)data);
}
input.close();
} catch(IOException io) {
io.printStackTrace();
}
}
});
thread1.start();
thread2.start();
Complete code is here: https://github.com/prabhash1785/Java/blob/master/JavaCodeSnippets/src/com/prabhash/java/io/PipedStreams.java
For more details, please have a look at this nice blog: https://techtavern.wordpress.com/2008/07/16/whats-this-ioexception-write-end-dead/
you need to close PipedOutputStream, before writing thread is finished (and ofcourse after all data is written). PipedInputStream throws this exception on read() when there is no writing thread and writer is not properly closed

How to create a Java non-blocking InputStream from a HttpsURLConnection?

Basically, I have a URL that streams xml updates from a chat room when new messages are posted. I'd like to turn that URL into an InputStream and continue reading from it as long as the connection is maintained and as long as I haven't sent a Thread.interrupt(). The problem I'm experiencing is that BufferedReader.ready() doesn't seem to become true when there is content to be read from the stream.
I'm using the following code:
BufferedReader buf = new BufferedReader(new InputStreamReader(ins));
String str = "";
while(Thread.interrupted() != true)
{
connected = true;
debug("Listening...");
if(buf.ready())
{
debug("Something to be read.");
if ((str = buf.readLine()) != null) {
// str is one line of text; readLine() strips the newline character(s)
urlContents += String.format("%s%n", str);
urlContents = filter(urlContents);
}
}
// Give the system a chance to buffer or interrupt.
try{Thread.sleep(1000);} catch(Exception ee) {debug("Caught thread exception.");}
}
When I run the code, and post something to the chat room, buf.ready() never becomes true, resulting in the lines never being read. However, if I skip the "buf.ready()" part and just read lines directly, it blocks further action until lines are read.
How do I either a) get buf.ready() to return true, or b) do this in such a way as to prevent blocking?
Thanks in advance,
James
How to create a Java non-blocking InputStream
You can't. Your question embodies a contradiciton in terms. Streams in Java are blocking. There is therefore no such thing as a 'non-blocking InputStream'.
Reader.ready() returns true when data can be read without blocking. Period. InputStreams and Readers are blocking. Period. Everything here is working as designed. If you want more concurrency with these APIs you will have to use multiple threads. Or Socket.setSoTimeout() and its near relation in HttpURLConnection.
For nonblocking IO don't use InputStream and Reader (or OutputStream/Writer), but use the java.nio.* classes, in this case a SocketChannel (and additional a CharsetDecoder).
Edit: as an answer to your comment:
Specifically looking for how to create a socket channel to an https url.
Sockets (and also SocketChannels) work on the transport layer (TCP), one (or two) level(s) below application layer protocols like HTTP. So you can't create a socket channel to an https url.
You would instead have to open a Socket-Channel to the right server and the right port (443 if nothing else given in the URI), create an SSLEngine (in javax.net.ssl) in client mode, then read data from the channel, feeding it to the SSL engine and the other way around, and send/get the right HTTP protocol lines to/from your SSLEngine, always checking the return values to know how many bytes were in fact processed and what would be the next step to take.
This is quite complicated (I did it once), and you don't really want to do this if you are not implementing a server with lots of clients connected at the same time (where you can't have a single thread for each connection). Instead, stay with your blocking InputStream which reads from your URLConnection, and put it simply in a spare thread which does not hinder the rest of your application.
You can use the Java NIO library which provides non-blocking I/O capabilities. Take a look at this article for details and sample code: http://www.drdobbs.com/java/184406242.
There is no HTTP/HTTPS implementation using Channels. There is no way to read the inputstream from a httpurlconnaction in a non-blocking way. You either have to use a third party lib or implement http over SocketChannel yourself.
import java.io.InputStream;
import java.util.Arrays;
/**
* This code demonstrates non blocking read from standard input using separate
* thread for reading.
*/
public class NonBlockingRead {
// Holder for temporary store of read(InputStream is) value
private static String threadValue = "";
public static void main(String[] args) throws InterruptedException {
NonBlockingRead test = new NonBlockingRead();
while (true) {
String tmp = test.read(System.in, 100);
if (tmp.length() > 0)
System.out.println(tmp);
Thread.sleep(1000);
}
}
/**
* Non blocking read from input stream using controlled thread
*
* #param is
* — InputStream to read
* #param timeout
* — timeout, should not be less that 10
* #return
*/
String read(final InputStream is, int timeout) {
// Start reading bytes from stream in separate thread
Thread thread = new Thread() {
public void run() {
byte[] buffer = new byte[1024]; // read buffer
byte[] readBytes = new byte[0]; // holder of actually read bytes
try {
Thread.sleep(5);
// Read available bytes from stream
int size = is.read(buffer);
if (size > 0)
readBytes = Arrays.copyOf(buffer, size);
// and save read value in static variable
setValue(new String(readBytes, "UTF-8"));
} catch (Exception e) {
System.err.println("Error reading input stream\nStack trace:\n" + e.getStackTrace());
}
}
};
thread.start(); // Start thread
try {
thread.join(timeout); // and join it with specified timeout
} catch (InterruptedException e) {
System.err.println("Data were note read in " + timeout + " ms");
}
return getValue();
}
private synchronized void setValue(String value) {
threadValue = value;
}
private synchronized String getValue() {
String tmp = new String(threadValue);
setValue("");
return tmp;
}
}

Categories