I have a class that implements 'Runnable' to read data from a data stream. The data comes from a Channel which is stored as a member variable in another of my classes, and I can get an instance of this channel by simply calling the getter getInputChannel(). Now, for my Runnable to read the data from the channel, it needs to know what type of channel it is so that it can use the channel's read method. The channel type may be one of either FileChannel or SocketChannel, and is decided at run time, i.e.,
private class ReadInputStream implements Runnable {
Thread thread;
boolean running = true;
ByteBuffer buffer = ByteBuffer.allocate(1024);
FileChannel or SocketChannel channel;
public ReadInputStream() {
// Need to cast type channel at run time
Channel ch = getInputChannel();
this.channel = (FileChannel or SocketChannel) ch;
}
public void run() {
while (running) {
channel.read(buffer);
// etc.
}
}
}
What is the best way to get the right type of channel so that I can implement its read method in the runnable's run() method?
Both FileChannel and SocketChannel implement ByteChannel which is what declares their read(ByteBuffer) method, so that's the type your getInputChannel() should return.
Edit Or if you only ever read from the channel, return a ReadableByteChannel as Darkhogg says. Since this is an input channel, this is most likely the case anyway.
There is no way in Java to express union types. Your best bet is to use some common interface that applies to both.
If you're using the channel only for reading, define it to be a ReadableByteChannel.
If you're using it for writing, use a WritableByteChannel.
If you need both, use ByteChannel.
You could use a simple if/else clause and cycle through the available instance types, for instance (no pun intended..):
if (channel instanceof FileChannel)
{
((FileChannel)channel).read(buffer);
}
else if (channel instanceof SocketChannel)
{
((SocketChannel)channel).read(buffer);
}
etc.
Related
I developed an application using Java socket. I am exchanging messages with this application with the help of byte arrays. I have a message named M1, 1979 bytes long. My socket buffer length is 512 bytes. I read this message in 4 parts, each with 512 bytes, but the last one is of course 443 bytes. I will name these parts like A, B, C, and D. So ABCD is a valid message of mine respectively.
I have a loop with a thread which is like below.
BlockingQueue<Chunk> queue = new LinkedBlockingQueue<>();
InputStream in = socket.getInputStream()
byte[] buffer = new byte[512];
while(true) {
int readResult = in.read(buffer);
if(readResult != -1) {
byte[] arr = Arrays.copyOf(buffer, readResult);
Chunk c = new Chunk(arr);
queue.put(c);
}
}
I'm filling the queue with the code above. When the message sending starts, I see the queue fill up in ABCD form but sometimes I put the data in the queue as a BACD. But I know that this is impossible because the TCP connection guarantees the order.
I looked at the dumps with Wireshark. This message comes correctly with a single tcp package. So there is no problem on the sender side. I am 100% sure that the message has arrived correctly but the read method does not seem to read in the correct order and this situation doesn't always happen. I could not find a valid reason for what caused this.
When I tried the same code on two different computers I noticed that the problem was in only one. The jdk versions on these computers are different. I looked at the version differences between the two jdk versions. When the Jdk version is "JDK 8u202", I am getting the situation where it works incorrectly. When I tried it with jdk 8u271, there was no problem. Maybe it is related to that but I wasn't sure. Because I have no valid evidence.
I am open to all kinds of ideas and suggestions. It's really on its way to being the most interesting problem I've ever encountered.
Thank you for your help.
EDIT: I found similar question.
Blocking Queue Take out of Order
EDIT:
Ok, I have read all the answers given below. Thank you for providing different perspectives for me. I will try to supplement some missing information.
Actually I have 2 threads. Thread 1(SocketReader) is responsible for reading socket. It wraps the information it reads with a Chunk class and puts it on the queue in the other Thread 2. So queue is in Thread 2. Thread 2(MessageDecoder) is consuming the blocking queue. There are no threads other than these. Actually this is a simple example of a "producer consumer design patter".
And yes, other messages are sent, but other messages take up less than 512 bytes. Therefore, I can read in one go. I do not encounter any sort problem.
MessageDecoder.java
public class MessageDecoder implements Runnable{
private BlockingQueue<Chunk> queue = new LinkedBlockingQueue<>();
public MessageDecoder() {
}
public void run() {
while(true) {
Chunk c;
try {
c = queue.take();
System.out.println(c.toString());
} catch (InterruptedException e) {
e.printStackTrace();
}
decodeMessageChunk(c);
}
}
public void put(Chunk c) {
try {
queue.put(c);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
SocketReader.java
public class SocketReader implements Runnable{
private final MessageDecoder msgDec;
private final InputStream in;
byte[] buffer = new byte[512];
public SocketReader(InputStream in, MessageDecoder msgDec) {
this.in = in;
this.msgDec = msgDec;
}
public void run() {
while(true) {
int readResult = in.read(buffer);
if(readResult != -1) {
byte[] arr = Arrays.copyOf(buffer, readResult);
Chunk c = new Chunk(arr);
msgDec.put(c);
}
}
}
}
Even if it's a FIFO queue, the locking of the LinkedBloquingQueue is unfair, so you can't guarantee the ordering of elements. More info regarding this here
I'd suggest using an ArrayBlockingQueue instead. Like the LinkedBloquingQueue, the order is not guaranteed but offers a slightly different locking mechanism.
This class supports an optional fairness policy for ordering waiting
producer and consumer threads. By default, this ordering is not
guaranteed. However, a queue constructed with fairness set to true
grants threads access in FIFO order. Fairness generally decreases
throughput but reduces variability and avoids starvation.
In order to set fairness, you must initialize it using this constructor:
So, for example:
ArrayBlockingQueue<Chunk> fairQueue = new ArrayBlockingQueue<>(1000, true);
/*.....*/
Chunk c = new Chunk(arr);
fairQueue.add(c);
As the docs state, this should grant thread access in FIFO order, guaranteeing the retrievement of the elements to be consistent while avoiding possible locking robbery happening in LinkedBloquingQueue's lock mechanism.
I am using Jeromq in multithreaded environment as shown below. Below is my code in which constructor of SocketManager connects to all the available sockets first and I put them in liveSocketsByDatacenter map in the connectToZMQSockets method. After that I start a background thread in the same constructor which runs every 30 seconds and it calls updateLiveSockets method to ping all those socket which were already there in liveSocketsByDatacenter map and update the liveSocketsByDatacenter map with whether those sockets were alive or not.
And getNextSocket() method is called by multiple reader threads concurrently to get the next live available socket and then we use that socket to send the data on it. So my question is are we using Jeromq correctly in multithreaded environment? Because we just saw an exception in our production environment with this stacktrace while we were trying to send data to that live socket so I am not sure whether it's a bug or something else?
java.lang.ArrayIndexOutOfBoundsException: 256
at zmq.YQueue.push(YQueue.java:97)
at zmq.YPipe.write(YPipe.java:47)
at zmq.Pipe.write(Pipe.java:232)
at zmq.LB.send(LB.java:83)
at zmq.Push.xsend(Push.java:48)
at zmq.SocketBase.send(SocketBase.java:590)
at org.zeromq.ZMQ$Socket.send(ZMQ.java:1271)
at org.zeromq.ZFrame.send(ZFrame.java:131)
at org.zeromq.ZFrame.sendAndKeep(ZFrame.java:146)
at org.zeromq.ZMsg.send(ZMsg.java:191)
at org.zeromq.ZMsg.send(ZMsg.java:163)
Below is my code:
public class SocketManager {
private static final Random random = new Random();
private final ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
private final Map<Datacenters, List<SocketHolder>> liveSocketsByDatacenter = new ConcurrentHashMap<>();
private final ZContext ctx = new ZContext();
private static class Holder {
private static final SocketManager instance = new SocketManager();
}
public static SocketManager getInstance() {
return Holder.instance;
}
private SocketManager() {
connectToZMQSockets();
scheduler.scheduleAtFixedRate(this::updateLiveSockets, 30, 30, TimeUnit.SECONDS);
}
// during startup, making a connection and populate once
private void connectToZMQSockets() {
Map<Datacenters, List<String>> socketsByDatacenter = Utils.SERVERS;
for (Map.Entry<Datacenters, List<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> addedColoSockets = connect(entry.getValue(), ZMQ.PUSH);
liveSocketsByDatacenter.put(entry.getKey(), addedColoSockets);
}
}
private List<SocketHolder> connect(List<String> addresses, int socketType) {
List<SocketHolder> socketList = new ArrayList<>();
for (String address : addresses) {
try {
Socket client = ctx.createSocket(socketType);
// Set random identity to make tracing easier
String identity = String.format("%04X-%04X", random.nextInt(), random.nextInt());
client.setIdentity(identity.getBytes(ZMQ.CHARSET));
client.setTCPKeepAlive(1);
client.setSendTimeOut(7);
client.setLinger(0);
client.connect(address);
SocketHolder zmq = new SocketHolder(client, ctx, address, true);
socketList.add(zmq);
} catch (Exception ex) {
// log error
}
}
return socketList;
}
// this method will be called by multiple threads concurrently to get the next live socket
// is there any concurrency or thread safety issue or race condition here?
public Optional<SocketHolder> getNextSocket() {
for (Datacenters dc : Datacenters.getOrderedDatacenters()) {
Optional<SocketHolder> liveSocket = getLiveSocket(liveSocketsByDatacenter.get(dc));
if (liveSocket.isPresent()) {
return liveSocket;
}
}
return Optional.absent();
}
private Optional<SocketHolder> getLiveSocket(final List<SocketHolder> listOfEndPoints) {
if (!CollectionUtils.isEmpty(listOfEndPoints)) {
// The list of live sockets
List<SocketHolder> liveOnly = new ArrayList<>(listOfEndPoints.size());
for (SocketHolder obj : listOfEndPoints) {
if (obj.isLive()) {
liveOnly.add(obj);
}
}
if (!liveOnly.isEmpty()) {
// The list is not empty so we shuffle it an return the first element
return Optional.of(liveOnly.get(random.nextInt(liveOnly.size()))); // just pick one
}
}
return Optional.absent();
}
// runs every 30 seconds to ping all the socket to make sure whether they are alive or not
private void updateLiveSockets() {
Map<Datacenters, List<String>> socketsByDatacenter = Utils.SERVERS;
for (Map.Entry<Datacenters, List<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> liveSockets = liveSocketsByDatacenter.get(entry.getKey());
List<SocketHolder> liveUpdatedSockets = new ArrayList<>();
for (SocketHolder liveSocket : liveSockets) { // LINE A
Socket socket = liveSocket.getSocket();
String endpoint = liveSocket.getEndpoint();
Map<byte[], byte[]> holder = populateMap();
Message message = new Message(holder, Partition.COMMAND);
// pinging to see whether a socket is live or not
boolean status = SendToSocket.getInstance().execute(message.getAdd(), holder, socket);
boolean isLive = (status) ? true : false;
SocketHolder zmq = new SocketHolder(socket, liveSocket.getContext(), endpoint, isLive);
liveUpdatedSockets.add(zmq);
}
liveSocketsByDatacenter.put(entry.getKey(), Collections.unmodifiableList(liveUpdatedSockets));
}
}
}
And here is how I am using getNextSocket() method of SocketManager class concurrently from multiple reader threads:
// this method will be called from multiple threads
public boolean sendAsync(final long addr, final byte[] reco) {
Optional<SocketHolder> liveSockets = SocketManager.getInstance().getNextSocket();
return sendAsync(addr, reco, liveSockets.get().getSocket(), false);
}
public boolean sendAsync(final long addr, final byte[] reco, final Socket socket,
final boolean messageA) {
ZMsg msg = new ZMsg();
msg.add(reco);
boolean sent = msg.send(socket);
msg.destroy();
retryHolder.put(addr, reco);
return sent;
}
public boolean send(final long address, final byte[] encodedRecords, final Socket socket) {
boolean sent = sendAsync(address, encodedRecords, socket, true);
// if the record was sent successfully, then only sleep for timeout period
if (sent) {
try {
TimeUnit.MILLISECONDS.sleep(500);
} catch (InterruptedException ex) {
Thread.currentThread().interrupt();
}
}
// ...
return sent;
}
I don't think this is correct I believe. It seems getNextSocket() could return a 0MQ socket to thread A. Concurrently, the timer thread may access the same 0MQ socket to ping it. In this case thread A and the timer thread are mutating the same 0MQ socket, which will lead to problems. So what is the best and efficient way to fix this issue?
Note: SocketHolder is an immutable class
Update:
I just noticed same issue happened on my another box with same ArrayIndexOutOfBoundsException but this time its 71 line number in "YQueue" file. The only consistent thing is 256 always. So there should be something related to 256 for sure and I am not able to figure out what is this 256 here?
java.lang.ArrayIndexOutOfBoundsException: 256
at zmq.YQueue.backPos(YQueue.java:71)
at zmq.YPipe.write(YPipe.java:51)
at zmq.Pipe.write(Pipe.java:232)
at zmq.LB.send(LB.java:83)
at zmq.Push.xsend(Push.java:48)
at zmq.SocketBase.send(SocketBase.java:590)
at org.zeromq.ZMQ$Socket.send(ZMQ.java:1271)
at org.zeromq.ZFrame.send(ZFrame.java:131)
at org.zeromq.ZFrame.sendAndKeep(ZFrame.java:146)
at org.zeromq.ZMsg.send(ZMsg.java:191)
at org.zeromq.ZMsg.send(ZMsg.java:163)
Fact #0: ZeroMQ is not thread-safe -- by definition
While ZeroMQ documentation and Pieter HINTJENS' excellent book "Code Connected. Volume 1" do not forget to remind this fact wherever possible, the idea of returning or even sharing a ZeroMQ socket-instance among threads appear from time to time. Sure, class-instances' methods may deliver this almost "hidden" inside theirs internal methods and attributes, but proper design efforts ought prevent any such side-effects with no exceptions, no excuse.
Sharing, if reasonably supported by quantitative facts, may be a way for a common instance of the zmq.Context(), but a crystal-clear distributed system design may live on a truly multi-agent scheme, where each agent operates its own Context()-engine, fine-tuned to the respective mix of configuration and performance preferences.
So what is the best and efficient way to fix this issue?
Never share a ZeroMQ socket. Never, indeed. Even if the newest development started to promise some near future changes in this direction. It is a bad habit to pollute any high-performance, low-latency distributed system design with sharing. Share nothing is the best design principle for this domain.
Yeah I can see we should not share sockets between threads but in my codewhat do you think is the best way to resolve this?
Yeah, the best and efficient way to fix this issue is to never share a ZeroMQ socket.
This means never return any object, attributes of which are ZeroMQ sockets ( which you actively build and return in a massive manner from the .connect(){...} class-method. In your case, all the class-methods seem to be kept private, which may fuse the problem of allowing "other threads" to touch the class-private socket instances, but the same principle must be endorsed also on all the attribute-level, so as to be effective. Finally, this "fusing" gets shortcut and violated by the
public static SocketManager getInstance(),
which promiscuitively offers any external asker to get a straight access to sharing the class-private instances of the ZeroMQ sockets.
If some documentation explicitly warns in almost every chapter not to share things, one rather should not share the things.
So, re-design the methods, so that the SocketManager gets more functionalities as it's class-methods, which will execute the must-have functionalities embedded, so as to explicitly prevent any external-world thread to touch a non-share-able instance, as documented in ZeroMQ publications.
Next comes the inventory of resources: your code seems to re-check every 30 seconds the state-of-the-world in all DataCenters-of-Interest. This actually creates new List objects twice a minute. While you may speculatively let java Garbage Collector to tidy up all the thrash, that is not further referenced from anywhere, this is not a good idea for ZeroMQ-related objects, embedded inside List-s from your previous re-check runs. ZeroMQ-objects are still referenced from inside the Zcontext() - the ZeroMQ Context()-core-factory instantiated I/O-thread(s), which could be also viewed as the ZeroMQ socket-inventory resources-manager. So, all the new-created socket-instances get not only the external-handle from the java-side, but also an internal-handle, from inside the (Z)Context(). So far so good. But what is not seen, anywhere in the code, is any method, that would de-commission any and all the ZeroMQ sockets in object-instances, that have got deassociated from java-side, but yet remain referenced from the (Z)Context()-side. Explicit resource-decommissioning of allocated resources is a fair design-side practice, the more for resources, that are limited or otherwise constrained. The way how to do this may differ for { "cheap" | "expensive" }-maintenance costs of such resources-management processing ( ZeroMQ socket-instances being remarkably expensive to get handled as some lightweight "consumable/disposable" ... but that is another story ).
So, add also a set of proper resources-re-use / resources-dismantling methods, that would get the total amount of new-created sockets back under your responsibility of control ( your code is responsible for how many socket-handlers inside the (Z)Context()-domain-of-resources-control may get created and must remain to have been managed -- be it knowingly or not ).
One may object there might be some "promises" from automated detection and ( potentially well deferred ) garbage collection, but still, your code is responsible for proper resources-management and even LMAX guys would never get such brave performance, if they were relying on "promises" from standard gc. Your problem is way worse than LMAX top-performance had to fight with. Your code ( so far published ) does nothing to .close() and .term() the ZeroMQ-associated resources at all. This is a straight impossible practice inside an ecosystem with uncontrolled-(distributed-demand-for)-consumption. You have to protect your boat from getting overloaded beyond a limit you know it can safely handle and dynamically unload each and every box, that has no recipient on the "opposite coast".
That is the Captain's ( your code designer's ) responsibility.
Not telling explicitly the sailor-in-charge of the inventory-management on the lowest level ( ZeroMQ Context()-floor ) that some boxes are to get un-loaded, the problem is still yours. The standard gc-chain-of-command will not do this "automatically", whatever "promises" might look like it would, it would not. So be explicit towards your ZeroMQ resources-management, evaluate return-codes from ordering these steps to be taken and handle appropriately any and all exceptions raised from doing these resources-management operations under your code explicit control.
Lower ( if not the lowest achievable at all ) resources utilisation-envelopes and higher ( if not the highest achievable at all ) performance is a bonus from doing this job right. LMAX guys are a good example in doing this remarkably well beyond the standard java "promises", so one can learn from the bests of the bests.
Call signatures declared, vs. used, do not seem to match:
while I may be wrong in this point, as most of my design efforts are not in java polymorphic call-interfaces, there seems to be a mis-match in a signature, published as:
private List<SocketHolder> connect( Datacenters dc, // 1-st
List<String> addresses, // 2-nd
int socketType // 3-rd
) {
... /* implementation */
}
andthe actual method invocation,called inside connectToZMQSockets() method just by:
List<SocketHolder> addedColoSockets = connect( entry.getValue(), // 1-st
ZMQ.PUSH // 2-nd
);
Like the title; Does closing a FileChannel close the underlying file stream?
From the AbstractInterruptibleChannel.close() API docs you can read:
Closes this channel.
If the channel has already been closed then this method returns
immediately. Otherwise it marks the channel as closed and then invokes
the implCloseChannel method in order to complete the close operation.
Which invokes AbstractInterruptibleChannel.implCloseChannel:
Closes this channel.
This method is invoked by the close method in order to perform the
actual work of closing the channel. This method is only invoked if the
channel has not yet been closed, and it is never invoked more than
once.
An implementation of this method must arrange for any other thread
that is blocked in an I/O operation upon this channel to return
immediately, either by throwing an exception or by returning normally.
And that doesn't say anything about the stream. So in fact, when I do:
public static void copyFile(File from, File to)
throws IOException, FileNotFoundException {
FileChannel sc = null;
FileChannel dc = null;
try {
to.createNewFile();
sc = new FileInputStream(from).getChannel();
dc = new FileOutputStream(to).getChannel();
long pos = 0;
long total = sc.size();
while (pos < total)
pos += dc.transferFrom(sc, pos, total - pos);
} finally {
if (sc != null)
sc.close();
if (dc != null)
dc.close();
}
}
...I leave the streams open?
The answer is 'yes' but there's nothing in the Javadoc that actually says so. The reason is that FileChannel itself is an abstract class, and its concrete implementation provides the implCloseChannel() method, which closes the underlying FD. However due to that architecture and the fact that implCloseChannel() is protected, this doesn't get documented.
I have a Socket that I am both reading and writing to, via BufferedReaders and BufferedWriters. I'm not sure which operations are okay to do from separate threads. I would guess that writing to the socket from two different threads at the same time is a bad idea. Same with reading off the socket from two different threads at the same time. What about reading on one thread while writing on another?
I ask because I want to have one thread blocked for a long time on a read as it waits for more data, but during this wait I also have occasional data to send on the socket. I'm not clear if this is threadsafe, or if I should cancel the read before I write (which would be annoying).
Sockets are thread unsafe at the stream level. You have to provide synchronization. The only warranty is that you won't get copies of the exact same bytes in different read invocations no matter concurrency.
But at a Reader and, specially, Writer level, you might have some locking problems.
Anyway, you can handle read and write operations with the Socket's streams as if they were completely independent objects (they are, the only thing they share is their lifecyle).
Once you have provided correct synchronization among reader threads on one hand, and writer threads on the other hand, any number of readers and writers will be okay. This means that, yes, you can read on one thread and write on another (in fact that's very frequent), and you don't have to stop reading while writing.
One last advice: all of the operations involving threads have associated timeout, make sure that you handle the timeouts correctly.
You actually read from InputStream and write to OutputStream. They are fairly independent and for as long as you serialize access to each of them you are ok.
You have to correlate, however, the data that you send with data that you receive. That's different from thread safety.
Java java.net.Socket is not actually thread safe: Open the Socket source, and look at the (let say) connected member field and how it is used. You will see that is not volatile, read and updated without synchrinization. This indicates that Socket class is not designed to be used by multiple threads. Though, there is some locks and synchronization there, it is not consistent.`
I recommend not to do it. Eventually, use buffers(nio), and do socket reads/writes in one thread
For details go the the discussionv
You can have one thread reading the socket and another thread writing to it. You may want to have a number of threads write to the socket, in which case you have to serialize your access with synchronization or you could have a single writing thread which gets the data to write from a queue. (I prefer the former)
You can use non-blocking IO and share the reading and writing work in a single thread. However this is actually more complex and tricky to get right. If you want to do this I suggest you use a library to help you such as Netty or Mina.
Very interesting, the nio SocketChannel writes are synchronized
http://www.docjar.com/html/api/sun/nio/ch/SocketChannelImpl.java.html
The old io Socket stuff depends on the OS so you would have to look at the OS native code to know for sure(and that may vary from OS to OS)...
Just look at java.net.SocketOutputStream.java which is what Socket.getOutputStream returns.
(unless of course I missed something).
oh, one more thing, they could have put synchronization in the native code in every JVM on each OS but who knows for sure. Only the nio is obvious that synchronization exists.
This is how socketWrite in native code, so it's not thread safe from the code
JNIEXPORT void JNICALL
Java_java_net_SocketOutputStream_socketWrite0(JNIEnv *env, jobject this,
jobject fdObj,
jbyteArray data,
jint off, jint len) {
char *bufP;
char BUF[MAX_BUFFER_LEN];
int buflen;
int fd;
if (IS_NULL(fdObj)) {
JNU_ThrowByName(env, "java/net/SocketException", "Socket closed");
return;
} else {
fd = (*env)->GetIntField(env, fdObj, IO_fd_fdID);
/* Bug 4086704 - If the Socket associated with this file descriptor
* was closed (sysCloseFD), the the file descriptor is set to -1.
*/
if (fd == -1) {
JNU_ThrowByName(env, "java/net/SocketException", "Socket closed");
return;
}
}
if (len <= MAX_BUFFER_LEN) {
bufP = BUF;
buflen = MAX_BUFFER_LEN;
} else {
buflen = min(MAX_HEAP_BUFFER_LEN, len);
bufP = (char *)malloc((size_t)buflen);
/* if heap exhausted resort to stack buffer */
if (bufP == NULL) {
bufP = BUF;
buflen = MAX_BUFFER_LEN;
}
}
while(len > 0) {
int loff = 0;
int chunkLen = min(buflen, len);
int llen = chunkLen;
(*env)->GetByteArrayRegion(env, data, off, chunkLen, (jbyte *)bufP);
while(llen > 0) {
int n = NET_Send(fd, bufP + loff, llen, 0);
if (n > 0) {
llen -= n;
loff += n;
continue;
}
if (n == JVM_IO_INTR) {
JNU_ThrowByName(env, "java/io/InterruptedIOException", 0);
} else {
if (errno == ECONNRESET) {
JNU_ThrowByName(env, "sun/net/ConnectionResetException",
"Connection reset");
} else {
NET_ThrowByNameWithLastError(env, "java/net/SocketException",
"Write failed");
}
}
if (bufP != BUF) {
free(bufP);
}
return;
}
len -= chunkLen;
off += chunkLen;
}
if (bufP != BUF) {
free(bufP);
}
}
I have a BufferedReader (generated by new BufferedReader(new InputStreamReader(process.getInputStream()))). I'm quite new to the concept of a BufferedReader but as I see it, it has three states:
A line is waiting to be read; calling bufferedReader.readLine will return this string instantly.
The stream is open, but there is no line waiting to be read; calling bufferedReader.readLine will hang the thread until a line becomes available.
The stream is closed; calling bufferedReader.readLine will return null.
Now I want to determine the state of the BufferedReader, so that I can determine whether I can safely read from it without hanging my application. The underlying process (see above) is notoriously unreliable and so might have hung; in this case, I don't want my host application to hang. Therefore I'm implementing a kind of timeout. I tried to do this first with threading but it got horribly complicated.
Calling BufferedReader.ready() will not distinguish between cases (2) and (3) above. In other words, if ready() returns false, it might be that the stream just closed (in other words, my underlying process closed gracefully) or it might be that the underlying process hung.
So my question is: how do I determine which of these three states my BufferedReader is in without actually calling readLine? Unfortunately I can't just call readLine to check this, as it opens my app up to a hang.
I am using JDK version 1.5.
There is a state where some data may be in the buffer, but not necessarily enough to fill a line. In this case, ready() would return true, but calling readLine() would block.
You should easily be able to build your own ready() and readLine() methods. Your ready() would actually try to build up a line, and only when it has done so successfully would it return true. Then your readLine() could return the fully-formed line.
Finally I found a solution to this. Most of the answers here rely on threads, but as I specified earlier, I am looking for a solution which doesn't require threads. However, my basis was the process. What I found was that processes seem to exit if both the output (called "input") and error streams are empty and closed. This makes sense if you think about it.
So I just polled the output and error streams and also tried to determine if the process had exited or not. Below is a rough copy of my solution.
public String readLineWithTimeout(Process process, long timeout) throws IOException, TimeoutException {
BufferedReader output = new BufferedReader(new InputStreamReader(process.getInputStream()));
BufferedReader error = new BufferedReader(new InputStreamReader(process.getErrorStream()));
boolean finished = false;
long startTime = 0;
while (!finished) {
if (output.ready()) {
return output.readLine();
} else if (error.ready()) {
error.readLine();
} else {
try {
process.exitValue();
return null;
} catch (IllegalThreadStateException ex) {
//Expected behaviour
}
}
if (startTime == 0) {
startTime = System.currentTimeMills();
} else if (System.currentTimeMillis() > startTime + timeout) {
throw new TimeoutException();
}
}
}
This is a pretty fundamental issue with java's blocking I/O API.
I suspect you're going to want to pick one of:
(1) Re-visit the idea of using threading. This doesn't have to be complicated, done properly, and it would let your code escape a blocked I/O read fairly gracefully, for example:
final BufferedReader reader = ...
ExecutorService executor = // create an executor here, using the Executors factory class.
Callable<String> task = new Callable<String> {
public String call() throws IOException {
return reader.readLine();
}
};
Future<String> futureResult = executor.submit(task);
String line = futureResult.get(timeout); // throws a TimeoutException if the read doesn't return in time
(2) Use java.nio instead of java.io. This is a more complicated API, but it has non-blocking semantics.
Have you confirmed by experiment your assertion that ready() will return false even if the underlying stream is at end of file? Because I would not expect that assertion to be correct (although I haven't done the experiment).
You could use InputStream.available() to see if there is new output from the process. This should work the way you want it if the process outputs only full lines, but it's not really reliable.
A more reliable approach to the problem would be to have a seperate thread dedicated to reading from the process and pushing every line it reads to some queue or consumer.
In general, you have to implement this with multiple threads. There are special cases, like reading from a socket, where the underlying stream has a timeout facility built-in.
However, it shouldn't be horribly complicated to do this with multiple threads. This is a pattern I use:
private static final ExecutorService worker =
Executors.newSingleThreadExecutor();
private static class Timeout implements Callable<Void> {
private final Closeable target;
private Timeout(Closeable target) {
this.target = target;
}
public Void call() throws Exception {
target.close();
return null;
}
}
...
InputStream stream = process.getInputStream();
Future<?> task = worker.schedule(new Timeout(stream), 5, TimeUnit.SECONDS);
/* Use the stream as you wish. If it hangs for more than 5 seconds,
the underlying stream is closed, raising an IOException here. */
...
/* If you get here without timing out, cancel the asynchronous timeout
and close the stream explicitly. */
if(task.cancel(false))
stream.close();
You could make your own wrapper around InputStream or InputStreamReader that works on a byte-by-byte level, for which ready() returns accurate values.
Your other options are threading which could be done simply (look into some of the concurrent data structures Java offers) and NIO, which is very complex and probably overkill.
If you just want the timeout then the other methods here are possibly better. If you want a non-blocking buffered reader, here's how I would do it, with threads: (please note I haven't tested this and at the very least it needs some exception handling added)
public class MyReader implements Runnable {
private final BufferedReader reader;
private ConcurrentLinkedQueue<String> queue = new ConcurrentLinkedQueue<String>();
private boolean closed = false;
public MyReader(BufferedReader reader) {
this.reader = reader;
}
public void run() {
String line;
while((line = reader.readLine()) != null) {
queue.add(line);
}
closed = true;
}
// Returns true iff there is at least one line on the queue
public boolean ready() {
return(queue.peek() != null);
}
// Returns true if the underlying connection has closed
// Note that there may still be data on the queue!
public boolean isClosed() {
return closed;
}
// Get next line
// Returns null if there is none
// Never blocks
public String readLine() {
return(queue.poll());
}
}
Here's how to use it:
BufferedReader b; // Initialise however you normally do
MyReader reader = new MyReader(b);
new Thread(reader).start();
// True if there is data to be read regardless of connection state
reader.ready();
// True if the connection is closed
reader.closed();
// Gets the next line, never blocks
// Returns null if there is no data
// This doesn't necessarily mean the connection is closed, it might be waiting!
String line = reader.readLine(); // Gets the next line
There are four possible states:
Connection is open, no data is available
Connection is open, data is available
Connection is closed, data is available
Connection is closed, no data is available
You can distinguish between them with the isClosed() and ready() methods.