I got assigned to work on some performance and random crashing issues of a multi-threaded java server. Even though threads and thread-safety are not really new topics for me, I found out designing a new multi-threaded application is probably half as difficult as trying to tweak some legacy code. I skimmed through some well known books in search of answers, but the weird thing is, as long as I read about it and analyze the examples provided, everything seems clear. However, the second I look at the code I'm supposed to work on, I'm no longer sure about anything! Must be too much of theoretical knowledge and little real-world experience or something.
Anyway, getting back on topic, as I was doing some on-line research, I came across this piece of code. The question which keeps bothering me is: Is it really safe to invoke getInputStream() and getOutputStream() on the socket from two separate threads without synchronization? Or am I now getting a bit too paranoid about the whole thread-safety issue? Guess that's what happens when like the 5th book in a row tells you how many things can possibly go wrong with concurrency.
PS. Sorry if the question is a bit lengthy or maybe too 'noobie', please be easy on me - that's my first post here.
Edit: Just to be clear, I know sockets work in full-duplex mode and it's safe to concurrently use their input and output streams. Seems fine to me when you acquire those references in the main thread and then initialize thread objects with those, but is it also safe to get those streams in two different threads?
#rsp:
So I've checked Sun's code and PlainSocketImpl does synchronize on those two methods, just as you said. Socket, however, doesn't. getInputStream() and getOutputStream() are pretty much just wrappers for SocketImpl, so probably concurrency issues wouldn't cause the whole server to explode. Still, with a bit of unlucky timing, seems like things could go wrong (e.g. when some other thread closes the socket when the method already checked for error conditions).
As you pointed out, from a code structure standpoint, it would be a good idea to supply each thread with a stream reference instead of a whole socket. I would've probably already restructured the code I'm working on if not for the fact that each thread also uses socket's close() method (e.g. when the socket receives "shutdown" command). As far as I can tell, the main purpose of those threads is to queue messages for sending or for processing, so maybe it's a Single Responsibility Principle violation and those threads shouldn't be able to close the socket (compare with Separated Modem Interface)? But then if I keep analysing the code for too long, it appears the design is generally flawed and the whole thing requires rewriting. Even if the management was willing to pay the price, seriously refactoring legacy code, having no unit tests what so ever and dealing with a hard to debug concurrency issues, would probably do more harm than good. Wouldn't it?
The input stream and output stream of the socket represent two separate datastreams or channels. It is perfectly save using both streams in threads that are not synchronised between them. The socket streams themselves will block reading and writing on empty or full buffers.
Edit: the socket implementation classes from Sun do sychronize the getInputStream() and getOutputStream() methods, calling then from different threads should be OK. I agree with you however that passing the streams to the threads using them might make more sense from a code structure standpoint (dependency injection helps testing for instance.)
Related
I have a problem caused by multi-threading and Android Open Accessory.
I need to communicate with a USB Accessory, but I need to do it from 2 threads. One thread generates and sends data the other one reads data.
Why I don't use a single thread? Because there can be 1 or more writes before a read and reads are blocking, so that is not an option.
If using multiple threads, I do run into "I/O Error (No such device)" sooner or later, because I will have a collision between read & write being executed at the same time.
Locking will more or less put me back in single-thread situation, so not good.
.available() method on the input-stream returns is not supported, so I cannot check if anything is available before doing a read
Since it's not a socket-based stream I cannot set timeout either.
I have tried getting the FileDescriptor from the USBAccessory and passing to JNI to handle it there, but after the first read/write the device becomes inaccessible.
Question/Suggestion needed:
What will be a suggested/best-practice approach to this? I do not expect written code, I just need some guidance on how to approach this problem.
To clarify:
The software at the other end might or might NOT respond with any data. There are some so called silent sends were the data sent it's just received but there is no ACK. Since the app I'm working on is only a proxy, I do not have a clear picture if the data will or will not produce an answer. That will require analysis of the data as well, which isn't on the books at the moment.
As you want to do read and write in parallel, writing will always lead to a pause to read if the read is on the same part as write.
May be you can follow similar approach as ConcurrentHashMap and use different locks for different segments and lock read only if write is on the same segment else allow the read to happen.
This will
Avoid blocking read during write in most scenarios
Avoid collision and
Definitely wont be a single thread approach.
Hope that helps.
If using multiple threads, I do run into I/O Error (No such device)
sooner or later, because I will have a collision between read & write
being executed at the same time.
This says it all. Since you are doing read and write on the same channel that does not support concurrent access, you are required to have your thread wait until the other thread is done doing read/write.
Your two-thread approach is what I would do, more or less. Good luck and trust in yourself.
I'm trying to build a Java Bittorent client. From what I understand after peers handshake with one another they may start sending messages to each other, often sending messages sporadically.
Using a DataInputStream connection I can read messages, but if I call a read and nothing is on the stream the peers holds. Is there a way I can tell if something is being sent over the stream? Or should I create a new thread that reads the stream for messages continuously from each peer until the client shuts them down shut down?
I think you need to do some major experimenting so that you can start to learn the basics of socket I/O. Trying to answer your question "as is" is difficult, because you don't yet understand enough to ask the question in a manner that it can be answered.
If you want to be able to tell if there is data to read, then you should not use the blocking I/O approach. Instead, you will need to use the APIs known as "NIO", which allow you to "select" a socket that has data to read (i.e. a socket that is associated with a buffer that already has data in it).
This will make much more sense after you write a lot of code and mess it up a few times. The underlying I/O primitives are actually quite primitive (pun intended). In this industry, we just made up lots of complicated terms and function names and API descriptions so that people would think that network communication is magic. It's not. It's generally no more complicated than memcpy().
There is a function in C called select(). In the scenario you've described, you need an equivalent of select in Java. And that is, as cpurdy mentioned, Non-blocking Socket I/O or NIO. Cursory googling returned following links:
http://tutorials.jenkov.com/java-nio/socket-channel.html
http://www.owlmountain.com/tutorials/NonBlockingIo.htm
http://rox-xmlrpc.sourceforge.net/niotut/index.htm
You might want to take a look at the Netty project: http://netty.io/
It is very easy with Netty to get started on network programming.
I would like opinion on this to settle a small dispute. Any help would be greatly appreciated.
I have written my own file handler that is attached to the logger. This being a file handler and being accessed by multiple threads, I am using synchronization in order to ensure that there is no collision during the writing process. Additionally it is a rolling log, so I also close and open files, and do not want any problems there either.
His response to it was (as pasted from email)
I strongly believe that Synchronization is very bad in the Handler. It
is too complex for such easy task. So, I would say why do not use one
instance per Thread?
What would you say is better from performance's and memory management perspective.
Thank you very much for any response. Whenever writing and reading is involved in multithreaded applications I have used synchronization on java applications all my life, and have not heard of any severe performance issues.
So please I would like to know if there are any issues and I really should switch to one instance per thread.
And in general, what would be the downfall of using synchronization?
EDIT: the reason why I wrote a custom file handler (yes I do love slf4j), is because my custom handler is dealing with two files at once, and additionally I have few other functions I perform on top of writing to files.
another solution would be to use a separate thread to do the (costly on its own) writing and use concurrent queues to pass the log messages from the domain threads
the key part here is that pushing to a queue is much less costly that writing to a file and means that there is less interference from concurrent log calls
the call to log would then log like
private static BlockingQueue logQueue = //...
public static void log(String message){
//construct&filter message
logQueue.add(message);
}
then in the logger thread it will look like
while(true){
String message = logQueue.poll();
logFile.println(message);//or whatever you are doing
}
As with all I/O, you have little choice but mutual exclusion. You may theoretically build up a complex scheme with a lock-free queue which accumulates logging entries, but its utility, and especially its reliability, would be very questionable: without careful design you could get a logging-caused OOME, have the application hang on due to threads which you didn't clean up, etc.
Keep in mind that, assuming you are using buffered I/O, you already have an equivalent of a queue, minimizing the time spent occupying the lock.
The downfall to synchronisation is the fact that only one thread can access that part of the code at any one time, meaning your code will see little benefit from multithreading I.e. the synchronised part of your application will only be as fast as a single thread. (Small overhead for handling the synchronised status too, so a little slower perhaps)
However, in subjects where you don't want the threads to interfere with one another, such as writing to files, the security gained from the synchronisation is paramount, and the performance loss should just be accepted.
I know that thread safety of java sockets has been discussed in several threads here on stackoverflow, but I haven't been able to find a clear answer to this question - Is it, in practice, safe to have multiple threads concurrently write to the same SocketOutputStream, or is there a risk that the data sent from one thread gets mixed up with the data from another tread? (For example the receiver on the other end first receives the first half of one thread's message and then some data from another thread's message and then the rest of the first thread's message)
The reason I said "in practice" is that I know the Socket class isn't documented as thread-safe, but if it actually is safe in current implementations, then that's good enough for me. The specific implementation I'm most curious about is Hotspot running on Linux.
When looking at the Java layer of hotspot's implementation, more specifically the implementation of socketWrite() in SocketOutputStream, it looks like it should be thread safe as long as the native implementation of socketWrite0() is safe. However, when looking at the implemention of that method (j2se/src/solaris/native/java/net/SocketOutputStream.c), it seems to split the data to be sent into chunks of 64 or 128kb (depending on whether it's a 64bit JVM) and then sends the chunks in seperate writes.
So - to me, it looks like sending more than 64kb from different threads is not safe, but if it's less than 64kb it should be safe... but I could very well be missing something important here. Has anyone else here looked at this and come to a different conclusion?
I think it's a really bad idea to so heavily depend on the implementation details of something that can change beyond your control. If you do something like this you will have to very carefully control the versions of everything you use to make sure it's what you expect, and that's very difficult to do. And you will also have to have a very robust test suite to verify that the multithreaded operatio functions correctly since you are depending on code inspection and rumors from randoms on StackOverflow for your solution.
Why can't you just wrap the SocketOutputStream into another passthrough OutputStream and then add the necessary synchronization at that level? It's much safer to do it that way and you are far less likely to have unexpected problems down the road.
According to this documentation http://www.docjar.com/docs/api/java/net/SocketOutputStream.html, the class does not claim to be thread safe, and thus assume it is not. It inherits from FileOutputStream, which normally file I/O is not inherently thread safe.
My advice is that if the class is related to hardware or communications, it is not thread safe or "blocking". The reason is thread safe operations consume more time, which you may not like. My background is not in Java but other libraries are similar in philosophy.
I notice you tested the class extensively, but you may test it all day for many days, and it may not prove anything, my 2-cents.
Good luck & have fun with it.
Tommy Kwee
Sometimes, while sending a large amount of data via SocketChannel.write(), the underlying TCP buffer gets filled up, and I have to continually re-try the write() until the data is all sent.
So, I might have something like this:
public void send(ByteBuffer bb, SocketChannel sc){
sc.write(bb);
while (bb.remaining()>0){
Thread.sleep(10);
sc.write(bb);
}
}
The problem is that the occasional issue with a large ByteBuffer and an overflowing underlying TCP buffer means that this call to send() will block for an unexpected amount of time. In my project, there are hundreds of clients connected simultaneously, and one delay caused by one socket connection can bring the whole system to a crawl until this one delay with one SocketChannel is resolved. When a delay occurs, it can cause a chain reaction of slowing down in other areas of the project, and having low latency is important.
I need a solution that will take care of this TCP buffer overflow issue transparently and without causing everything to block when multiple calls to SocketChannel.write() are needed. I have considered putting send() into a separate class extending Thread so it runs as its own thread and does not block the calling code. However, I am concerned about the overhead necessary in creating a thread for EACH socket connection I am maintaining, especially when 99% of the time, SocketChannel.write() succeeds on the first try, meaning there's no need for the thread to be there. (In other words, putting send() in a separate thread is really only needed if the while() loop is used -- only in cases where there is a buffer issue, perhaps 1% of the time) If there is a buffer issue only 1% of the time, I don't need the overhead of a thread for the other 99% of calls to send().
I hope that makes sense... I could really use some suggestions. Thanks!
Prior to Java NIO, you had to use one Thread per socket to get good performance. This is a problem for all socket based applications, not just Java. Support for non-blocking IO was added to all operating systems to overcome this. The Java NIO implementation is based on Selectors.
See The definitive Java NIO book and this On Java article to get started. Note however, that this is a complex topic and it still brings some multithreading issues into your code. Google "non blocking NIO" for more information.
The more I read about Java NIO, the more it gives me the willies. Anyway, I think this article answers your problem...
http://weblogs.java.net/blog/2006/05/30/tricks-and-tips-nio-part-i-why-you-must-handle-opwrite
It sounds like this guy has a more elegant solution than the sleep loop.
Also I'm fast coming to the conclusion that using Java NIO by itself is too dangerous. Where I can, I think I'll probably use Apache MINA which provides a nice abstraction above Java NIO and its little 'surprises'.
You don't need the sleep() as the write will either return immediately or block.
You could have an executor which you pass the write to if it doesn't write the first time.
Another option is to have a small pool of thread to perform the writes.
However, the best option for you may be to use a Selector (as has been suggested) so you know when a socket is ready to perform another write.
For hundreds of connections, you probably don't need to bother with NIO. Good old fashioned blocking sockets and threads will do you.
With NIO, you can register interest in OP_WRITE for the selection key, and you will get notified when there is room to write more data.
There are a few things you need to do, assuming you already have a loop using
Selector.select(); to determine which sockets are ready for I/O.
Set the socket channel to non-blocking after you've created it, sc.configureBlocking(false);
Write (possibly parts of) the buffer and check if there's anything left. The buffer itself takes care of current position and how much is left.
Something like
sc.write(bb);
if(sc.remaining() == 0)
//we're done with this buffer, remove it from the select set if there's nothing else to send.
else
//do other stuff/return to select loop
Get rid of your while loop that sleeps
I am facing some of the same issues right now:
- If you have a small amount of connections, but with large transfers, I would just create a threadpool, and let the writes block for the writer threads.
- If you have a lot of connections then you could use full Java NIO, and register OP_WRITE on your accept()ed sockets, and then wait for the selector to come in.
The Orielly Java NIO book has all this.
Also:
http://www.exampledepot.com/egs/java.nio/NbServer.html?l=rel
Some research online has led me to believe NIO is pretty overkill unless you have a lot of incoming connections. Otherwise, if its just a few large transfers - then just use a write thread. It will probably have quicker response. A number of people have issues with NIO not repsonding as quick as they want. Since your write thread is on its own blocking it wont hurt you.