I have a server application which received requests and forwards them on a Unix Domain Socket. This works perfectly under reasonable usage but when I am doing some load tests with a few thousand requests I am getting a Broken Pipe error.
I am using Java 7 with junixsocket to send the requests. I have lots of concurrent requests, but I have a thread pool of 20 workers which is writing to the unix domain socket, so there is no issue of too many concurrent open connections.
For each request I am opening, sending and closing the connection with the Unix Domain Socket.
What is the reason that could cause a Broken Pipe on Unix Domain Sockets?
UPDATE:
Putting a code sample if required:
byte[] mydata = new byte[1024];
//fill the data with bytes ...
AFUNIXSocketAddress socketAddress = new AFUNIXSocketAddress(new File("/tmp/my.sock"));
Socket socket = AFUNIXSocket.connectTo(socketAddress);
OutputStream out = new BufferedOutputStream(socket.getOutputStream());
InputStream in = new BufferedInputStream(socket.getInputStream()));
out.write(mydata);
out.flush(); //The Broken Pipe occurs here, but only after a few thousand times
//read the response back...
out.close();
in.close();
socket.close();
I have a thread pool of 20 workers, and they are doing the above concurrently (so up to 20 concurrent connections to the same Unix Domain Socket), with each one opening, sending and closing. This works fine for a load test of a burst of 10,000 requests but when I put a few thousand more I suddenly get this error, so I am wondering whether its coming from some OS limit.
Keep in mind that this is a Unix Domain Socket, not a network TCP socket.
'Broken pipe' means you have written to a connection that had already been closed by the other end. It is detected somewhat asynchronously due to buffering. It basically means you have an error in your application protocol.
From the Linux Programmer's Manual (similar language is also in the socket man page on Mac):
The communications protocols which implement a SOCK_STREAM ensure that data is not lost or duplicated. If a piece of data for which the peer protocol has buffer space cannot be successfully transmitted within a reasonable length of time, then the connection is considered to be dead. When SO_KEEPALIVE is enabled on the socket the protocol checks in a protocol-specific manner if the other end is still alive. A SIGPIPE signal is raised if a process sends or receives on a broken stream; this causes naive processes, which do not handle the signal, to exit.
In other words, if data gets stuck in a stream socket for too long, you'll end up with a SIGPIPE. It's reasonable that you would end up with this if you can't keep up with your load test.
Related
Deplyment environment:
I have created a TCP server using JAVA over windows 10 OS. My TCP client program is written in VC++ and runs on windows 7 OS (I don't have any control over this part of the code, it is a black box to me).
My TCP server code is like this:
Socket s = ss.accept();
s.setReceiveBufferSize(2000);
s.setSendBufferSize(2000);
s.setTcpNoDelay(true);
s.setKeepAlive(true);
new TcpConnectionHandler(s,this.packetHandler);
Following is the TCP connection handler snippet:
InputStream incomingPacketBuffer = this.clientSocket.getInputStream();
OutputStream outgoingPacketBuffer = this.clientSocket.getOutputStream();
int bufferLen=0;
byte inBuffer[] = new byte[this.clientSocket.getReceiveBufferSize()];
byte outBuffer[] = new byte[this.clientSocket.getSendBufferSize()];
while(this.clientSocket.isConnected())
{
bufferLen = incomingPacketBuffer.read(inBuffer);
if(bufferLen>0)
{
outBuffer = (byte[]) this.packetHandlerModule.invoke(this.packetHandler,Arrays.copyOf(inBuffer, bufferLen));
}
if(outBuffer != null)
{
if(this.clientSocket.isConnected())
{
outgoingPacketBuffer.write(outBuffer);
outgoingPacketBuffer.flush();
}
}
}
this.clientSocket.close();
The communication is packet based and the protocol/parsing is handled by packetHandler.
Two more variant I've tried:
I have tried to close the socket as and when a reply is sent back to the client. That is, after receiving one packet of data, I reply to the client and close the connection.
I used inputStream.available before using the read method.
The problem I face:
Most of the time the TCP server replies to incoming packets within a second. If the server receives a packet after some idle time, the server doesn't reply to the packet. Sometimes even when there is active communication is going on, the reply is not being transmitted. Secondly, the isConnected function returns true even when the client socket closed the connection.
Debugging attempts:
I used teraterm to send packets and checked it. The behavior is same. As long as I send packets one after another, I don't have an issue. If one packet doesn't get a reply, then every packet sent after that does not get reply from the server.
When I press Ctrl+C in server console, all the packets sent from teraterm is processed by TCP server and reply is sent back. After this the server works properly for some duration.
I checked the packet flow with wireshark. When the replies are sent back normally, it is sent along with the ACK of client request (SYN, SYN+ACK, ACK, PSH, PSH+ACK, FYN, FYN+ACK, ACK). When the reply gets staled (may not be the right term, it is stuck in inputStream.available or inputStream.read), only ACK packet is sent by server (SYN, SYN+ACK, ACK, PSH, ACK).
I checked many forums and other threads in stackexchange, learned about Nagle's algorithm, applicaion must take care of packetization in TCP, TCP may receive 10+10 packets as 8+12 or 15+5 or any such manner. The server code takes care of packetization, setKeepAlive is set to true (there is no problem when a packet is sent from server).
Problem in short: "At times, TCP read call is getting blocked for a long duration even when there is incoming packets. When Ctrl+C is pressed, they are getting processed."
PS: I just started posting queries on stackexchange, so kindly let me know if there is any issues in the way of formulating the query.
PPS: Sorry for such a long post.
UPDATE
The comment from EJB helped me to identify the peer disconnect.
I made another setup with Ubuntu 16.04 as operating system for server. It has been 3 days, windows system had the issue occasionally. Ubuntu 16.04 never staled.
Some things to consider;
the TCP buffer sizes are usually 8K at least and I don't think you can skink them to 2000 bytes, or if you can, I don't think it's a good idea.
the size of the byte[] doesn't really matter over about 2K, you may as well pick a value.
you can't need to be creating a buffer more than once.
So in short I would try.
Socket s = ss.accept();
s.setTcpNoDelay(true);
s.setKeepAlive(true);
new TcpConnectionHandler(s,this.packetHandler);
and
try {
InputStream in = this.clientSocket.getInputStream();
OutputStream out = this.clientSocket.getOutputStream();
int bufferLen = 0;
byte[] buffer = new byte[2048];
while ((bufferLen = in.read(buffer)) > 0) {
out.write(buffer, 0, bufferLen); // not buffered so no need to flush
}
} finally {
this.clientSocket.close();
}
At times, TCP read call is getting blocked for a long duration even when there is incoming packets.
Would write a test Java client to see that this is not due to behaviour in Java.
public static void main(String args[]){
byte[] message = ...
Socket socket = ...
DataOutputStream dOut = new DataOutputStream(socket.getOutputStream());
dOut.write(message); //#1
dOut.close();
socket.close();
}
Let's assume that the line #1 will put the data into buffer waiting to flush to remote machine. After that the stream and socket are closed.
We assume that, in the sending process, there is some unknown problem happens in network, and our operating system will resend the packet that was in the buffer until the TCP re-tranmission timeout.
I am wondering that how I can catch this exception in Java program? Because the code above already send out data to buffer and probably closed the stream and socket (and probably exit the Java main body), left all the other job (TCP-related, re-tranmission) to operating system.
My question is, will the TCP re-tranmission (we assume packet lost) continue even Java program exit? What is the best method to catch the re-tranmission timeout error?
TCP will continue to try to cleanly shutdown the connection even after the program exits. It is generally recommended that the application perform the shutdown itself. Basically, you perform the following sequence of steps:
Shutdown the TCP connection in the send direction triggering the normal close sequence. If the protocol prohibits the other side from sending any data, you can shutdown the connection in the receive direction as well, however, if you do this and the other side sends any data, it may cause the other side to detect an abnormal shutdown as the data it sent will be lost.
Continue to read from the connection until you detect a clean or abnormal shutdown from the other end. If all goes well, you will detect a clean shutdown as soon as you finish receiving any data the other side has sent.
Close the handle or delete the object/reference to the connection. The actual TCP connection is already shut down.
I am using the below code to send data to a tcp server. I am assuming that I need to use socket.shutdownOutput() to properly indicate that the client is done sending the request. Is my assumption correct? If not please let me know the purpose of shutdownOutput(). Also appreciate any further optimizations I can make.
Client
def address = new InetSocketAddress(tcpIpAddress, tcpPort as Integer)
clientSocket = new Socket()
clientSocket.connect(address, FIVE_SECONDS)
clientSocket.setSoTimeout(FIVE_SECONDS)
// default to 4K when writing to the server
BufferedOutputStream outputStream = new BufferedOutputStream(clientSocket.getOutputStream(), 4096)
//encode the data
final byte[] bytes = reqFFF.getBytes("8859_1")
outputStream.write(bytes,0,bytes.length)
outputStream.flush()
clientSocket.shutdownOutput()
Server
ServerSocket welcomeSocket = new ServerSocket(6789)
while(true)
{
println "ready to accept connections"
Socket connectionSocket = welcomeSocket.accept()
println "accepted client req"
BufferedInputStream inFromClient = new BufferedInputStream(connectionSocket.getInputStream())
BufferedOutputStream outToClient = new BufferedOutputStream(connectionSocket.getOutputStream())
ByteArrayOutputStream bos=new ByteArrayOutputStream()
println "reading data byte by byte"
byte b=inFromClient.read()
while(b!=-1)
{
bos.write(b)
b=inFromClient.read()
}
String s=bos.toString()
println("Received request: [" + s +"]")
def resp = "InvalidInput"
if(s=="hit") { resp = "some data" }
println "Sending resp: ["+resp+"]"
outToClient.write(resp.getBytes());
outToClient.flush()
}
I am using the below code to send data to a tcp server. I am assuming
that I need to use socket.shutdownOutput() to properly indicate that
the client is done sending the request. Is my assumption correct?
YES Your assumption is correct. And this output ShutDown is known as half close . Using half close the TCP provides the ability for one end of the connection to terminate its output, while still receiving data from the other end. Let me walk you through the effects of socket.shutdownOutput() method :
Locally, the local socket and its input stream behave normally for reading
purposes, but for writing purposes the socket and its output stream behave
as though the socket had been closed by this end: subsequent writes to the
socket will throw an IOException
TCP’s normal connection-termination sequence (a - FIN acknowledged by
an ACK) is queued to be sent after any pending data has been sent and acknowledged.
Remotely, the remote socket behaves normally for writing purposes, but for
reading purposes the socket behaves as though it had been closed by this
end: further reads from the socket return an EOF condition, i.e. a read count
of -1 or an EOFException , depending on the method being called.
When the local socket is finally closed, the connection-termination sequence
has already been sent, and is not repeated; if the other end has already
done a half-close as well, all protocol exchanges on the socket are now
complete.
Hence we see that When the EOF is received, that end is assured that the other end has done the output shutdown. And this scenario is perfectly achieved by socket.shutDownOutput() on the other side.
Source: Fundamental Networking in Java, Esmond Pitt
Socket.shutdownOutput() means that the client is finished sending any data through the TCP connection. It will send the remaining data followed by a termination sequence which will completely close its OUTGOING connection. It is not possible to send any further data, which will also indicate to your program that the request is completely finished. So its recommended if you are sure you don't have to send any more data.
But it's not needed to indicate that the request is finished (you don't have to open/close the output all the time if you have multiple requests), there are other ways.
it's not my first time trying to understand this issue but i hope it will be the last one:
some background:
i have a Java SocketChannel NIO server working in non-blocking mode.
this server has multiple clients which send and receive messages from it.
each client maintain its connection to the server with "keepalive" messages every once in a while.
The main idea with the server is that the clients will remain connect "all the time" and receive messages from it in "push" mode.
now to my question:
in Java NIO read() function - when the read() return -1 - it means that its EOS.
in the question i've asked here i realized that it means that the socket has finished its current stream and doesn't need to be closed..
when searching in google a bit more about this i found out that it does mean that the connection is closed on the other side..
what does the word "stream" exactly means? is it the current message being sent from the client? is it the ability of the client side connection to send anymore messages ?
why would a SocketChannel be closed on the client side if the client never told him to be closed ?
what is the difference between read() return -1 and connection reset by peer I/O error ?
this is how i read from SocketChannel:
private JSONObject readIncomingData(SocketChannel socketChannel)
throws JSONException, InvalidKeyException, IllegalBlockSizeException, BadPaddingException, IOException {
JSONObject returnObject = null;
ByteBuffer buffer = ByteBuffer.allocate(1024);
Charset charset = Charset.forName("UTF-8");
String endOfMesesage = "\"}";
String message = "";
StringBuilder input = new StringBuilder();
boolean continueReading = true;
while (continueReading && socketChannel.isOpen())
{
buffer.clear();
int bytesRead = socketChannel.read(buffer);
if (bytesRead == -1)
{
continueReading = false;
continue;
}
buffer.flip();
input.append(charset.decode(buffer));
message = input.toString();
if (message.contains(endOfMesesage))
continueReading = false;
}
if (input.length() > 0 && message.contains(endOfMesesage))
{
JSONObject messageJson = new JSONObject(input.toString());
returnObject = new JSONObject(encrypter.decrypt(messageJson.getString("m")));
}
return returnObject;
}
What does the word "stream" exactly means? is it the current message being sent from the client? is it the ability of the client side connection to send anymore messages ?
The stream means the data that is flowing between two locations, usually between the client and the server but effectively it's any kind of data flowing. E.g. if you read a file from your hard disc you use a FileInputStream which represents data flowing from the file on disc to your program. It's a very generic concept. Think of it as a river where the water is the data. Plus it's a very cool kind of river which allows you to control how the water/data is flowing.
Why would a SocketChannel be closed on the client side if the client never told him to be closed ?
That can happen if the connection between client and server is reset or interrupted. Your program should never assume that connections just live and are never interrupted. Connections are interrupted for all kinds of reasons, may it be a flaky network component, someone pulling a plug that should better be left where it was or the wireless network is going down. Also the server might close the connection, e.g. if the server program goes down, has a bug or the connection runs into a timeout. Always remember that open connections are a limited resource so servers might decide to close them if they are idle for too long.
What is the difference between read() return -1 and connection reset by peer I/O error ?
When the read() returns -1 this simply means that there is currently no more data in the stream. A connection reset means, there was probably more data, but the connection no longer exists and therefore this data cannot be read anymore. Again taking the river anology: Think of the data as some quantity of water being sent from a village upstream (aka Serverville) to a village downstream (aka Clientville) using a riverbed that connects the two villages (the connection). Now someone at Serverville pulls the big lever and the water (the data) flows down from Serverville to Clientville. After Serverville has sent all the water it wanted to send, it closes the lever and the riverbed will be empty again (and actually destroyed as the connection got closed). This is where Clientville get's the -1. Now imagine some bulldozer interrupting the riverbed and some of the water never makes it to Clientville. This is the "connection reset" situation.
Hope this helps :)
what does the word "stream" exactly means? is it the current message being sent from the client?
It is a stream of bytes, not messages. You can use those bytes to form a message but the stream has no idea you are doing this, nor does it support messages in any way.
why would a SocketChannel be closed on the client side if the client never told him to be closed ?
It can only be closed with a -1 if the other end closed it.
what is the difference between read() return -1 and connection reset by peer I/O error ?
You can close or drop a connection other ways such as closing it from the same side, or a timeout in the connection e.g.you pulled out the network cable.
BTW: The way you have written the code is better suited to blocking NIO. For example, if you receive more than one whole message, anything after the first one is discarded. If you use blocking IO and keep everything you read you will not get corrupted or dropped messages.
What does the word "stream" exactly means? is it the current message being sent from the client?
It basically means one side of the connection, which is full-duplex. TCP is a byte-stream protocol, providing two independent byte streams, one in each direction.
Why would a SocketChannel be closed on the client side if the client never told him to be closed?
It wouldn't. The client did close the connection. That's what read() returning -1 means.
What is the difference between read() return -1 and connection reset by peer I/O error ?
read() returning -1 means the peer closed the connection properly. 'Connection reset by peer' indicates a protocol error of some kind, usually that you have written data to a connection that had already been closed by the peer.
Re your code, if read() returns -1 you must close the channel. There is no other sensible way to proceed.
For work I have written a specialized HTTP server which only performs 301/302/Frame redirections for web sites. Recently, some nefarious clients have been intentionally opening sockets and writing one character every 500 milliseconds in order to defeat my TCP socket timeout. Then they keep the socket open indefinitely and have multiple clients doing the same thing in a distributed denial of service. This eventually exhausts the thread pool which handles the TCP connections. How would you write your code to make it less susceptible to this sort of bad behavior? Here's my socket accept code:
while (true) {
// Blocks while waiting for a new connection
log.debug("Blocking while waiting for a new connection.") ;
try {
Socket server = httpServer.accept() ;
// After receiving a new connection, set the SO_LINGER and SO_TIMEOUT options
server.setReuseAddress(true) ;
server.setSoTimeout(timeout) ;
server.setSoLinger(true, socketTimeout) ;
// Hand off the new socket connection to a worker thread
threadPool.execute(new Worker(cache, server, requests, geoIp)) ;
} catch (IOException e) {
log.error("Unable to accept socket connection.", e) ;
continue ;
}
}
timeout and socketTimeout are currently set to 500 milliseconds.
Start closing sockets after a certain time has passed. If a socket has stayed open too long just close it down. You could do this in two ways:
You could also put a time limit on how long the client takes to send you a request. If they don't sustain a certain level of throughput close em. That can be pretty easy to do in your read loop when your thread is reading the request by adding a System.currentTimeInMillis() at the start and compare to where you are as you loop. If it drifts past a certain limit they are shutdown and dropped.
An alternative idea to this idea is possibly not reject them but let your thread return to the pool, but put the socket on a stack to watch. Let the bytes pile up and after they reached a certain size you can them pass them to a thread in the pool to process. This the hybrid approach to cut em off vs. maybe they aren't bad but slow.
Another way to handle that is watch how long a thread has been working on a request, and if it's not finished within a time limit close the underlying socket. Then the thread will get a SocketException and it can shutdown and clean up.
Here are some other ideas that mostly involve using outside hardware like firewalls, load balancers, etc.
https://security.stackexchange.com/questions/114/what-techniques-do-advanced-firewalls-use-to-protect-againt-dos-ddos/792#792