We have a Java 8 application served by an Apache Tomcat 8 behind an Apache server, which is requesting multiple webservices in parallel using CXF. From time to time, there's one of them which lasts exactly 3 more seconds than the rest (which should be about 500ms only).
I've activated CXF debug, and now I have the place inside CXF where the 3 seconds are lost:
14/03/2018 09:20:49.061 [pool-838-thread-1] DEBUG o.a.cxf.transport.http.HTTPConduit - No Trust Decider for Conduit '{http://ws.webapp.com/}QueryWSImplPort.http-conduit'. An affirmative Trust Decision is assumed.
14/03/2018 09:20:52.077 [pool-838-thread-1] DEBUG o.a.cxf.transport.http.HTTPConduit - Sending POST Message with Headers to http://172.16.56.10:5050/services/quertServices Conduit :{http://ws.webapp.com/}QueryWSImplPort.http-conduit
As you could see, there're three seconds between these two lines. When the request is ok, it usually takes 0ms in between these two lines.
I've been looking into the CXF code, but no clue about the reason of these 3 secs...
The server application (which is also maintained by us), is served from another Apache Tomcat 6.0.49, which is behind another Apache server. The thing is that it seems that the server's Apache receives the request after the 3 seconds.
Anyone could help me?
EDIT:
We've monitored the server's send/received packages and it seems that the client's server is sending a negotiating package at the time it should, while the server is replying after 3 seconds.
These are the packages we've found:
481153 11:31:32 14/03/2018 2429.8542795 tomcat6.exe SOLTESTV010 SOLTESTV002 TCP TCP:Flags=CE....S., SrcPort=65160, DstPort=5050, PayloadLen=0, Seq=2858646321, Ack=0, Win=8192 ( Negotiating scale factor 0x8 ) = 8192 {TCP:5513, IPv4:62}
481686 11:31:35 14/03/2018 2432.8608381 tomcat6.exe SOLTESTV002 SOLTESTV010 TCP TCP:Flags=...A..S., SrcPort=5050, DstPort=65160, PayloadLen=0, Seq=436586023, Ack=2858646322, Win=8192 ( Negotiated scale factor 0x8 ) = 2097152 {TCP:5513, IPv4:62}
481687 11:31:35 14/03/2018 2432.8613607 tomcat6.exe SOLTESTV010 SOLTESTV002 TCP TCP:Flags=...A...., SrcPort=65160, DstPort=5050, PayloadLen=0, Seq=2858646322, Ack=436586024, Win=256 (scale factor 0x8) = 65536 {TCP:5513, IPv4:62}
481688 11:31:35 14/03/2018 2432.8628380 tomcat6.exe SOLTESTV010 SOLTESTV002 HTTP HTTP:Request, POST /services/consultaServices {HTTP:5524, TCP:5513, IPv4:62}
So, it seems is the server's Tomcat is the one which is blocked with something. Any clue?
EDIT 2:
Although that happened yesterday (the first server waiting 3s for the ack of the second), this is not the most common scenario. What it usually happens is what I described at the beginning (3 seconds between the two CXF's logs and the server receiving any request from the first one after 3 seconds.
There has been some times when the server (the one which receives the request), hangs for 3 seconds. For instance:
Server 1 sends 5 requests at the same time (suppossedly) to server 2.
Server 2 receives 4 of them, in that same second, and start to process them.
Server 2 finish processing 2 of those 4 requests in 30ms and replies to server 1.
More or less at this same second, there's nothing registered in the application logs.
After three seconds, logs are registered again, and the server finish to process the remaining 2 requests. So, although the process itself is about only some milliseconds, the response_time - request_time is 3 seconds and a few ms.
At this same time, the remaining request (the last one from the 5 request which were sent), is registered in the network monitor and is processed by the application in just a few milliseconds. However, the global processing time is anything more than 3s, as it has reached the server 3 seconds after being sent.
So there's like a hang in the middle of the process. 2 requests were successfully processed before this hang and replied in just a fraction of a second. 2 other request lasted a little bit more, the hang happened, and ended with a processing time of 3 seconds. The last one, reached the server just when the hang happened, so it didn't get into the application after the hang.
It sounds like a gc stop the world... but we have analyzed gc.logs and there's nothing wrong with that... could there be any other reason?
Thanks!
EDIT 3:
Looking at the TCP flags like the one I pasted last week, we've noticed that there are lots of packets with the CE flag, that is a notification of TCP congestion. We're not network experts, but have found that this could deal us to a 3 seconds of delay before the packet retransmission...
could anyone give us some help about that?
Thanks. Kind regards.
Finally, it was everything caused by the network congestion we discovered looking at the TCP flags. Our network admins has been looking at the problem, trying to reduce the congestion, reducing the timeout to retransmit.
The thing is that it seems that the server's Apache receives the
request after the 3 seconds.
How do you figure this out ? If you're looking at Apache logs, you can be misleaded by wrong time stamps.
I first thought that your Tomcat 6 takes 3 seconds to answer instead of 0 to 500ms, but from the question and the comments, it is not the case.
Hypothesis 1 : Garbage Collector
The GC is known for introducing latency.
Highlight the GC activity in your logs by using the GC verbosity parameters. If it is too difficult to correlate, you can use the jstat command with the gcutil option and you can compare easily with the Tomcat's log.
Hypothesis 2 : network timeout
Although 3s is a very short time (in comparison with the 21s TCP default timeout on Windows for example), it could be a timeout.
To track the timeouts, you can use the netstat command. With netstat -an , look for the SYN_SENT connections, and with netstat -s look for the error counters. Please check if there is any network resource that must be resolved or accessed in this guilty webservice caller.
Related
I am working on socket programming on Java recently and something is confusing me. I have three questions about it.
First one is;
There is a ServerSocket method in Java. And this method can take up to 3 parameters such as port, backlog and ip address. Backlog means # of clients that can connect as a form of queue into a server. Now lets think about this situation.
What happens if 10 clients try to connect this server at the same
time?
Does Server drop last 5 clients which tried to connect? Lets increase the number of clients up to 1 million per hour. How can I handle all of them?
Second question is;
Can a client send messages concurrently without waiting server's response? What happens if a client sends 5 messages into server that has 5 backlog size?
The last one is not a question actually. I have a plan to manage load balancing in my mind. Lets assume we have 3 servers running on a machine.
Let the servers names are A, B and C and both of them are running smoothly. According to my plan, if I gave them a priority according to incoming messages then smallest priority means the most available server. For example;
Initial priorities -> A(0), B(0), C(0) and respond time is at the end of 5. time unit.
1.Message -> A (1), B(0), C(0)
2.Message -> A (1), B(1), C(0)
3.Message -> A (1), B(1), C(1)
4.Message -> A (2), B(1), C(1)
5.Message -> A (2), B(2), C(1)
6.Message -> A (1), B(2), C(2)
.
.
.
Is this logic good? I bet there is a far better logic. What do I do to handle more or less a few million requests in a day?
PS: All this logic is going to be implemented into Java Spring-Boot project.
Thanks
What happens if 10 clients try to connect this server at the same time?
The javadoc explains it:
The backlog argument is the requested maximum number of pending connections on the socket. Its exact semantics are implementation specific. In particular, an implementation may impose a maximum length or may choose to ignore the parameter altogther.
.
Lets increase the number of clients up to 1 million per hour. How can I handle all of them?
By accepting them fast enough to handle them all in one hour. Either the conversations are so quick that you can just handle them one after another. Or, more realistically, you will handle the various messages in several threads, or use non-blocking IO.
Can a client send messages concurrently without waiting server's response?
Yes.
What happens if a client sends 5 messages into server that has 5 backlog size?
Sending messages has nothing to do with the backlog size. The backlog is for pending connections. Messages can only be sent once you're connected.
All this logic is going to be implemented into Java Spring-Boot project.
Spring Boot is, most of the time, not used for low-level socket communication, but to expose web services. You should probably do that, and let standard solutions (a reverse proxy, software or hardware) do the load-balancing for you. Especially given that you don't seem to understand how sockets, non-blocking IO, threads, etc. work yet.
So for your first question, the backlog queue is something where the clients will be held in wait if you are busy with handling other stuff (IO with already connected client e.g.). If the list grows beyond backlog, the those news clients will get a connection refused. You should be ok with 10 clients connect at the same time. It's long discussion, but keep a thread pool, as soon you get a connected socket from accept, hand it to your thread pool and go back to wait in accept. You can't support millions of client "practically" on one single server period! You'll need to load balance.
Your second question is not clear, clients can't send messages, as long as they are on the queue, they will be taken off the queue, once you accept them & then it's not relevant how long the queue is.
And lastly your question about load balancing, I'd suggest if you are going to have to serve millions of clients, invest in some good dedicated load-balancer :), that can do round robin as well as you mentioned.
With all that said, don't reinvent the wheel :), there are some open source java servers, my favorite: https://netty.io/
In the docs (Tomcat 7 Config), it is written:
The number of milliseconds this Connector will wait, after accepting a connection, for the request URI line to be presented. Use a value of -1 to indicate no (i.e. infinite) timeout. The default value is 60000 (i.e. 60 seconds) but note that the standard server.xml that ships with Tomcat sets this to 20000 (i.e. 20 seconds). Unless disableUploadTimeout is set to false, this timeout will also be used when reading the request body (if any).
When a client sends request to a server, it will take N milliseconds to establish a connection. If this N exceeds the connection timeout that is set on the client's end, the request will fail in the client as expected.
I'm not able to understand what Tomcat's connectionTimeout does differently. Specifically, what does "after accepting a connection, for the request URI line to be presented" means ?
The connectionTimeout is the limit of time after which the server will automatically close the connection with the client not the other way around. It is a way to limit the impact of a Denial Of Service Attack. Indeed a typical way to do a DOS Attack is to launch several requests on a given server, and each request will last forever making the server waits for nothing and filling up its pool of threads such that the Server won't be able to accept any new requests. Thanks to this timeout, after x milliseconds it will ignore the request considering it as a potential attack.
Here is an interesting discussion on globally the same subject that goes a little bit deeper.
I face a scenario where everything works fine with keeping session.setMaxInactiveInterval(15*60) //15 minutes
Now, there is one request in application where number of users are getting saved, which takes more then 15 minutes at server side as internally many operations are involved while saving each user.
When saveUser operation is performed from browser, request received at server, starts processing, after 15 minutes client see "Session Time out" but user saving operation still going on.
I want to avoid this scenario, and need is when request is received by server at that time till the time it responds back to client don't considered in-between time as inactive time.
I can very well do it by making setMaxInactiveInterval(-1) as first line of my method saveUser and just before giving response as setMaxInactiveInterval(15*60).
I am looking for standard approach or this is the standard approach to follow?
Is there any inbuilt Tomcat configuration provided for such scenario?
The standard Java EE approach for this would be as follows:
Instead of doing all the processing in the web-layer, put all the details on a JMS queue and return the request straight away.
The data would be taken off the JMS queue by a worker (possibly on a different server, as you wouldn't want to have the load of creating something for 15 minutes on your web layer)
Once the data is create for the user a notification would be generated (e.g. the browser could query every 30 seconds whether the user generation has finished).
N.B. blocking a thread in a server for 15 minutes is a very bad idea, because essentially it means that your servlet container wouldn't be able to do anything else with that thread. Imagine if a large number of those requests came in at the same time, your web layer would probably buckle under the pressure...
N.B.2: 15*50 does not make 15 minutes
Client sends a message. Server reads the message and writes a reply. Client reads the reply. Repeat. Each message is shorter than 500 bytes. Sockets are not closed.
I get around 800 request+responses/per second between two desktop PCs on LAN. Network activity on the hosts are barely noticeable.
If I don't do readReply (or do it in a separate thread), throughout explodes to like 30.000 msg/sec or more! This also peaks the network activity on the hosts.
My questions:
Is 800 msg/sec a reasonable number for a request/response protocol on a single socket?
How is it that removing the readReply call can increase performance like that???
What can be done to improve this, apart from using UDP? Any other protocol that might be used?
Server:
while (true) {
message = readMessage();
writeReply("Thanks");
}
Client:
while (true) {
writeMessage("A message");
reply = readReply();
}
Notes:
I implemented this in both Java and Php and got about the same results.
Ping latency is <1 ms
The basic problem is latency: the time it takes for a network frame/package to reach the destination.
For instance, 1 ms latency limits the speed to at most 1000 frames/second. Latency of 2 ms can handle 500 fps, 10 ms gives 100 fps etc..
In this case, managing 1600 fps (800*2) is expected when latency is 0.5 ms.
I think this is because you manage to send more data per frame. It will fill up the TCP buffer in the client after a while though.
Batch (pipeline) the messages if possible. Send 10 messages from the client in a batch and then wait for the server to reply. The server should send all 10 replies in a single chunk as well. This should make the speed 10x faster in theory.
We have huge record set on AIX box that we send over network to Linux box and process it.
Each record is about 277 bytes in size.
complete flow is like:
i) Program A sends records to java process B (both on AIX box).
ii) Java process B on AIX sends the records to java Program C on linux. Both are communicating through java sockets where B is client and C is server.
iii) Program C processes each record and sends an ACK back to Program B.
iv) Program B sends ACK back to Program A, which then sends next record.
I tihnk all these ACKs eat up the network and overall process is becoming very slow. For eg. in latest run, it processed 330,000 records in 4 hours and then we got a socket reset and client failed.
I was trying to find out that what would be better protocol in this case to have less network traffic and finish up faster. 330,000 records in 4 hours is really slow as processing each record on Program C takes less than 5-10 seconds but over-all flow is such that we are facing this slowness issue.
Thanks in advance,
-JJ
Waiting for the ack to go all the way back to A before sending the next record will definitely slow you down because C is essentially idle while this is happening. Why don't you move to a queuing architecture? Why not create a persistent queue on C which can receive the records from A (via B) and then have one (or many) processors for this queue sitting on C.
This way you decouple how fast A can send from how fast C can process them. A's ack becomes the fact that the message was delivered to the queue successfully. I would use HornetQ for this purpose.
EDIT
The HornetQ getting-started guide is here.
If you can't use this, for the simplest non-persistent in-memory queue, simply use a ThreadPoolExecutor from Java's concurrency libraries. You create a ThreadPoolExecutor like this:
new ThreadPoolExecutor(
threadPoolSize, threadPoolSize, KEEP_ALIVE, MILLISECONDS,
new LinkedBlockingQueue<Runnable>(queueSize), ThreadPoolExecutor.DiscardOldestPolicy.discardOldest());
Where queueSize can be MAX_INT. You call execute() with a Runnable on the ThreadPool to get tasks to be carried out. So your receiving code in C can simply pop these Runnables created and parameterized with the Record on to the ThreadPool and then return the ack immediately to A (via B).
If each record takes 5 seconds, and there are 330,000 record, this should take 1,650,000 seconds which is 19 days. If you are taking 4 hours to process 330,000 records, are they not taking 43 ms.
One reason they might take 43 ms per request is if you are creating a closing a connection for each request. It could be sending most of its time creating/closing rather than doing. A simple way around this is to create a connection once, and only reconnect if there is an error.
If you use a persistent connection your overhead could drop below 100 micro-seconds per request.
Is there any reason you cannot send a batch of data of say 1000 records to process, which would return 1 ACK and cut the overhead by a factor of 1000?