Apache Http Client and Load Balancers

Apache Http Client and Load Balancers - java

After spending a few hours reading the Http Client documentation and source code I have decided that I should definitely ask for help here.
I have a load balancer server using a round-robin algorithm somewhat like this
+---> RESTServer1
client --> load balancer +---> RESTServer2
+---> RESTServer3
Our client is using HttpClient to direct requests to our load balancer server, which in turn round-robins the requests to the corresponding RESTServer.
Now, Apache HttpClient creates, by default, a pool of connections (2 per route by default). This connections are by default persistent connections since I am using Http v1.1 and my servers are emitting Connection: Keep-Alive headers.
So, the problems is that since HttpClient creates this persistent connections, then those connections are no longer subject to round-robing algorithm at the balancer level. They always hit the same server every time.
This creates two problems:
I can see that sometimes one or more of the balanced servers are overloaded with traffic, whereas one ore more of the other servers are idle; and
even if I take one of my REST servers out of the balancer, it stills receives requests while the persistent connections are alive.
Definitely this is not the intended behavior.
I suppose I could force a Connection: close header in my responses, or I could run HttpClient without a connection pool or with a NoConnectionReuseStrategy. But the documentation for HttpClient states that the idea behind the use of a pool is to improve performance by avoiding having to open a socket every time and doing all the TPC handshaking and related stuff. So, I have to conclude that the use of a connection pool is beneficial to the performance of my applications.
So my question here, is there a way to use persistent connections with a load-balancer in the way or am I forced to use non-persistent connections for this scenario?
I want the performance that comes with reusing connections, but I want them properly load-balanced. Any thoughts on how I can configure this scenario with Apache Http Client if at all possible?

Your question is perhaps more related to your load balancer configuration and the style of load balancing. There are several ways:
HTTP Redirection
LB acts as a reverse proxy
Pure packet forwarding
In scenarios 1 and 3 you do not have a chance with persistent connections. If your load balancer acts like a reverse proxy, there might be a way to achieve persistent connections with balancing. "Dumb" balancers, like SMTP or LDAP selects the target per TCP connection, not on a request basis.
For example the Apache HTTPd server with the balancer module (see http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html) can dispatch every request (even on persistent connections) to a different server.
Also check, that you do not receive a balancer cookie which might be session persistent so that the cause is not the persistent connection but a balancer cookie.
HTH, Mark

+1 to #mp911de answer
One can also make the scenarios 1 and 3 work reasonably well by limiting the total time to live of persistent connections to some short period time, say 15 seconds. This way connections would live long enough to get re-used during periods of activity and short enough to go away during periods of relative inactivity.

Related

Custom Load Balancer on Kubernetes based on number of available connections on the pod

There is a system that allows us to establish at most 100 http connections.
We have a Spring boot app that connects to that system using a connection pool with a limited size and as request come in, if there are no connections in the pool, the requests waits in a queue for an available connection.
The app is deployed to Kubernetes, and it scales and round robin requests to the pods.
When load comes in, some pods have available connections and others don't and was wondering, is there a way I can configure a custom load balancer on Kubernetes to forward requests to pods that have the highest available connection count rather than round robin?

The HTTP protocol has a feature called HTTP keep-alive, or HTTP connection reuse that uses a single TCP connection to send and receive multiple HTTP requests and responses. follow this doc for Load balancing long-lived connections in Kubernetes
Here are a few examples on how to implement keep-alive in different languages:
Keep-alive in Spring boot

Is there a way to maintain a connection to a restful web service?

I'm working on an application which is a monolith. We have some features in our roadmap that I think would fit into a microservices architecture and am toying around with building them as such.
My problem: the application processes ~150 requests per second during peak times. These requests come in on raw TCP/IP connections which are kept alive at all times. We have very strict latency requirements (the majority of our requests are responded to within 25-50 milliseconds). Each request would need to consume 1 to many microservices. My concern is that consuming multiple restful web services (specifically creating/destroying the connection each time the service is consumed as well as TLS handshakes) is going to cause too much latency for processing these requests.
My question: Is it possible (and is there a best practice) to maintaining the state of a connection to a restful web service while multiple threads consume that web service? each request to consume the web service would be self contained but we would simply keep the physical connection alive.

JVM naturally pools HTTP connections for the HttpURLConnection (via http://docs.oracle.com/javase/8/docs/technotes/guides/net/http-keepalive.html). So, it should be happening for JAX-WS and JAX-RS out of the box. Usually, other non-HttpURLConnection based frameworks (like netty) support http connection pooling as well. So it's very likely you don't need to worry about this by yourself in your code. You need to calculate how many connections you would need to pool though, but it's a configuration sort of thing.
You could check that TCP connections are not closed after getting an HTTP response by sniffing traffic from you application by tcpdump or Wireshark and checking if there is no TCP FIN happening after you get the result.

Shall I use persistent connections to upload files in intranet?

In intranet, the network is good
Server A will send lots of files to server B by http service at the same time
Http protocol is HTTP 1.1, which uses persistent connection by default
[update] Use a connection pool to hold 100 connections
[update] One connection sends a file at one time
[update] Onnection will not be closed(persistent connection), and will be reused to send next file
Each file has size of 7K to 30K
Question:
In the above condition, will persistent connection have better performance than non-persistent connections?
I ask this question because we found the connections would be blocked for a huge long time when upload files. I suggest to use non-persistent connection, since I think it's more stable, but my colleage inisit to use persistent connection, because he think persistent has better performance.
UPDATE
See the updated question, thank you ~

In HTTP 1.1, your persistent connection allows pipe-lining but not parallelism. (See RFC 2616) That means that if you share your connection among 100 threads and each one sends a file, you will send those 100 files one at a time (in some ordering) and receive responses for each file in the order it was sent. You are not getting any advantage out of sending on 100 threads, because they're just lining up to send and receive one at a time.
You may be able to send faster using multiple connections, because that would allow them to actually run in parallel. But this is dependent on lots of other factors. Depending on your network, setting up and tearing down 100 connections may be slower than pipe-lining through one connection. Also, the server may not appreciate you opening 100 separate connections. Worse, the server may reject you only some of the time, which is a big headache.
I suggest taking the middle road: open, say, 5 persistent connections (using only 5 threads) and send 20 documents down each connection. HTTPClient has a BasicConnPool to do this sort of thing, though it may be too basic for your needs.

Apache HttpClient 4 persistent connection per Proxy instead of per route

My understanding, all implementations of ClientConnectionManager persist connections base on route. This results in basically no persistent connections if a proxy is involved. For example, the HttpClient needs to visit 1000 different domains via a HTTP proxy with an fix IP, it has to establish at least 1000 connection to the proxy instead of creating 1 persistent connection to the proxy and reuse that for the 1000 requests.
I'm simulating multiple users visiting thousands of domains (fake domains, all dns resolved to a couple of IPs, the resolving happen after the proxy, so nothing to do with HttpClient). The above behavior quickly use up all available ports in the localhost as I increase the # of users and domains, the Address Bind errors occur as result.
Is there a way to make the HttpClient to persist connection on proxy basis? ie. A HttpClient only maintain specified number of connections to a given proxy.

After intensive research, it seems that Apache HttpClient doesn't support this behavior out-of-box. I have to modify the HttpClient/HttpCore source in order to have this feature, ie. maintain persistent connections based only on localAddress and First Proxy address.
The classes I modified are:
org.apache.http.conn.routing.HttpRounte.java and
org.apache.http.conn.routing.BasicRouteDirector.java.
Basically I changed the hashCode and equal method in HttpRoute (which is used as a key to hashtable for persistent conn lookup), so the lookup doesn't consider target address if a proxy is involved.
Initial test results of above modification shows about 100 times improvement in terms of request throughput in my scenario. So far it works fine for me.
Kevin

How many requests can handle a port at 'a' time

I am creating a web application having a login page , where number of users can tries to login at same time. so here I need to handle number of requests at a time.
I know this is already implemented for number of popular sites like G talk.
So I have some questions in my mind.
"How many requests can a port handle at a time ?"
How many sockets can I(server) create ? is there any limitations?
For e.g . As we know when we implement client server communication using Socket programming(TCP), we pass 'a port number(unreserved port number)to server for creating a socket .
So I mean to say if 100000 requests came at a single time then what will be approach of port to these all requests.
Is he manitains some queue for all these requests , or he just accepts number of requests as per his limit? if yes what is handling request limit size of port ?
Summary:
I want to know how server serves multiple requests simultaneously?I don't know any thing about it. I know we are connection to a server via its ip address and port number that's it.
So I thought there is only one port and many request come to that port only via different clients so how server manages all the requests?
This is all I want to know. If you explain this concept in detail it would be very helpful. Thanks any way.

A port doesn't handle requests, it receives packets. Depending on the implementation of the server this packets may be handled by one or more processes / threads, so this is unlimited theoretically. But you'll always be limited by bandwith and processing performance.
If lots of packets arrive at one port and cannot be handled in a timely manner they will be buffered (by the server, the operating system or hardware). If those buffers are full, the congestion maybe handled by network components (routers, switches) and the protocols the network traffic is based on. TCP for example has some methods to avoid or control congestion: http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Congestion_control

This is typically configured in the application/web server you are using. How you limit the number of concurrent requests is by limiting the number of parallel worker threads you allow the server to spawn to serve requests. If more requests come in than there are available threads to handle them, they will start to queue up. This is the second thing you typically configure, the socket back-log size. When the back-log is full, the server will start responding with "connection refused" when new requests come in.

Then you'll probably be restricted by number of File Descriptors your os supports (in case of *nix) or the number of simultaneous connections your webserver supports. The OS maximum on my machine seems to be 75000.

100,000 concurrent connections should be easily possible in Java if you use something like Netty.
You need to be able to:
Accept incoming connections fast enough. The NIO framework helps enormously here, which is what Netty uses internally. There is a smallish queue for incoming requests, so you need to be able to handle these faster than the queue can fill up.
Create connections for each client (this implies some memory overhead for things like connection info, buffers etc.) - you may need to tweak your VM settings to have enough free memory for all the connections
See this article from 2009 where they discuss achieving 100,000 concurrent connections with about 20% CPU usage on a quad-core server.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.