I finished coding a java application that uses 25 different threads, each thread is an infinite loop where an http request is sent and the json object(small one) that is returned is processed. It is crucial that the time between two requests sent by a specific thread is less than 500ms. However, I did some benchmark on my program and that time is well over 1000ms. SO my question is: Is there a better way to handle multiple connections other than creating multiple threads ?
I am in desperate need for help so I'm thankful for any advice you may have !
PS: I have a decent internet connection ( my ping to the destination server of the requests is about 120ms).
I'd suggest looking at Apache HttpClient:
Specifically, you'll be interested in constructing a client that has a pooling connection manager. You can then leverage the same client.
PoolingClientConnectionManager connectionManager = new PoolingClientConnectionManager();
connectionManager.setMaxTotal(number);
HttpClient client = new DefaultHttpClient(connectionManager);
Here's a specific example that handles your use-case:
PoolingConnectionManager example
Related
I'm evaluating AsyncHttpClient for big loads (~1M HTTP requests).
For each request I would like to invoke a callback using the AsyncCompletionHandler which will just insert the result into a blocking queue
My question is: if I'm sending asynchronous requests in a tight loop, how many threads will the AsyncHttpClient use? (I know you can set the max but apparently you take a risk of losing requests, I've seen it here)
I'm currently using the Netty implementation with these versions:
async-http-client v1.9.33
netty v3.10.5.Final
I don't mind using other versions if there are any optimization in later versions
EDIT:
I read that Netty uses the reactor pattern for reacting to HTTP responses which means it allocates a very small number of threads to act as selectors. This also means that the number of allocated threads doesn't increase with high requests volume. However that contradicts the need to set the max number of connections.
Can anyone explain what I'm missing?
Thanks in advance
The AsyncHttpClient client (and other non-blocking IO clients for the matter), don't need to allocate a thread per request, and the client need not resize its thread pool even if you bombard it with requests. You do initiate many connections if you don't use HTTP keep-alive, or call multiple hosts, but it can all be handled by a single threaded client (there may be more than one IO thread, depending on the implementation).
However, it's always a good idea to limit the max requests per host, and max requests per domain, to avoid overloading a service on a specific host, or a site, and avoid getting blocked. This is why HTTP clients add a maxConnectionsPerXxx setting.
AHC has two types of threads:
For I/O operation. On your screen, it's AsyncHttpClient-x-x threads. AHC creates 2*core_number of those.
For timeouts. On your screen, it's AsyncHttpClient-timer-1-1 thread. Should be only one.
And as you mentioned:
maxConnections just means number of open connections which does not
directly affect the number of threads
Source: issue on GitHub: https://github.com/AsyncHttpClient/async-http-client/issues/1658
I'll get right into the subject
I have a server that works a music recommendation system ( for some kind of application)
the server has a very large database
So i made a singleton constructor of the recommendation system.
My problem is
the first time this constructor is being created it has to run a training data and it connects to the database a lot which is a time consuming operation
This has to run only the first time according to my singleton object and then afterwards, it'll be able to use the results of the constructor right away
My problem is that on the first HTTP request from my PC to the server, the explorer times out and the singleton object is never created on the server
I think my solution would be in extending the wait time of the explorer until the server finishes computation and returns with result, however
if someone has a better solution i'd be greatly in his dept
I really need an easy applicable solution that requires minimal effort because the delivery deadline is closing up and i need to wrap the project as fast as possible
Thanks again
Few comments/suggestions
Increasing timeout is one way but its not sure shot way of solving the problem. The time taken by the recommendation system may not always be same over the time.
I suggest another approach to solve this. Not sure if its an option for you, but Will it be possible to create the recommendation system asynchronously in a separate thread so that the server start up is not held back by this ?
If you could do above, then provision a flag which indicates that recommendation system has started.
Meanwhile if you receive any request, first check the flag if the flag indicate that the recommendation system has not yet started, the return some meaningful message/status.
This way you will get the response immediately and based on the response you can work out retries on the client side.
Please note that this will be substantial change on the server side. Just an opinion to improve the things further and full proof way of avoiding timeout.
You can increase the connection time out using below
HttpResponse response = null;
final HttpParams httpParams = new BasicHttpParams();
// 60 second connection timeout
HttpConnectionParams.setConnectionTimeout(httpParams, 60000);
HttpClient httpClient = new DefaultHttpClient(httpParams);
I want to make a few million http request to web service of the form-
htp://(some ip)//{id}
I have the list of ids with me.
Simple calculation has shown that my java code will take around 4-5 hours to get the data from the api
The code is
URL getUrl = new URL("http url");
URLConnection conn = getUrl.openConnection();
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
StringBuffer sbGet = new StringBuffer();
String getline;
while ((getline = rd.readLine()) != null)
{
sbGet.append(getline);
}
rd.close();
String getResponse = sbGet.toString();
Is there a way to more efficiently make such requests which will take less time
One way is to use an executor service with a fixed thread pool (the size depends how much the target HTTP service can handle) and bombard requests to the service in parallel. The Runnable would basically perform the steps you outlined in your sample code, btw.
You need to profile your code before you start optimizing it. Otherwise you may end up optimizing the wrong part. Depending on the results you obtain from profiling consider the following options.
Change the protocol to allow you to batch the requests
Issue multiple requests in parallel (use multiple threads or execute multiple processes in parallel; see this article)
Cache previous results to reduce the number of requests
Compress the request or the response
Persist the HTTP connection
Is there a way to more efficiently make such requests which will take less time?
Well you probably could run a small number of requests in parallel, but you are likely to saturate the server. Beyond a certain number of requests per second, the throughput is likely to degrade ...
To get past that limit, you will need to redesign the server and/or the server's web API. For instance:
Changing your web API to allows a client to fetch a number of objects in each request will reduce the request overheads.
Compression could help, but you are trading off network bandwidth for CPU time and/or latency. If you have a fast, end-to-end network then compression might actually slow things down.
Caching helps in general, but probably not in your use-case. (You are requesting each object just once ...)
Using persistent HTTP connections avoids the overhead of creating a new TCP/IP connection for each request, but I don't think you can't do this for HTTPS. (And that's a shame because HTTPS connection establishment is considerably more expensive.)
I've built a simple Java program that works as a server locally.
At the moment it does a few things, such as previews directories, forwards to index.html if directory contains it, sends Last-Modified header and responds properly to a client's If-Modifed-Since request.
What I need to do now is make my program accept persistent connections. It's threaded at the moment so that each connection has it's own thread. I want to put my entire thread code within a loop that continues until either Connection: close, or a specified timeout.
Does anybody have any ideas where to start?
Edit: This is a university project, and has to be done without the use of Frameworks.
I have a main method, which loops indefinitely, each time it loops it creates a Socket object, a HTTPThread object is then created (A class of my own creation) - that processes the single request.
I want to allow multiple requests to work within a single connection making use of the Connection: keep-alive request header. I expect to use a loop in my HTTPThread class, I'm just not sure how to pass multiple requests.
Thanks in advance :)
I assume that you are implementing the HTTP protocol code yourself starting with the Socket APIs. And that you are implementing the persistent connections part of the HTTP spec.
You can put the code in the loop as you propose, and use Socket.setSoTimeout to set the timeout on blocking operations, and hence your HTTP timeouts. You don't need to do anything to reuse the streams for your connection ... apart from not closing them.
I would point out that there are much easier ways to implement a web server. There are many existing Java web server frameworks and application servers, or you could repurpose the Apache HTTP protocol stacks.
If it should act like a web service: Open 2 sockets from the client side, one for requests, one for
responses. Keep the sockets and streams open.
You need to define a separator to notify the other side that a
transfer is over. A special bit string for a binary, a special
character (usually newline) for a text-based protocol (like XML).
If you really try to implement an own http-server, you should rather make use of a library that already implements the HTTP 1.1 connection-keepalive standard.
Some ideas to get you started:
This wikipedia article describes HTTP 1.1 persistent connections:
http://en.wikipedia.org/wiki/HTTP_persistent_connection
You want to not close the socket, but after some inactive time period (apache 2.2 uses 5 seconds) you want to close it.
You have two ways to implement:
in your thread do not close the socket and do not exit the thread, but instead put a read timeout on the socket (whatever you want to support). When you call read it will block and if the timeout expires then you close the socket, else you read next request. The downside of this is that each persistent connection holds both a thread and a socket for whatever your max wait period is. Meaning that your solution doesn't scale because you're holding threads for too long (but may be fine for the purposes of a school project)!
You can get around the limitation of (1) by maintaining a list of tuples {socket,timestamp}, having a background thread monitor and close connections that timeout, and using NIO to detect a new read on an existing open socket. So after you finish reading the initial request you just exit the thread (returning it to the thread pool). Obviously this is much more complicated but it has the benefit of freeing up request threads.
I'm wondering wich is best solution for maintaining huge amount of small TCP connections in multi thread application without locking after some time.
Assume, that we have to visit lot of http web sites (like ~200 000 on different domains, servers etc) in multi thread. Wich classes are the best to do safest connection (I mean most lock-resistance, not multi-threading lock but TCP connection that will "not react for anything"). Will HttpURLConnection & BufferedReader do the job with setted connection and read timeout ? I saw that when I was using simple solution:
URL url = new URL( xurl );
BufferedReader in = new BufferedReader( new InputStreamReader( url.openStream() ) );
All threads were locked/dead after 2-3 hours.
Is better to have constant threads like 10 running all-time and requesting URL's to take from main thread or better create one thread for each url and then kill it in some way if it will not respond after some time ? (how to kill sub-thread ?)
Well if it is going to be HTTP connection, I really doubt you can cache them. Because keeping the HTTP connection alive is not only at the client side, it requires the server side support too. Most of the time, the server will close the connection after the time out period (which is configured in the server). So, check what is the maximum time out configured at the server side and how long you want to keep the connection cached.