I'm interested in executing about 50 HTTP requests/second from a single machine. I don't care so much about latency but I do care about throughput.
I'm trying to decide whether to use Apache HttpAsyncClient or use Netty. Could someone shed some light about the advantages of each regarding my problem?
I've found this comparison but I was hoping for a little bit more detailed explanation on which one is better and for what use case. Also, does the comparison means that using the synchronous apache HTTP client with 200 threads be better than the other options? Isn't 200 threads a bit too much (assuming I'm using a normal computer with 4 cores, 2 threads per core and 12GB of RAM)?
Thanks in advance
The main problem with these benchmarks is that in real life you have more threads and much more noise, so you can't really expect to get similar results in production unless you go for the async IO option.
You're looking into getting more throughput, and as expected Netty based clients wins big time in their benchmark. So it's probably your best bet.
We're using Netty very successfully for a wide array of applications, and it never fails us. You can use ning async-http-client, and then you don't have to implement a client all by your self.
Do note however, as I stated in the comments, I base my answer on my personal experience, and on our production metrics. Never believe a random benchmark post you see posted on the internet, nor a StackOverflow answer. Test it your self ;)
Related
How do I figure out and prove the optimal port/socket/thread ratio for my application?
At the moment I am considering something like this:
Each thread handles all the traffic of a single port, each client gets their own socket, and the sockets are split between the available ports, and thus threads. This solution is based on the assumption that I should create approximately one thread per CPU core, and that sockets are fairly cheap to open. Is this a good solution, and more importantly how do I mathematically prove that this, or any other solution, is a good one?
I know I can write a sample program for every solution and test the results, but I would much prefer a mathematical proof over an empirical one, especially where the test is done on a machine that does not reflect the server hardware and configuration.
I don't have much experience with ports and sockets, and I am having a tough time finding information to answer my question. The best resources I could find so far are these Stack Overflow questions:
Forcing multiple threads to use multiple CPUs when they are available
When Should I Use Threads?
What is the difference between a port and a socket?
If I simply overlooked someting, or are misunderstanding the way ports, sockets and threads are/should be used I will be quite content with a simple "rtfm:[link]" answer to point me in the right direction. However If you are feeling magnanimous and provide me with a good explanation I will be much obliged.
If you use non-blocking NIO, the optimal performance is 1 Thread / core. The reason for this is very simple:
With non-blocking IO, whenever there is work available, it will be executed by 1 available thread at full speed (as no blocking).
You can never exceed 100% CPU usage.
I want to build a service that basically pulls data from various API's.
On a typical server, is there a thread limit that one should adhere too?
Does anyone have any experience building something similiar, how many threads was considered ideal and what kind of requests per second can one expect?
Is 100 threads too much? 200?
I realize this is something I'm going to have to test, but looking for someone who has built something similar in nature that can shed some experience on it.
It depends on you bottlenecks and your requirements. How fast do you need to complete the operations? Do the threads make IO? I know they make a lot of network requests from your explanation.
So the threads are going to wait on network. Why do you need many threads then, maybe async operations will be faster.
And in general, as Robert Harvey commented: It's going to take us longer to answer your question than it is for you to test it and tweak the number. The number of threads depends on all sorts of variables which you haven't specified, so any answer is going to be a guess
For your particular case it may be more suited to use an asynchronous style of programming. In this case you could achieve a large throughput of API calls using a small number of threads - it may be even comparable to the number of available cores.
There are several available libraries to achieve this (Twitter is the king here).
Finagle - General purpose, supports multiple transport protocols
Scrooge - For thrift only
Async Http Client - Java-oriented async http client
And there are many others.
I would like to test a sevlet that I 've made with simultaneous requests (100 or 1000 or more). I was searching about it and I found JMeter but I am not quite sure if it does what I want.
Do you know how I could do something like that. Do you know any tutorial or guide that it could help me? (I am not experienced in programming)
p.s.
I run my servlet on Jetty because I am using the jetty continuations. This is also what I want to test.
JMeter is rather easy to use. Also consider installing JMeter plugins that enable richer set of graphs and sampling methods. See my blog post for some instructions, also have a look at a sample performance test plan for JMeter.
JMeter is a good choice, it can do this job. See the user manual, it explains in detail how to set up the test.
Btw: Running the test tool and the application on the same machine is not a relistic performance/throughput test scenario and can only provide an indication on how your servlet behaves in the real world.
You can just use any HTTP performance tester, for example apache bench:
ab -c 100 -n 100000 http://localhost/
# Hit the http://localhost/ with 100000 requests, 100 at a time
This will output something like:
Requests per second: 4497.18 [#/sec] (mean)
JMeter is a good choice - easy to use and sufficient for most cases.
If you want to do simple tests it should be enough. However, you are interested in writing more complex tests scenarios, I'd recommend HP LoadRunner (but it's commercial software).
You may want to rethink he preciseness of your test. Unless your connecting agents are defined by a synchronized clock the odds of a simultaneous event connection are pretty low. Humans are pretty chaotic, organic computing units tied to imprecise clocks dictating the interval between requests to a service. You actually have to get a very high number of chaotic requests before you get some behavior of natural simultaneous incidents of some number of users within the same section of code having made the request and the same time mark. Now, it is highly likely that you can have a high number coincident within a short window, such as 200 per second, but true simultaneous behavior is quite rare in real world conditions.
Food for thought....
I am building a complex HTML 5 application that takes advantage of Websockets. I am getting to the point where I have a lot of different types of data that gets updated in real time on the screen.
I want to know if it is going to be better for me to have fewer Websockets that are more complex, or a lot of simple Websockets open per page.
I added http://github.com/TooTallNate/Java-WebSocket web socket server to my Grails Application.
Right now I am going down the path of using a lot of simple web sockets for each task. I know using more sockets will use more memory on the server side but also more sockets means more concurrent processing.
Does anyone have any advice on how I can balance this.
Thanks for any tips in advance. Keith Blanchard
I think it is hard to make any reasonable statements about websockets without measuring the actual performance in specific browsers.
My inclination would be to have a single websocket per client.
There are some pretty hard limits on capacity server-side when doing IO ... relatively easily to saturate the channel when you have many connections (something that can bite heavily ajaxified systems as well).
Again, need to really measure to make intelligent statements about this.
Websocket-per-client would also make the application much more manageable ... depends on your actual use-case, but "more concurrency" is not necessarily better and can make managing state incredibly complex.
I personally did some benchmark on this one, and the results are:
10 websockets on a single page will cause page a little unresponsive when data coming in from each socket.
50 websockets on a single page will cause an unbearable freeze on the web.
So somewhere around 10 or less than 10 would be your upper limit.
I am developing a application for benchmarking purposes, for which I require to create large number of http connection in a short time, I created a program in java to test how much threads is java able to create, it turns out in my 2GB single core machine, the limit is variable between 5000 and 6000 with 1 GB of memory given to JVM after which it hits outofmemoryerror with heap limit reached.
It is suggested that erlang will be able to generate much more concurrent processes, I am willing to learn erlang if it is capable of solving the problem , can erlang be able to generate somewhere around 100000 processes which are essentially http requests waiting for responses, in a matter of few seconds without reaching any limit like memory error etc.,
According famous Richard Jones blog you can handle 100k connection almost out of the box. You have to increase process limit, see +P parameter and it needs little bit memory management trickery e.g. gc or hibernate. To achieve significantly more you have to do more hacking with C.
It can handle pretty much anything you throw at it. Erlang processes are extremely light weight.
See http://www.sics.se/~joe/apachevsyaws.html for a benchmark in concurrency between Yaws and Apache. It gives you a good idea of what's possible.
I thought an interesting thing for this is that a guy was able to get a million comet connections with mochiweb http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-1
As previously stated, a lot is a good answer. And also given your requirement is to write a tsunami client, you can/should distribute code over to several erlang nodes.
Even better, you might want to check out Tsung. It is a distributed load testing application written in erlang.
And if you don't want to use it, I am pretty sure there's code you want to read in there.