tl;dr: Setting the polling-interval to 0 has given my performance a huge boost, but I am worried about possible problems down the line.
In my application, I am doing a fair amount of publishing from our java server to our flex client, publishing on a variety of topics and sub-topics.
Recently, we have been on a round of performance improvements system-wide, and the messaging layer was proving to be a big bottleneck.
A few minutes ago, I discovered that setting the <polling-interval-millis> property in our services-config.xml to 0 caused published messages, even when there are lots of them, to be recognized by the client almost instantly, instead of with the 3 second delay that is the default value for polling-interval-millis, which has obviously had a tremendous impact.
So, I'm pretty happy with the current performance, only thing is, I'm a bit nervous about unintended side-effects caused by this change. In particular, I am worried about our Flash client slowing way down, and of way too much unwanted traffic.
My preliminary testing has not borne out this fear, but before I commit the change to our repository, I was hoping that somebody with experience with this stuff would chime in.
Unfortunately your question is too general...there is no way to receive a specific answer. I'll write below some ideas, maybe they are helpful.
Decreasing the value from 3 to 0 means that you are receiving new data way faster. If your Flex client uses this data in order to make complex computations it is possible to slow your client or to show obsolete data (it is a known pattern, see http://help.adobe.com/en_US/LiveCycleDataServicesES/3.1/Developing/WS3a1a89e415cd1e5d1a8a18fb122bdc0aad5-8000Update.html ). You need to understand how the data is processed and probably to do some client benchmarking.
Also the server will have to handle more requests, and it would be good to identify what is the maximum requests per second which can be handled. For that, you will need to use a tool like Jmeter in order to detect the maximum capacity of your system, after that you can do some computations trying to figure out how many requests per second you will have after you reduced the interval from 3 to 0, taking into account that the number of clients is increasing with 10% per month etc etc.
The main idea is that you should do some performance testing for some API and save the scripts in order to see if your future modification are slowing down the system too much. Without having this it is quite hard to guess if it ok or not to change configuration parameters.
You might want to try out long-polling. For our Weblogic servers, we don't get any problems unless we let the poll request go to 5 minutes, so we keep it to 4, then give it a 1 second rest before starting again. We have a couple of hundred total users, with 60-70 on it hard core all day. The thing to keep in mind is that you're basically turning intermittent user requests into what amounts to almost always connected telnet sessions. Depending on the browser your users are using it can implications from that as well, but overall we've been very pleased.
Related
I have an ec2 instance doing a long running job. The job should take about a week but after a week it is only at 31%. I believe this is due to the low average cpu (less than 1%) and because it very rarely receives a GET request (just me checking the status).
Reason for low cpu:
this Java service performs many GET requests then processes a batch of pages once it has a few hundred (non-arbitrary, there is a reason they are all required first). but to prevent me getting http 429 (too many requests) i must time my GET requests apart using Thread.sleep(x) and synchronization. this results in a very low cpu which spikes every so often.
I think amazons preemptive systems think that my service is arbitrarily waiting, when in actual fact it needs to wake up at a specific moment. I also notice that if i check the status more often then it goes quicker.
How do i stop amazons preemptive system thinking my service isn't doing anything?
I have thought of 2 solutions, however neither seems intuitive:
have another process running to keep the cpu at ~25%. which would only really consist of
while(true){
Thread.sleep(300);
LocalDateTime until = LocalDateTime.now().plusMillis(100);
while(LocalDateTime.now().isBefore(until){
//empty loop
}
}
however this just seems like an unnecessary use of resources.
have a process on my laptop perform a GET request to the aws service every 10 minutes. but one of the reasons i put it on aws was to free up my laptop, although this would be magnitudes less resources of my laptop than having the service run locally.
Is one of these solutions more desirable than the other? Is there another solution which would be more appropriate?
Many Thanks,
edit: note, I use the free-tier services only.
I have a tomcat server.
In the tomcat server, I handle some restful request which calls to very high memory usage server which can last 15 minutes and finally can crash the tomcat.
How can I run this request:
1. without crash the tomcat?
2. without exceed the limit of 3 min on restful requests?
Thank you.
Try another architectural approach.
REST is designed to be statusless, so you have to introduce status.
I suggest you implement ...
the long running task as batch in the background (as
#kamran-ghiasvand suggests).
a submit request that starts the batch and returns a unique ID
a status request that reports the status of the task (auto refresh
the screen every 5s e.g.). You can do that on html/page basis or as
ajax.
To give you an idea what you'll might need on the backend, I quote our PaymentService interface below.
public interface PaymentService {
PnExecution createPaymentExecution(List<Period> Periods, Date calculationDate) throws PnValidationException;
Long createPaymentExecutionAsync(List<Period> Periods, Date calculationDate);
PnExecution simulatePaymentExecution(Period Period, Date calculationDate) throws PnValidationException;
Void deletePaymentExecution(Long pnExecutionId, AsyncTaskListener<?, ?> listener);
Long deletePaymentExecutionAsync(Long pnExecutionId);
void removePaymentNotificationFromPaymentExecution(Long pnExecutionId, Pn paymentNotification);
}
About performance:
Try to find the memory consumers, and try to sequentialize the problem, cut it into steps. Make sure you have not created memory leaks by keeping unused objects referenced. Last resort would be concurrence (of independent tasks) or parallel (processing of similar tasks). But most of these problems are the result of a too straight-forward architectural approach.
Crashing tomcat server has nothing to do with request processing time, though, it might occur due to JVM heap memory overflow (or thousands of other reasons). You should make sure about reason of crash by investigating tomcat logs carefully. If its reason is lack of memory, you can allocate more memory to JVM upon starting tomcat using '-Xmx' flag. For example, you can add the following line in your setenv.sh for allocating 2GB of ram to tomcat:
CATALINA_OPTS="-Xmx2048m"
In terms of request timeout, also there are many reasons that play role here. For example, connectionTimeout of your http connector (see server.xml), network or browser or web client limitations and many other reasons.
Generally speaking, its very bad practice to make such long request synchronously via restful request. I suggest that you consider another workarounds such as websocket or push notification for announcing user that his time-consuming request is completed on server side.
Basically what you are asking boils down to this:
For some task running on Tomcat, that I have not told you anything about, how do I make it run faster, use less memory and not crash.
In the general case, you need to analyze your code to work out why it is taking so long and using so much memory. Then you need to modify or rewrite it as required to reduce memory utilization and improve its efficiency.
I don't think we can offer sensible advice into how to make the request faster, etc without more details. For example, the advice that some people have been offering to split the request into smaller requests, or perform the large request asynchronously won't necessarily help. You should not try these ideas without first understanding what the real problem is.
It is also possible that your task is taking too long and crashing Tomcat for a specific reason:
It is possible that the request's use of (too much) memory is actually causing the requests to take too long. If a JVM is running out of heap memory, it will spend more and more time running the GC. Ultimately it will fail with an OutOfMemoryError.
The excessive memory use could be related to the size of the task that the request is performing.
The excessive memory use could be caused by a bug (memory leak) in your code or some 3rd party library that you are using.
Depending on the above, the problem could be solved by:
increasing Tomcat's heapsize,
fixing the memory leak, or
limiting the size of the "problem" that the request is trying to solve.
It is also possible that you just have a bug in your code; e.g. an infinite loop.
In summary, you have not provided enough information to allow a proper diagnosis. The best we can do is suggest possible causes. Guesses, really.
We have a Google App Engine Java app with 50 - 120 req/s depending on the hour of the day.
Our frontend appengine-web.xml is like that :
<instance-class>F1</instance-class>
<automatic-scaling>
<min-idle-instances>3</min-idle-instances>
<max-idle-instances>3</max-idle-instances>
<min-pending-latency>300ms</min-pending-latency>
<max-pending-latency>1.0s</max-pending-latency>
<max-concurrent-requests>100</max-concurrent-requests>
</automatic-scaling>
Usually 1 frontend instance manages to handle around 20 req/s. Start up time is around 15s.
I have a few questions :
When I change the frontend Default version, I get thousands of Error 500 - Request was aborted after waiting too long to attempt to service your request.
So, to avoid that, I switch from one version to the other using the Traffic splitting feature by IP address, going from 1 to 100% by steps of 5%, it takes around 5 minutes to do it properly and avoid massive 500 errors. Moreover, that feature seems only available for the default frontend module.
-> Is there a better way to switch versions ?
To avoid thousands of Error 500 - Request was aborted after waiting too long to attempt to service your request., We must use at least 3 Resident (min-idle) instances. And as our traffic grows, even with 3, we sometimes still get massive Error 500. Am I supposed to go to 4 residents? I thought App Engine was nice because you only pay for the instances you use, so if in order to work properly we need at least half our running instances in Idle mode, that's not great, is it? It's not really cost effective as when the load is low, still having 4 idle instances is a big waste :( What's weird is that they seem to wait only 10s before responding 500 : pending_ms=10248
-> Do you have advices to avoid that ?
Quite often, we also get thousands of Error 500 - A problem was encountered with the process that handled this request, causing it to exit. This is likely to cause a new process to be used for the next request to your application. If you see this message frequently, you may be throwing exceptions during the initialization of your application. (Error code 104). I don't understand, there aren't any exceptions, and we get hundreds of them for a few seconds.
-> Do you have advices to avoid that ?
Thanks a lot in advance for your help ! ;)
Those error messages are mostly related to loading requests which last too long in being loaded and therefore they finish in something similar to a DeadlineExceededException, which affects dramatically the performance and users experience as you probably already know.
This is a very often issue specially when using DI frameworks with google app engine, and so far it´s an unavoidable and serious unfixed issue when using automatic scaling, which is the scaling policy that App Engine provides for hanndling public requests since its inception.
Try changing the Frontend Instance Class to F2, specially if your memory consumption is higher than 128MB per instance, and set to 15s the min-max pending latency so your requests will get more chances to be processed by a resident instance. However you will get still long response times for some requests, since Google App Engine may not issue a warmup request every time your application needs a new instance, and I undestand that F4 would break the bank.
I am providing a RESTful service that is being served by a servlet (running inside Tomcat 7.0.X and Ubuntu Linux). I'm already getting about 20 thousand queries per hour and it will grow much higher. The servlet receives the requests, prepares the response, inserts a records in a MySQL database table and delivers the response. The log in the database is absolutely mandatory. Untily recently, all this happened in a syncronous way. I mean, before the Tomcat thread delivered the response, it had to create the records in the database table. The problem is that this log used to take more than 90% of the total time, and even worse: when the database got slower then the service took about 10-15 seconds instead of just 20 miliseconds.
I recetly made an improvement: Each Tomcat thread creates an extra thread doing a "(new Thread(new certain Object)).start();" that takes care of the SQL insertion in an asyncronous way, so the response gets to the clients faster. But these threads take too much RAM when MySQL runs slower and threads multiply, and with a few thousands of them the JVM Tomcat runs of the memory.
What I need is to be able to accept as much HTTP requests as possible, to log every one of them as fast as possible (not syncronously), and to make everything fast and with a very low usage of RAM when MySQL gets slow and inserts need to queue. I think I need some kind of queue to buffer the entries when the speed of http request is higher than the speed of insertions in the database log.
I'm thinking about these ideas:
1- Creating some kind of FIFO queue myself, maybe using some of those Apache commons collections, and the some kind of thread that polls the collection and creates the database records. But what collection should I use? And how should I program the thread that polls it, so it won't monopolize the CPU? I think that a "Do while (true)...." would eat the CPU cycles. And that about making it thread safe? How to do it? I think doing it myself is too much effort and most likely I will reinvent the wheel.
2- log4J? I have never used it directly, but it seems that this framework is algo designed to creat "appenders" that talk to the database. Would that be the way to do it?
3- Using some kind of any other framework that specializes in this?
What would you suggest?
Thanks in advance!
What comes to mind right away is a queue like you said. You can use things like ActiveMQ http://activemq.apache.org/ or RabbitMQ http://www.rabbitmq.com/.
The idea is to just fire and forget. There should be almost no overhead to send the messages.
Then you can connect some "offline" to pick up messages off the queues and write them to the database at the speed you need.
I feel like I plug this all day on Stack Overflow, but we use Mule (http://www.mulesoft.org/) at work to do this. One of the great things about Mule is that you can explicitly set the number of threads that read from the queue and the number of threads that write to the database. It allows you fine grain control over throttling messages.
Definitely take a look at using a ThreadPoolExecutor. You can provide the thread pool size, and it will handle all the concurrency and queuing for you. Only possible issue is that if your JVM crashes for any reason, you'll lose any queued items in your pool.
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/ThreadPoolExecutor.html
I would also definitely look into optimizing the MySQL database as much as possible. 20k entries per hour can get hairy pretty quickly. The better optimized your hardware, os, and indexes the quicker your inserts and smaller your queue will be.
First of all: Thanks a lot for your valuable suggestions!
So far I have found a partial solution to my need, and I already implemented it succesfully:
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/LinkedBlockingQueue.html
Now I'm thinking about using also a queue provider if that gets full, as a failover solution. So far I have thought about Amazon's queue service, but it costs money. I will also check the queue solutions that Ryan suggested.
We are doing some Java stress runs (involving network IO). Initially things are all fine and the system responds very fast (avg latency in test 2ms). But hours later when I redo the same test I observe the performance goes down (20 - 60ms). It's the same Jar files, same JVM, and the same LAN over which the stress is running. I am not understanding the reason for this behavior.
The LAN is 1GBPS and for the stress requirements I'm sure we are not using all of it.
So my questions:
Can it be because of some switches in the LANs?
Does the machine slow off after some time ( The machines are restarted .. say about 6months back well before the stress can start; They are RHEL5, XEON 64bit Quad core)
What is the general way to debug such an issues?
A few questions...
How much of the environment is under your control and are you putting any measures in place to ensure it's consistent for each run? i.e. are you sharing the network with other systems, is the machine you're using being used solely for your stress testing?
The way I'd look at this is to start gathering details on what your machine and code are up to. That means use perfmon (windows) sar (unix) to find out what the OS and hardware is doing and get a profiler attached to make sure your code is doing the same thing and help pin-point where the bottleneck is occuring from a code perspective.
Nothing terribly detailed but something I hope that will help get you started.
The general way is "measure everything". This, in particular might mean:
Ensure time on all servers is the same (use ntp or something similar);
Measure how long did it take to generate request (what if request generator has a bug?);
Measure when did request leave the client machine(s), or at least how long did it take to do i/o. Sometimes it is enough to know average time necessary for many requests.
Measure when did the request arrive.
Measure how long did it take to generate a response.
Measure how long did it take to send the response.
You can probably start from the 5th element, as this is (you believe) your critical chain. But it is best to log as much as you can - as according to what you've said yourself, it takes days to produce different results.
If you don't want to modify your code, look for cases where you can sniff data without intervening (e.g. define a servlet filter in your web.xml).