I'm hosting a java app app on app engine, or some reason i see that from today the response times are extremely slow - 10kms + !!! the gae status page shows everything is ok, Does anyone have an answer or similar experience ?
Second issue, i see that many request starts only a few seconds after it has been received, there is a delay in executing the request, does anyone know how i can fix it ?
p.s
I changed my instances from f1 to f2 to see if maybe it will help but the result is the same.
Thank you
The GAE Google group is likely still the best place to ask questions like this.
Could it be that you are just seeing an increased number of warmup requests? In this case going from F1 to F2 will not make a huge difference. Depending on your application instance startup can be reduced by changing the instance class. But this change alone will not reduce the time to a more reasonable response time of ~1 second.
The following best practices allow you to reduce the duration of loading requests:
Load only the code needed for startup.
Access the disk as little as possible.
In some cases, loading code from a zip or jar file is faster than loading from many separate files.
You can also try to add a few resident instances. The GAE scheduler will then put extra traffic on resident instances and launch new dynamic instances in the background. Since residents are started ahead of time this will hide some latency from users.
Related
I have an ec2 instance doing a long running job. The job should take about a week but after a week it is only at 31%. I believe this is due to the low average cpu (less than 1%) and because it very rarely receives a GET request (just me checking the status).
Reason for low cpu:
this Java service performs many GET requests then processes a batch of pages once it has a few hundred (non-arbitrary, there is a reason they are all required first). but to prevent me getting http 429 (too many requests) i must time my GET requests apart using Thread.sleep(x) and synchronization. this results in a very low cpu which spikes every so often.
I think amazons preemptive systems think that my service is arbitrarily waiting, when in actual fact it needs to wake up at a specific moment. I also notice that if i check the status more often then it goes quicker.
How do i stop amazons preemptive system thinking my service isn't doing anything?
I have thought of 2 solutions, however neither seems intuitive:
have another process running to keep the cpu at ~25%. which would only really consist of
while(true){
Thread.sleep(300);
LocalDateTime until = LocalDateTime.now().plusMillis(100);
while(LocalDateTime.now().isBefore(until){
//empty loop
}
}
however this just seems like an unnecessary use of resources.
have a process on my laptop perform a GET request to the aws service every 10 minutes. but one of the reasons i put it on aws was to free up my laptop, although this would be magnitudes less resources of my laptop than having the service run locally.
Is one of these solutions more desirable than the other? Is there another solution which would be more appropriate?
Many Thanks,
edit: note, I use the free-tier services only.
I have built a java program within an android app that parses some data from an online website. This data is to be used from the app to show the content(organized) to the user. I tried to optimize my code as much as I could but there's not much I can do since it's a lot of data. So what I did instead was deploying the java program on HEROKU and let the server side do the work and give the result in a simple html which I can easily parse with no major delay. The thing is that this worked out pretty fine. I got a high increase in performance yet with one little problem. When I open the app for the first time in say 2 days it results to be a lot slower, but on the second run just after that it seems to be a lot faster. Now I am guessing that the HEROKU server works on some cache-like way in that the least recent run dynamic web sites get no priority until a request comes from the client side, considering that in 2,3 consequent runs I get a very high increase in performance. Now, my question is, is there a way I can sort of "give priority" to my HEROKU java program or is there another free dynamic web site that allows you to deploy a war and presents no such performance issues. To some it might seem as a no big deal. In particular I get performance increase of say from 6 seconds to 2 seconds which is actually quite a big deal since app users usually do not tolerate such kinds of delays.
Heroku puts free apps to sleep. The first request after it sleeps will restart the app, which means you have to wait longer.
For more info see Heroku's sleeping policy for Free dynos.
We have a Google App Engine Java app with 50 - 120 req/s depending on the hour of the day.
Our frontend appengine-web.xml is like that :
<instance-class>F1</instance-class>
<automatic-scaling>
<min-idle-instances>3</min-idle-instances>
<max-idle-instances>3</max-idle-instances>
<min-pending-latency>300ms</min-pending-latency>
<max-pending-latency>1.0s</max-pending-latency>
<max-concurrent-requests>100</max-concurrent-requests>
</automatic-scaling>
Usually 1 frontend instance manages to handle around 20 req/s. Start up time is around 15s.
I have a few questions :
When I change the frontend Default version, I get thousands of Error 500 - Request was aborted after waiting too long to attempt to service your request.
So, to avoid that, I switch from one version to the other using the Traffic splitting feature by IP address, going from 1 to 100% by steps of 5%, it takes around 5 minutes to do it properly and avoid massive 500 errors. Moreover, that feature seems only available for the default frontend module.
-> Is there a better way to switch versions ?
To avoid thousands of Error 500 - Request was aborted after waiting too long to attempt to service your request., We must use at least 3 Resident (min-idle) instances. And as our traffic grows, even with 3, we sometimes still get massive Error 500. Am I supposed to go to 4 residents? I thought App Engine was nice because you only pay for the instances you use, so if in order to work properly we need at least half our running instances in Idle mode, that's not great, is it? It's not really cost effective as when the load is low, still having 4 idle instances is a big waste :( What's weird is that they seem to wait only 10s before responding 500 : pending_ms=10248
-> Do you have advices to avoid that ?
Quite often, we also get thousands of Error 500 - A problem was encountered with the process that handled this request, causing it to exit. This is likely to cause a new process to be used for the next request to your application. If you see this message frequently, you may be throwing exceptions during the initialization of your application. (Error code 104). I don't understand, there aren't any exceptions, and we get hundreds of them for a few seconds.
-> Do you have advices to avoid that ?
Thanks a lot in advance for your help ! ;)
Those error messages are mostly related to loading requests which last too long in being loaded and therefore they finish in something similar to a DeadlineExceededException, which affects dramatically the performance and users experience as you probably already know.
This is a very often issue specially when using DI frameworks with google app engine, and so far it´s an unavoidable and serious unfixed issue when using automatic scaling, which is the scaling policy that App Engine provides for hanndling public requests since its inception.
Try changing the Frontend Instance Class to F2, specially if your memory consumption is higher than 128MB per instance, and set to 15s the min-max pending latency so your requests will get more chances to be processed by a resident instance. However you will get still long response times for some requests, since Google App Engine may not issue a warmup request every time your application needs a new instance, and I undestand that F4 would break the bank.
I have given some thought on how to calculate how many users I can handle with Heroku and one dyno.
But to figure it out I need some input.
And I must say that the official documentation isn't nice to navigate and interpreter so I haven't read it all. My complaints about it are that it doesn't describe things very well. Sometimes it describes old stacks, sometimes it's ruby specific, sometime it isn't described at all and so on.
So I need some input on how Heroku, Cedar stack, works regarding requests to make my calculations.
You are more than welcome to correct me on my assumptions as I am relatively new to dyno theory.
Lets say I have a controller that takes a request and calculate a JSON response in 10ms locally will I be able to serve 100request a second?
As I understand the cedar stack doesn't have a fronting caching solution, many questions arises.
Does static content requests take up dyno time?
Does transfer time count to request time.
Can one dyno solution transfer many response to a request at the same time if the request requires small CPU utilization.
Some of the question is intertwined so a combined answer or other thought is valued.
An example:
Static HTML page.
<HTML>...<img><css><script>...
AjaxCall //dyno processing time 10ms
AjaxCall //dyno processing time 10ms
AjaxCall //dyno processing time 10ms
AjaxCall //dyno processing time 10ms
...</HTML>
Can I serve (1000ms / (10ms x 4)) = 25HTML pages a second?
This assumes that static content isn't provided by a dyno.
This assumes that transfer time isn't blamed on the dyno.
If this isn't the case I would be a catastrophe. Lets say a mobile phone in Africa makes 10 request and have a 10sec transfer time then my App will be unavailable for over 1½ minute.
I can only really answer the first question: Static assets most certainly do take up dyno time. In fact, I think it's best to keep all static assets, including stylesheets and JS on an asset server when using heroku's free package. (If everyone did that, heroku would benefit and so would you). I recommend using the asset_sync gem to handle that. The Readme does explain that there are one or two, easily resolved, current issues.
Regarding your last point, sorry if I'm misinterpreting here, but a user in south africa might take 10 seconds to have their request routed to Heroku, but most of that time is probably spent trafficking around the maze of telephone exchanges between SA and USA. Your dyno is only tied up for the portion of the request that takes place inside Heroku's servers, not the 9.9 seconds your request spent getting there. So effectively Heroku is oblivious to whether your request is coming from South Africa or Sweden.
There are all sorts of things you can do to speed your app up: Caching, more dynos, Unicorn with several workers
You're making two wrong assumptions. The good news is that your problem becomes much simpler once you think about things differently.
First off remember that a dyno is a single process, not a single thread. If you're using Java then you'll be utilizing many request threads. Therefore you don't have to worry about your application being unavailable while a request is being processed. You'll be able to process requests in parallel.
Also when talking about dyno time that refers to the amount of time that your process is running not just request processing time. So a web process that is waiting for a request still consumes dyno time since the process is up while it waits for requests. This is why you get 750 free dyno hours a month. You'll be able to run a single dyno for the entire month (720 hours).
As far as computing how many requests your application can serve per second the best way to do that is to test it. You can use New Relic to monitor your application while you load test it with JMeter or whatever your favorite load testing program is: http://devcenter.heroku.com/articles/newrelic
I'm planning on hosting a JRuby on Rails app on Google AppEngine/Java. I found a great blog post by Ola Bini on how to to this, but in the summary he says:
Overall, JRuby on Rails works very
well on the App Engine, except for
some smaller details. The major ones
are the startup cost and testing. As
it happens, you can’t actually get
GAE/J to precreate things. Instead
you’ll have to let the first release
take the hit of this. Now, GAE/J does
a let of preverifying of bytecodes and
so on, so startup is a bit more heavy
than on other JDKs. One runtime takes
about 20 seconds wall time to startup,
so the first hit takes some time.
I don't fully understand this. How often, under what circumstances, will a runtime need to be started up? A regular 20 second lag is likely to be an issue.
App Engine will start new runtimes for you whenever demand is outstripping the currently running instances. It will then shut down instances when demand is lower. Ultimately, this means that all of your instances could be shut down if your app is not used for a certain amount of time. Then, the next time a user tries to access your app, a new instance will need to be started, or "spun up" as some people call it.
As of March, the app engine team wouldn't give any official estimate on how long an instance will stay up:
7:40pm] nwinter: Is it possible to get a rough estimate of how long an app
instance will stick around once spawned?
[7:40pm] marzia_google: #nwinter, not really
[7:40pm] marzia_google: there are no garuntees
[7:41pm] nwinter: No average time or anything?
[7:42pm] marzia_google: #nwinter i'm not sure an average time would be
meaningful, even if i knew off hand what it was ( i don't)
[7:42pm] marzia_google: since it really can be quite variable
[7:42pm] Kardax: Re instance lifetime: So an app instance could last a few
seconds or a few hours? Just curious
[7:43pm] dan_google: nwinter: Apps are evicted by least-recently-used on an
app server. As someone noted recently (forums or chat I forget), low
traffic could mean lots of "restarts", but so could spikes in traffic which
may start new instances on multiple app servers.
[7:43pm] nwinter: #dan_google: good to know!
[7:43pm] dan_google: Kardax: Yes, depending on the weather. By which I
mean, request patterns, other apps on each app server, and so forth. Not
really predictable.
This is the transcript of a chat with the app engine team. I have deleted the non-relevant lines in the transcript like "so and so logged in." The full transcript can be found here