Google App Engine - How can I improve my cold start JVM time?

Google App Engine - How can I improve my cold start JVM time? - java

I'm not new to improving my cold start time, I've spent many hours trying different things. I'd like if possible to know exactly what the Google App Engine does during a cold start.
I have a log statement as described here http://code.google.com/intl/nl/appengine/kb/java.html#performance to show when my code first gets control.
I have two apps that I have been testing, one is simple, and my code first gets control after about 1 second.
The other one has lots of files and stuff and my code first gets control after about 2 seconds. This one doesn't use any more libraries than the other one, however it does have a lot more jsps and java classes.
Could simply having more java and jsp classes cause slower cold start, even if the class isn't used?

Related

Deploying JAVA war to dynamic website performance issue

I have built a java program within an android app that parses some data from an online website. This data is to be used from the app to show the content(organized) to the user. I tried to optimize my code as much as I could but there's not much I can do since it's a lot of data. So what I did instead was deploying the java program on HEROKU and let the server side do the work and give the result in a simple html which I can easily parse with no major delay. The thing is that this worked out pretty fine. I got a high increase in performance yet with one little problem. When I open the app for the first time in say 2 days it results to be a lot slower, but on the second run just after that it seems to be a lot faster. Now I am guessing that the HEROKU server works on some cache-like way in that the least recent run dynamic web sites get no priority until a request comes from the client side, considering that in 2,3 consequent runs I get a very high increase in performance. Now, my question is, is there a way I can sort of "give priority" to my HEROKU java program or is there another free dynamic web site that allows you to deploy a war and presents no such performance issues. To some it might seem as a no big deal. In particular I get performance increase of say from 6 seconds to 2 seconds which is actually quite a big deal since app users usually do not tolerate such kinds of delays.

Heroku puts free apps to sleep. The first request after it sleeps will restart the app, which means you have to wait longer.
For more info see Heroku's sleeping policy for Free dynos.

How can I profile a Java application non-interactively?

What I want to do is generate a call tree with CPU timing information for a Java application as it goes through a scripted task. The idea is to see how much time is spent in each part of the code, and how this changes when I change the code or the task, but to do so in a consistently repeatable way.
In Java VisualVM I can do this interactively by clicking to start and stop profiling, but I would like to automate the process so I can get more consistent results (and not get so bored). Can VisualVM do this, or is there another profiler that can?

If I were a profiler vendor I would have to be concerned about providing people what they think they want, even if what they think they want does not solve the problem they have.
The thing is, only some problems can be found by knowing how long routines typically take, and if you ignore the ones you don't find that way, they will become the dominant part of how much time your program takes.
An example of what I mean is this recent example:
A program spends 50% of its wall-clock time reading .dll files to look up string resources to get the names of files so that the strings can be displayed on a splash screen so the user can see that something is happening during application startup. That means, if there were some other way to provide eye-candy to the user, the app could start up twice as fast.
During this process, the call stack is typically 15-20 functions deep, so it's really hard to tell what's going on just by having timing numbers for the functions.
What makes the problem difficult is that it is semantic. No particular routine is "hot" in a way that it could be speeded up.
The only "hot" thing is the general description, overall, of what the program is doing, and no tool can isolate it for you.
Only you can recognize it.
However, if you simply interrupted the program and examined the call stack during startup, the probability is 50% that you would see the entire explanation for the time being spent.
If you do it several times, it's the basis of the random pausing technique that some programmers rely on because it will find every problem profilers can find, and more, and others look down on because it isn't a tool.
And do it interactively, either that or extract a small number of stack samples by using something analogous to pstack.

Can anyone recommend an open source C- or Java-based scheduler for embedding as the heart of another program?

Can someone recommend an open source scheduler that would lend itself to being embedded
as the heart of a specialized web based scheduler? A C- or Java-based scheduler would be my
first choice to work with.
The finished project would allow someone, via the internet, to add, delete and change tasks scheduled on the local machine by monthly, weekly, daily, and by time of day. The tasks
would be fairly simple: to display messages and play WAV files on the local machine on scheduled dates and at times specified by the remote programmer.
Ok, why? Well, my wife and myself moved her mother to our town a couple of years ago, because she could not or would not tend to her own affairs including eating and taking her insulin on a rigorous schedule. She is a type one diabetic. She is a widow and had been living by herself for about ten years. My wife had been tending to her bills and affairs remotely from our home a thousand miles away. My mother-in-law had a dozen different doctors that did not of one another and she was being way over-medicated with one medication counteracting another. We found out that she had not been careful with her diabetes and that this had resulted in a trip by EMS to the emergency room on the average of every other month. Strangely, she is not totally senile, though her short term memory is pretty much shot, but, she is, and always has been, a "dyed-in-the-wool" slacker. My wife and I, both, work full time jobs, from before daylight until after dark, still, my wife manages to call her mother three times a day, to tell her to eat and take her insulin and then spends about two hours each evening with her mother before coming home. This machine would be, essentially a headless system that served no function other than to flash pre-programmed messages to small monitor and play audio "nags" at the appropriate times. "Get up and your eat breakfast", "It's time to do your insulin", "Give the dog its pill", "Get ready to go to your doctors appointment" and so on. With no keyboard or mouse and the front panel switches disabled, she is enough of a Luddite that I don't think that she will think of pulling the cord out of the wall socket.
Anyway, that's where I'm trying to go. I'm a reluctant programmer, but, I have written some large and complex programs over the years and in a number of different languages and to make the wife's life a bit easier, I can do this. A scheduler that could be modified to become a large block of code within the overall program, would save me a great deal of time and head scratching.

Get a Linux box, SSH in to it, and add entries to the crontab. As for the alerting program, that'll be specific to your task.

You got me interested. We're looking into Spring Batch at the office, but that's less about scheduling jobs and more about heavy processing. I checked out the FAQ, which led me to Quartz... it looks pretty nifty. Here are its features: http://www.quartz-scheduler.org/overview/features.html

Quartz, as mentioned by Dondo is a kind of industry standard for scheduling. It's very popular and used a lot.
Alternatively you could use the Timer API that comes with Java EE. This is a fairly basic thing, but still quite powerful. See this for a small example SIMPLEST POSSIBLE EJB 3.1 TIMER.
Java EE also gives you the tools to easily create a GUI (via Java server faces) and to have some CRUD logic to enter new tasks into your system and persist them with the Java persistence API to a DB.
Of course, if you don't have any experience yet with Java EE (or Spring, or Quartz) simply learning those technologies may be more time consuming than actually building what you have in mind.

Can't see my own application methods in Java VisualVM

I'm trying to profile my java app, just to find out the methods in which most time is being spent. Given the poor reactions here to TPTP, I thought I'd give Java VisualVM a go.
It all seemed rather simple to use - except that I can't seem to get anything consistent or useful out of it.
I can't seem to see anything relating to MY OWN code - all I get is a whole bunch of calls to things like java.* methods.
I've tried restricting instrumentation to only my own packages, which seems to cut down the number of methods instrumented, but still I don't ever seem to see my own.
Each time I run, I get varying numbers of methods instrumented, ranging from 10's to 1000's.
I've tried putting in a sleep at the start of my app, to make sure I get VisualVM up and running before my app starts to do anything interesting, to make sure it's profiling when the interesting stuff is running.
Is there something I have to do to ensure my classes get instrumented ?
Are there timing issues ? ..like, have to wait for classes to be loaded etc ?
I've also tried running the guts of the code twice, to make sure all the code does get exercised...
I'm just running an app, with a main, from Eclipse. I've tried using the Eclipse integration so that VisualVM starts up when I start the app - results are the same.
I've also tried exporting the app as a runnable app, and running it standalone from the command line, rather than through Eclipse - same result.
My app is not a long running web app etc - just a main that calls some other of my own classes to do some processing, then quits.
I'd be grateful for any advice about what I might be doing wrong ! :)
Thanks !

I too am struggling with VisualVM, which is a shame because its user interface is fantastic while its profiling output seems horrific. You can seem my question here.
Java VisualVM giving bizarre results for CPU profiling - Has anyone else run into this?
I can tell you a couple of odd things that I have learned about VisualVM and the way it seems to do its profiling.
VisualVM appears to be counting the total time spent inside a method (wall-clock time). I have a thread in my application which starts a number of other threads and then immediately blocks waiting for a message on a queue. VisualVM will not register this method in the profiler until one of the other threads sends the message the first thread was waiting for (when the application terminates). Suddenly the blocking method call dominates the profiling output and is recorded as taking up more than 80% of the application time.
Other profilers, such as JProfiler and the one used by Azul do not count a blocked thread as taking up time for the profiler. This means that blocking methods which probably aren't interesting (situation dependant) for performance profiling are obscuring your view of that code that is eating your CPU time.
When I am running my profiling I end up with
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run()
obscuring my profiling right up until that message comes back to the waiting thread and then the top spot is shared between these two totally irrelevant methods, as well as various other uninteresting methods which don't appear on other profilers.
Secondly and I think quite importantly the method filtering mechanism doesn't work as I would have expected. This means that I can't filter out the I am trying to track down what the story is with this right now.
Not a really helpful answer. The solution as I see it right now is to pay for JProfiler - VisualVM just doesn't seem trustworthy for this task.

you could take a look at Appdynamics lite , it's has a nice features such as business transaction discovery which can samples all call made to a specific method in your code.
The lite version has a lot of limitation such as 10min sampling max and 30 business transaction discovery max.
It's would be nice to have a free tools that do the same

I assume this isn't just an academic question - you would like to see if you could make the app run faster. I assume you also wouldn't mind a little "out of the box" thinking. There are many popular ideas about performance that are actually pretty fuzzy.
For example, you say you're looking for "methods in which most time is being spent". If by that you mean "self time" (program counter actually in the method) there is probably very little, unless you've got some intense loops. Methods generally spend time by calling other methods, sometimes doing I/O.
Another fuzzy idea is that measuring method time or counting the number of calls can tell you very much about where bottlenecks are. Bottlenecks are specific lines of code, not methods, so even if you know approximately where to look, you're still playing detective.
So those are a few of the fuzzy ideas. Here is a bunch more. Let me suggest how one should think about it, and how that leads to results.
When you eventually fix something, it will reduce execution time by some percent, like (pick a number) 30%, right? (Otherwise you didn't fix anything.) OK, during that 30% it was doing something, something that it didn't need to do because later you got rid of it. So, you don't need to measure. You do need to find out what it is doing in that time, so you know what to get rid of.
A very simple way is to "pause" it 10 (or some number of) times at random. Understand what it is doing and why, by looking at the call stack and possibly some of the data. On about 3 of those times you will see it doing something you could get rid of.
You will know approximately how much it will save by seeing what percent of samples is showing it. Approximate is good enough. You can easily see exactly how much time is saved by stopwatching it before and after.
Then, don't stop. You've made the app faster. Do it again, and make it faster yet. Sooner or later you get to a point where you can't make it any faster, but it's probably in more than one step.

Google AppEngine: how often does a "runtime startup" occur

I'm planning on hosting a JRuby on Rails app on Google AppEngine/Java. I found a great blog post by Ola Bini on how to to this, but in the summary he says:
Overall, JRuby on Rails works very
well on the App Engine, except for
some smaller details. The major ones
are the startup cost and testing. As
it happens, you can’t actually get
GAE/J to precreate things. Instead
you’ll have to let the first release
take the hit of this. Now, GAE/J does
a let of preverifying of bytecodes and
so on, so startup is a bit more heavy
than on other JDKs. One runtime takes
about 20 seconds wall time to startup,
so the first hit takes some time.
I don't fully understand this. How often, under what circumstances, will a runtime need to be started up? A regular 20 second lag is likely to be an issue.

App Engine will start new runtimes for you whenever demand is outstripping the currently running instances. It will then shut down instances when demand is lower. Ultimately, this means that all of your instances could be shut down if your app is not used for a certain amount of time. Then, the next time a user tries to access your app, a new instance will need to be started, or "spun up" as some people call it.
As of March, the app engine team wouldn't give any official estimate on how long an instance will stay up:
7:40pm] nwinter: Is it possible to get a rough estimate of how long an app
instance will stick around once spawned?
[7:40pm] marzia_google: #nwinter, not really
[7:40pm] marzia_google: there are no garuntees
[7:41pm] nwinter: No average time or anything?
[7:42pm] marzia_google: #nwinter i'm not sure an average time would be
meaningful, even if i knew off hand what it was ( i don't)
[7:42pm] marzia_google: since it really can be quite variable
[7:42pm] Kardax: Re instance lifetime: So an app instance could last a few
seconds or a few hours? Just curious
[7:43pm] dan_google: nwinter: Apps are evicted by least-recently-used on an
app server. As someone noted recently (forums or chat I forget), low
traffic could mean lots of "restarts", but so could spikes in traffic which
may start new instances on multiple app servers.
[7:43pm] nwinter: #dan_google: good to know!
[7:43pm] dan_google: Kardax: Yes, depending on the weather. By which I
mean, request patterns, other apps on each app server, and so forth. Not
really predictable.
This is the transcript of a chat with the app engine team. I have deleted the non-relevant lines in the transcript like "so and so logged in." The full transcript can be found here

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.