I run Java program like script many times per second and every time loading of script takes a lot of time
register.put("some word", new SomeClass(/* code */), some vars);
and this code repeats ~200 times (yeah)
there is a way to precompile this code? (like C++ precompiled headers)
Java gets compiled, not to machine code, but to bytecode.
If you wish to compile your Java application in machine-readable code (i.e. an executable), you can use GraalVM's native-image feature. Be aware, though, that there are certain limitations. For one, compilation is quite slow and memory-intensive. For another, SubstrateVM (the framework used for native compilation) operates under a closed-world assumption. This means that you cannot load classes during program execution. Another downside is that native images are not able to use the JIT compiler and thus the overall throughput of the program may decrease (see this GitHub issue for details). This list of limitations is not extensive. Please check the limitations when you run into problems.
You can also try to improve application startup by creating your own JVM runtime image with jlink. There is an article at medium that seems like a good entry point.
Related
I'm building a java CLI utility application that processes some data from a file.
Apart from reading from a file, all the operations are done in-memory. The in-memory processing part is taking a surprisingly long time so I tried profiling it but could not pinpoint any specific function that performed particularly bad.
I was afraid that JIT was not able to optimize the program during a single run, so I benchmarked how the runtime changes between the consecutive executions of the function with all the program logic (including reading the input file) and sure enough, the runtime for the in-memory processing part goes down for several executions and becomes almost 10 times smaller already on the 5th run.
I tried shuffling the input data before every execution, but it doesn't have any visible effect on this. I'm not sure if some caching may be responsible for this improvement or the JIT optimizations done during the program run, but since usually the program is ran once at time, it always shows the worst performance.
Would it be possible to somehow get a good performance during the first run? Is there a generic way to optimize performance for a short-running java applications?
You probably cannot optimize startup time and performance by changing your application1, 2. And especially for a small application3. And I certainly don't think there are "generic" ways to do it; i.e. optimizations that will work for all cases.
However, there are a couple of JVM features that should improve performance for a short-lived JVM.
Class Data Sharing (CDS) is a feature that allows JIT compiled classes to be cached in the file system (as a CDS archive) and which is then reused by later of runs of your application. This feature has been available since Java 5 (though with limitations in earlier Java releases).
The CDS feature is controlled using the -Xshare JVM option.
-Xshare:dump generates a CDS archive during the run
-Xshare:off -Xshare:on and -Xshare:auto control whether an existing CDS archive will be used.
The other way to improve startup times for a HotSpot JVM is (was) to use Ahead Of Time (AOT) compilation. Basically, you compile your application to a native code binary using the jaotc command, and then run the executable it produces rather than the java command. The jaotc command is experimental and was introduced in Java 9.
It appears that jaotc was not included in the Java 16 builds published by Oracle, and is scheduled for removal in Java 17. (See JEP 410: Remove the Experimental AOT and JIT Compiler).
The current recommended way to get AOT compilation for Java is to use the GraalVM AOT Java compiler.
1 - You could convert into a client-server application where the server "up" all of the time. However, that has other problems, and doesn't eliminate the startup time issue for the client ... assuming that is coded in Java.
2 - According to #apangin, there are some other application tweaks that may could make you code more JIT friendly, though it will depend on what you code is currently doing.
3 - It is conceivable that the startup time for a large (long running) monolithic application could be improved by refactoring it so that subsystems of the application can be loaded and initialized only when they are needed. However, it doesn't sound like this would work for your use-case.
You could have the small processing run as a service: when you need to run it, "just" make a network call to that service (easier if it's HTTP because there are easy way to do it in Java). That way, the processing itself stays in the same JVM and will eventually get faster when JIT kicks in.
Of course, because it could require significant development, that is only valid if the processing itself:
is called often
has arguments that are easy to pass to the service (usually serialized as strings)
has arguments that don't require too much data to pass to the service (e.g. several MB binary content)
I recently started reading about Java compilers. So far my understanding is, that optimization comes from techniques like tired compilation or code profiling. Now I read that Java 9 respectively Java 10 (Windows) provides the option of AOT compilation. Now I wonder: what use case would justify the use of AOT compilation?
To have better startup performance, like simple desktop app, it would be annoying for user to wait for it to load and then it still will be pretty slow until JIT kicks in. So then you can use AOT to already provide optimized code - it might not be as good as JIT, but will be much faster at startup.
Also some applications are used only for few seconds or even less - JIT will never have a chance to kick in. Like simple command line app that just sends single request and closes. Each function will be probably only executed once - so there is no reason to use JIT at all.
Also it might help to decrease binary size or allow for creation of very simple and small standalone binary. Same for memory usage - as JIT needs some memory to work.
I have a Java program where I'm trying new algorithms for square rooting and comparing them to the native Math.sqrt(a) method in Java. What I find weird is that the first time the .sqrt(a) method is called in the program, it takes at least 50,000ns whereas the times after, it only takes a few thousand. Does this have to do with how the system time is being calculated during the first few moments of running the program, or are the methods actually executing slower for some reason?
There are significant overheads in starting a Java application.
The JVM (the java executable) needs to be loaded.
The JVM needs to bootstrap:
creating and initializing the heap
classloading various system classes
and so on
Your classes need to be classloaded. This typically triggers further classloading of system classes, third party libraries and so on.
After a bit ... the JIT compiler starts to compile methods to native code.
While this is happening, the GC may run to clean up garbage that was created by JIT compilation and classloading.
All of this adds up to significant startup costs ... compared to (say) an application that is implemented in C or C++, compiled and linked to an executable.
However, this should bot be relevant to developing and benchmarking algorithms in Java. You simply need to do the benchmarking in a way that eliminates the "JVM warmup" overheads. For more details:
How do I write a correct micro-benchmark in Java?
#user7859067 comments:
need very awesome performance, go native.
I assume you mean ... implement the code as a Java native method. That doesn't help with JVM bootstrap overheads. And "going native" isn't always a win, since there are overheads when calling into custom native method from Java.
However, it is a fact that the implementations of many Math functions are in native code ... for speed. (The JIT compiler has tweaks to generate special fast calls to "intrinsic" native methods, but (AFAIK) you can't use this yourself without modifying the JRE codebase ...) Anyhow, if you compare your (pure Java) implementation's performance against the standard (native) Math.sqrt method, you are comparing apples and oranges.
I read some materials about JVM and bytecode. I think it would be more efficient if JVM can translate bytecode into platform dependent machine code in the first time run, instead of interpreting them all the time.
However, I could not find such files in my project folders. There are only bin and src folders, which contain *.class bytecodes and *.java source codes.
So my questions are:
If Java interprets bytecode all the time, why not translate bytecode to machine code after the first run?
If they do generate machine code, where are the files?
Not an option since the environment can change between runs (e.g. upgrade of JVM)
In memory (or serialized to disk when needed)
If Java interprets bytecode all the time, why not translate bytecode
to machine code after the first run?
There are pros and cons to both ahead of time (AOT) and just in time (JIT) compilation.
The main advantage of AOT is that the compiler is generally allowed to take longer, so it can perform more sophisticated analysis and optimization. Another advantage is that the compiler doesn't have to be present at runtime on the target machine. The disadvantages are everything else.
The main advantage of JIT is that the compiler is able to make optimizations based on information known only at runtime. In fact, it is even possible to unoptimize and reoptimize code when conditions change. Furthermore, the JIT doesn't have to waste time optimizing code that is never or rarely run, unlike the AOT compiler.
Some languages are designed to favor one approach over the other. For example, C/C++ are designed for AOT, while Java is designed for JIT (though it can be compiled AOT with some restrictions). For example, Java has a heavy emphasis on virtual getters and setters, possibly for classes not loaded until runtime. But the JIT can see and inline these functions at runtime. By contrast, if you used virtual methods for every field access in C++, you'd pay a huge performance penalty.
It doesn't interpret code all the time. Interpreted code is translated into byte code after some time. You can tweak this "time" using -XX:CompileThreshold= (default is 10000) or you can turn off compilation completely.
In memory. There's a special area in memory called "Code cache". You can see how methods a compiled into the cache and how they are evicted from the cache using -XX:+PrintCompilation. The size of the cache is also configurable, see -XX:ReservedCodeCacheSize=.
Well, the JVM has preprocessed data but only for its own classes. Given the size of the JRE library and the fact that it usually doesn’t change, it’s a big win (you might look for files called classes.jsa).
However, even these files are not containing native code but only easier-to-process byte code.
The big point about code generation in Hot Spot JVMs is that they don’t compile code on a class or method basis as you seem to think. These JVMs compile code fragments spanning multiple interacting methods as the interaction is discovered during the self-profiling. These code blocks may span methods from the JRE, the extension libraries, 3rd party libraries in your class path and your application classes and hence are only valid for this specific combination.
During the compilation the information gathered about your program’s behavior will be used, e.g. code paths not taken might be elided and conditionals might be asserted to evaluate to a certain result as they did in previous evaluations. This yields to a high performance but it might happen that the JVM has to drop the code even during the same execution when one of the assertions does not hold anymore, e.g. the program might take a code path it didn’t before or a new class has been loaded into the JVM which extends a class whose code has been optimized as-if having no subclasses, etc.
So if optimized and compiled code might be rendered obsolete even within the same environment, it is even much likelier to be obsolete in the next execution. In the end, the JVM would have to check whether the old code is still appropriate which might turn out to be even costlier than simply gathering the new environment’s data and program behavior.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
It seems like anything you can do with bytecode you can do just as easily and much faster in native code. In theory, you could even retain platform and language independence by distributing programs and libraries in bytecode then compiling to native code at installation, rather than JITing it.
So in general, when would you want to execute bytecode instead of native?
Hank Shiffman from SGI said (a long time ago, but it's till true):
There are three advantages of Java
using byte code instead of going to
the native code of the system:
Portability: Each kind of computer has its unique instruction
set. While some processors include the
instructions for their predecessors,
it's generally true that a program
that runs on one kind of computer
won't run on any other. Add in the
services provided by the operating
system, which each system describes in
its own unique way, and you have a
compatibility problem. In general, you
can't write and compile a program for
one kind of system and run it on any
other without a lot of work. Java gets
around this limitation by inserting
its virtual machine between the
application and the real environment
(computer + operating system). If an
application is compiled to Java byte
code and that byte code is interpreted
the same way in every environment then
you can write a single program which
will work on all the different
platforms where Java is supported.
(That's the theory, anyway. In
practice there are always small
incompatibilities lying in wait for
the programmer.)
Security: One of Java's virtues is its integration into the Web. Load
a web page that uses Java into your
browser and the Java code is
automatically downloaded and executed.
But what if the code destroys files,
whether through malice or sloppiness
on the programmer's part? Java
prevents downloaded applets from doing
anything destructive by disallowing
potentially dangerous operations.
Before it allows the code to run it
examines it for attempts to bypass
security. It verifies that data is
used consistently: code that
manipulates a data item as an integer
at one stage and then tries to use it
as a pointer later will be caught and
prevented from executing. (The Java
language doesn't allow pointer
arithmetic, so you can't write Java
code to do what we just described.
However, there is nothing to prevent
someone from writing destructive byte
code themselves using a hexadecimal
editor or even building a Java byte
code assembler.) It generally isn't
possible to analyze a program's
machine code before execution and
determine whether it does anything
bad. Tricks like writing
self-modifying code mean that the evil
operations may not even exist until
later. But Java byte code was designed
for this kind of validation: it
doesn't have the instructions a
malicious programmer would use to hide
their assault.
Size: In the microprocessor world RISC is generally preferable
over CISC. It's better to have a small
instruction set and use many fast
instructions to do a job than to have
many complex operations implemented as
single instructions. RISC designs
require fewer gates on the chip to
implement their instructions, allowing
for more room for pipelines and other
techniques to make each instruction
faster. In an interpreter, however,
none of this matters. If you want to
implement a single instruction for the
switch statement with a variable
length depending on the number of case
clauses, there's no reason not to do
so. In fact, a complex instruction set
is an advantage for a web-based
language: it means that the same
program will be smaller (fewer
instructions of greater complexity),
which means less time to transfer
across our speed-limited network.
So when considering byte code vs native, consider which trade-offs you want to make between portability, security, size, and execution speed. If speed is the only important factor, go native. If any of the others are more important, go with bytecode.
I'll also add that maintaining a series of OS and architecture-targeted compilations of the same code base for every release can become very tedious. It's a huge win to use the same Java bytecode on multiple platforms and have it "just work."
The performance of essentially any program will improve if it is compiled, executed with profiling, and the results fed back into the compiler for a second pass. The code paths which are actually used will be more aggressively optimized, loops unrolled to exactly the right degree, and the hot instruction paths arranged to maximize I$ hits.
All good stuff, yet it is almost never done because it is annoying to go through so many steps to build a binary.
This is the advantage of running the bytecode for a while before compiling it to native code: profiling information is automatically available. The result after Just-In-Time compilation is highly optimized native code for the specific data the program is processing.
Being able to run the bytecode also enables more aggressive native optimization than a static compiler could safely use. For example if one of the arguments to a function is noted to always be NULL, all handling for that argument can simply be omitted from the native code. There will be a brief validity check of the arguments in the function prologue, if that argument is not NULL the VM aborts back to the bytecode and starts profiling again.
Bytecode creates an extra level of indirection.
The advantages of this extra level of indirection are:
Platform independence
Can create any number of programming languages (syntax) and have them compile down to the same bytecode.
Could easily create cross language converters
x86, x64, and IA64 no longer need to be compiled as seperate binaries. Only the proper virtual machine needs to be installed.
Each OS simply needs to create a virtual machine and it will have support for the same program.
Just in time compilation allows you to update a program just by replacing a single patched source file. (Very beneficial for web pages)
Some of the disadvantages:
Performance
Easier to decompile
All good answers, but my hot-button has been hit - performance.
If the code being run spends all its time calling library/system routines - file operations, database operations, sending windows messages, then it doesn't matter very much if it's JITted, because most of the clock time is spent waiting for those lower-level operations to complete.
However, if the code contains things we usually call "algorithms", that have to be fast and don't spend much time calling functions, and if those are used often enough to be a performance problem, then JIT is very important.
I think you just answered your own question: platform independence. Platform-independent bytecode is produced and distributed to its target platform. When executed it's quickly compiled to native code either before execution begins, or simultaneously (Just In Time). The Java JVM and presumably the .NET runtimes operate on this principle.
Here: http://slashdot.org/developers/02/01/31/013247.shtml
Go see what the geeks of Slashdot have to say about it! Little dated, but very good comments!
Ideally you would have portable bytecode that compiles Just In Time to native code. I think the reason bytecode interpreters exist without JIT is due primarily to the practical fact that native code compilation adds complexity to a virtual machine. It takes time to build, debug, and maintain that additional component. Not everyone has the time or resources to make that commitment.
A secondary factor is safety. It's much easier to verify an interpreter won't crash than to guarantee the same for native code.
Third is performance. It can often take more time to generate machine code than to interpret bytecode for small pieces of code that only run once.
Portability and platform independence are probably the most notable advantages of bytecode over native code.