This question has received a total of several paragraphs of answer. Here is the only sentence that actually tells me what I was looking for:
Your examples would make little difference since intermediate computations need to be stored temporarily on the stack so they can be used later on.
In fact, it answers my question perfectly and completely =)
Unlike all the cruft telling me "nooo don't ask that question". >_<
Like if you have a method, and you change it by increasing the number of local variables but make no other changes, does it make the method slower? Here's an example:
void makeWindow() {
Display
.getContext()
.windowBuilder()
.setSize(800, 600)
.setBalloonAnimal(BalloonAnimal.ELDER_GOD.withColor(PUCE))
.build();
}
or
void makeWindow() {
DisplayContext dc = Display.getContext();
WindowBuilder wb = db.windowBuilder();
BalloonAnimal god = BalloonAnimal.ELDER_GOD;
BalloonAnimal puceGod = god.withColor(PUCE);
wb.setSize(800, 600).setBalloonAnimal(puceGod).build();
}
Another example:
int beAnExample(int quiche) {
return areGlobalsEvil?
quiche * TAU/5:
highway(quiche, Globals.frenchFrenchRevolution);
}
or
int beAnExample(int quiche) {
if (areGlobalsEvil) {
int to5 = TAU/5;
int result = quiche * to5;
return result;
} else {
Game french = Globals.frenchFrenchRevolution;
int result = highway(quiche, french);
return result;
}
}
Really, what I'm asking is: Is the number of this sort of local variable even relevant by the time the method's compiled to bytecode? If so, what about once Hotspot gets to work on it?
This question is relevant to the code generator I'm working on.
The easy answer is no. Local variables consume runtime stack space. Allocating space for them only marginally increases the number of instructions. Your examples would make little difference since intermediate computations need to be stored temporarily on the stack so they can be used later on. Focus more on the readability of your programs rather than needless micro-optimizations.
If you're interested in looking at the actual bytecode of a class, investigate the javap program.
Don't worry about it. The compiler can do all sorts of crazy, make-your-head-asplode optimizations. Start with code that's correct and maintainable. Programmer time is worth far more than processor tiem.
Test it by running each method 1,000,000 times and divide the total time to calculate the cost per execution. In all likelihood, it won't be noticable.
Actually, Java compilers may even be smart enough to just compile it out.
Write your code for readability to reduce long term cost of maintenance. Then tune it in the 5% of places where you really need to.
The chances are that it will make little (if any) difference, and the "little" will be insignificant.
Focus on making your generator correct and maintainable, and let the Java compiler (particularly the JIT compiler) do the micro-optimization of the generated code.
Note that #Edawg's advice on looking at the bytecode is not necessarily helpful. The JIT compiler aggressively optimizes the native code that it generates from the bytecodes. It can be difficult to predict which of two bytecode sequences is going to be faster. Certainly, counting bytecodes, method calls and so on can be misleading.
The only way to be sure that you are generating "optimal" Java source code would be to compile it and benchmark it on your target platform. Even then, there's a good chance that your efforts at source-code level optimization will be negated ... by JIT compiler changes.
This is not a hotspot issue. There may need to be additional byte codes to load and store the local variables, but let the compiler worry about optimizing that.
You should concentrate on issues like null pointer checking in your generated code and how to report errors in a meaningful way that is tied to the source input that you are code generating from.
Related
I have been pulled into a performance investigation of a code that is similar to the below:
private void someMethod(String id) {
boolean isHidden = someList.contains(id);
boolean isDisabled = this.checkIfDisabled(id);
if (isHidden && isDisabled) {
// Do something here
}
}
When I started investigating it, I was hoping that the compiled version would look like this:
private void someMethod(String id) {
if (someList.contains(id) && this.checkIfDisabled(id)) {
// Do something here
}
}
However, to my surprise, the compiled version looks exactly like the first one, with the local variables, which causes the method in isDisabled to always be called, and that's where the performance problem is in.
My solution was to inline it myself, so the method now short circuits at isHidden, but it left me wondering: Why isn't the Java Compiler smart enough in this case to inline these calls for me? Does it really need to have the local variables in place?
Thank you :)
First: the java compiler (javac) does almost no optimizations, that job is almost entirely done by the JVM itself at runtime.
Second: optimizations like that can only be done when there is no observable difference in behaviour of the optimized code vs. the un-optimized code.
Since we don't know (and the compiler presumably also doesn't know) if checkIfDisabled has any observable side-effects, it has to assume that it might. Therefore even when the return value of that method is known to not be needed, the call to the method can't be optimized away.
There is, however an option for this kind of optimization to be done at runtime: If the body (or bodies, due to polymorphism) of the checkIfDisabled method is simple enough then it's quite possible that the runtime can actually optimize away that code, if it recognizes that the calls never have a side-effect (but I don't know if any JVM actually does this specific kind of optimization).
But that optimization is only possible at a point where there is definite information about what checkIfDisabled does. And due to the dynamic class-loading nature of Java that basically means it's almost never during compile time.
Generally speaking, while some minor optimizations could possibly be done during compile time, the range of possible optimizations is much larger at runtime (due to the much increased amount of information about the code available), so the Java designers decided to put basically all optimization effort into the runtime part of the system.
The most-obvious solution to this problem is simply to rewrite the code something like this:
if (someList.contains(id)) {
if (this.checkIfDisabled(id)) {
// do something here
If, in your human estimation of the problem, one test is likely to mean that the other test does not need to be performed at all, then simply "write it that way."
Java compiler optimizations are tricky. Most optimizations are done at runtime by the JIT compiler. There are several levels of optimizations, the maximum number of optimizations by default will be made after 5000 method invocations. But it is rather problematic to see which optimizations are applied, since JIT compile the code directly into the platform's native code
I am currently working on code that will have hundreds of thousands of iterations and want to know if modern Java compilers automatically handle intermediate values during optimization into assembly.
For instance, I have the following code in the loop (simplified):
arrayA[i] += doubleA*doubleB;
arrayB[i] += doubleA*doubleB;
Is a modern Java compiler 'intelligent' enough to store doubleA*doubleB into a multiplication register (and then proceed to read from the multiplication register for the second array, avoiding a second floating point operation)? Or, would I be better off with the following:
double product = doubleA*doubleB;
arrayA[i] += product;
arrayB[i] += product;
For the second option, I would primarily be concerned about the overhead of Java's garbage collector dealing with the product variable every single time it goes out of scope.
If you are running the code millions of times it is highly probable that the code will be JIT compiled. If you want to see the JIT output, and verify that it is being natively compiled you can enable that with a JVM flag (you will also have to compile a library beforehand (the library doesn't come pre-packaged due to licensing issues)).
When the JIT compiles code into native machine code it will usually perform optimizations on the code. There is also a flag which optimizes it more and more over time as the usages go up. It should be noted that JIT compilation won't usually occur until the function has been executed around 10,000 times, unfortunately there is no way to force the JIT to compile code at program launch. Presumably the JIT shouldn't have any overhead, it will probably compile the code in the background on another thread, and then inject the native code when it is finished (JIT compilation should still only take less than half a second).
As for the storing the result into a double, that won't have any negative performance impact. Also you don't need to worry about the GC for that, since it is a primitive type it is declared on the stack and popped off after the scope exits (the variable will be re-declared in the next loop iteration).
You'll practically never know what a jit does, but you can easily look at the bytecode with javap. If the javac/ide didn't optimize it, I won't presume the jit will. Just write good code, it easier on the eyes anyway.
I want to do the following
int sum = x+y;
sum = Math.max(sum,x);
but that line of code tends to take longer than
int sum = x+y;
if(x>sum)sum=x;
I hope this is not inappropriate to ask, but can someone explain why this is?
I already looked in the source code and all Java is doing is
return (a >= b) ? a : b;
Maybe because Java's Math class is being created for the first time like any other Singleton or something like that, because no one used it before that, like class loader operation.
Method calls aren't free (even ignoring the potential class load that Roey pointed out): They involve pushing the arguments and a return address on the stack, jumping to a different location in the code, popping the arguments off the stack, doing the work, pushing the result on the stack, jumping back, and popping the result off the stack.
However, I suspect you'd find that if you had a Math.max call in a hotspot in your code (a place that was run a LOT), Oracle's JVM's JIT would optimize it into an inline operation to speed it up. It won't bother if there doesn't seem to be any need, preferring speed of compilation of bytecode to machine code over optimization; but it's a two-stage compiler, where the second stage kicks in to more aggressively optimize hotspots it detects in the code.
microbenchmarking in Java is a very hard job in general. The example and statement cannot be generalized and as usual, the answer to your question is "it depends". ;). First of all, the source code in you see in the JDK implementation for Math.max is a default, which is not used at all on modern hardware. The compiler replaces this call with a CPU operation. Read more here.
This of course does not answer your question why your code is 'faster' now. Probably, it was not executed at all, because of dead code elimination, a compiler feature. Can you give us some surrounding code? Details about how often it is called is useful as well. Details about hardware also. Very important as well: Disable all power save features and all background tasks if you do 'measurements'. Best is to use something like JMH
Cheers Benni
Consider the following method:
private static long maskAndNegate(long l) {
int numberOfLeadingZeros = Long.numberOfLeadingZeros(l)
long mask = CustomBitSet.masks[numberOfLeadingZeros];
long result = (~l) & mask;
return result;
}
The method can be abbreviated to:
private static long maskAndNegate(long l) {
return (~l) & CustomBitSet.masks[Long.numberOfLeadingZeros(l)];
}
Are these two representations equal in actual run time? In other words, does the Java compiler optimize away the unnecessary definition of extra variables, that I've placed for readability and debugging?
The Java compiler itself hardly does any optimization. It's the JIT that does almost everything.
Local variables themselves are somewhat irrelevant to optimization though - a multi-operator expression still needs the various operands logically to go on the stack, just in unnamed "slots". You may well find that the generated bytecode for your two implementations if very similar, just without the names in the second case.
More importantly, any performance benefit that might occur very occasionally from reducing the number of local variables you use is almost certainly going to be insignificant. The readability benefit of the first method is much more likely to be significant. As always, avoid micro-optimizing without first having evidence that the place you're trying to optimize is a bottleneck, and then only allow optimizations which have proved their worth.
(By the time you've proved you need to optimize a particular method, you'll already have the tools to test any potential optimization, so you won't need to guess.)
The code is not significantly large enough to be optimized for starters. And in the second way you are just saving the memory used for storing references to numberOfLeadingZeros and all.
But when you will use this code significantly enough on runtime such as 10000 times at least, then JIT will identify it as HOT code and then optimize it with neat tricks such as Method Inlining and similar sorts.
But in your case preferable option is first one as it is more readable.
You should not compromise Readability for small optimizations.
double calcTaxAmount() {
double price = getA() * getB() + getC();
double taxRate = getD() + getE();
return price * taxRate;
}
The function above calculates the amount of tax payment.
The price and the rate are calculated by calling some other functions.
I introduced two local variables price and taxRate to just improve code readability, so both will be used only once.
Will those kinds of "one-time" local variables be substituted and inlined at compile time with most modern compilers?
Obviously, it depends on the compiler. Quite a few compilers are actually brain-dead when it comes to optimization, because they are dealing with dynamic languages which are complex enough that most optimizations are invalid and many others are only safe if very restrictive conditions are met (for instance, any function call could have nearly any effect). For instance, all Python implementations feature a compiler, but most of them only do very few peephole optimizations, which may not be sufficient to eliminate all overhead.
That said, if you're talking about statically typed languages (which your example hints at), then usually yes. Liveness analysis can detect the equivalence (you still need a storage location for, but the lifetime is the same), and any reasonably register allocator can avoid spilling the values needlessly.
That said, this is a really bad focus for optimization. If you actually want to make stuff faster, look at the final code and profile with realistic scenarios. And if you're going to micro-optimize, apply some common sense. Even assuming this function is a hotspot, the actual computation and getting the values may easily take 100x more time. A non-inlined function call takes pretty long compared to a stack store, and a cache miss is also pretty cost at this level.
Generally yes.
Java only optimises code to native code after it has been called many time (10,000 time by default) If the method is not calls very much it won't make much difference in any case.
Even if it makes a difference of say 1 ns each, you would need to call this method 1 billion times to add a delay of 2 seconds. If its only 10 million times you are unlikely to notice the difference.
As long as the compiler can prove that they aren't aliased and modified externally, the compiler should be able to optimize them away (and I suspect that it can determine that here).
If you make them const I can't think of a compiler that couldn't optimize that.
All that said, this sounds like premature optimization and I wouldn't change the code even if it's fractionally slower because it adds to clarity.
Depends entirely on the compiler for C. Presumably yes for current compilers with proper optimization options turned on.
For Java it will not be optimized by the compiler (javac), but it may get optimized by the JIT when the code actually executes.
That beeing said, those local variables add very little overhead anyway. If the compiler decides to optimize the expressions to the equivalent of:
return (getA() * getB() + getC()) * (getD() + getE());
It will still require some form of temporary storage (stack or register) to store the intermediate results of the subexpressions. So it shouldn't make much of a difference anyway.
I wouldn't worry about it and go with what offers better readability.