Why is Math.max so expensive in Java? - java

I want to do the following
int sum = x+y;
sum = Math.max(sum,x);
but that line of code tends to take longer than
int sum = x+y;
if(x>sum)sum=x;
I hope this is not inappropriate to ask, but can someone explain why this is?
I already looked in the source code and all Java is doing is
return (a >= b) ? a : b;

Maybe because Java's Math class is being created for the first time like any other Singleton or something like that, because no one used it before that, like class loader operation.

Method calls aren't free (even ignoring the potential class load that Roey pointed out): They involve pushing the arguments and a return address on the stack, jumping to a different location in the code, popping the arguments off the stack, doing the work, pushing the result on the stack, jumping back, and popping the result off the stack.
However, I suspect you'd find that if you had a Math.max call in a hotspot in your code (a place that was run a LOT), Oracle's JVM's JIT would optimize it into an inline operation to speed it up. It won't bother if there doesn't seem to be any need, preferring speed of compilation of bytecode to machine code over optimization; but it's a two-stage compiler, where the second stage kicks in to more aggressively optimize hotspots it detects in the code.

microbenchmarking in Java is a very hard job in general. The example and statement cannot be generalized and as usual, the answer to your question is "it depends". ;). First of all, the source code in you see in the JDK implementation for Math.max is a default, which is not used at all on modern hardware. The compiler replaces this call with a CPU operation. Read more here.
This of course does not answer your question why your code is 'faster' now. Probably, it was not executed at all, because of dead code elimination, a compiler feature. Can you give us some surrounding code? Details about how often it is called is useful as well. Details about hardware also. Very important as well: Disable all power save features and all background tasks if you do 'measurements'. Best is to use something like JMH
Cheers Benni

Related

variable declaration inside/outside loop

If for some reason I have to use as little memory space as possible, then is the second code bellow better that the first? (the code is just for illustration and doesn't have any meaning). (edit: imagine I want to make the assembly code of this before any optimization done by the JVM. Then do i use 99999998 extra memory locations in the first procedure compared to the second one? the focus is just one memory usage)
First:
for(int i=0; i<99999999; i++){
int k=2*i
}
Second:
int k=0;
for(int i=0; i<99999999; i++){
k=2*i
}
What I'm sure of :
In every case, The difference will not be visible. if you want to make such small optimization, Java is surely not the best technology. Which makes me recommend the first one cause it make the code more readable and logical. (Find it strange to declare a variable outside the for if you use it only inside it, it's confusing).
What I think :
In your small example and since your looking for a memory footprint. The first one is better because it follow the implicit rule mentioned in this comment: the smaller the scope is the better it is.
In the first case the variable k is used only in a really small loop. So the optimizer will easily understand it and use only a register, so no memory usage and less instructions.
In the second case, it will be harder for the optimizer to determine that k is not use elsewhere. So it could allow some memory instead of using a register. It will then use some memory and be less optimized since it will need instructions to load and store the memory.
As mentionned in this comment, it will mostly depend on how you use it. In your example the optimizer will detect it's the same usage in both case and will use no memory. But in harder code it will not always find it. So I recommend to have the smaller scope has possible.
#pranay-khandelwal 's response links to an excellent discussion of this question in a different language. The Java JVM, however, throws bytecode around at runtime to try to achieve better performance, which can complicate things.
There's actually another discussion here on a similar topic of in/out best practices in general for readability, which resulted in some benchmarks and discussion that one of the participants documented here
As a general rule of thumb the second option will be better for memory and performance under almost all circumstances - where the former may be more self-documenting and maintainable, and avoid accidental use elsewhere in the outer scope.
Since you mention that the code in your post is only a representative example and this could apply to more than just simple types:
Replacing the contents of an already registered memory area is less costly than registering a new one and deregistering the old one (or waiting for it to be garbage collected). Even where a language's compiler or interpreter smartly uses recently unreferenced things to store new things, that work also takes work, and is overhead that can optimistically be avoided with outer declarations - though as others mention, this is usually an unnecessary and potentially bug-spawning micro-optimization.
As from these short examples you provided, the second option. However, it always depends on the logic of your code.
Thinking about performance and minimising execution space and time, the second code scales better even though it looks countering some good coding practices.
K in your code is used only inside the loop block. However, it is "reused" over multiple "block iterations". Take a look at the syntax of your for loop, it declares i (int i) in the beginning of the statement; this declaration will happen just once. Again, declaring the variable multiple times may lead to waste to time and memory.
The JVM optimiser might do a good job in general, simple cases. However, it might fail in capturing the semantics of your code (Java).
for(int i=0,k=0; i<99999999; i++){
k=2*i
}

Does adding local variables to methods make them slower?

This question has received a total of several paragraphs of answer. Here is the only sentence that actually tells me what I was looking for:
Your examples would make little difference since intermediate computations need to be stored temporarily on the stack so they can be used later on.
In fact, it answers my question perfectly and completely =)
Unlike all the cruft telling me "nooo don't ask that question". >_<
Like if you have a method, and you change it by increasing the number of local variables but make no other changes, does it make the method slower? Here's an example:
void makeWindow() {
Display
.getContext()
.windowBuilder()
.setSize(800, 600)
.setBalloonAnimal(BalloonAnimal.ELDER_GOD.withColor(PUCE))
.build();
}
or
void makeWindow() {
DisplayContext dc = Display.getContext();
WindowBuilder wb = db.windowBuilder();
BalloonAnimal god = BalloonAnimal.ELDER_GOD;
BalloonAnimal puceGod = god.withColor(PUCE);
wb.setSize(800, 600).setBalloonAnimal(puceGod).build();
}
Another example:
int beAnExample(int quiche) {
return areGlobalsEvil?
quiche * TAU/5:
highway(quiche, Globals.frenchFrenchRevolution);
}
or
int beAnExample(int quiche) {
if (areGlobalsEvil) {
int to5 = TAU/5;
int result = quiche * to5;
return result;
} else {
Game french = Globals.frenchFrenchRevolution;
int result = highway(quiche, french);
return result;
}
}
Really, what I'm asking is: Is the number of this sort of local variable even relevant by the time the method's compiled to bytecode? If so, what about once Hotspot gets to work on it?
This question is relevant to the code generator I'm working on.
The easy answer is no. Local variables consume runtime stack space. Allocating space for them only marginally increases the number of instructions. Your examples would make little difference since intermediate computations need to be stored temporarily on the stack so they can be used later on. Focus more on the readability of your programs rather than needless micro-optimizations.
If you're interested in looking at the actual bytecode of a class, investigate the javap program.
Don't worry about it. The compiler can do all sorts of crazy, make-your-head-asplode optimizations. Start with code that's correct and maintainable. Programmer time is worth far more than processor tiem.
Test it by running each method 1,000,000 times and divide the total time to calculate the cost per execution. In all likelihood, it won't be noticable.
Actually, Java compilers may even be smart enough to just compile it out.
Write your code for readability to reduce long term cost of maintenance. Then tune it in the 5% of places where you really need to.
The chances are that it will make little (if any) difference, and the "little" will be insignificant.
Focus on making your generator correct and maintainable, and let the Java compiler (particularly the JIT compiler) do the micro-optimization of the generated code.
Note that #Edawg's advice on looking at the bytecode is not necessarily helpful. The JIT compiler aggressively optimizes the native code that it generates from the bytecodes. It can be difficult to predict which of two bytecode sequences is going to be faster. Certainly, counting bytecodes, method calls and so on can be misleading.
The only way to be sure that you are generating "optimal" Java source code would be to compile it and benchmark it on your target platform. Even then, there's a good chance that your efforts at source-code level optimization will be negated ... by JIT compiler changes.
This is not a hotspot issue. There may need to be additional byte codes to load and store the local variables, but let the compiler worry about optimizing that.
You should concentrate on issues like null pointer checking in your generated code and how to report errors in a meaningful way that is tied to the source input that you are code generating from.

Will the Java optimizer remove parameter construction for empty method calls?

Suppose I have code like :
log.info("Now the amount" + amount + " seems a bit high")
and I would replace the log method with a dummy implementation like :
class Logger {
...
public void info() {}
}
Will the optimizer inline it and will the dead code removal remove the parameter construction if no side effects are detected?
I would guess that the compiler (javac) will not, but that the just in time compiler very likely will.
For it to work however, it has to be able to deduct that whatever instructions you use to generate the parameter have no side effects. It should be able to do that for the special case of strings, but it might not for other methods.
To be sure, make a small benchmark that compares this with the code that Jesper suggested and see, whichever is faster or if they are equally fast.
Also see here: http://www.ibm.com/developerworks/java/library/j-jtp12214/index.html#3.0
The important answer is: it might do.
I don't think you should be relying on any particular behaviour of the optimiser. If your code runs acceptably fast without optimisation, then you may or may not get a performance boost, but the good news is your solution will be fine regardless. If performance is dreadful when non-optimised, it's not a good solution.
The problem with relying on the optimiser is that you're setting yourself up to conform to an invisible contract that you have no idea of. Hotspot will perform differently in other JDK/JRE versions, so there's no guarantee that just because it runs fine on your exact JVM, it'll run fine elsewhere. But beyond that, the exact optimisations that take place may depend on environmental issues such as the amount of free heap, the amount of cores on the machine, etc.
And even if you manage to confirm it works fine in your situation right now, you've just made your codebase incredibly unstable. I know for a fact that one of the optimisations/inlinings that Hotspot does depends on the number of subclasses loaded and used for a non-final class. If you start using another library/module which loads a second implementation of log - BANG, the optimisation gets unwound and performance is terrible again. And good luck working out how adding a third party library to your classpath toasts your app's internal performance...
Anyway, I don't think you're asking the real question. In the case you've described, the better solution is not to change the info method's implementation, but change the calls to no-ops (i.e. comment them out). If you want to do this across a whole slew of classes at once, at compile time, you can use a type of IFDEF like so:
public class Log
{
public static final boolean USE_INFO = true;
public void info()
{
...
}
...
}
and then in your class:
if (Log.USE_INFO)
{
log.info("Now the amount" + amount + " seems a bit high");
}
Now the compiler (javac, not Hotspot) can see that the boolean condition is constant and will elide it when compiling. If you set the boolean flag to false and recompile, then all of the info statements will be stripped from the file completely as javac can tell that they will never be called.
If you want to be able to enable/disable info logging at runtime, then instead of a constant boolean flag you'll need to use a method and here's there's nothing for it but to call the method every time. Depending on the implementation of the method, Hotspot may be able to optimise the check but as I mentioned above, don't count on it.

does a Java getter incur a performance penalty

if i have the code
int getA(){
return a;
}
and then do something like
int b = obj.getA();
instead of
int b = obj.a;
will that mean that the stack will have to be pushed and popped ultimately slowing down my code?
The JIT compiler will inline the method.
The code should look like
int b = obj.GetA();
I have two answers for you:
I don't think that there is a significant performance penalty for using the getter vs accessing the variable directly. I would worry more about how understandable and readable the code is than performance for this sort of decision.
According to OO design principles, which may or may not be important to you, you would normally hide the data and provide the getter method to access it—there is a detailed discussion on the merits of this approach here.
Theoretically there is some runtime penalty, due to a method call being made. In reality, this has very little effect on the overall performance due to two reasons:
Unless the obj.getA() is taking place inside the inner-most loop of your program, then its effect on the overall performance of your code will be negligible. When performance is an issue you should consider the bottleneck of your code. There's no point in optimizing code that is out not at these hot spots. In order to identify these spots you need to analyze the execution of your code via a profiler.
As #Michael was saying the JVM uses a "Just In Time" compiler/optimizer that inlines code based on actual execution. It performs just this kind of optimizations (see this talk)

How would you find out if a machine’s stack grows up or down in memory? (JAVA)

I have a C program to check whether the machine stack is growing up or down in memory in.
It goes like that :
#include <stdio .h>
void sub(int *a) {
int b;
if (&b > a) {
printf("Stack grows up.");
} else {
printf("Stack grows down.");
}
}
main () {
int a;
sub(&a);
}
Now i want to do the same in Java. :-)
Any one knows a solution without writing any native code ???
Thanks
If you're not doing any native code, then I can't imagine a situation where it could possibly matter in pure Java code. After all, the Java stack might be allocated in any direction at all instead of being a strictly contiguous block of memory (like the machine stack).
Java source code compiles to Java byte-code which is an assembly like language that runs on the JVM. JVM is a virtual machine and so it will look exactly the same by definition both on machines that use stack-up and stack-down.
Because of this it is not possible to know whether on a specific machine stack grows up or down from Java code.
This cannot be done in Java code. It cannot be done in C code either. The code you posted invokes undefined behavior (&b > a). According to the standard, the result of comparing two pointers is undefined unless the pointers point to elements within the same array. The standard says nothing about the direction of stack growth or whether a stack even exists.
woah, you will not be able to get any usefull information out of such simple code in Java, least not that I know of.
The code you have makes a lot of assumptions that, even in C actually, may or may not be true. It will depend on the platform and OS that is running your program.
In Java you will be completely dependent on the JVM's implementation for addressing and as such will not be able to do this.
My first answer would be to use a profiler. You can also create your own profiling agent using the API provided (JVMTI) for this purpose. It is a lot more complex certainly than your approach but you shouldbe able to get what you need.
There is also this page at IBM that can be of help.
This is pretty much all I have on the subject, I hope it will help you

Categories