Output timing problem - java

the following code:
String str1="asdfavaxzvzxvc";
String str2="werwerzsfaasdf";
Object c=str1;
Object d=str2;
System.out.println(c);
long time1=System.currentTimeMillis();
for(int i=0;i<1000000000;i++){
if(c.equals(d)){
//System.out.println("asfasdfasdf"); // line 9
}
}
long time2=System.currentTimeMillis();
System.out.println("time taken in this is "+(time2-time1));
When I uncomment the line 9, that is let print if condition is true, though never it is going to happen since both object are not equal , then it takes 5000+ milli-seconds, and to my surprise by just commenting it takes only 5 milli-seconds, I am not getting the reason, why it takes so much time if it is not commented, since it's never going to get executed...
Is this some sort of branch prediction effect ? or any sort of compiler optimization

The compiler optimizes away dead code — in this case, the entire loop is removed. This might be done by the bytecode compiler (e.g. javac) or, more likely, by HotSpot's JIT.
Why does it still take a whopping 5 ms for this to execute? It doesn't necessarily take all that long. Instead, you might be hitting the resolution limit on System.currentTimeMillis(). Try it with System.nanoTime() instead. FWIW, using nanoTime() agrees with currentTimeMillis() on my Windows system.
You might be interested in reading How do I write a correct micro-benchmark in Java? and Is stopwatch benchmarking acceptable?
Further reading
White Paper: The Java HotSpot Performance Engine Architecture
HotSpot Home Page

The compiler will optimise the entire loop away, because it has no observable side-effects.

When the Java "Compiler" compiles your code, it does some optimizing on it. Empty if-clauses are deleted, so you just have a long for-loop, which is pretty fast.
But since the "Compiler" doesn't that the if is always false and the code in the clause is never executed, it test's it every single time. That takes much longer.

This is interesting.
It can't be compile-time optimization. The compiler can't just remove the whole loop body, because there's a call to the equals() method inside the loop. The compiler can't assume that the method will have no side effects, and that it always returns the same result.
But it is possible for the JIT compiler to make those optimizations at run-time. So that's probably what's happening.

Related

Is it better to use the same int for every for loop?

In a program where I have, say, 50 for loops that run every so often; there are 50 ints being created every time they are run and then consuming memory for each of them.If I declare an int i at the beginning of every class and then reuse it in each for loop, there would now be, say, 5 ints i (i = number of classes) created and then reassigned a value of 0 in each loop, saving space by not flooding the heap with is.Would that help the program's performance or it just wouldn't be too much of a difference? Does heap memory works as I think it works in this case?
Reusing simple, small, short living ints for multiple loops is a code smell - don't do it.
If your loops are in the same method, many loops might be a code smell on it's own. Probably you should split the method into smaller chunks of code which can be more easily tested, independent from each other.
Counters in loops are almost always independent from each other. An exception might be a loop which counts up and the next one is counting down, where the initial value is depending on the last value of the former loop. If you reinitialize your counter as the very first statement of a for loop, which is the case for 99% of the loops I have seen so far, they are independent from each other and the declaration of the counter variable should reflect that. It's an important kind of documenting your code.
Reusing a counter only documents that there are some coders with wrong ideas of optimization.
Now the worser idea is to define some counters on class level (some, because you have nested loops) and call the same counter from different methods. Wow! You woulnd't consider that, would you? That would not only avoid making the class thread save - you couldn't call one method from another one without shooting yourself into both feet.
Think of the case where you have a method with a for loop, which you are going to refactor for some reason. Maybe a more elegant solution is possible with Java9, but it has a for-loop with some i, and the i is initialized at the top of the for loop for (i=0; ... and now you look where it is coming from, and realize, it is not declared in that method, but an instance variable! And the class is 500 lines long with, let's say 20 methods, each on average 25 lines long. Many of them using that same i. Now you have to investigate, which method is called by your method, which method calls your method and might depend on the value of i which you end with, which can be run in parallel. You're doomed! Instead of 25 lines of code, you have to have 500 lines of code in your head, to reason about, what is going on.
The scope of local variables should be as narrow as possible, as long as there is no good reason to reuse it. The less lines a variable is in scope, the easier it is to understand some code.
You have always to look where a variable is initialized to find it's current value, where it is declared, to find it's type (is it a long, an int, a byte, a short?), where it is reused later. Counters are almost never reused. Make it easy to reason about your code by declaring them just in the for loops head, so everybody knows, it has no influence on later code.
Well, and use the simplified for loop where it makes sense, to prevent the ugly off-by-one errors. Then you don't have to reason about a few bytes being used.
But in the end, I admit, for a beginner, it is a reasonable question to come up with. I had the idea in former times, too. :)
I don't see why you'd need to have that many loops in your main method. If those integers are only being used as iterators and you really have that many loops in your program you should probably arrange them into specific functions/methods to perform whatever tasks these loops are doing.
By doing so you'd have local variables being used as iterators for each one of those methods and the memory used up for these variables would be freed up when the method finishes it's execution.
No it doesn't matter really as compilers these days are very powerful and the difference is negligible, but the main issue here is do you really need to set 'i' value on top, as your code grows and you start using 'i' in different/complex scenarios you might start seeing 'i' behaving differently and will have no clue why its happening and its pain to debug the whole thing to find what caused the issue
If you really want to know the difference you can refer to this link which talks about almost the same scenario your talking about Difference between declaring variables before or in loop?

Method optimization

Which one would be faster and why?
public void increment{
synchronized(this){
i++;
}
}
or
public synchronized void increment(){
I++;
}
Would method inlining improve the second option?
The difference is unlikely to matter or be visible but the second example could be faster. Why? In Oracle/OpenJDK, the less byte code a method uses the more it can be optimised e.g. inlined.
There is a number of thresholds for when a method could be optimised and these thresholds are based on the number of bytes of byte code. The second example uses less byte code so it is possible it could be optimise more.
One of these optimisations is escape analysis which could eliminate the use of synchronized for thread local object. This would make it much faster. The threshold for escape analysis is method which are under 150 bytes (after inlining) by default. You could see cases where the first solution makes the method just over 150 bytes and the second is just under 150 bytes.
NOTE:
don't write code on this basis, the different it too trivial to matter. Code clarity is far, far more important.
if performance does matter, using an alternative like AtomicInteger is likely to be an order of magnitude faster. (consistently, not just in rare cases)
BTW AFAIK the concurrency libraries use little assert statements, not because they are not optimised away but because they still count to the size of the byte code and indirectly slow down their use by meaning in some cases the code isn't as optimal. (this claim of mine should have a reference, but I couldn't find it)

Java, optimal calling of objects and methods

Lets say I have the following code:
private Rule getRuleFromResult(Fact result){
Rule output=null;
for (int i = 0; i < rules.size(); i++) {
if(rules.get(i).getRuleSize()==1){output=rules.get(i);return output;}
if(rules.get(i).getResultFact().getFactName().equals(result.getFactName())) output=rules.get(i);
}
return output;
}
Is it better to leave it as it is or to change it as follows:
private Rule getRuleFromResult(Fact result){
Rule output=null;
Rule current==null;
for (int i = 0; i < rules.size(); i++) {
current=rules.get(i);
if(current.getRuleSize()==1){return current;}
if(current.getResultFact().getFactName().equals(result.getFactName())) output=rules.get(i);
}
return output;
}
When executing, program goes each time through rules.get(i) as if it was the first time, and I think it, that in much more advanced example (let's say as in the second if) it takes more time and slows execution. Am I right?
Edit: To answer few comments at once: I know that in this particular example time gain will be super tiny, but it was just to get the general idea. I noticed I tend to have very long lines object.get.set.change.compareTo... etc and many of them repeat. In scope of whole code that time gain can be significant.
Your instinct is correct--saving intermediate results in a variable rather than re-invoking a method multiple times is faster. Often the performance difference will be too small to measure, but there's an even better reason to do this--clarity. By saving the value into a variable, you make it clear that you are intending to use the same value everywhere; if you re-invoke the method multiple times, it's unclear if you are doing so because you are expecting it to return different results on different invocations. (For instance, list.size() will return a different result if you've added items to list in between calls.) Additionally, using an intermediate variable gives you an opportunity to name the value, which can make the intention of the code clearer.
The only different between the two codes, is that in the first you may call twice rules.get(i) if the value is different one one.
So the second version is a little bit faster in general, but you will not feel any difference if the list is not bit.
It depends on the type of the data structure that "rules" object is. If it is a list then yes the second one is much faster as it does not need to search for rules(i) through rules.get(i). If it is a data type that allows you to know immediately rules.get(i) ( like an array) then it is the same..
In general yes it's probably a tiny bit faster (nano seconds I guess), if called the first time. Later on it will be probably be improved by the JIT compiler either way.
But what you are doing is so called premature optimization. Usually should not think about things that only provide a insignificant performance improvement.
What is more important is the readability to maintain the code later on.
You could even do more premature optimization like saving the length in a local variable, which is done by the for each loop internally. But again in 99% of cases it doesn't make sense to do it.

Java: how much time does an empty loop use?

I am trying to test the speed of autoboxing and unboxing in Java, but when I try to compare it against an empty loop on a primitive, I noticed one curious thing. This snippet:
for (int j = 0; j < 10; j++) {
long t = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++)
;
t = System.currentTimeMillis() - t;
System.out.print(t + " ");
}
Every time I run this, it returns the same result:
6 7 0 0 0 0 0 0 0 0
Why does the first two loops always take some time, then the rest just seem to be skipped by the system?
In this answer to this post, it is said that Just-In-Time compilation will be able to optimize this away. But if so, why the first two loops still took some time?
JIT triggers AFTER a certain piece of code has been executed many times.
The HotSpot JVM will try to identify "hot spots" in your code. Hot spots are pieces of your code that are executed many many times. To do this, the JVM will "count" the executions of various instructions, and when it determines a certain piece is executed frequently, it will trigger the JIT. (this is an approximation, but it's easy to understand explained this way).
The JIT (Just-In-Time) takes that piece of code, and tries to make it faster.
The techniques used by the JIT to make your code run faster are a lot, but the one that most commonly creates confusion are :
It will try to determine if that piece of code uses variables that are not used anywhere else (useless variables), and remove them.
If you acquire and release the same lock multiple times (like calling synchronized methods of the same object), it can acquire the lock once and do all the calls in a single synchronized block
If you access members of an object that are not declare volatile, it can decide to optimize it (placing values in registers and similar), creating strange results in multi-threading code.
It will inline methods, to avoid the cost of the call.
It will translate bytecode to machine code.
If the loop is completely useless, it could be completely removed.
So, the proper answer to your question is that an empty loop, after being JITed, takes no time to execute .. most probably is not there anymore.
Again, there are many other optimizations, but in my experience these are among those that have created most headaches.
Moreover, JIT is being improved in any new version of Java, and sometimes it is even a bit different depending on the platform (since it is to some extent platform specific). Optimizations done by the JIT are difficult to understand, because you cannot usually find them using javap and inspecting bytecode, even if in recent versions of Java some of these optimizations have been moved to the compiler directly (for example, since Java 6 the compiler is able to detect and warn about unused local variables and private methods).
If you are writing some loops to test something, it is usually good practice to have the loop inside a method, call the method a few times BEFORE timing it, to give it a "speed up" round, and then perform the timed loop.
This usually triggers the JIT in a simple program like yours, even if there is no guarantee that it will actually trigger (or that it even exists on a certain platform).
If you want to get paranoid about JIT or non JIT timing (I did): make a first round, timing each execution of the loop, and wait until the timing stabilize (for example, difference from the average less than 10%), then start with your "real" timing.
The JIT doesn't kick in on a chunk of code until it determines that there is some benefit to doing so. That means the first few passes through some code won't be JITed.

Java execution speed

I'm new to Java programming.
I am curious about speed of execution and also speed of creation and distruction of objects.
I've got several methods like the following:
private static void getAbsoluteThrottleB() {
int A = Integer.parseInt(Status.LineToken.nextToken());
Status.AbsoluteThrottleB=A*100/255;
Log.level1("Absolute Throttle Position B: " + Status.AbsoluteThrottleB);
}
and
private static void getWBO2S8Volts() {
int A = Integer.parseInt(Status.LineToken.nextToken());
int B = Integer.parseInt(Status.LineToken.nextToken());
int C = Integer.parseInt(Status.LineToken.nextToken());
int D = Integer.parseInt(Status.LineToken.nextToken());
Status.WBO2S8Volts=((A*256)+B)/32768;
Status.WBO2S8VoltsEquivalenceRatio=((C*256)+D)/256 - 128;
Log.level1("WideBand Sensor 8 Voltage: " + Double.toString(Status.WBO2S8Volts));
Log.level1("WideBand Sensor 8 Volt EQR:" + Double.toString(Status.WBO2S8VoltsEquivalenceRatio));
Would it be wise to create a separate method to process the data since it is repetative? Or would it just be faster to execute it as a single method? I have several of these which would need to be rewritten and I am wondering if it would actually improve speed of execution or if it is just as good, or if there is a number of instructions where it becomes a good idea to create a new method.
Basically, what is faster or when does it become faster to use a single method to process objects versus using another method to process several like objects?
It seems like at runtime, pulling a new variable, then performing a math operation on it is quicker then creating a new method and then pulling a varible then performing a math operation on it. My question is really where the speed is at..
These methods are all called only to read data and set a Status.Variable. There are nearly 200 methods in my class which generate data.
The speed difference of invoking a piece of code inside a method or outside of it is negligible. Specially compared with using the right algorithm for the task.
I would recommend you to use the method anyway, not for performance but for maintainability. If you need to change one line of code which turn out to introduce a bug or something and you have this code segment copy/pasted in 50 different places, it would be much harder to change ( and spot ) than having it in one single place.
So, don't worry about the performance penalty introduced by using methods because, it is practically nothing( even better, the VM may inline some of the calls )
I think S. Lott's comment on your question probably hits the nail perfectly on the head - there's no point optimizing code until you're sure the code in question actually needs it. You'll most likely end up spending a lot of time and effort for next to no gain, otherwise.
I'll also second Support's answer, in that the difference in execution time between invoking a separate method and invoking the code inline is negligible (this was actually what I wanted to post, but he kinda beat me to it). It may even be zero, if an optimizing compiler or JIT decides to inline the method anyway (I'm not sure if there are any such compilers/JITs for Java, however).
There is one advantage of the separate method approach however - if you separate your data-processing code into a separate method, you could in theory achieve some increased performance by having that method called from a separate thread, thus decoupling your (possibly time-consuming) processing code from your other code.
I am curious about speed of execution and also speed of creation and destruction of objects.
Creation of objects in Java is fast enough that you shouldn't need to worry about it, except in extreme and unusual situations.
Destruction of objects in a modern Java implementation has zero cost ... unless you use finalizers. And there are very few situations that you should even think of using a finalizer.
Basically, what is faster or when does it become faster to use a single method to process objects versus using another method to process several like objects?
The difference is negligible relative to everything else that is going on.
As #S.Lott says: "Please don't micro-optimize". Focus on writing code that is simple, clear, precise and correct, and that uses the most appropriate algorithms. Only "micro" optimize when you have clear evidence of a critical bottleneck.

Categories