Consider the following cases:
Case 1: (Less comments in for loop)
import java.io.IOException;
public class Stopwatch {
private static long start;
public static void main(String args[]) throws IOException {
start = System.currentTimeMillis();
for (int i = 0; i < 1000000000; i++) {
/**
* Comment Line 1
* Comment Line 2
* Comment Line 3
* Comment Line 4
*/
}
System.out.println("The time taken to execute the code is: " + (System.currentTimeMillis() - start)/1000.0);
}
}
The time taken to execute the code is: 2.259
Case 2: (More comments in for loop)
import java.io.IOException;
public class Stopwatch {
private static long start;
public static void main(String args[]) throws IOException {
start = System.currentTimeMillis();
for (int i = 0; i < 1000000000; i++) {
/**
* Comment Line 1
* Comment Line 2
* Comment Line 3
* Comment Line 4
* Comment Line 5
* Comment Line 6
* Comment Line 7
* Comment Line 8
*/
}
System.out.println("The time taken to execute the code is: " + (System.currentTimeMillis() - start)/1000.0);
}
}
The time taken to execute the code is: 2.279
Case 3: (No comments, empty for loop)
import java.io.IOException;
public class Stopwatch {
private static long start;
public static void main(String args[]) throws IOException {
start = System.currentTimeMillis();
for (int i = 0; i < 1000000000; i++) {
}
System.out.println("The time taken to execute the code is: " + (System.currentTimeMillis() - start)/1000.0);
}
}
The time taken to execute the code is: 2.249
Configuration: JDK 1.5, 3rd Gen i5, 4GB Ram.
Question: If we add more comments, does the program takes more time to execute? Why?
Question: If we add more comments, does the program takes more time to execute? Why?
No. Comments have no effect on execution.
They will slow the compiler down a tiny bit - but even that should be imperceptible unless you have a ridiculous number of comments.
The "effect" you're noticing is more to do with the way you're timing things - using System.currentTimeMillis for benchmarking is a bad idea; you should use System.nanos instead as it typically uses a higher-accuracy clock (suitable only for timing, not for determining the "wall clock" time). Additionally, typically benchmark programs should run their "target" code for long enough to warm up the JIT compiler etc before actually measuring. Then you need to consider other things which might have been running on your system at the same time. Basically there's a lot involved in writing a good benchmark. I suggest you look at Caliper if you're going to be writing any significant benchmarks in the future.
You can verify that there's no difference just due to the comments though - compile your code and then run
javap -c Stopwatch
and you can look at the bytecode. You'll see there's no difference between the different versions.
No, comments are ignored by compile so they cannot affect the time execution. The difference you get is very low. If you try your test 10 times you can see that the difference you get is in the range of statistical mistake.
Computer does a lot of tasks simultaneously. If you want to compare performance of 2 pieces of code that give similar execution time you need many experiments to prove that one of them is faster than other.
It may slow the compiling process down on an extremely minute level - all it's doing is reading the // or /* */ and skipping everything after that until a line break or end comment, respectively. If you want to get more accurate results, run each iteration 10 times, and get the average execution time for each one. Then, compare.
Adding comments will increase the time taken for compiling the program but only at a minute level. When compiling the compiler will have to read a line to know if it needs to skip (in case of a comment) or execute that line of code.
Related
This question already has answers here:
Do not use System.out.println in server side code
(9 answers)
Closed 6 years ago.
I've got a simple program that I got from my Java programming book, just added a bit to it.
package personal;
public class SpeedTest {
public static void main(String[] args) {
double DELAY = 5000;
long startTime = System.currentTimeMillis();
long endTime = (long)(startTime + DELAY);
long index = 0;
while (true) {
double x = Math.sqrt(index);
long now = System.currentTimeMillis();
if (now >= endTime) {
break;
}
index++;
}
System.out.println(index + " loops in " + (DELAY / 1000) + " seconds.");
}
}
This returns 128478180 loops in 5.0 seconds.
If I add System.out.println(x); before the if statement, then my number of loops in 5 seconds goes down to the 400,000s, is that due to latency in the System.out.println()? Or is it just that x was not being calculated when I wasn't printing it out?
Anytime you "do output" within a very-busy loop, in any programming language whatsoever, you introduce two possibly-very-significant delays:
The data must be converted to printable characters, then be written to whatever display/device it might be going to ... and ...
"The act of outputting anything" obliges the process to synchronize itself with any-and-every-other process that might also be generating output.
One alternative strategy that is often used for this purpose is a "trace table." This is an in-memory array, of some fixed size, which contains strings. Entries are added to this table in a "round-robin" fashion: the oldest entry is continuously being replaced by the newest one. This strategy provides a history without requiring output. (The only requirement that remains is that anyone which is adding an entry to the table, or reading from it, must synchronize their activities e.g. using a mutex.)
Processes which wish to display the contents of the trace-table should grab the mutex, make an in-memory copy of the content-of-interest, then release the mutex before preparing their output. In this way, the various processes which are contributing entries to the trace-table will not be delayed by I/O-associated sources of delay.
I've run into a really strange bug, and I'm hoping someone here can shed some light as it's way out of my area of expertise.
First, relevant background information: I am running OS X 10.9.4 on a Late 2013 Macbook Pro Retina with a 2.4GHz Haswell CPU. I'm using JDK SE 8u5 for OS X from Oracle, and I'm running my code on the latest version of IntelliJ IDEA. This bug also seems to be specific only to OS X, as I posted on Reddit about this bug already and other users with OS X were able to recreate it while users on Windows and Linux, including myself, had the program run as expected with the println() version running half a second slower than the version without println().
Now for the bug: In my code, I have a println() statement that when included, the program runs at ~2.5 seconds. If I remove the println() statement either by deleting it or commenting it out, the program counterintuitively takes longer to run at ~9 seconds. It's extremely strange as I/O should theoretically slow the program down, not make it faster.
For my actual code, it's my implementation of Project Euler Problem 14. Please keep in mind I'm still a student so it's not the best implementation:
public class ProjectEuler14
{
public static void main(String[] args)
{
final double TIME_START = System.currentTimeMillis();
Collatz c = new Collatz();
int highestNumOfTerms = 0;
int currentNumOfTerms = 0;
int highestValue = 0; //Value which produces most number of Collatz terms
for (double i = 1.; i <= 1000000.; i++)
{
currentNumOfTerms = c.startCollatz(i);
if (currentNumOfTerms > highestNumOfTerms)
{
highestNumOfTerms = currentNumOfTerms;
highestValue = (int)(i);
System.out.println("New term: " + highestValue); //THIS IS THE OFFENDING LINE OF CODE
}
}
final double TIME_STOP = System.currentTimeMillis();
System.out.println("Highest term: " + highestValue + " with " + highestNumOfTerms + " number of terms");
System.out.println("Completed in " + ((TIME_STOP - TIME_START)/1000) + " s");
}
}
public class Collatz
{
private static int numOfTerms = 0;
private boolean isFirstRun = false;
public int startCollatz(double n)
{
isFirstRun = true;
runCollatz(n);
return numOfTerms;
}
private void runCollatz(double n)
{
if (isFirstRun)
{
numOfTerms = 0;
isFirstRun = false;
}
if (n == 1)
{
//Reached last term, does nothing and causes program to return to startCollatz()
}
else if (n % 2 == 0)
{
//Divides n by 2 following Collatz rule, running recursion
numOfTerms = numOfTerms + 1;
runCollatz(n / 2);
}
else if (n % 2 == 1)
{
//Multiples n by 3 and adds one, following Collatz rule, running recursion
numOfTerms = numOfTerms + 1;
runCollatz((3 * n) + 1);
}
}
}
The line of code in question has been commented in with all caps, as it doesn't look like SO does line numbers. If you can't find it, it's within the nested if() statement in my for() loop in my main method.
I've run my code multiple times with and without that line, and I consistently get the above stated ~2.5sec times with println() and ~9sec without println(). I've also rebooted my laptop multiple times to make sure it wasn't my current OS run and the times stay consistent.
Since other OS X 10.9.4 users were able to replicate the code, I suspect it's due to a low-level bug with the compliler, JVM, or OS itself. In any case, this is way outside my knowledge. It's not a critical bug, but I definitely am interested in why this is happening and would appreciate any insight.
I did some research and some more with #ekabanov and here are the findings.
The effect you are seeing only happens with Java 8 and not with Java 7.
The extra line triggers a different JIT compilation/optimisation
The assembly code of the faster version is ~3 times larger and quick glance shows it did loop unrolling
The JIT compilation log shows that the slower version successfully inlined the runCollatz while the faster didn't stating that the callee is too large (probably because of the unrolling).
There is a great tool that helps you analyse such situations, it is called jitwatch. If it is assembly level then you also need the HotSpot Disassembler.
I'll post also my log files. You can feed the hotspot log files to the jitwatch and the assembly extraction is something that you diff to spot the differences.
Fast version's hotspot log file
Fast version's assembly log file
Slow version's hotspot log file
Slow version's assembly log file
http://s24.postimg.org/9y073weid/refactor_vs_non_refactor.png
Well here is the result for the execution time in nanoseconds for re-factored and non re-factored code for a simple addition operation. 1 to 5 are the consecutive runs of the code.
My intent was just to find out whether splitting up logic into multiple methods makes execution slow or not and here is the result which shows that yes there is considerable time that goes into just putting the methods on stack.
I invite people who have done some research on it before or want to investigate on this area to correct me if I am doing something wrong and draw some conclusive results out of this.
In my opinion yes code re-factoring does help in making code more structured and understandable but in time critical systems like real time game engines I would prefer not to re-factor.
Following was the simple code which I used:
package com.sim;
public class NonThreadedMethodCallBenchMark{
public static void main(String args[]){
NonThreadedMethodCallBenchMark testObject = new NonThreadedMethodCallBenchMark();
System.out.println("************Starting***************");
long startTime =System.nanoTime();
for(int i=0;i<900000;i++){
//testObject.method(1, 2); // Uncomment this line and comment the line below to test refactor time
//testObject.method5(1,2); // uncomment this line and comment the above line to test non refactor time
}
long endTime =System.nanoTime();
System.out.println("Total :" +(endTime-startTime)+" nanoseconds");
}
public int method(int a , int b){
return method1(a,b);
}
public int method1(int a, int b){
return method2(a,b);
}
public int method2(int a, int b){
return method3(a,b);
}
public int method3(int a, int b){
return method4(a,b);
}
public int method4(int a, int b){
return method5(a,b);
}
public int method5(int a, int b){
return a+b;
}
public void run() {
int x=method(1,2);
}
}
You should consider that code which doesn't do anything useful can be optimised away. If you are not careful you can be timing how long it takes to detect the code doesn't do anything useful, rather than run the code. If you use multiple methods, it can take longer to detect useless code and thus give different results. I would always look at the steady state performance after the code has warmed up.
For the most expensive parts of your code, small methods will be inlined so they won't make any difference to performance cost. What can happen is
smaller methods can be optimised better as complex methods can defeat optimisation tricks.
smaller methods can be eliminated as they are inlined.
If you don't ever warm up the code, it is likely to be slower. However, if code is called rarely, it is unlikely to matter (except in low latency system, in which case I suggest you warm up your code on startup)
If you run
System.out.println("************Starting***************");
for (int j = 0; j < 10; j++) {
long startTime = System.nanoTime();
for (int i = 0; i < 1000000; i++) {
testObject.method(1, 2);
//testObject.method5(1,2); // uncomment this line and comment the above line to test non refactor time
}
long endTime = System.nanoTime();
System.out.println("Total :" + (endTime - startTime) + " nanoseconds");
}
prints
************Starting***************
Total :8644835 nanoseconds
Total :3363047 nanoseconds
Total :52 nanoseconds
Total :30 nanoseconds
Total :30 nanoseconds
Note: the 30 nano-seconds is the time it takes to perform a System.nanoTime() call. The inner loop and the methods calls have been eliminated.
Yes, extra method calls cost extra time (except in the circumstance where the compiler/jitter actually does inlining, which I think some of them will do sometimes but under circumstances that you will find hard to control).
You shouldn't worry about this except for the most expensive part of your code, because you won't be able to see the difference in most cases. In those cases where performance doesn't matter, you should refactor to maximize code clarity.
I suggest you refactor at will, occasionally measure performance using a profiler to find the expensive parts. Only when the profiler shows that some specific code is using a significant fraction of the time, should you worry about about function calls (or other speed vs. clarity tradeoffs) in that specific code. You will discover that the performance bottlenecks are often in places that you wouldn't have guessed.
I would like to compare the speed performance (if there were any) from the two readDataMethod() as I illustrate below.
private void readDataMethod1(List<Integer> numbers) {
final long startTime = System.nanoTime();
for (int i = 0; i < numbers.size(); i++) {
numbers.get(i);
}
final long endTime = System.nanoTime();
System.out.println("method 1 : " + (endTime - startTime));
}
private void readDataMethod2(List<Integer> numbers) {
final long startTime = System.nanoTime();
int i = numbers.size();
while (i-- > 0) {
numbers.get(i);
}
final long endTime = System.nanoTime();
System.out.println("method 2 : " + (endTime - startTime));
}
Most of the time the result I get shows that method 2 has "lower" value.
Run readDataMethod1 readDataMethod2
1 636331 468876
2 638256 479269
3 637485 515455
4 716786 420756
Does this test prove that the readDataMethod2 is faster than the earlier one ?
Does this test prove that the readDataMethod2 is faster than the earlier one ?
You are on the right track in that you're measuring comparative performance, rather than making assumptions.
However, there are lots of potential issues to be aware of when writing micro-benchmarks in Java. I would recommend that you read
How do I write a correct micro-benchmark in Java?
In the first one, you are calling numbers.size() for each iteration.
Try storing it in a variable, and check again.
The reason because of which the second version runs faster is because you are calling numbers.size() on each iteration. Replacing it by storing in a number would make it almost the same as the first one.
Does this test prove that the readDataMethod2 is faster than the earlier one ?
As #aix says, you are on the right track. However, there are a couple of specific issues with your methodology:
It doesn't look like you are "warming up" the JVM. Therefore it is conceivable that your figures could be distorted by startup effects (JIT compilation) or that none of the code has been JIT compiled.
I'd also argue that your runs are doing too little work. A 500000 nanoseconds, is 0.0005 seconds, and that's not much work. The risk is that "other things" external to your application could be introducing noise into the measurements. I'd have more confidence in runs that take tens of seconds.
I came across simple java program with two for loops. The question was whether these for loops will take same time to execute or first will execute faster than second .
Below is programs :
public static void main(String[] args) {
Long t1 = System.currentTimeMillis();
for (int i = 999; i > 0; i--) {
System.out.println(i);
}
t1 = System.currentTimeMillis() - t1;
Long t2 = System.currentTimeMillis();
for (int j = 0; j < 999; j++) {
System.out.println(j);
}
t2 = System.currentTimeMillis() - t2;
System.out.println("for loop1 time : " + t1);
System.out.println("for loop2 time : " + t2);
}
After executing this I found that first for loop takes more time than second. But after swapping there location the result was same that is which ever for loop written first always takes more time than the other. I was quite surprised with result. Please anybody tell me how above program works.
The time taken by either loop will be dominated by I/O (i.e. printing to screen), which is highly variable. I don't think you can learn much from your example.
The first loop will allocate 1000 Strings in memory while the second loop, regardsless of working forwards or not, can use the already pre-allocated objects.
Although working with System.out.println, any allocation should be neglible in comparison.
Long (and other primitive wrappers) has cache (look here for LongCache class) for values -128...127. It is populated at first loop run.
i think, if you are going to do a real benchmark, you should run them in different threads and use a higher value (not just 1000), no IO (printing output during execution time), and not to run them sequentially, but one by one.
i have an experience executing the same code a few times may takes different execution time.
and in my opinion, both test won't be different.