I know there's a lot of topics talking about Reflection performance.
Even official Java docs says that Reflection is slower, but I have this code:
public class ReflectionTest {
public static void main(String[] args) throws Exception {
Object object = new Object();
Class<Object> c = Object.class;
int loops = 100000;
long start = System.currentTimeMillis();
Object s;
for (int i = 0; i < loops; i++) {
s = object.toString();
System.out.println(s);
}
long regularCalls = System.currentTimeMillis() - start;
java.lang.reflect.Method method = c.getMethod("toString");
start = System.currentTimeMillis();
for (int i = 0; i < loops; i++) {
s = method.invoke(object);
System.out.println(s);
}
long reflectiveCalls = System.currentTimeMillis() - start;
start = System.currentTimeMillis();
for (int i = 0; i < loops; i++) {
method = c.getMethod("toString");
s = method.invoke(object);
System.out.println(s);
}
long reflectiveLookup = System.currentTimeMillis() - start;
System.out.println(loops + " regular method calls:" + regularCalls
+ " milliseconds.");
System.out.println(loops + " reflective method calls without lookup:"
+ reflectiveCalls+ " milliseconds.");
System.out.println(loops + " reflective method calls with lookup:"
+ reflectiveLookup + " milliseconds.");
}
}
That I don't think is a valid benchmark, but at least should show some difference.
I executed it waiting to see the reflection normal calls being a bit slower than regular ones.
But this prints this:
100000 regular method calls:1129 milliseconds.
100000 reflective method calls without lookup:910 milliseconds.
100000 reflective method calls with lookup:994 milliseconds.
Just for note, first I executed it without that bunch of sysouts, and then I realized that some JVM optimization are just making it goes faster, so I added these printls to see if reflection was still faster.
The result without sysouts are:
100000 regular method calls:68 milliseconds.
100000 reflective method calls without lookup:48 milliseconds.
100000 reflective method calls with lookup:168 milliseconds.
I saw over internet that the same test executed on old JVMs make the reflective without lookup are two times slower than regular calls, and that speed falls over new updates.
If anyone can execute it and say me I'm wrong, or at least show me if there's something different than the past that make it faster.
Following instructions, I ran every loop separated and the result are (without sysouts)
100000 regular method calls:70 milliseconds.
100000 reflective method calls without lookup:120 milliseconds.
100000 reflective method calls with lookup:129 milliseconds.
Never performance test different bits of code in the same "run". The JVM has various optimisations that mean it though the end result is the same, how the internals are performed may differ. In more concrete terms, during your test the JVM may have noticed you are calling Object.toString a lot and have started to inline the method calls to Object.toString. It may have started to perform loop unfolding. Or there could have been a garbage collection in the first loop but not the second or third loops.
To get a more meaningful, but still not totally accurate picture you should separate your test into three separate programs.
The results on my computer (with no printing and 1,000,000 runs each)
All three loops run in same program
1000000 regular method calls: 490 milliseconds.
1000000 reflective method calls without lookup: 393 milliseconds.
1000000 reflective method calls with loopup: 978 milliseconds.
Loops run in separate programs
1000000 regular method calls: 475 milliseconds.
1000000 reflective method calls without lookup: 555 milliseconds.
1000000 reflective method calls with loopup: 1160 milliseconds.
There's an article by Brian Goetz on microbenchmarks that's worth reading. It looks like you're not doing anything to warm up the JVM (meaning give it a chance to do whatever inlining or other optimizations it's going to do) before doing your measurements, so it's likely the non-reflective test is still not warmed-up yet, and that could skew your numbers.
When you have multiple long running loops, the first loop can trigger the method to compile resulting in the later loops being optimised from the start. However the optimisation can be sub-optimal as it has no runtime information for those loops. The toString is relatively expensive and couple be taking longer than the reflections calls.
You don't need separate programs to avoid loop being optimised due to an earlier loop. You can run them in different methods.
The results I get are
Average regular method calls:2 ns.
Average reflective method calls without lookup:10 ns.
Average reflective method calls with lookup:240 ns.
The code
import java.lang.reflect.Method;
public class ReflectionTest {
public static void main(String[] args) throws Exception {
int loops = 1000 * 1000;
Object object = new Object();
long start = System.nanoTime();
Object s;
testMethodCall(object, loops);
long regularCalls = System.nanoTime() - start;
java.lang.reflect.Method method = Object.class.getMethod("getClass");
method.setAccessible(true);
start = System.nanoTime();
testInvoke(object, loops, method);
long reflectiveCalls = System.nanoTime() - start;
start = System.nanoTime();
testGetMethodInvoke(object, loops);
long reflectiveLookup = System.nanoTime() - start;
System.out.println("Average regular method calls:"
+ regularCalls / loops + " ns.");
System.out.println("Average reflective method calls without lookup:"
+ reflectiveCalls / loops + " ns.");
System.out.println("Average reflective method calls with lookup:"
+ reflectiveLookup / loops + " ns.");
}
private static Object testMethodCall(Object object, int loops) {
Object s = null;
for (int i = 0; i < loops; i++) {
s = object.getClass();
}
return s;
}
private static Object testInvoke(Object object, int loops, Method method) throws Exception {
Object s = null;
for (int i = 0; i < loops; i++) {
s = method.invoke(object);
}
return s;
}
private static Object testGetMethodInvoke(Object object, int loops) throws Exception {
Method method;
Object s = null;
for (int i = 0; i < loops; i++) {
method = Object.class.getMethod("getClass");
s = method.invoke(object);
}
return s;
}
}
Micro-benchmarks like this are never going to be accurate at all - as the VM "warms up" it'll inline bits of code and optimise bits of code as it goes along, so the same thing executed 2 minutes into a program could vastly outperform it right at the start.
In terms of what's happening here, my guess is that the first "normal" method call block warms it up, so the reflective blocks (and indeed all subsequent calls) would be faster. The only overhead added through reflectively calling a method that I can see is looking up the pointer to that method, which is a nanosecond-scale operation anyway and would be easily cached by the JVM. The rest would be on how the VM is warmed up, which it is by the time you reach the reflective calls.
There is no inherent reason why reflective call should be slower than a normal call. JVM can optimize them into the same thing.
Practically, human resources are limited, and they had to optimize normal calls first. As time passes by they can work on optimizing reflective calls; especially when reflection becomes more and more popular.
I have been writing my own micro-benchmark, without loops, and with System.nanoTime():
public static void main(String[] args) throws NoSuchMethodException, IllegalArgumentException, IllegalAccessException, InvocationTargetException
{
Object obj = new Object();
Class<Object> objClass = Object.class;
String s;
long start = System.nanoTime();
s = obj.toString();
long directInvokeEnd = System.nanoTime();
System.out.println(s);
long methodLookupStart = System.nanoTime();
java.lang.reflect.Method method = objClass.getMethod("toString");
long methodLookupEnd = System.nanoTime();
s = (String) (method.invoke(obj));
long reflectInvokeEnd = System.nanoTime();
System.out.println(s);
System.out.println(directInvokeEnd - start);
System.out.println(methodLookupEnd - methodLookupStart);
System.out.println(reflectInvokeEnd - methodLookupEnd);
}
I have been executing that in Eclipse on my machine a dozen times, and the results vary quite a bit, but here is what I typically get:
the direct method invocation clocks at 40-50 microseconds
method lookup clocks at 150-200 microseconds
reflective invocation with the method variable clocks at 250-310 microseconds.
Now, do not forget the caveats on microbenchmarks described in Nathan's reply - there are certainly a lot of flaws in that micro benchmark - and trust the documentation if they say that reflection is a LOT slower than direct invocation.
It strikes me that you have placed a "System.out.println(s)" call inside your inner benchmark loop.
Since performing IO is bound to be slow, it actually "swallows up" your benchmark and the overhead of the invoke becomes negligible.
Try removing the "println()" call and running code like this, I'm sure you'd be surprised by the result (some of the silly calculations are needed to avoid the compiler optimizing away the calls altogether):
public class Experius
{
public static void main(String[] args) throws Exception
{
Experius a = new Experius();
int count = 10000000;
int v = 0;
long tm = System.currentTimeMillis();
for ( int i = 0; i < count; ++i )
{
v = a.something(i + v);
++v;
}
tm = System.currentTimeMillis() - tm;
System.out.println("Time: " + tm);
tm = System.currentTimeMillis();
Method method = Experius.class.getMethod("something", Integer.TYPE);
for ( int i = 0; i < count; ++i )
{
Object o = method.invoke(a, i + v);
++v;
}
tm = System.currentTimeMillis() - tm;
System.out.println("Time: " + tm);
}
public int something(int n)
{
return n + 5;
}
}
-- TR
Even if you look up the method in both cases (i.e. before 2nd and 3rd loop),
the first lookup takes way less time than the second lookup, which should have been the other way around and less than a regular method call on my machine.
Neverthless, if you use the 2nd loop with method lookup, and System.out.println statement, I get this:
regular call : 740 ms
look up(2nd loop) : 640 ms
look up ( 3rd loop) : 800 ms
Without System.out.println statement, I get:
regular call : 78 ms
look up (2nd) : 37 ms
look up (3rd ) : 112 ms
Related
http://s24.postimg.org/9y073weid/refactor_vs_non_refactor.png
Well here is the result for the execution time in nanoseconds for re-factored and non re-factored code for a simple addition operation. 1 to 5 are the consecutive runs of the code.
My intent was just to find out whether splitting up logic into multiple methods makes execution slow or not and here is the result which shows that yes there is considerable time that goes into just putting the methods on stack.
I invite people who have done some research on it before or want to investigate on this area to correct me if I am doing something wrong and draw some conclusive results out of this.
In my opinion yes code re-factoring does help in making code more structured and understandable but in time critical systems like real time game engines I would prefer not to re-factor.
Following was the simple code which I used:
package com.sim;
public class NonThreadedMethodCallBenchMark{
public static void main(String args[]){
NonThreadedMethodCallBenchMark testObject = new NonThreadedMethodCallBenchMark();
System.out.println("************Starting***************");
long startTime =System.nanoTime();
for(int i=0;i<900000;i++){
//testObject.method(1, 2); // Uncomment this line and comment the line below to test refactor time
//testObject.method5(1,2); // uncomment this line and comment the above line to test non refactor time
}
long endTime =System.nanoTime();
System.out.println("Total :" +(endTime-startTime)+" nanoseconds");
}
public int method(int a , int b){
return method1(a,b);
}
public int method1(int a, int b){
return method2(a,b);
}
public int method2(int a, int b){
return method3(a,b);
}
public int method3(int a, int b){
return method4(a,b);
}
public int method4(int a, int b){
return method5(a,b);
}
public int method5(int a, int b){
return a+b;
}
public void run() {
int x=method(1,2);
}
}
You should consider that code which doesn't do anything useful can be optimised away. If you are not careful you can be timing how long it takes to detect the code doesn't do anything useful, rather than run the code. If you use multiple methods, it can take longer to detect useless code and thus give different results. I would always look at the steady state performance after the code has warmed up.
For the most expensive parts of your code, small methods will be inlined so they won't make any difference to performance cost. What can happen is
smaller methods can be optimised better as complex methods can defeat optimisation tricks.
smaller methods can be eliminated as they are inlined.
If you don't ever warm up the code, it is likely to be slower. However, if code is called rarely, it is unlikely to matter (except in low latency system, in which case I suggest you warm up your code on startup)
If you run
System.out.println("************Starting***************");
for (int j = 0; j < 10; j++) {
long startTime = System.nanoTime();
for (int i = 0; i < 1000000; i++) {
testObject.method(1, 2);
//testObject.method5(1,2); // uncomment this line and comment the above line to test non refactor time
}
long endTime = System.nanoTime();
System.out.println("Total :" + (endTime - startTime) + " nanoseconds");
}
prints
************Starting***************
Total :8644835 nanoseconds
Total :3363047 nanoseconds
Total :52 nanoseconds
Total :30 nanoseconds
Total :30 nanoseconds
Note: the 30 nano-seconds is the time it takes to perform a System.nanoTime() call. The inner loop and the methods calls have been eliminated.
Yes, extra method calls cost extra time (except in the circumstance where the compiler/jitter actually does inlining, which I think some of them will do sometimes but under circumstances that you will find hard to control).
You shouldn't worry about this except for the most expensive part of your code, because you won't be able to see the difference in most cases. In those cases where performance doesn't matter, you should refactor to maximize code clarity.
I suggest you refactor at will, occasionally measure performance using a profiler to find the expensive parts. Only when the profiler shows that some specific code is using a significant fraction of the time, should you worry about about function calls (or other speed vs. clarity tradeoffs) in that specific code. You will discover that the performance bottlenecks are often in places that you wouldn't have guessed.
I want to test the time of adding and getting item in simple and generic hashmap:
public void TestHashGeneric(){
Map hashsimple = new HashMap();
long startTime = System.currentTimeMillis();
for (int i = 0; i < 100000; i++) {
hashsimple.put("key"+i, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" );
}
for (int i = 0; i < 100000; i++) {
String ret =(String)hashsimple.get("key"+i);
}
long endTime =System.currentTimeMillis();
System.out.println("Hash Time " + (endTime - startTime) + " millisec");
Map<String,String> hm = new HashMap<String,String>();
startTime = System.currentTimeMillis();
for (int i = 0; i < 100000; i++) {
hm.put("key"+i, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" );
}
for (int i = 0; i < 100000; i++) {
String ret = hm.get("key"+i);
}
endTime = System.currentTimeMillis();
System.out.println("Hash generic Time " + (endTime - startTime) + " millisec");
}
The problem is that I get different time if I change places between hashmap's code section!
if i put the loops (with the time print ofcourse) of generic below the simple I get better time for generic and if i put simple below the generic I get better time for simple!
Same happens if I use different methods for this.
The JIT will compile and optimise your program while it is running, so the 2nd run will always be faster.
You should make the following modifications:
Run both tests untimed first, then re-run them timed so that you don't get affected by the JIT.
You should use System.nanoTime() as this is more accurate for timing (you should never get a diff of 0).
You should also test against some empty methods, as you are also timing the string concatenation operation in each loop.
Also note that in Java generic types are erased, so there should be no runtime difference at all.
The Java Runtime is pretty sophistocated - it does some learning/optimisation while it's running. You might be able to get the test you want by "warming up" the JVM first. Try calling TestHashGeneric() twice and see what the second set of results gives you.
You've also got twice as much stuff in memory on the second run. There's all sorts of variables that can affect this test.
This is not the correct way to perform micro-benchmarks as it is very involved (Why?). I suggest you use Caliper framework for this.
When you have a single loop which reaches the compile threshold (about 10K) the whole method is compiled. This can make either the first loop appear faster (as it is optimised correctly as the second loop has no counter information) or the second loop appear after (as its compiled before it is even started)
This simplest way to fix this is to place each test in its own method and they will be compiled and optimised independently. (The order can still matter but is less important) I would still suggest running the test a number of times to see how the results vary.
I came across simple java program with two for loops. The question was whether these for loops will take same time to execute or first will execute faster than second .
Below is programs :
public static void main(String[] args) {
Long t1 = System.currentTimeMillis();
for (int i = 999; i > 0; i--) {
System.out.println(i);
}
t1 = System.currentTimeMillis() - t1;
Long t2 = System.currentTimeMillis();
for (int j = 0; j < 999; j++) {
System.out.println(j);
}
t2 = System.currentTimeMillis() - t2;
System.out.println("for loop1 time : " + t1);
System.out.println("for loop2 time : " + t2);
}
After executing this I found that first for loop takes more time than second. But after swapping there location the result was same that is which ever for loop written first always takes more time than the other. I was quite surprised with result. Please anybody tell me how above program works.
The time taken by either loop will be dominated by I/O (i.e. printing to screen), which is highly variable. I don't think you can learn much from your example.
The first loop will allocate 1000 Strings in memory while the second loop, regardsless of working forwards or not, can use the already pre-allocated objects.
Although working with System.out.println, any allocation should be neglible in comparison.
Long (and other primitive wrappers) has cache (look here for LongCache class) for values -128...127. It is populated at first loop run.
i think, if you are going to do a real benchmark, you should run them in different threads and use a higher value (not just 1000), no IO (printing output during execution time), and not to run them sequentially, but one by one.
i have an experience executing the same code a few times may takes different execution time.
and in my opinion, both test won't be different.
Today I was faced with the method constructServiceUrl() of the org.jasig.cas.client.util.CommonUtils class. I thought he was very strange:
final StringBuffer buffer = new StringBuffer();
synchronized (buffer)
{
if (!serverName.startsWith("https://") && !serverName.startsWith("http://"))
{
buffer.append(request.isSecure() ? "https://" : "http://");
}
buffer.append(serverName);
buffer.append(request.getRequestURI());
if (CommonUtils.isNotBlank(request.getQueryString()))
{
final int location = request.getQueryString().indexOf(
artifactParameterName + "=");
if (location == 0)
{
final String returnValue = encode ? response.encodeURL(buffer.toString()) : buffer.toString();
if (LOG.isDebugEnabled())
{
LOG.debug("serviceUrl generated: " + returnValue);
}
return returnValue;
}
buffer.append("?");
if (location == -1)
{
buffer.append(request.getQueryString());
}
else if (location > 0)
{
final int actualLocation = request.getQueryString()
.indexOf("&" + artifactParameterName + "=");
if (actualLocation == -1)
{
buffer.append(request.getQueryString());
}
else if (actualLocation > 0)
{
buffer.append(request.getQueryString().substring(0, actualLocation));
}
}
}
}
Why did the author synchronizes a local variable?
This is an example of manual "lock coarsening" and may have been done to get a performance boost.
Consider these two snippets:
StringBuffer b = new StringBuffer();
for(int i = 0 ; i < 100; i++){
b.append(i);
}
versus:
StringBuffer b = new StringBuffer();
synchronized(b){
for(int i = 0 ; i < 100; i++){
b.append(i);
}
}
In the first case, the StringBuffer must acquire and release a lock 100 times (because append is a synchronized method), whereas in the second case, the lock is acquired and released only once. This can give you a performance boost and is probably why the author did it. In some cases, the compiler can perform this lock coarsening for you (but not around looping constructs because you could end up holding a lock for long periods of time).
By the way, the compiler can detect that an object is not "escaping" from a method and so remove acquiring and releasing locks on the object altogether (lock elision) since no other thread can access the object anyway. A lot of work has been done on this in JDK7.
Update:
I carried out two quick tests:
1) WITHOUT WARM-UP:
In this test, I did not run the methods a few times to "warm-up" the JVM. This means that the Java Hotspot Server Compiler did not get a chance to optimize code e.g. by eliminating locks for escaping objects.
JDK 1.4.2_19 1.5.0_21 1.6.0_21 1.7.0_06
WITH-SYNC (ms) 3172 1108 3822 2786
WITHOUT-SYNC (ms) 3660 801 509 763
STRINGBUILDER (ms) N/A 450 434 475
With JDK 1.4, the code with the external synchronized block is faster. However, with JDK 5 and above the code without external synchronization wins.
2) WITH WARM-UP:
In this test, the methods were run a few times before the timings were calculated. This was done so that the JVM could optimize code by performing escape analysis.
JDK 1.4.2_19 1.5.0_21 1.6.0_21 1.7.0_06
WITH-SYNC (ms) 3190 614 565 587
WITHOUT-SYNC (ms) 3593 779 563 610
STRINGBUILDER (ms) N/A 450 434 475
Once again, with JDK 1.4, the code with the external synchronized block is faster. However, with JDK 5 and above, both methods perform equally well.
Here is my test class (feel free to improve):
public class StringBufferTest {
public static void unsync() {
StringBuffer buffer = new StringBuffer();
for (int i = 0; i < 9999999; i++) {
buffer.append(i);
buffer.delete(0, buffer.length() - 1);
}
}
public static void sync() {
StringBuffer buffer = new StringBuffer();
synchronized (buffer) {
for (int i = 0; i < 9999999; i++) {
buffer.append(i);
buffer.delete(0, buffer.length() - 1);
}
}
}
public static void sb() {
StringBuilder buffer = new StringBuilder();
synchronized (buffer) {
for (int i = 0; i < 9999999; i++) {
buffer.append(i);
buffer.delete(0, buffer.length() - 1);
}
}
}
public static void main(String[] args) {
System.out.println(System.getProperty("java.version"));
// warm up
for(int i = 0 ; i < 10 ; i++){
unsync();
sync();
sb();
}
long start = System.currentTimeMillis();
unsync();
long end = System.currentTimeMillis();
long duration = end - start;
System.out.println("Unsync: " + duration);
start = System.currentTimeMillis();
sync();
end = System.currentTimeMillis();
duration = end - start;
System.out.println("sync: " + duration);
start = System.currentTimeMillis();
sb();
end = System.currentTimeMillis();
duration = end - start;
System.out.println("sb: " + duration);
}
}
Inexperience, incompetence, or more likely dead yet benign code that remains after refactoring.
You're right to question the worth of this - modern compilers will use escape analysis to determine that the object in question cannot be referenced by another thread, and so will elide (remove) the synchronization altogether.
(In a broader sense, it is sometimes useful to synchronize on a local variable - they are still objects after all, and another thread can still have a reference to them (so long as they have been somehow "published" after their creation). Still, this is seldom a good idea as it's often unclear and very difficult to get right - a more explicitly locking mechanism with other threads is likely to prove better overall in these cases.)
I don't think the synchronization can have any effect, since buffer is never passed to another method or stored in a field before it goes out of scope, so no other thread can possibly have access to it.
The reason it is there could be political - I've been in a similar situation: A "pointy-haired boss' insisted that I clone a string in a setter method instead of just storing the reference, for fear of having the contents changed. He didn't deny that strings are immutable but insisted on cloning it "just in case." Since it was harmless (just like this synchronization) I did not argue.
That's a bit insane...it doesn't do anything except add overhead. Not to mention that calls to StringBuffer are already synchronized, which is why StringBuilder is preferred for cases where you won't have multiple threads accessing the same instance.
IMO, there is no need for synchronization that local variable. Only if it was exposed to others, e.g. by passing it to a function that will store it and potentially use it in another thread, the synchronization would make sense.
But as this is not the case, I see no use of it
I have created a class to handle my debug outputs so that I don't need to strip out all my log outputs before release.
public class Debug {
public static void debug( String module, String message) {
if( Release.DEBUG )
Log.d(module, message);
}
}
After reading another question, I have learned that the contents of the if statement are not compiled if the constant Release.DEBUG is false.
What I want to know is how much overhead is generated by running this empty method? (Once the if clause is removed there is no code left in the method) Is it going to have any impact on my application? Obviously performance is a big issue when writing for mobile handsets =P
Thanks
Gary
Measurements done on Nexus S with Android 2.3.2:
10^6 iterations of 1000 calls to an empty static void function: 21s <==> 21ns/call
10^6 iterations of 1000 calls to an empty non-static void function: 65s <==> 65ns/call
10^6 iterations of 500 calls to an empty static void function: 3.5s <==> 7ns/call
10^6 iterations of 500 calls to an empty non-static void function: 28s <==> 56ns/call
10^6 iterations of 100 calls to an empty static void function: 2.4s <==> 24ns/call
10^6 iterations of 100 calls to an empty non-static void function: 2.9s <==> 29ns/call
control:
10^6 iterations of an empty loop: 41ms <==> 41ns/iteration
10^7 iterations of an empty loop: 560ms <==> 56ns/iteration
10^9 iterations of an empty loop: 9300ms <==> 9.3ns/iteration
I've repeated the measurements several times. No significant deviations were found.
You can see that the per-call cost can vary greatly depending on workload (possibly due to JIT compiling),
but 3 conclusions can be drawn:
dalvik/java sucks at optimizing dead code
static function calls can be optimized much better than non-static
(non-static functions are virtual and need to be looked up in a virtual table)
the cost on nexus s is not greater than 70ns/call (thats ~70 cpu cycles)
and is comparable with the cost of one empty for loop iteration (i.e. one increment and one condition check on a local variable)
Observe that in your case the string argument will always be evaluated. If you do string concatenation, this will involve creating intermediate strings. This will be very costly and involve a lot of gc. For example executing a function:
void empty(String string){
}
called with arguments such as
empty("Hello " + 42 + " this is a string " + count );
10^4 iterations of 100 such calls takes 10s. That is 10us/call, i.e. ~1000 times slower than just an empty call. It also produces huge amount of GC activity. The only way to avoid this is to manually inline the function, i.e. use the >>if<< statement instead of the debug function call. It's ugly but the only way to make it work.
Unless you call this from within a deeply nested loop, I wouldn't worry about it.
A good compiler removes the entire empty method, resulting in no overhead at all. I'm not sure if the Dalvik compiler already does this, but I suspect it's likely, at least since the arrival of the Just-in-time compiler with Froyo.
See also: Inline expansion
In terms of performance the overhead of generating the messages which get passed into the debug function are going to be a lot more serious since its likely they do memory allocations eg
Debug.debug(mymodule, "My error message" + myerrorcode);
Which will still occur even through the message is binned.
Unfortunately you really need the "if( Release.DEBUG ) " around the calls to this function rather than inside the function itself if your goal is performance, and you will see this in a lot of android code.
This is an interesting question and I like #misiu_mp analysis, so I thought I would update it with a 2016 test on a Nexus 7 running Android 6.0.1. Here is the test code:
public void runSpeedTest() {
long startTime;
long[] times = new long[100000];
long[] staticTimes = new long[100000];
for (int i = 0; i < times.length; i++) {
startTime = System.nanoTime();
for (int j = 0; j < 1000; j++) {
emptyMethod();
}
times[i] = (System.nanoTime() - startTime) / 1000;
startTime = System.nanoTime();
for (int j = 0; j < 1000; j++) {
emptyStaticMethod();
}
staticTimes[i] = (System.nanoTime() - startTime) / 1000;
}
int timesSum = 0;
for (int i = 0; i < times.length; i++) { timesSum += times[i]; Log.d("status", "time," + times[i]); sleep(); }
int timesStaticSum = 0;
for (int i = 0; i < times.length; i++) { timesStaticSum += staticTimes[i]; Log.d("status", "statictime," + staticTimes[i]); sleep(); }
sleep();
Log.d("status", "final speed = " + (timesSum / times.length));
Log.d("status", "final static speed = " + (timesStaticSum / times.length));
}
private void sleep() {
try {
Thread.sleep(10);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private void emptyMethod() { }
private static void emptyStaticMethod() { }
The sleep() was added to prevent overflowing the Log.d buffer.
I played around with it many times and the results were pretty consistent with #misiu_mp:
10^5 iterations of 1000 calls to an empty static void function: 29ns/call
10^5 iterations of 1000 calls to an empty non-static void function: 34ns/call
The static method call was always slightly faster than the non-static method call, but it would appear that a) the gap has closed significantly since Android 2.3.2 and b) there's still a cost to making calls to an empty method, static or not.
Looking at a histogram of times reveals something interesting, however. The majority of call, whether static or not, take between 30-40ns, and looking closely at the data they are virtually all 30ns exactly.
Running the same code with empty loops (commenting out the method calls) produces an average speed of 8ns, however, about 3/4 of the measured times are 0ns while the remainder are exactly 30ns.
I'm not sure how to account for this data, but I'm not sure that #misiu_mp's conclusions still hold. The difference between empty static and non-static methods is negligible, and the preponderance of measurements are exactly 30ns. That being said, it would appear that there is still some non-zero cost to running empty methods.