Java: How/Should I optimize a method with multiple IF statements?

Java: How/Should I optimize a method with multiple IF statements? - java

The problem is less generic, than in subj. Here I have the Builder pattern for user's convenience and a method with multiple IFs. However each IF statement is a condition on one of the object's non-final field. There're no assignment operations to these fields within the body of the method under consideration as well, as no setters provided by the class's API. Example:
public class MyFormatter {
public static class Builder {
private final boolean notOptional; // don't mind this one, just the Builder pattern
private boolean optionalA, optionalB, optionalC; // these would matter further
private Builder optionalA() { optionalA = true; return this; }
private Builder optionalB() { optionalB = true; return this; }
private Builder optionalC() { optionalC = true; return this; }
public Builder(boolean notOptional) {
this.notOptional = notOptional;
}
public MyFormatter build() {
MyFormatter formatter = new MyFormatter(notOptional);
formatter.optionalA = optionalA;
formatter.optionalB = optionalB;
formatter.optionalC = optionalC;
return formatter;
}
}
private final boolean notOptional;
private boolean optionalA, optionalB, optionalC; // Not final
private MyFormatter(boolean notOptional) {
this.notOptional = notOptional;
}
protected String publish(String msg) {
StringBuilder sb = new StringBuilder();
// Here we go: a lot of IFs, though conditions "effectively never" change
if (optionalA) {
sb.append("something");
}
if (optionalB) {
sb.append("something else");
}
if (optionalC) {
sb.append("and something else");
}
return sb.toString();
}
}
Ok, now the questions are how much JIT-compiler can do to optimize this code, and if there's anything I can do to optimize it (some lazy initialization etc.).
p.s. (Harder question) Imagine this code being translated in JavaScript (by GWT), i.e. no JVM would be involved in executing/optimizing this method. What can a programmer do in this case to improve the performance?

It is absolutely crucial for dev to see the real-time dynamics and each millisecond matter a lot.
That's it. Unless your devs can read many thousand messages per second, you're fine. The cost of
if (optionalA) {
sb.append("something");
}
consists of two parts.
The conditional branch and the appending. A mispredicted branch takes 10-20 cycles, i.e., up to 20 / 3 nanoseconds on a 3 GHz CPU. A correctly predicted branch is essentially free and because of the boolean being constant and the code being hot, you can assume that.
According to the length of "something", the appending may be more costly, but no details are given, so there's nothing to optimize.
I don't think the JIT will find something to optimize here. You could size your StringBuilder to gain a bit.
All in all, it is premature optimization.
Imagine this code being translated in JavaScript (by GWT)
Modern browsers have an advanced JIT just like Java does. Due to Javascript being weakly typed, it can't be as fast, but it comes pretty close.
Measure before optimizing, so you don't spend your time where the CPU does not.

Related

Omitting an instance field at run time in Java

Java's assert mechanism allows disabling putting in assertions which have essentially no run time cost (aside from a bigger class file) if assertions are disabled. But this may cover all situations.
For instance, many of Java's collections feature "fail-fast" iterators that attempt to detect when you're using them in a thread-unsafe way. But this requires both the collection and the iterator itself to maintain extra state that would not be needed if these checks weren't there.
Suppose someone wanted to do something similar, but allow the checks to be disabled and if they are disabled, it saves a few bytes in the iterator and likewise a few more bytes in the ArrayList, or whatever.
Alternatively, suppose we're doing some sort of object pooling that we want to be able to turn on and off at runtime; when it's off, it should just use Java's garbage collection and take no room for reference counts, like this (note that the code as written is very broken):
class MyClass {
static final boolean useRefCounts = my.global.Utils.useRefCounts();
static {
if(useRefCounts)
int refCount; // want instance field, not local variable
}
void incrementRefCount(){
if(useRefCounts) refCount++; // only use field if it exists;
}
/**return true if ready to be collected and reused*/
boolean decrementAndTestRefCount(){
// rely on Java's garbage collector if ref counting is disabled.
return useRefCounts && --refCount == 0;
}
}
The trouble with the above code is that the static bock makes no sense. But is there some trick using low-powered magic to make something along these lines work? (If high powered magic is allowed, the nuclear option is generate two versions of MyClass and arrange to put the correct one on the class path at start time.)

NOTE: You might not need to do this at all. The JIT is very good at inlining constants known at runtime especially boolean and optimising away the code which isn't used.
The int field is not ideal, however, if you are using a 64 bit JVM, the object size might not change.
On the OpenJDK/Oracle JVM (64-bit), the header is 12 bytes by default. The object alignment is 8 byte so the object will use 16 bytes. The field, adds 4 bytes, which after alignment is also 16 bytes.
To answer the question, you need two classes (unless you use generated code or hacks)
class MyClass {
static final boolean useRefCounts = my.global.Utils.useRefCounts();
public static MyClass create() {
return useRefCounts ? new MyClassPlus() : new MyClass();
}
void incrementRefCount() {
}
boolean decrementAndTestRefCount() {
return false;
}
}
class MyClassPlus extends MyClass {
int refCount; // want instance field, not local variable
void incrementRefCount() {
refCount++; // only use field if it exists;
}
boolean decrementAndTestRefCount() {
return --refCount == 0;
}
}

If you accept a slightly higher overhead in the case you’re using your ref count, you may resort to external storage, i.e.
class MyClass {
static final WeakHashMap<MyClass,Integer> REF_COUNTS
= my.global.Utils.useRefCounts()? new WeakHashMap<>(): null;
void incrementRefCount() {
if(REF_COUNTS != null) REF_COUNTS.merge(this, 1, Integer::sum);
}
/**return true if ready to be collected and reused*/
boolean decrementAndTestRefCount() {
return REF_COUNTS != null
&& REF_COUNTS.compute(this, (me, i) -> --i == 0? null: i) == null;
}
}
There is a behavioral difference for the case that someone invokes decrementAndTestRefCount() more often than incrementRefCount(). While your original code silently runs into a negative ref count, this code will throw a NullPointerException. I prefer failing with an exception in this case…
The code above will leave you with the overhead of a single static field in case you’re not using the feature. Most JVMs should have no problems eliminating the conditionals regarding the state of a static final variable.
Note further that the code allows MyClass instances to get garbage collected while having a non-zero ref count, just like when it was an instance field, but also actively removes the mapping when the count reaches the initial state of zero again, to minimize the work needed for cleanup.

Java Boolean implementation of valueOf()

While poking around the JDK 1.7 source I noticed these methods in Boolean.java:
public static Boolean valueOf(String s) {
return toBoolean(s) ? TRUE : FALSE;
}
private static boolean toBoolean(String name) {
return ((name != null) && name.equalsIgnoreCase("true"));
}
So valueOf() internally calls toBoolean(), which is fine. I did find it interesting to read how the toBoolean() method was implemented, namely:
equalsIgnoreCase() is reversed from what I would normally do (put the string first), and then
there is a null check first. This seems redundant if point 1 was adopted; as the first/second check in that method is a null check.
So I thought I would put together a quick test and check how my implementation would work compared with the JDK one. Here it is:
public class BooleanTest {
private final String[] booleans = {"false", "true", "null"};
#Test
public void testJdkToBoolean() {
long start = System.currentTimeMillis();
for (int i = 0; i < 1000000; i++) {
for (String aBoolean : booleans) {
Boolean someBoolean = Boolean.valueOf(aBoolean);
}
}
long end = System.currentTimeMillis();
System.out.println("JDK Boolean Runtime is: " + (end-start));
}
#Test
public void testModifiedToBoolean() {
long start = System.currentTimeMillis();
for (int i = 0; i < 1000000; i++) {
for (String aBoolean : booleans) {
Boolean someBoolean = ModifiedBoolean.valueOf(aBoolean);
}
}
long end = System.currentTimeMillis();
System.out.println("ModifiedBoolean Runtime is: " + (end-start));
}
}
class ModifiedBoolean {
public static Boolean valueOf(String s) {
return toBoolean(s) ? Boolean.TRUE : Boolean.FALSE;
}
private static boolean toBoolean(String name) {
return "true".equalsIgnoreCase(name);
}
}
Here is the result:
Running com.app.BooleanTest
JDK Boolean Runtime is: 37
ModifiedBoolean Runtime is: 34
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.128 sec
So not much of a gain, especially when distributed over 1m runs. Really not all that surprising.
What I would like to understand is how these differ at the bytecode level. I am interested in delving into this area but don't have any experience. Is this more work than is worth while? Would it provide a useful learning experience? Is this something people do on a regular basis?

There would be no performance gain for a couple of reasons:
It's just not that expensive of an operation to check whether or not name == null.
The thing that takes time is loading the value of name...which has to be loaded in either case.
name==null is faster then calling String.equalsIgnoreCase since it's a simple equality test rather than a function call.
These don't matter anyway because the architecture will likely use predictive branching and thus if most of your calls aren't for null strings, the architecture will start loading the branching instructions as if your strings are not null.

First, bytecode is very close to Java source. It can't give you much more information about the performance except some special cases (e.g. compile-time expression evaluation). Much more important is JIT compilation done by the JVM.
Some background: In early Java versions, it was rather a machine-well-readable version of source code. Decompiling such early Java versions is rather straightforward. You will lose comments and code will be slightly different. The hardest work of such decompiler is probably reconstructing the loops. In today Java versions, the decompilers have to be slightly more complex, because the language has been changed (inner classes, generics, …) more than the bytecode. But the bytecode is still very close to the source, even today.
Second, the redundant null check might not be important. JVM is able to remove some unneeded checks, even the automatically generated array bounds checks if they are surely unneeded.
Third, benchmarks are very tricky and even more tricky on the JVM. JVM "warms up", so the second benchmark might benefit from some optimizations done for the first benchmark. In some cases, the opposite might also happen – some optimistic optimisation must be discarded and the second benchmark is slower. Moreover, running the code only once creates huge error in the results.

Additional 'if checks' if the value is already set up - what is faster, what uses more resources?

Assume that we have a given interface:
public interface StateKeeper {
public abstract void negateWithoutCheck();
public abstract void negateWithCheck();
}
and following implementations:
class StateKeeperForPrimitives implements StateKeeper {
private boolean b = true;
public void negateWithCheck() {
if (b == true) {
this.b = false;
}
}
public void negateWithoutCheck() {
this.b = false;
}
}
class StateKeeperForObjects implements StateKeeper {
private Boolean b = true;
#Override
public void negateWithCheck() {
if (b == true) {
this.b = false;
}
}
#Override
public void negateWithoutCheck() {
this.b = false;
}
}
Moreover assume that methods negate*Check() can be called 1+ many times and it is hard to say what is the upper bound of the number of calls.
The question is which method in both implementations is 'better'
according to execution speed, garbage collection, memory allocation, etc. -
negateWithCheck or negateWithoutCheck?
Does the answer depend on which from the two proposed
implementations we use or it doesn't matter?
Does the answer depend on the estimated number of calls? For what count of number is better to use one or first method?

There might be a slight performance benefit in using the one with the check. I highly doubt that it matters in any real life application.
premature optimization is the root of all evil (Donald Knuth)
You could measure the difference between the two. Let me emphasize that these kind of things are notoriously difficult to measure reliably.
Here is a simple-minded way to do this. You can hope for performance benefits if the check recognizes that the value doesn't have to be changed, saving you an expensive write into the memory. So I have changed your code accordingly.
interface StateKeeper {
public abstract void negateWithoutCheck();
public abstract void negateWithCheck();
}
class StateKeeperForPrimitives implements StateKeeper {
private boolean b = true;
public void negateWithCheck() {
if (b == false) {
this.b = true;
}
}
public void negateWithoutCheck() {
this.b = true;
}
}
class StateKeeperForObjects implements StateKeeper {
private Boolean b = true;
public void negateWithCheck() {
if (b == false) {
this.b = true;
}
}
public void negateWithoutCheck() {
this.b = true;
}
}
public class Main {
public static void main(String args[]) {
StateKeeper[] array = new StateKeeper[10_000_000];
for (int i=0; i<array.length; ++i)
//array[i] = new StateKeeperForObjects();
array[i] = new StateKeeperForPrimitives();
long start = System.nanoTime();
for (StateKeeper e : array)
e.negateWithCheck();
//e.negateWithoutCheck();
long end = System.nanoTime();
System.err.println("Time in milliseconds: "+((end-start)/1000000));
}
}
I get the followings:
check no check
primitive 17ms 24ms
Object 21ms 24ms
I didn't find any performance penalty of the check the other way around when the check is always superfluous because the value always has to be changed.
Two things: (1) These timings are unreliable. (2) This benchmark is far from any real life application; I had to make an array of 10 million elements to actually see something.
I would simply pick the function with no check. I highly doubt that in any real application you would get any measurable performance benefit from the function that has the check but that check is error prone and is harder to read.

Short answer: the Without check will always be faster.
An assignment takes a lot less computation time than a comparison. Therefore: an IF statement is always slower than an assignment.
When comparing 2 variables, your CPU will fetch the first variable, fetch the second variable, compare those 2 and store the result into a temporary register. That's 2 fetches, 1 compare and a 1 store.
When you assign a value, your CPU will fetch the value on the right hand of the '=' and store it into the memory. That's 1 fetch and 1 store.

In general, if you need to set some state, just set the state. If, on the otherhand, you have to do something more - like log the change, inform about the change, etc. - then you should first inspect the old value.
But, in the case when methods like the ones you provided are called very intensely, there may be some performance difference in checking vs non-checking (whether the new value is different). Possible outcomes are:
1-a) check returns false
1-b) check returns true, value is assigned
2) value is assigned without check
As far as I know, writing is always slower than reading (all the way down to register level), so the fastest outcome is 1-a. If your case is that the most common thing that happens is that the value will not be changed ('more than 50%' logic is just not good enough, the exact percentage has to be figured out empirically) - then you should go with checking, as this eliminates redundant writing operation (value assignment). If, on the other hand, value is different more than often - assign it without checking.
You should test your concrete cases, do some profiling, and based on the result determine the best implementation. There is no general "best way" for this case (apart from "just set the state").
As for boolean vs Boolean here, I would say (off the top of my head) that there should be no performance difference.

Only today I've seen few answers and comments repeating that
Premature optimization is the root of all evil
Well obviously one if statement more is one thing more to do, but... it doesn't really matter.
And garbage collection and memory allocation... not an issue here.

I would generally consider the negateWithCheck to be slightly slower due there always being a comparison. Also notice in the StateKeeperOfObjects you are introducing some autoboxing. 'true' and 'false' are primitive boolean values.
Assuming you fix the StateKeeperOfObjects to use all objects, then potentially, but most likely not noticeable.
The speed will depend slightly on the number of calls, but in general the speed should be considered to be the same whether you call it once or many times (ignoring secondary effects such as caching, jit, etc).
It seems to me, a better question is whether or not the performance difference is noticeable. I work on a scientific project that involves millions of numerical computations done in parallel. We started off using Objects (e.g. Integer, Double) and had less than desirable performance, both in terms of memory and speed. When we switched all of our computations to primitives (e.g. int, double) and went over the code to make sure we were not introducing anything funky through autoboxing, we saw a huge performance increase (both memory and speed).
I am a huge fan of avoiding premature optimization, unless it is something that is "simple" to implement. Just be wary of the consequences. For example, do you have to represent null values in your data model? If so, how do you do that using a primitive? Doubles can be done easily with NaN, but what about Booleans?

negateWithoutCheck() is preferable because if we consider the number of calls then negateWithoutCheck() has only one call i.e. this.b = false; where as negateWithCheck() has one extra with previous one.

Java: Framework for thread shared data

I've written a few multithreaded hobby programs and some in my previous(engineering/physics) studies as well, so I consider myself to have an above-beginner knowledge in the area of synchronization/thread safety and primitives, what the average user finds to be challanging with the JMM and multiple threads etc.
What I find that I need and there is no proper method of marking instance or static members of classes as shared by different threads. Think about it, we have access rules such as private/protected/public and conventions on how to name getters/setters and a lot of things.
But what about threading? What if I want to mark a variable as thread shared and have it follow certain rules? Volatile/Atomic refs might do the job, but sometimes you just do need to use mutexes. And when you manually have to remember to use something...you will forget about it :) - At some point.
So I had an idea, and I see I am not the first, I also checked out http://checkthread.org/example-threadsafe.html - They seem to have a pretty decent code analyzer which I might try later which sort of lets me do some of the things I want.
But coming back to the initial problem. Let's say we need something a little more low level than a message passing framework and we need something a little more high level than primitive mutexes... What do we have...wel...nothing?
So basically, what I've made is a sort of pure java super-simple framework for threading that lets you declare class members as shared or non-shared...well sort of :).
Below is an example of how it could be used:
public class SimClient extends AbstractLooper {
private static final int DEFAULT_HEARTBEAT_TIMEOUT_MILLIS = 2000;
// Accessed by single threads only
private final SocketAddress socketAddress;
private final Parser parser;
private final Callback cb;
private final Heart heart;
private boolean lookingForFirstMsg = true;
private BufferedInputStream is;
// May be accessed by several threads (T*)
private final Shared<AllThreadsVars> shared = new Shared<>(new AllThreadsVars());
.
.
.
.
static class AllThreadsVars {
public boolean connected = false;
public Socket socket = new Socket();
public BufferedOutputStream os = null;
public long lastMessageAt = 0;
}
And to access the variables marked as thread shared you must send a runnable-like functor to the Shared object:
public final void transmit(final byte[] data) {
shared.run(new SharedRunnable<AllThreadsVars, Object, Object>() {
#Override
public Object run(final AllThreadsVars sharedVariable, final Object input) {
try {
if (sharedVariable.socket.isConnected() && sharedVariable.os != null) {
sharedVariable.os.write(data);
sharedVariable.os.flush();
}
} catch (final Exception e) { // Disconnected
setLastMessageAt(0);
}
return null;
}
}, null);
}
Where a shared runnable is defined like:
public interface SharedRunnable<SHARED_TYPE, INPUT, OUTPUT> {
OUTPUT run(final SHARED_TYPE s, final INPUT input);
}
Where is this going?
Well this gives me the help (yes you can leak out and break it but far less likely) that I can mark variable sets (not just variables) as thread shared, and once that is done, have it guaranteed in compile time ( I cannot forget to synchronize some method). It also allows me to standardize and perform tests to look for possible deadlocks also in compile time (Though atm I only implemented it in runtime cause doing it in compile time with the above framework will probably require more than just the java compiler).
Basically this is extremely useful to me and I'm wondering if I'm just reinventing the wheel here or of this could be some anti-pattern I don't know of. And I really don't know who to ask. (Oh yeah and Shared.run(SharedRunnable r, INPUT input) works just like
private final <OUTPUT, INPUT> OUTPUT run(final SharedRunnable<SHARED_TYPE, INPUT, OUTPUT> r, final INPUT input) {
try {
lock.lock();
return r.run(sharedVariable, input);
} finally {
lock.unlock();
}
}
This is just my own experimentation so it's not really finished by any means, but I have one decent project using it right now and it's really helping out a lot.

You mean something like this? (which can be enforced by tools like findbugs.)

If you have values which should be shared, the best approach is encapsulate this within the class. This way the caller does need to know what thread model you are using. If you want to know what model is used internally, you can read the source, however the caller cannot forget to access a ConcurrentMap (for example) correctly because all its method are thread safe.

Saving on Instance Variables

Our server recently has been going down a lot and I was tasked to improve the memory usage of a set of classes that was identified to be the culprit.
I have code which initializes an instance of an object and goes like this:
boolean var1;
boolean var2;
.
.
.
boolean var100;
void setup() {
var1 = map.hasFlag("var1");
var2 = map.hasFlag("var2);
.
.
.
if (map.hasFlag("some flag") {
doSomething();
}
if (var1) {
increment something
}
if (var2) {
increment something
}
}
The setup code takes about 1300 lines. My question is if it is possible for this method to be more efficient in terms of using too many instance variables.
The instance variables by the way are used in a "main" method handleRow() where for example:
handleRow(){
if (var1) {
doSomething();
}
.
.
.
if (var100) {
doSomething();
}
}
One solution I am thinking is to change the implementation by removing the instance variables in the setup method and just calling it directly from the map when I need it:
handleRow(){
if (map.hasFlag("var1") {
doSomething();
}
.
.
.
if (map.hasFlag("var100") {
doSomething();
}
}
That's one solution I am considering but I would like to hear the inputs of the community. :)

If these are really all boolean variables, consider using a BitSet instead. You may find that reduces the memory footprint by a factor of 8 or possibly even 32 depending on padding.

100 boolean variables will take 1.6k of memory when every boolean with overhead takes 16 bytes (which is a bit much imho) I do not think this will be the source of the problem.
Replacing these flags with calls into the map will negatively impact performance, so your change will probably make things worse.
Before you go redesigning your code (a command pattern looks like a good candidate) you should look further into where the memory leak is that you are asked to solve.
Look for maps that the classes keep adding to, collections that are static variables etc. Once you find out where the reason for the memory growth lies you can decide which part of your classes to refactor.

You could save memory at the cost of time (but if your memory use is a real problem, then it's probably a nett gain in time) by storing the values in a bitset.
If the class is immutable (once you create it, you never change it) then you can perhaps gain by using a variant on Flyweight pattern. Here you have a store of in-use objects in a weak hashmap, and create your objects in a factory. If you create an object that is identical to an existing object, then your factory returns this previous object instead. The saving in memory can be negliable or massive depending on how many repeated objects there are.
If the class is not immutable, but there is such repetition, you can still use the Flyweight pattern, but you will have to do a sort of copy-on-write where altering an object makes it change from using a shared internal representation to one of its own (or a new one from the flyweight store). This is yet more complicated and yet more expensive in terms of time, but again if its appropriate, the savings can be great.

You can use command pattern:
public enum Command {
SAMPLE_FLAG1("FLAG1") {
public abstract void call( ){
//Do you increment here
}
},
SAMPLE_FLAG2("FLAG2") {
public abstract void call( ){
//Do you increment here
}
};
private Map<String, Command> commands = new HashMap<String, Command>( );
static {
for ( Command cmd : Command.values( )) {
commands.put( cmd.name, cmd);
}
};
private String name;
private Command( String name) {
this.name = name;
}
public Command fromString( String cmd) {
return commands.get( cmd);
}
public abstract void call( );
}
and then:
for( String $ : flagMap.keySet( )) {
Command.fromString( $).call( );
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java: How/Should I optimize a method with multiple IF statements? - java

Related

Omitting an instance field at run time in Java

Java Boolean implementation of valueOf()

Additional 'if checks' if the value is already set up - what is faster, what uses more resources?

Java: Framework for thread shared data

Saving on Instance Variables

Categories

Resources