I wrote two methods to check there performance
public class Test1 {
private String value;
public void notNull(){
if( value != null) {
//do something
}
}
public void nullNot(){
if( null != value) {
//do something
}
}
}
and checked it's byte code after compiling
public void notNull();
Code:
Stack=1, Locals=1, Args_size=1
0: aload_0
1: getfield #2; //Field value:Ljava/lang/String;
4: ifnull 7
7: return
LineNumberTable:
line 6: 0
line 9: 7
StackMapTable: number_of_entries = 1
frame_type = 7 /* same */
public void nullNot();
Code:
Stack=2, Locals=1, Args_size=1
0: aconst_null
1: aload_0
2: getfield #2; //Field value:Ljava/lang/String;
5: if_acmpeq 8
8: return
LineNumberTable:
line 12: 0
line 15: 8
StackMapTable: number_of_entries = 1
frame_type = 8 /* same */
}
in here two opcodes are used to implement the if condition: in first case it use ifnull- check top value of stack is null-, and in second case it use if_acmpeq- check top two value are equal in the stack-
so, will this make an effect on performance?
(this will helps me to prove first implementation of null is good in performance wise as well as in the aspect of readability :) )
Comparing the generated bytecodes is mostly meaningless, since most of the optimization happens in run time with the JIT compiler. I'm going to guess that in this case, either expression is equally fast. If there's any difference, it's negligible.
This is not something that you need to worry about. Look for big picture optimizations.
Don't optimize at the expense of readability if the speed (or memory/whatever the case may be) gain will be negligible. I think !=null is generally more readable, so use that.
With questions like this, it's hard to know how smart the JVM will be (though the answer is "usually pretty smart if possible" and it looks very possible in this case). But just to be sure, test it:
class Nullcheck {
public static class Fooble { }
Fooble[] foo = {null , new Fooble(), null , null,
new Fooble(), null, null, new Fooble() };
public int testFirst() {
int sum = 0;
for (int i=0 ; i<1000000000 ; i++) if (foo[i&0x7] != null) sum++;
return sum;
}
public int testSecond() {
int sum = 0;
for (int i=0 ; i<1000000000 ; i++) if (null != foo[i&0x7]) sum++;
return sum;
}
public void run() {
long t0 = System.nanoTime();
int s1 = testFirst();
long t1 = System.nanoTime();
int s2 = testSecond();
long t2 = System.nanoTime();
System.out.printf("Difference=%d; %.3f vs. %.3f ns/loop (diff=%.3f)\n",
s2-s1,(t1-t0)*1e-9,(t2-t1)*1e-9,(t0+t2-2*t1)*1e-9);
}
public static void main(String[] args) {
Nullcheck me = new Nullcheck();
for (int i=0 ; i<5 ; i++) me.run();
}
}
And on my machine this yields:
Difference=0; 2.574 vs. 2.583 ns/loop (diff=0.008)
Difference=0; 2.574 vs. 2.573 ns/loop (diff=-0.001)
Difference=0; 1.584 vs. 1.582 ns/loop (diff=-0.003)
Difference=0; 1.582 vs. 1.584 ns/loop (diff=0.002)
Difference=0; 1.582 vs. 1.582 ns/loop (diff=0.000)
So the answer is: no, no meaningful difference at all. (And the JIT compiler can find extra tricks to speed each up after the same number of repeat runs.)
Update: The code above runs an ad-hoc benchmark. Using JMH (now that it exists!) is a good way to help avoid (some) microbenchmarking pitfalls. The code above avoids the worst pitfalls but it doesn't give explicit error estimates and ignores various other things that sometimes matter. These days: use JMH! Also, when in doubt, run your own benchmarks. Details sometimes matter — not very often for something as straightforward as this, but if it is really important to you you should check in a condition as close to production as you can manage.
Apart from the hard-earned wisdom of avoiding accidental assignment in C, which favors putting the constant on the left of the binary operator, I find the constant on the left to be more readable because it puts the crucial value in the most prominent position.
Usually a function body will use only a few variables, and it's usually apparent by way of context which variable is under inspection. By putting the constant on the left, we more closely mimic switch and case: given this variable, select a matching value. Seeing the value on the left, one focuses on the particular condition being selected.
When I scan
if (var == null)
I read it as, "We're inspecting var here, and we're comparing it for equality, against ... ah, null." Conversely, when I scan
if (null == var)
I think, "We're seeing if a value is null, and ... yes, it's var we're inspecting." It's an even stronger recognition with
if (null != var)
which my eye just picks up on immediately.
This intuition comes from consistency of habit, preferring to read what one writes, and writing what one prefers to read. One can learn it either way, but it's not objectively true as others have answered here that putting the variable on the left is clearer. It depends on what aspect of the expression one wants to be most clear first.
Seeing the bytecode difference was fascinating. Thanks for sharing that.
The difference will be negligable so go with what's most readable (!= null imo)
I'd stick with (value != null) for readability. But you can always use Assertions.
Minute optimization like that is the job of the compiler, especially in high-level languages like Java.
Although strictly it's not relevant here, don't optimize prematurely!
From the point of view, there is no significant difference in performance.
However, it is useful to write the null first to catch typos errors.
For example, if you are used to write this code:
if (obj == null)
Could be wrote by mistake as:
if (obj = null)
From the point of view of the compiler, this is fine.
However, If you are used to write the code as:
if (null == obj)
and made the mistake to write:
if (null = obj)
the compiler will let you know you made a mistake in that line.
Putting null first seems to generate an extra byte-code, but aside from that there may not be a performance difference.
Personally, I wouldn't worry about performance until its time to worry about performance.
I would use the notNull() approach, just so you don't throw a compiler error if you forget the ! and accidentally type null = value.
Oh, if you ask for ultimate performance, don't create additional class or methods. Even static methods would take a bit of time as the Java class loader needs to JIT load it.
So, whenever you need to check if a variable is null, you just test it by either
if (x == null)
or
if (null == x)
Frankly I reckon the performance bonus to pick one of the two is easily offset by the overhead of introducing unnecessary methods.
You can ignore this very minute optimisation stuff during coding
As you can see the performance different is very less. Don't worry about the small things it is always better to focus more on algorithm. And obviously readability is a factor.
I would use the "new" Java 8 feature, I write several examples:
import java.util.Optional;
public class SillyExample {
public void processWithValidation(final String sampleStringParameter){
final String sampleString = Optional.ofNullable(sampleStringParameter).orElseThrow(() -> new IllegalArgumentException("String must not be null"));
//Do what you want with sampleString
}
public void processIfPressent(final String sampleStringParameter){
Optional.ofNullable(sampleStringParameter).ifPresent(sampleString -> {
//Do what you want with sampleString
});
}
public void processIfPressentWithFilter(final String sampleStringParameter){
Optional.ofNullable(sampleStringParameter).filter("hello"::equalsIgnoreCase).ifPresent(sampleString -> {
//Do what you want with sampleString
});
}
}
In Java-8 two additional methods were introduced to Objects class:
Objects#nonNull and Objects#isNull, which you can use to replace null checks. An interesting things is that both of them use objects first:
public static boolean isNull(Object obj) {
return obj == null;
}
and
public static boolean nonNull(Object obj) {
return obj != null;
}
correspondingly. I guess it means that this is the recommended way (at least core jdk developers used that approach)
Objects source code
I would prefer null != object as it makes clearly visible that it's just for null check.
Byte code is just a simple translation of the source code.
Related
Java's assert mechanism allows disabling putting in assertions which have essentially no run time cost (aside from a bigger class file) if assertions are disabled. But this may cover all situations.
For instance, many of Java's collections feature "fail-fast" iterators that attempt to detect when you're using them in a thread-unsafe way. But this requires both the collection and the iterator itself to maintain extra state that would not be needed if these checks weren't there.
Suppose someone wanted to do something similar, but allow the checks to be disabled and if they are disabled, it saves a few bytes in the iterator and likewise a few more bytes in the ArrayList, or whatever.
Alternatively, suppose we're doing some sort of object pooling that we want to be able to turn on and off at runtime; when it's off, it should just use Java's garbage collection and take no room for reference counts, like this (note that the code as written is very broken):
class MyClass {
static final boolean useRefCounts = my.global.Utils.useRefCounts();
static {
if(useRefCounts)
int refCount; // want instance field, not local variable
}
void incrementRefCount(){
if(useRefCounts) refCount++; // only use field if it exists;
}
/**return true if ready to be collected and reused*/
boolean decrementAndTestRefCount(){
// rely on Java's garbage collector if ref counting is disabled.
return useRefCounts && --refCount == 0;
}
}
The trouble with the above code is that the static bock makes no sense. But is there some trick using low-powered magic to make something along these lines work? (If high powered magic is allowed, the nuclear option is generate two versions of MyClass and arrange to put the correct one on the class path at start time.)
NOTE: You might not need to do this at all. The JIT is very good at inlining constants known at runtime especially boolean and optimising away the code which isn't used.
The int field is not ideal, however, if you are using a 64 bit JVM, the object size might not change.
On the OpenJDK/Oracle JVM (64-bit), the header is 12 bytes by default. The object alignment is 8 byte so the object will use 16 bytes. The field, adds 4 bytes, which after alignment is also 16 bytes.
To answer the question, you need two classes (unless you use generated code or hacks)
class MyClass {
static final boolean useRefCounts = my.global.Utils.useRefCounts();
public static MyClass create() {
return useRefCounts ? new MyClassPlus() : new MyClass();
}
void incrementRefCount() {
}
boolean decrementAndTestRefCount() {
return false;
}
}
class MyClassPlus extends MyClass {
int refCount; // want instance field, not local variable
void incrementRefCount() {
refCount++; // only use field if it exists;
}
boolean decrementAndTestRefCount() {
return --refCount == 0;
}
}
If you accept a slightly higher overhead in the case you’re using your ref count, you may resort to external storage, i.e.
class MyClass {
static final WeakHashMap<MyClass,Integer> REF_COUNTS
= my.global.Utils.useRefCounts()? new WeakHashMap<>(): null;
void incrementRefCount() {
if(REF_COUNTS != null) REF_COUNTS.merge(this, 1, Integer::sum);
}
/**return true if ready to be collected and reused*/
boolean decrementAndTestRefCount() {
return REF_COUNTS != null
&& REF_COUNTS.compute(this, (me, i) -> --i == 0? null: i) == null;
}
}
There is a behavioral difference for the case that someone invokes decrementAndTestRefCount() more often than incrementRefCount(). While your original code silently runs into a negative ref count, this code will throw a NullPointerException. I prefer failing with an exception in this case…
The code above will leave you with the overhead of a single static field in case you’re not using the feature. Most JVMs should have no problems eliminating the conditionals regarding the state of a static final variable.
Note further that the code allows MyClass instances to get garbage collected while having a non-zero ref count, just like when it was an instance field, but also actively removes the mapping when the count reaches the initial state of zero again, to minimize the work needed for cleanup.
While poking around the JDK 1.7 source I noticed these methods in Boolean.java:
public static Boolean valueOf(String s) {
return toBoolean(s) ? TRUE : FALSE;
}
private static boolean toBoolean(String name) {
return ((name != null) && name.equalsIgnoreCase("true"));
}
So valueOf() internally calls toBoolean(), which is fine. I did find it interesting to read how the toBoolean() method was implemented, namely:
equalsIgnoreCase() is reversed from what I would normally do (put the string first), and then
there is a null check first. This seems redundant if point 1 was adopted; as the first/second check in that method is a null check.
So I thought I would put together a quick test and check how my implementation would work compared with the JDK one. Here it is:
public class BooleanTest {
private final String[] booleans = {"false", "true", "null"};
#Test
public void testJdkToBoolean() {
long start = System.currentTimeMillis();
for (int i = 0; i < 1000000; i++) {
for (String aBoolean : booleans) {
Boolean someBoolean = Boolean.valueOf(aBoolean);
}
}
long end = System.currentTimeMillis();
System.out.println("JDK Boolean Runtime is: " + (end-start));
}
#Test
public void testModifiedToBoolean() {
long start = System.currentTimeMillis();
for (int i = 0; i < 1000000; i++) {
for (String aBoolean : booleans) {
Boolean someBoolean = ModifiedBoolean.valueOf(aBoolean);
}
}
long end = System.currentTimeMillis();
System.out.println("ModifiedBoolean Runtime is: " + (end-start));
}
}
class ModifiedBoolean {
public static Boolean valueOf(String s) {
return toBoolean(s) ? Boolean.TRUE : Boolean.FALSE;
}
private static boolean toBoolean(String name) {
return "true".equalsIgnoreCase(name);
}
}
Here is the result:
Running com.app.BooleanTest
JDK Boolean Runtime is: 37
ModifiedBoolean Runtime is: 34
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.128 sec
So not much of a gain, especially when distributed over 1m runs. Really not all that surprising.
What I would like to understand is how these differ at the bytecode level. I am interested in delving into this area but don't have any experience. Is this more work than is worth while? Would it provide a useful learning experience? Is this something people do on a regular basis?
There would be no performance gain for a couple of reasons:
It's just not that expensive of an operation to check whether or not name == null.
The thing that takes time is loading the value of name...which has to be loaded in either case.
name==null is faster then calling String.equalsIgnoreCase since it's a simple equality test rather than a function call.
These don't matter anyway because the architecture will likely use predictive branching and thus if most of your calls aren't for null strings, the architecture will start loading the branching instructions as if your strings are not null.
First, bytecode is very close to Java source. It can't give you much more information about the performance except some special cases (e.g. compile-time expression evaluation). Much more important is JIT compilation done by the JVM.
Some background: In early Java versions, it was rather a machine-well-readable version of source code. Decompiling such early Java versions is rather straightforward. You will lose comments and code will be slightly different. The hardest work of such decompiler is probably reconstructing the loops. In today Java versions, the decompilers have to be slightly more complex, because the language has been changed (inner classes, generics, …) more than the bytecode. But the bytecode is still very close to the source, even today.
Second, the redundant null check might not be important. JVM is able to remove some unneeded checks, even the automatically generated array bounds checks if they are surely unneeded.
Third, benchmarks are very tricky and even more tricky on the JVM. JVM "warms up", so the second benchmark might benefit from some optimizations done for the first benchmark. In some cases, the opposite might also happen – some optimistic optimisation must be discarded and the second benchmark is slower. Moreover, running the code only once creates huge error in the results.
For the following piece of code, sonarqube computes the method cyclomatic complexity as 9
String foo() {
if (cond1) return a;
if (cond2) return b;
if (cond3) return c;
if (cond4) return d;
return e;
}
I understand as per the rules for computation http://docs.sonarqube.org/display/SONAR/Metrics+-+Complexity the complexity of 9 is correct.
So complexity of the method is = 4 (if) + 4 (return) + 1 (method) = 9
This complexity can be reduced, if I have a single exit point.
String foo() {
String temp;
if (cond1) {
temp = a;
} else if (cond2) {
temp = b;
} else if (cond3) {
temp = c;
} else if (cond4) {
temp = d;
} else {
temp = e;
}
return temp;
}
I believe this code is more cluttered and unreadable than the previous version and I feel having methods with return on guard conditions is a better programming practice. So is there a good reason why return statement is considered for computation of cyclomatic complexity? Can the logic for computation be changed so that it doesn't promote single exit point.
I agree you should use some common sense and go with the code which you believe is simplest.
BTW You can simplify you code and have just one return if you use ? :
String foo() {
return cond1 ? a :
cond2 ? b :
cond3 ? c :
cond4 ? d : e;
}
"So is there a good reason why return statement is considered for
computation of cyclomatic complexity? Can the logic for computation be
changed so that it doesn't promote single exit point."
In your example having multiple returns doesn't add to the complexity and as #Peter Lawrey says you should employ common sense.
Does this mean that all examples of multiple return statements do not to complexity and it should be removed? I don't think so. If would be very easy to come up with an example of a method which is hard-to-read because of multiple return statements. Just imagine a 100 line method with 4 different return statement sprinkled throughout. That is the kind of issue this rules tries to catch.
This is a known problem with cyclomatic complexity.
Also there is good reason to think that cyclomatic complexity is useless. It correlates strongly with SLOC and only weakly with actual bugs. In fact SLOC is just as good a predictor of defects as cyclomatic complexity. The same goes for most other complexity metrics.
See http://www.leshatton.org/Documents/TAIC2008-29-08-2008.pdf, starting around slide 16.
Other answers have made good points about the computation involved.
I'd like to point out that your assertion that the code is less readable is false, because in one instance you have braces, and in the other you don't.
String foo() {
String output = e;
if (cond1) output = a;
else if (cond2) output = b;
else if (cond3) output = c;
else if (cond4) output = d;
return output;
}
This is as readable as the example you gave with return statements.
Whether or not you allow braceless if statements is a question of style that you should probably be consistent with across all your code.
The more important issue that cyclomatic complexity does address is that if computing the value of cond1, cond2 etc have side effects, i.e. if they were a stateful method rather than a field in this case, then the conceptual complexity of the code is much higher if you might return early compared to if you can't.
Assume that we have a given interface:
public interface StateKeeper {
public abstract void negateWithoutCheck();
public abstract void negateWithCheck();
}
and following implementations:
class StateKeeperForPrimitives implements StateKeeper {
private boolean b = true;
public void negateWithCheck() {
if (b == true) {
this.b = false;
}
}
public void negateWithoutCheck() {
this.b = false;
}
}
class StateKeeperForObjects implements StateKeeper {
private Boolean b = true;
#Override
public void negateWithCheck() {
if (b == true) {
this.b = false;
}
}
#Override
public void negateWithoutCheck() {
this.b = false;
}
}
Moreover assume that methods negate*Check() can be called 1+ many times and it is hard to say what is the upper bound of the number of calls.
The question is which method in both implementations is 'better'
according to execution speed, garbage collection, memory allocation, etc. -
negateWithCheck or negateWithoutCheck?
Does the answer depend on which from the two proposed
implementations we use or it doesn't matter?
Does the answer depend on the estimated number of calls? For what count of number is better to use one or first method?
There might be a slight performance benefit in using the one with the check. I highly doubt that it matters in any real life application.
premature optimization is the root of all evil (Donald Knuth)
You could measure the difference between the two. Let me emphasize that these kind of things are notoriously difficult to measure reliably.
Here is a simple-minded way to do this. You can hope for performance benefits if the check recognizes that the value doesn't have to be changed, saving you an expensive write into the memory. So I have changed your code accordingly.
interface StateKeeper {
public abstract void negateWithoutCheck();
public abstract void negateWithCheck();
}
class StateKeeperForPrimitives implements StateKeeper {
private boolean b = true;
public void negateWithCheck() {
if (b == false) {
this.b = true;
}
}
public void negateWithoutCheck() {
this.b = true;
}
}
class StateKeeperForObjects implements StateKeeper {
private Boolean b = true;
public void negateWithCheck() {
if (b == false) {
this.b = true;
}
}
public void negateWithoutCheck() {
this.b = true;
}
}
public class Main {
public static void main(String args[]) {
StateKeeper[] array = new StateKeeper[10_000_000];
for (int i=0; i<array.length; ++i)
//array[i] = new StateKeeperForObjects();
array[i] = new StateKeeperForPrimitives();
long start = System.nanoTime();
for (StateKeeper e : array)
e.negateWithCheck();
//e.negateWithoutCheck();
long end = System.nanoTime();
System.err.println("Time in milliseconds: "+((end-start)/1000000));
}
}
I get the followings:
check no check
primitive 17ms 24ms
Object 21ms 24ms
I didn't find any performance penalty of the check the other way around when the check is always superfluous because the value always has to be changed.
Two things: (1) These timings are unreliable. (2) This benchmark is far from any real life application; I had to make an array of 10 million elements to actually see something.
I would simply pick the function with no check. I highly doubt that in any real application you would get any measurable performance benefit from the function that has the check but that check is error prone and is harder to read.
Short answer: the Without check will always be faster.
An assignment takes a lot less computation time than a comparison. Therefore: an IF statement is always slower than an assignment.
When comparing 2 variables, your CPU will fetch the first variable, fetch the second variable, compare those 2 and store the result into a temporary register. That's 2 fetches, 1 compare and a 1 store.
When you assign a value, your CPU will fetch the value on the right hand of the '=' and store it into the memory. That's 1 fetch and 1 store.
In general, if you need to set some state, just set the state. If, on the otherhand, you have to do something more - like log the change, inform about the change, etc. - then you should first inspect the old value.
But, in the case when methods like the ones you provided are called very intensely, there may be some performance difference in checking vs non-checking (whether the new value is different). Possible outcomes are:
1-a) check returns false
1-b) check returns true, value is assigned
2) value is assigned without check
As far as I know, writing is always slower than reading (all the way down to register level), so the fastest outcome is 1-a. If your case is that the most common thing that happens is that the value will not be changed ('more than 50%' logic is just not good enough, the exact percentage has to be figured out empirically) - then you should go with checking, as this eliminates redundant writing operation (value assignment). If, on the other hand, value is different more than often - assign it without checking.
You should test your concrete cases, do some profiling, and based on the result determine the best implementation. There is no general "best way" for this case (apart from "just set the state").
As for boolean vs Boolean here, I would say (off the top of my head) that there should be no performance difference.
Only today I've seen few answers and comments repeating that
Premature optimization is the root of all evil
Well obviously one if statement more is one thing more to do, but... it doesn't really matter.
And garbage collection and memory allocation... not an issue here.
I would generally consider the negateWithCheck to be slightly slower due there always being a comparison. Also notice in the StateKeeperOfObjects you are introducing some autoboxing. 'true' and 'false' are primitive boolean values.
Assuming you fix the StateKeeperOfObjects to use all objects, then potentially, but most likely not noticeable.
The speed will depend slightly on the number of calls, but in general the speed should be considered to be the same whether you call it once or many times (ignoring secondary effects such as caching, jit, etc).
It seems to me, a better question is whether or not the performance difference is noticeable. I work on a scientific project that involves millions of numerical computations done in parallel. We started off using Objects (e.g. Integer, Double) and had less than desirable performance, both in terms of memory and speed. When we switched all of our computations to primitives (e.g. int, double) and went over the code to make sure we were not introducing anything funky through autoboxing, we saw a huge performance increase (both memory and speed).
I am a huge fan of avoiding premature optimization, unless it is something that is "simple" to implement. Just be wary of the consequences. For example, do you have to represent null values in your data model? If so, how do you do that using a primitive? Doubles can be done easily with NaN, but what about Booleans?
negateWithoutCheck() is preferable because if we consider the number of calls then negateWithoutCheck() has only one call i.e. this.b = false; where as negateWithCheck() has one extra with previous one.
I have an assignment wherein I have to parse the field access flags of a java .class file.
The specification for a .class file can be found here: Class File Format (page 26 & 27 have the access flags and hex vals).
This is fine, I can do this no worries.
My issue is that there is a large number of combinations.
I know the public, private and protected are mutually exclusive, which reduces the combinations somewhat. Final and transient are also mutually exclusive. The rest however are not.
At the moment, I have a large switch statement to do the comparison. I read in the hex value of the access flag and then increment a counter, depending on if it is public, private or protected. This works fine, but it seems quite messy to just have every combination listed in a switch statement. i.e. public static, public final, public static final, etc.
I thought of doing modulo on the access flag and the appropriate hex value for public, private or protected, but public is 0x0001, so that won't work.
Does anyone else have any ideas as to how I could reduce the amount of cases in my switch statement?
What is the problem? The specification says that it's a bit flag, that means that you should look at a value as a binary number, and that you can test if a specific value is set by doing a bitwise AND.
E.g
/*
ACC_VOLATILE = 0x0040 = 10000000
ACC_PUBLIC = 0x0001 = 00000001
Public and volatile is= 10000001
*/
publicCount += flag & ACC_PUBLIC > 0 ? 1 : 0;
volatileCount += flag & ACC_VOLATILE > 0 ? 1 : 0;
If you are trying to avoid a pattern like this one I just stole:
if (access_flag & ACC_PUBLIC != 0)
{
public++;
}
if (access_flag & ACC_FINAL != 0)
{
final++;
}
...
It's a great instinct. I make it a rule never to write code that looks redundant like that. Not only is it error-prone and more code in your class, but copy & paste code is really boring to write.
So the big trick is to make this access "Generic" and easy to understand from the calling class--pull out all the repeated crap and just leave "meat", push the complexity to the generic routine.
So an easy way to call a method would be something like this that gives an array of bitfields that contain many bit combinations that need counted and a list of fields that you are interested in (so that you don't waste time testing fields you don't care about):
int[] counts = sumUpBits(arrayOfFlagBitfields, ACC_PUBLIC | ACC_FINAL | ACC_...);
That's really clean, but then how do you access the return fields? I was originally thinking something like this:
System.out.println("Number of public classes="+counts[findBitPosition(ACC_PUBLIC]));
System.out.println("Number of final classes="+counts[findBitPosition(ACC_FINAL)]);
Most of the boilerplate here is gone except the need to change the bitfields to their position. I think two changes might make it better--encapsulate it in a class and use a hash to track positions so that you don't have to convert bitPosition all the time (if you prefer not to use the hash, findBitPosition is at the end).
Let's try a full-fledged class. How should this look from the caller's point of view?
BitSummer bitSums=new BitSummer(arrayOfFlagBitfields, ACC_PUBLIC, ACC_FINAL);
System.out.println("Number of public classes="+bitSums.getCount(ACC_PUBLIC));
System.out.println("Number of final classes="+bitSums.getCount(ACC_FINAL));
That's pretty clean and easy--I really love OO! Now you just use the bitSums to store your values until they are needed (It's less boilerplate than storing them in class variables and more clear than using an array or a collection)
So now to code the class. Note that the constructor uses variable arguments now--less surprise/more conventional and makes more sense for the hash implementation.
By the way, I know this seems like it would be slow and inefficient, but it's probably not bad for most uses--if it is, it can be improved, but this should be much shorter and less redundant than the switch statement (which is really the same as this, just unrolled--however this one uses a hash & autoboxing which will incur an additional penalty).
public class BitSummer {
// sums will store the "sum" as <flag, count>
private final HashMap<Integer, Integer> sums=new HashMap<Integer, Integer>();
// Constructor does all the work, the rest is just an easy lookup.
public BitSummer(int[] arrayOfFlagBitfields, int ... positionsToCount) {
// Loop over each bitfield we want to count
for(int bitfield : arrayOfFlagBitfields) {
// and over each flag to check
for(int flag : positionsToCount) {
// Test to see if we actually should count this bitfield as having the flag set
if((bitfield & flag) != 0) {
sums.put(flag, sums.get(flag) +1); // Increment value
}
}
}
}
// Return the count for a given bit position
public int getCount(int bit) {
return sums.get(bit);
}
}
I didn't test this but I think it's fairly close. I wouldn't use it for processing video packets in realtime or anything, but for most purposes it should be fast enough.
As for maintaining code may look "Long" compared to the original example but if you have more than 5 or 6 fields to check, this will actually be a shorter solution than the chained if statements and significantly less error/prone and more maintainable--also more interesting to write.
If you really feel the need to eliminate the hashtable you could easily replace it with a sparse array with the flag position as the index (for instance the count of a flag 00001000/0x08 would be stored in the fourth array position). This would require a function like this to calculate the bit position for array access (both storing in the array and retrieving)
private int findBitPosition(int flag) {
int ret;
while( ( flag << 1 ) != 0 )
ret++;
return ret;
}
That was fun.
I'm not sure that's what you're looking for, but I would use if-cases with binary AND to check if a flag is set:
if (access_flag & ACC_PUBLIC != 0)
{
// class is public
}
if (access_flag & ACC_FINAL != 0)
{
// class is final
}
....