Does the Java compiler or the JIT compiler optimize divisions or multiplications by a constant power of two down to bitshifting?
For example, are the following two statements optimized to be the same?
int median = start + (end - start) >>> 1;
int median = start + (end - start) / 2;
(basically this question but for Java)
While the accepted answer is right in the sense that the division can't be simply replaced by a right shift, the benchmark is terribly wrong. Any Java benchmark running for less than one second is probably measuring the interpreter's performance - not something you usually care about.
I couldn't resist and wrote an own benchmark which mainly shows that it's all more complicated. I'm not trying to fully explain the results, but I can say that
a general division is a damn slow operation
it gets avoided is much as possible
division by a constant gets AFAIK always somehow optimized
division by a power of two gets replaced by a right shift and an adjustment for negative numbers
a manually optimized expression might be better
No, the Java compiler doesn't do that, because it can't be sure on what the sign of (end - start) will be. Why does this matter? Bit shifts on negative integers yield a different result than an ordinary division. Here you can see a demo: this simple test:
System.out.println((-10) >> 1); // prints -5
System.out.println((-11) >> 1); // prints -6
System.out.println((-11) / 2); // prints -5
Also note that I used >> instead of >>>. A >>> is an unsigned bitshift, while >> is signed.
System.out.println((-10) >>> 1); // prints 2147483643
#Mystical: I wrote a benchmark, that shows that the compiler / JVM doesn't do that optimization: https://ideone.com/aKDShA
If the JVM does not do it, you can easily do it yourself.
As noted, right shifts on negative numbers do not behave the same as division because the result is rounded in the wrong direction. If you know that the dividend is non-negative, you can safely replace the division by a shift. If it might be negative, you can use the following technique.
If you can express your original code in this form:
int result = x / (1 << shift);
You can replace it with this optimized code:
int result = (x + (x >> 31 >>> (32 - shift))) >> shift;
Or, alternatively:
int result = (x + ((x >> 31) & ((1 << shift) - 1))) >> shift;
These formulae compensate for the incorrect rounding by adding a small number computed from the sign bit of the dividend. This works for any x with all shift values from 1 to 30.
If the shift is 1 (i.e., you are dividing by 2) then the >> 31 can be removed in the first formula to give this very tidy snippet:
int result = (x + (x >>> 31)) >> 1;
I have found these techniques to be faster even when the shift is non-constant, but obviously they benefit the most if the shift is constant. Note: For long x instead of int, change 31 and 32 respectively to 63 and 64.
Examining the generated machine code shows that (unsurprisingly) the HotSpot Server VM can do this optimization automatically when the shift is constant, but (also unsurprisingly) the HotSpot Client VM is too stupid to.
Related
Ok, so I know that typically left- and right- shifts are well defined only for values 0..31. I was thinking how to best extend this to include 32, which simplifies some algorithms. I came up with:
int32 << n & (n-32) >> 5
Which seems to work. Question is, is it guaranteed to work on any architecture (C, C++, Java), and can it be done more effectively?
In Java it's guaranteed to work if those variables are of type int, since >> in Java does an arithmetic right shift and shifting more than 31 also has defined behavior. But beware of operator precedence
int lshift(int x, int n)
{
return (x << n) & ((n-32) >> 5);
}
This will work for shift count up to 32. But it can be modified to include any int values with shift counts larger than 31 return 0
return (x << n) & ((n-32) >> 31);
However in C and C++ the size of int type and the behavior of >> operator is implementation defined. Most (if not all) modern implementations implement it as arithmetic shift for signed types though. Besides, the behavior of shifting more than the variable width is undefined. Worse yet, signed overflow invokes UB so even a left shift by 31 is also UB (until C++14). Therefore to get a well defined output you need to
Use a unsigned fixed-width type like uint32_t (so x << 31 isn't UB)
Use a compiler that emits arithmetic right shift instruction for >> and use a signed type for n, or implement the arithmetic shift yourself
Mask the shift amount to limit it to 5 bits for int32_t
The result would be
uint32_t lshift(uint32_t x, int32_t n)
{
return (x << (n & 0x1F)) & ((n-32) >> 31);
}
If the architecture supports conditional instructions like x86 or ARM then the following way may be faster
return n < 32 ? x << n : 0;
On a 64-bit platform you can made this even simpler by shifting in a 64-bit type and then mask. Some 32-bit platforms like ARM does support shifting by 32 so this method is also efficient
return ((uint64_t)x << (n & 0x3F)) & 0xFFFFFFFFU;
You can see the output assembly here. I don't see how it can be improved further
I stumbled upon a question that asks whether you ever had to use bit shifting in real projects. I have used bit shifts quite extensively in many projects, however, I never had to use arithmetic bit shifting, i.e., bit shifting where the left operand could be negative and the sign bit should be shifted in instead of zeros. For example, in Java, you would do arithmetic bit shifting with the >> operator (while >>> would perform a logical shift). After thinking a lot, I came to the conclusion that I have never used the >> with a possibly negative left operand.
As stated in this answer arithmetic shifting is even implementation defined in C++, so – in contrast to Java – there is not even a standardized operator in C++ for performing arithmetic shifting. The answer also states an interesting problem with shifting negative numbers that I was not even aware of:
+63 >> 1 = +31 (integral part of quotient E1/2E2)
00111111 >> 1 = 00011111
-63 >> 1 = -32
11000001 >> 1 = 11100000
So -63>>1 yields -32 which is obvious when looking at the bits, but maybe not what most programmers would anticipate on first sight. Even more surprising (but again obvious when looking at the bits) is that -1>>1 is -1, not 0.
So, what are concrete use cases for arithmetic right shifting of possibly negative values?
Perhaps the best known is the branchless absolute value:
int m = x >> 31;
int abs = x + m ^ m;
Which uses an arithmetic shift to copy the signbit to all bits. Most uses of arithmetic shift that I've encountered were of that form. Of course an arithmetic shift is not required for this, you could replace all occurrences of x >> 31 (where x is an int) by -(x >>> 31).
The value 31 comes from the size of int in bits, which is 32 by definition in Java. So shifting right by 31 shifts out all bits except the signbit, which (since it's an arithmetic shift) is copied to those 31 bits, leaving a copy of the signbit in every position.
It has come in handy for me before, in the creation of masks that were then used in '&' or '|' operators when manipulating bit fields, either for bitwise data packing or bitwise graphics.
I don't have a handy code sample, but I do recall using that technique many years ago in black-and-white graphics to zoom in (by extending a bit, either 1 or 0). For a 3x zoom, '0' would become '000' and '1' would become '111' without having to know the initial value of the bit. The bit to be expanded would be placed in the high order position, then an arithmetic right shift would extend it, regardless of whether it was 0 or 1. A logical shift, either left or right, always brings in zeros to fill vacated bit positions. In this case the sign bit was the key to the solution.
Here's an example of a function that will find the least power of two greater than or equal to the input. There are other solutions to this problem that are probably faster, namly any hardware oriented solution or just a series of right shifts and ORs. This solution uses arithmetic shift to perform a binary search.
unsigned ClosestPowerOfTwo(unsigned num) {
int mask = 0xFFFF0000;
mask = (num & mask) ? (mask << 8) : (mask >> 8);
mask = (num & mask) ? (mask << 4) : (mask >> 4);
mask = (num & mask) ? (mask << 2) : (mask >> 2);
mask = (num & mask) ? (mask << 1) : (mask >> 1);
mask = (num & mask) ? mask : (mask >> 1);
return (num & mask) ? -mask : -(mask << 1);
}
Indeed logical right shift is much more commonly used. However there are many operations that require an arithmetic shift (or are solved much more elegantly with an arithmetic shift)
Sign extension:
Most of the time you only deal with the available types in C and the compiler will automatically sign extend when casting/promoting a narrower type to a wider one (like short to int) so you may not notice it, but under the hood a left-then-right shift is used if the architecture doesn't have an instruction for sign extension. For "odd" number of bits you'll have to do the sign extension manually so this would be much more common. For example if a 10-bit pixel or ADC value is read into the top bits of a 16-bit register: value >> 6 will move the bits to the lower 10 bit positions and sign extend to preserve the value. If they're read into the low 10 bits with the top 6 bits being zero you'll use value << 6 >> 6 to sign extend the value to work with it
You also need signed extension when working with signed bit fields
struct bitfield {
int x: 15;
int y: 12;
int z: 5;
};
int f(bitfield b) {
return (b.x/8 + b.y/5) * b.z;
}
Demo on Godbolt. The shifts are generated by the compiler but usually you don't use bitfields (as they're not portable) and operate on raw integer values instead so you'll need to do arithmetic shifts yourself to extract the fields
Another example: sign-extend a pointer to make a canonical address in x86-64. This is used to store additional data in the pointer: char* pointer = (char*)((intptr_t)address << 16 >> 16). You can think of this as a 48-bit bitfield at the bottom
V8 engine's SMI optimization stores the value in the top 31 bits so it needs a right shift to restore the signed integer
Round signed division properly when converting to a multiplication, for example x/12 will be optimized to x*43691 >> 19 with some additional rounding. Of course you'll never do this in normal scalar code because the compiler already does this for you but sometimes you may need to vectorize the code or make some related libraries then you'll need to calculate the rounding yourself with arithmetic shift. You can see how compilers round the division results in the output assembly for bitfield above
Saturated shift or shifts larger than bit width, i.e. the value becomes zero when the shift count >= bit width
uint32_t lsh_saturated(uint32_t x, int32_t n) // returns 0 if n == 32
{
return (x << (n & 0x1F)) & ((n-32) >> 5);
}
uint32_t lsh(uint32_t x, int32_t n) // returns 0 if n >= 32
{
return (x << (n & 0x1F)) & ((n-32) >> 31);
}
Bit mask, useful in various cases like branchless selection (i.e. muxer). You can see lots of ways to conditionally do something on the famous bithacks page. Most of them are done by generating a mask of all ones or all zeros. The mask is usually calculated by propagating the sign bit of a subtraction like this (x - y) >> 31 (for 32-bit ints). Of course it can be changed to -(unsigned(x - y) >> 31) but that requires 2's complement and needs more operations. Here's the way to get the min and max of two integers without branching:
min = y + ((x - y) & ((x - y) >> (sizeof(int) * CHAR_BIT - 1)));
max = x - ((x - y) & ((x - y) >> (sizeof(int) * CHAR_BIT - 1)));
Another example is m = m & -((signed)(m - d) >> s); in Compute modulus division by (1 << s) - 1 in parallel without a division operator
I am not too sure what you mean. BUt i'm going to speculate that you want to use the bit shift as an arithmetic function.
One interesting thing i have seen is this property of binary numbers.
int n = 4;
int k = 1;
n = n << k; // is the same as n = n * 2^k
//now n = (4 * 2) i.e. 8
n = n >> k; // is the same as n = n / 2^k
//now n = (8 / 2) i.e. 4
hope that helps.
But yes you want to be careful of negative numbers
i would mask and then turn it back accordingly
In C when writing device drivers, bit shift operators are used extensively since bits are used as switches that need to be turned on and off. Bit shift allow one to easily and correctly target the right switch.
Many hashing and cryptographic functions make use of bit shift. Take a look at Mercenne Twister.
Lastly, it is sometimes useful to use bitfields to contain state information. Bit manipulation functions including bit shift are useful for these things.
What is the advantage of using >> operator over / operator? It was extensively used in the code I am maintaining.
For eg,
int width = previousWidth >> 2;
When you want to shift a value by a certain number of bits, it's considerably simpler to understand. For example:
byte[] bits = new byte[4];
bits[0] = (byte) (value >> 24);
bits[1] = (byte) (value >> 16);
bits[2] = (byte) (value >> 8);
bits[3] = (byte) (value >> 0);
That's clearly shifting by different numbers of bits. Would you really want to express that in terms of division instead?
Now of course when what you really want is division, you should use the division operator for the sake of readability. Some people may use bitshifting for the sake of performance, but as ever, readability is more important than micro-optimization for most code. So in your case, if what's actually desired is for width to be previousWidth divided by 4, the code should *absolutely reflect that:
int width = previousWidth / 4;
I'd only use bitshifting for this after proving that the performance difference was significant.
Most likely nothing at all. It would be a JVM optimisation to turn a division in to a bitshift.
If it's being done to a bitmask then fine, but if as in the above case it's what appears to be an awkward way of writing a mathematical operation then consider changing it as part of your maintenance (or just accept it as a bad idea but leave it alone).
They are different operations:
>> 2 shifts a bit pattern two bits to the right (essentially dividing an integer by 2^2 = 4).
/ 2 divides the number by two (which could be implemented as >> 1).
Note that using >> 1 to divide other numeric types by 2 is likely to give unexpected results.
If you mean "divide by two", type " / 2". People reading your code (including you, some time later) will thank you for not obscuring the meaning.
As others already said, I suggest you always favor the more readable option over the optimization unless you
definitely need that optimization
know that it works by
knowing your compiler
knowing your JVM
having put performance measurements in place
considering statistic evaluations of those measurements (extrema, variance, first vs. later measurements, ...)
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Java's >> versus >>> Operator?
Hi,
I know >> or << can improve the performance , but what's the purpose about >>> operator ?
ex PriorityQueue class in JDK source file,
private void heapify() {
for (int i = (size >>> 1) - 1; i >= 0; i--)
siftDown(i, (E) queue[i]);
}
Don't tell me how >>> works , just why I use it .
Thank you
The >> operator preserves the left most bit, but >>> does not.
This means if you shift a negative number with >> it stays negative, but if you use >>> it does not.
So use >> for mathematic operations, and >>> for bit bases operations.
Use <<, >> and >>> for operations on integral types that represent bit patterns.
DO NOT use them to "speed up" multiplication and division. The chances are that it won't make any difference, and it may actually make your code slower.
The Java JIT compiler should be capable of generating machine code for simple arithmetic expressions that is optimal for the hardware that it is currently running on.
If you implement your arithmetic using clever masking and shifting, there is a chance that 1) the code won't be optimal for the machine you are running on, 2) the JIT optimizer won't figure out that you are actually doing arithmetic ... and therefore won't be able to optimize. The end result will be slower code.
Difference between >> and >>> operators is that >> is for unsigned shift. It means that >>> clears most left bit when >> preserves value of this bit (because in fact this bit means negative value)
Example:
you have value FFFFFFFEh = -2 (signed) then:
-2 >> 1 = FFFFFFFF = -1 // >> preserves highest bit value; note, this wrong if we treat FFFFFFFF as unsigned value
-2 >>> 1 = 7FFFFFFF // >>> clears highest bit
so if you want to operate with unsigned values you should use >>> instead of >>.
The other day I decided to write an implementation of radix sort in Java. Radix sort is supposed to be O(k*N) but mine ended up being O(k^2*N) because of the process of breaking down each digit to one number. I broke down each digit by modding (%) the preceding digits out and dividing by ten to eliminate the succeeding digits. I asked my professor if there would be a more efficient way of doing this and he said to use bit operators. Now for my questions: Which method would be the fastest at breaking down each number in Java, 1) Method stated above. 2) Convert number to String and use substrings. 3) Use bit operations.
If 3) then how would that work?
As a hint, try using a radix other than 10, since computers handle binary arithmetic better than decimal.
x >>> n is equivalent to x / 2n
x & (2n - 1) is equivalent to x % 2n
By the way, Java's >> performs sign extension, which is probably not what you want in this case. Use >>> instead.
Radix_sort_(Java)
The line of code that does this;
int key = (a[p] & mask) >> rshift;
is the bit manipulation part.
& is the operator to do a bitwise AND and >> is a right-shift.