Whenever I need to average two numbers for an algorithm like binary search, I always do something like this:
int mid = low + ((high - low) / 2);
I recently saw another way to do it in this post, but I don't understand it. It says you can do this in Java:
int mid = (low + high) >>> 1;
or this in C++:
int mid = ((unsigned int)low + (unsigned int)high)) >> 1;
The C++ version essentially makes both operands unsigned, so doing a shift results in an arithmetic shift instead of a signed shift. I understand what both these pieces of code are doing, but how does this solve the overflow issue? I thought the whole issue was that the intermediate value high + low could overflow?
Edit:
Oh, duh. All the answers didn't exactly answer my question, but it was #John Zeringue's answer that made it click. I'll try to explain here.
The issue with (high + low)/2 in Java isn't exactly that high + low overflows (it does overflow since the integers are both signed, but all the bits are still there, and no information is lost). The issue with taking the average like this is the division. The division is operating on a signed value, so your result will be negative. Using the shift instead will divide by two but consider the bits instead of the sign (effectively treating it as unsigned).
The code you saw is broken: it doesn't compute the average of negative numbers correctly. If you are operating on non-negative values only, like indexes, that's OK, but it's not a general replacement.
The code you have originally,
int mid = low + ((high - low) / 2);
isn't safe from overflow either because the difference high - low may overflow the range for signed integers. Again, if you only work with non-negative integers it's fine.
Using the fact that A+B = 2*(A&B) + A^B we can compute the average of two integers without overflow like this:
int mid = (high&low) + (high^low)/2;
You can compute the division by 2 using a bit shift, but keep in mind the two are not the same: the division rounds towards 0 while the bit shift always rounds down.
int mid = (high&low) + ((high^low)>>1);
So let's consider bytes instead of ints. The only difference is that a byte is an 8-bit integer, while an int has 32 bits. In Java, both are always signed, meaning that the leading bit indicates whether they're positive (0) or negative (1).
byte low = Byte.valueOf("01111111", 2); // The maximum byte value
byte high = low; // This copies low.
byte sum = low + high; // The bit representation of this is 11111110, which, having a
// leading 1, is negative. Consider this the worst case
// overflow, since low and high can't be any larger.
byte mid = sum >>> 1; // This correctly gives us 01111111, fixing the overflow.
For ints, it's the same thing. Basically the gist of all this is that using an unsigned bitshift on signed integers allows you to leverage the leading bit to handle the largest possible values of low and high.
The C++ version has a hidden cheat: low and high are ints but they're never negative. When you cast them to unsigned int your sign bit becomes an extra precision bit, which a single addition cannot overflow.
It's not a very good cheat because array indices should be unsigned anyway.
Like was said elsewhere, i >> 1 means /2 for unsigned integers.
The C++ version doesn't solve the overflow issue. It only solves the issue of successfully dividing by 2 using shift instead of /, an optimization that your compiler should be able to make itself if it would be a performance improvement.
On the other hand overflow may not be a real problem, if your integral types are large enough to hold a reasonable range of indices.
You cannot use an unsigned int in Java. In case of overflow, the low 32 bits are considered, and the high order bits are discarded. The unsigned right shift will help u treat the int as unsigned int. However, in C++ you won't have the overflow.
You are safe from integer overflows by using the way you said you already use, which is:
int mid = low + ((high - low) / 2);
Let you compiler do it's job to optimize this if it needs to.
Program synthesis techniques appear to solve such problems.
In this video, the programmer specifies constraints a) no overflow, b) no division, and c) no if-then-else. The synthesizer automatically came up with something pretty nice.
https://youtu.be/jZ-mMprVVBU
Related
I need to multiply two 8 byte (64 bit) arrays in the fastest way possible. The byte arrays are little endian. The arrays can be wrapped in a ByteBuffer and treated as little endian to easily resolve a java "long" value that correctly represents the bytes (but not the real nominal value since java longs are 2s compliment).
Java's standard way to handle large math is BigInteger. But that implementation is slow and unnecessary since im very strictly working with 64 bits x 64 bits. In addition, you can't throw the "long" value into one because the nominal value is incorrect, and I can't use the byte array directly because it's little endian. I need to be able to do this without having to use up more memory / CPU to reverse the array. This type of multiplication should be able to execute 1m+ times per second. BigInteger doesn't really come close to meeting that requirement anyway, so I'm trying to do it via splitting the high order bits from the low order bits, but I can't get it working consistently.
The high-order-bits-only code is only working for a subset of longs because even the intermediate addition can overflow. I got my current code from this answer....
high bits of long multiplication in Java?
Is there a more generic pattern for getting hi/lo order bits from 128 bit multiplication? That works for the largest long values?
Edit:
FWIW I'm prepared for the answer to be.. "cant do that in java, do it in c++ and call via JNI". Though I'm hoping someone can give a java solution before it comes to that.
As of Java 9 (which was a bit too new at the time this question was asked), there is now a trivial way to get the upper half of the 128-bit product of two signed 64-bit integers: Math.multiplyHigh
There is a relatively simple conversion from "upper half of signed product" to "upper half unsigned product" (see Hacker's Delight chapter 8), which can be used to implement an unsigned multiply high like this:
static long multiplyHighUnsigned(long x, long y) {
long signedUpperHalf = Math.multiplyHigh(x, y);
return signedUpperHalf + ((x >> 63) & y) + ((y >> 63) & x);
}
This has the potential to be more efficient (on platforms on which multiplyHigh is treated as an intrinsic function by the JIT) than the more manual approach used by the old answer, which I will leave below the line.
It can be done manually without BigInteger by splitting the longs up into two halves, creating the partial products, and then summing them up. Naturally the low half of the sum can be left out.
The partial products overlap, like this:
LL
LH
HL
HH
So the high halves of LH and HL must be added to the high result, and furthermore the low halves of LH and HL together with the high half of LL may carry into the bits of the high half of the result. The low half of LL is not used.
So something like this (only slightly tested):
static long hmul(long x, long y) {
long m32 = 0xffffffffL;
// split
long xl = x & m32;
long xh = x >>> 32;
long yl = y & m32;
long yh = y >>> 32;
// partial products
long t00 = xl * yl;
long t01 = xh * yl;
long t10 = xl * yh;
long t11 = xh * yh;
// resolve sum and carries
// high halves of t10 and t01 overlap with the low half of t11
t11 += (t10 >>> 32) + (t01 >>> 32);
// the sum of the low halves of t10 + t01 plus
// the high half of t00 may carry into the high half of the result
long tc = (t10 & m32) + (t01 & m32) + (t00 >>> 32);
t11 += tc >>> 32;
return t11;
}
This of course treats the input as unsigned, which does not mean they have to be positive in the sense that Java would treat them as positive, you can absolutely input -1501598000831384712L and -735932670715772870L and the right answer comes out, as confirmed by wolfram alpha.
If you are prepared to interface with native code, in C++ with MSVC you could use __umulh, and with GCC/Clang you can make the product as an __uint128_t and just shift it right, the codegen for that is actually fine, it doesn't cause a full 128x128 multiply.
I'm implementing Karatsuba multiplication in Scala (my choice) for an online course. Considering the algorithm is meant to multiply large numbers, I chose the BigInt type which is backed by Java BigInteger. I'd like to implement the algorithm efficiently, which using base 10 arithmetic is copied below from Wikipedia:
procedure karatsuba(num1, num2)
if (num1 < 10) or (num2 < 10)
return num1*num2
/* calculates the size of the numbers */
m = max(size_base10(num1), size_base10(num2))
m2 = floor(m/2)
/* split the digit sequences in the middle */
high1, low1 = split_at(num1, m2)
high2, low2 = split_at(num2, m2)
/* 3 calls made to numbers approximately half the size */
z0 = karatsuba(low1, low2)
z1 = karatsuba((low1 + high1), (low2 + high2))
z2 = karatsuba(high1, high2)
return (z2 * 10 ^ (m2 * 2)) + ((z1 - z2 - z0) * 10 ^ m2) + z0
Given that BigInteger is internally represented as an int[], if I can calculate m2 in terms of the int[], I can use bit shifting to extract the lower and higher halves of the number. Similarly, the last step can be achieved by bit shifting too.
However, it's easier said than done, as I can't seem to wrap my head around the logic. For example, if the max number is 999, the binary representation is 1111100111, lower half is 99 = 1100011, upper half is 9 = 1001. How do I get the above split?
Note:
There is an existing question that shows how to implement using arithmetic on BigInteger, but not bit shifting. Hence, my question is not a duplicate.
To be able to use bit shifting to do the splits and recombination, the base needs to be a power of two. Using two itself, as in the linked answer, is probably reasonable. Then the "length" of the inputs can be found directly with bitLength, and the split could be implemented as:
// x = a + 2^N b
BigInteger b = x.shiftRight(N);
BigInteger a = x.subtract(b.shiftLeft(N));
Where N is the size that a will have in bits.
Given that BigInteger is implemented with 32bit limbs, it makes sense to use 2³² as the base, ensuring that the big shifts involve only the movement of whole integers, and not also the slower code path where the BigInteger is shifted by a value between 1 and 31. This could be accomplished by rounding N to a multiple of 32.
The specific constant in this line,
if (N <= 2000) return x.multiply(y); // optimize this parameter
Should probably not be trusted too much, given that comment. For performance there should be some bound though, otherwise the recursive splitting goes too deeply. For example, when the size of the numbers is 32 or less, it's clearly better to just multiply, but probably a good cut-off is much higher. In this source of BigInteger itself, the cutoff is expressed in terms of the number of limbs instead of bits, and set to 80 (so 2560 bits) - it also has an other threshold above which it switches to 3-way Toom-Cook multiplication instead of Karatsuba multiplication.
I need to multiply two 8 byte (64 bit) arrays in the fastest way possible. The byte arrays are little endian. The arrays can be wrapped in a ByteBuffer and treated as little endian to easily resolve a java "long" value that correctly represents the bytes (but not the real nominal value since java longs are 2s compliment).
Java's standard way to handle large math is BigInteger. But that implementation is slow and unnecessary since im very strictly working with 64 bits x 64 bits. In addition, you can't throw the "long" value into one because the nominal value is incorrect, and I can't use the byte array directly because it's little endian. I need to be able to do this without having to use up more memory / CPU to reverse the array. This type of multiplication should be able to execute 1m+ times per second. BigInteger doesn't really come close to meeting that requirement anyway, so I'm trying to do it via splitting the high order bits from the low order bits, but I can't get it working consistently.
The high-order-bits-only code is only working for a subset of longs because even the intermediate addition can overflow. I got my current code from this answer....
high bits of long multiplication in Java?
Is there a more generic pattern for getting hi/lo order bits from 128 bit multiplication? That works for the largest long values?
Edit:
FWIW I'm prepared for the answer to be.. "cant do that in java, do it in c++ and call via JNI". Though I'm hoping someone can give a java solution before it comes to that.
As of Java 9 (which was a bit too new at the time this question was asked), there is now a trivial way to get the upper half of the 128-bit product of two signed 64-bit integers: Math.multiplyHigh
There is a relatively simple conversion from "upper half of signed product" to "upper half unsigned product" (see Hacker's Delight chapter 8), which can be used to implement an unsigned multiply high like this:
static long multiplyHighUnsigned(long x, long y) {
long signedUpperHalf = Math.multiplyHigh(x, y);
return signedUpperHalf + ((x >> 63) & y) + ((y >> 63) & x);
}
This has the potential to be more efficient (on platforms on which multiplyHigh is treated as an intrinsic function by the JIT) than the more manual approach used by the old answer, which I will leave below the line.
It can be done manually without BigInteger by splitting the longs up into two halves, creating the partial products, and then summing them up. Naturally the low half of the sum can be left out.
The partial products overlap, like this:
LL
LH
HL
HH
So the high halves of LH and HL must be added to the high result, and furthermore the low halves of LH and HL together with the high half of LL may carry into the bits of the high half of the result. The low half of LL is not used.
So something like this (only slightly tested):
static long hmul(long x, long y) {
long m32 = 0xffffffffL;
// split
long xl = x & m32;
long xh = x >>> 32;
long yl = y & m32;
long yh = y >>> 32;
// partial products
long t00 = xl * yl;
long t01 = xh * yl;
long t10 = xl * yh;
long t11 = xh * yh;
// resolve sum and carries
// high halves of t10 and t01 overlap with the low half of t11
t11 += (t10 >>> 32) + (t01 >>> 32);
// the sum of the low halves of t10 + t01 plus
// the high half of t00 may carry into the high half of the result
long tc = (t10 & m32) + (t01 & m32) + (t00 >>> 32);
t11 += tc >>> 32;
return t11;
}
This of course treats the input as unsigned, which does not mean they have to be positive in the sense that Java would treat them as positive, you can absolutely input -1501598000831384712L and -735932670715772870L and the right answer comes out, as confirmed by wolfram alpha.
If you are prepared to interface with native code, in C++ with MSVC you could use __umulh, and with GCC/Clang you can make the product as an __uint128_t and just shift it right, the codegen for that is actually fine, it doesn't cause a full 128x128 multiply.
I understand what the unsigned right shift operator ">>>" in Java does, but why do we need it, and why do we not need a corresponding unsigned left shift operator?
The >>> operator lets you treat int and long as 32- and 64-bit unsigned integral types, which are missing from the Java language.
This is useful when you shift something that does not represent a numeric value. For example, you could represent a black and white bit map image using 32-bit ints, where each int encodes 32 pixels on the screen. If you need to scroll the image to the right, you would prefer the bits on the left of an int to become zeros, so that you could easily put the bits from the adjacent ints:
int shiftBy = 3;
int[] imageRow = ...
int shiftCarry = 0;
// The last shiftBy bits are set to 1, the remaining ones are zero
int mask = (1 << shiftBy)-1;
for (int i = 0 ; i != imageRow.length ; i++) {
// Cut out the shiftBits bits on the right
int nextCarry = imageRow & mask;
// Do the shift, and move in the carry into the freed upper bits
imageRow[i] = (imageRow[i] >>> shiftBy) | (carry << (32-shiftBy));
// Prepare the carry for the next iteration of the loop
carry = nextCarry;
}
The code above does not pay attention to the content of the upper three bits, because >>> operator makes them
There is no corresponding << operator because left-shift operations on signed and unsigned data types are identical.
>>> is also the safe and efficient way of finding the rounded mean of two (large) integers:
int mid = (low + high) >>> 1;
If integers high and low are close to the the largest machine integer, the above will be correct but
int mid = (low + high) / 2;
can get a wrong result because of overflow.
Here's an example use, fixing a bug in a naive binary search.
Basically this has to do with sign (numberic shifts) or unsigned shifts (normally pixel related stuff).
Since the left shift, doesn't deal with the sign bit anyhow, it's the same thing (<<< and <<)...
Either way I have yet to meet anyone that needed to use the >>>, but I'm sure they are out there doing amazing things.
As you have just seen, the >> operator automatically fills the
high-order bit with its previous contents each time a shift occurs.
This preserves the sign of the value. However, sometimes this is
undesirable. For example, if you are shifting something that does not
represent a numeric value, you may not want sign extension to take
place. This situation is common when you are working with pixel-based
values and graphics. In these cases you will generally want to shift a
zero into the high-order bit no matter what its initial value was.
This is known as an unsigned shift. To accomplish this, you will use
java’s unsigned, shift-right operator,>>>, which always shifts zeros
into the high-order bit.
Further reading:
http://henkelmann.eu/2011/02/01/java_the_unsigned_right_shift_operator
http://www.java-samples.com/showtutorial.php?tutorialid=60
The signed right-shift operator is useful if one has an int that represents a number and one wishes to divide it by a power of two, rounding toward negative infinity. This can be nice when doing things like scaling coordinates for display; not only is it faster than division, but coordinates which differ by the scale factor before scaling will differ by one pixel afterward. If instead of using shifting one uses division, that won't work. When scaling by a factor of two, for example, -1 and +1 differ by two, and should thus differ by one afterward, but -1/2=0 and 1/2=0. If instead one uses signed right-shift, things work out nicely: -1>>1=-1 and 1>>1=0, properly yielding values one pixel apart.
The unsigned operator is useful either in cases where either the input is expected to have exactly one bit set and one will want the result to do so as well, or in cases where one will be using a loop to output all the bits in a word and wants it to terminate cleanly. For example:
void processBitsLsbFirst(int n, BitProcessor whatever)
{
while(n != 0)
{
whatever.processBit(n & 1);
n >>>= 1;
}
}
If the code were to use a signed right-shift operation and were passed a negative value, it would output 1's indefinitely. With the unsigned-right-shift operator, however, the most significant bit ends up being interpreted just like any other.
The unsigned right-shift operator may also be useful when a computation would, arithmetically, yield a positive number between 0 and 4,294,967,295 and one wishes to divide that number by a power of two. For example, when computing the sum of two int values which are known to be positive, one may use (n1+n2)>>>1 without having to promote the operands to long. Also, if one wishes to divide a positive int value by something like pi without using floating-point math, one may compute ((value*5468522205L) >>> 34) [(1L<<34)/pi is 5468522204.61, which rounded up yields 5468522205]. For dividends over 1686629712, the computation of value*5468522205L would yield a "negative" value, but since the arithmetically-correct value is known to be positive, using the unsigned right-shift would allow the correct positive number to be used.
A normal right shift >> of a negative number will keep it negative. I.e. the sign bit will be retained.
An unsigned right shift >>> will shift the sign bit too, replacing it with a zero bit.
There is no need to have the equivalent left shift because there is only one sign bit and it is the leftmost bit so it only interferes when shifting right.
Essentially, the difference is that one preserves the sign bit, the other shifts in zeros to replace the sign bit.
For positive numbers they act identically.
For an example of using both >> and >>> see BigInteger shiftRight.
In the Java domain most typical applications the way to avoid overflows is to use casting or Big Integer, such as int to long in the previous examples.
int hiint = 2147483647;
System.out.println("mean hiint+hiint/2 = " + ( (((long)hiint+(long)hiint)))/2);
System.out.println("mean hiint*2/2 = " + ( (((long)hiint*(long)2)))/2);
BigInteger bhiint = BigInteger.valueOf(2147483647);
System.out.println("mean bhiint+bhiint/2 = " + (bhiint.add(bhiint).divide(BigInteger.valueOf(2))));
System.out.println(Integer.MAX_VALUE + 1 == Integer.MIN_VALUE);
is true.
I understand that integer in Java is 32 bit and can't go above 231-1, but I can't understand why adding 1 to its MAX_VALUE results in MIN_VALUE and not in some kind of exception. Not mentioning something like transparent conversion to a bigger type, like Ruby does.
Is this behavior specified somewhere? Can I rely on it?
Because the integer overflows. When it overflows, the next value is Integer.MIN_VALUE. Relevant JLS
If an integer addition overflows, then the result is the low-order bits of the mathematical sum as represented in some sufficiently large two's-complement format. If overflow occurs, then the sign of the result is not the same as the sign of the mathematical sum of the two operand values.
The integer storage gets overflowed and that is not indicated in any way, as stated in JSL 3rd Ed.:
The built-in integer operators do not indicate overflow or underflow in any way. Integer operators can throw a NullPointerException if unboxing conversion (§5.1.8) of a null reference is required. Other than that, the only integer operators that can throw an exception (§11) are the integer divide operator / (§15.17.2) and the integer remainder operator % (§15.17.3), which throw an ArithmeticException if the right-hand operand is zero, and the increment and decrement operators ++(§15.15.1, §15.15.2) and --(§15.14.3, §15.14.2), which can throw an OutOfMemoryError if boxing conversion (§5.1.7) is required and there is not sufficient memory available to perform the conversion.
Example in a 4-bits storage:
MAX_INT: 0111 (7)
MIN_INT: 1000 (-8)
MAX_INT + 1:
0111+
0001
----
1000
You must understand how integer values are represented in binary form, and how binary addition works. Java uses a representation called two's complement, in which the first bit of the number represents its sign. Whenever you add 1 to the largest java Integer, which has a bit sign of 0, then its bit sign becomes 1 and the number becomes negative.
This links explains with more details: http://www.cs.grinnell.edu/~rebelsky/Espresso/Readings/binary.html#integers-in-java
--
The Java Language Specification treats this behavior here: http://docs.oracle.com/javase/specs/jls/se6/html/expressions.html#15.18.2
If an integer addition overflows, then the result is the low-order bits of the mathematical sum as represented in some sufficiently large two's-complement format. If overflow occurs, then the sign of the result is not the same as the sign of the mathematical sum of the two operand values.
Which means that you can rely on this behavior.
On most processors, the arithmetic instructions have no mode to fault on an overflow. They set a flag that must be checked. That's an extra instruction so probably slower. In order for the language implementations to be as fast as possible, the languages are frequently specified to ignore the error and continue. For Java the behaviour is specified in the JLS. For C, the language does not specify the behaviour, but modern processors will behave as Java.
I believe there are proposals for (awkward) Java SE 8 libraries to throw on overflow, as well as unsigned operations. A behaviour, I believe popular in the DSP world, is clamp the values at the maximums, so Integer.MAX_VALUE + 1 == Integer.MAX_VALUE [not Java].
I'm sure future languages will use arbitrary precision ints, but not for a while yet. Requires more expensive compiler design to run quickly.
The same reason why the date changes when you cross the international date line: there's a discontinuity there. It's built into the nature of binary addition.
This is a well known issue related to the fact that Integers are represented as two's complement down at the binary layer. When you add 1 to the max value of a two's complement number you get the min value. Honestly, all integers behaved this way before java existed, and changing this behavior for the Java language would have added more overhead to integer math, and confused programmers coming from other languages.
When you add 3 (in binary 11) to 1 (in binary 1), you must change to 0 (in binary 0) all binary 1 starting from the right, until you got 0, which you should change to 1. Integer.MAX_VALUE has all places filled up with 1 so there remain only 0s.
Easy to understand with byte example=>
byte a=127;//max value for byte
byte b=1;
byte c=(byte) (a+b);//assigns -128
System.out.println(c);//prints -128
Here we are forcing addition and casting it to be treated as byte.
So what will happen is that when we reach 127 (largest possible value for a byte) and we add plus 1 then the value flips (as shown in image) from 127 and it becomes -128.
The value starts circling around the type.
Same is for integer.
Also integer + integer stays integer ( unlike byte + byte which gets converted to int [unless casted forcefully as above]).
int int1=Integer.MAX_VALUE+1;
System.out.println(int1); //prints -2147483648
System.out.println(Integer.MIN_VALUE); //prints -2147483648
//below prints 128 as converted to int as not forced with casting
System.out.println(Byte.MAX_VALUE+1);
Cause overflow and two-compliant nature count goes on "second loop", we was on far most right position 2147483647 and after summing 1, we appeared at far most left position -2147483648, next incrementing goes -2147483647, -2147483646, -2147483645, ... and so forth to the far most right again and on and on, its nature of summing machine on this bit depth.
Some examples:
int a = 2147483647;
System.out.println(a);
gives: 2147483647
System.out.println(a+1);
gives: -2147483648 (cause overflow and two-compliant nature count goes on "second loop", we was on far most right position 2147483647 and after summing 1, we appeared at far most left position -2147483648, next incrementing goes -2147483648, -2147483647, -2147483646, ... and so fores to the far most right again and on and on, its nature of summing machine on this bit depth)
System.out.println(2-a);
gives:-2147483645 (-2147483647+2 seems mathematical logical)
System.out.println(-2-a);
gives: 2147483647 (-2147483647-1 -> -2147483648, -2147483648-1 -> 2147483647 some loop described in previous answers)
System.out.println(2*a);
gives: -2 (2147483647+2147483647 -> -2147483648+2147483646 again mathematical logical)
System.out.println(4*a);
gives: -4 (2147483647+2147483647+2147483647+2147483647 -> -2147483648+2147483646+2147483647+2147483647 -> -2-2 (according to last answer) -> -4)`