The other day I decided to write an implementation of radix sort in Java. Radix sort is supposed to be O(k*N) but mine ended up being O(k^2*N) because of the process of breaking down each digit to one number. I broke down each digit by modding (%) the preceding digits out and dividing by ten to eliminate the succeeding digits. I asked my professor if there would be a more efficient way of doing this and he said to use bit operators. Now for my questions: Which method would be the fastest at breaking down each number in Java, 1) Method stated above. 2) Convert number to String and use substrings. 3) Use bit operations.
If 3) then how would that work?
As a hint, try using a radix other than 10, since computers handle binary arithmetic better than decimal.
x >>> n is equivalent to x / 2n
x & (2n - 1) is equivalent to x % 2n
By the way, Java's >> performs sign extension, which is probably not what you want in this case. Use >>> instead.
Radix_sort_(Java)
The line of code that does this;
int key = (a[p] & mask) >> rshift;
is the bit manipulation part.
& is the operator to do a bitwise AND and >> is a right-shift.
Related
I am writing a program to do number system conversions.
The problem I am having is when I input a two's complement number Java is taking it in as a positive binary number for example if I input 11111101 as -3 instead the program takes in 253. how can i make it so the program takes in -3.
Though not addressing the title directly, sign-extending your input is a step towards that goal. Here are some ways to sign-extend that work in a more general case, not just when the input is 8 or 16 or 32 bits.
Relying on arithmetic right shift: (x << k) >> k where k is the number of "unused" bits. So for example to extend an 8-bit number into a 32-bit number, k would be 24.
A XOR trick: (x ^ m) - m, where m is a mask that has a 1 exactly where the signbit of your input is. Eg for 8-bit input, m = 1 << 7. Advantage is that it depends only on the input, not the type you're converting to - for Java that's not really a consideration, but that makes it attractive for use in templated C++ code.
Explicit wrapping, like jsheeran's comment.
Currently I am working with the % operator in Java. I thought I understood it but after coming across this question am now confused. So I know that 10 % 3 = Remainder 1. My question is why would 3 % 8 = Remainder 3 I thought that this would instead equal 0 because 3 goes into 8 zero times? Another example would be 2 % 8 = Remainder 2 why would this ones remainder not be zero also? If someone could explain why this is that would be awesome! Thank you!
The remainder operator is explained in the Java tutorial at https://docs.oracle.com/javase/tutorial/java/nutsandbolts/op1.html
If you are looking for a modulo operator, then use Math.floorMod at it treats negative numbers as you'd expect. The documentation gives examples of the difference between % and floorMod: http://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#floorMod-int-int-
This is the correct behavior, mathematically speaking.
Hopefully people will indulge me a digression, but the integers mod n are all of the integers from 0 to n - 1. They form a multiplicative group that's important to number theory and cryptography. In fact, if you do
for (int i = 0; i < x; i++) {
Console.WriteLine(i % n);
}
this will give you all the numbers from 0 to (n - 1) over and over again, just like the group.
Without going into details, a mathematical group is a set with an operation that behaves more or less like you'd expect it to with "ordinary" arithmetic. For example, if a set forms a group under multiplication, that means that multiplication "works" for that set.
If n is the power of a prime (e.g. 32), then the integers mod n form a finite field, meaning that it behaves quite a bit like ordinary arithmetic.
TL;DR This behavior is by design. It's actually mathematically perfectly correct as well.
Does the Java compiler or the JIT compiler optimize divisions or multiplications by a constant power of two down to bitshifting?
For example, are the following two statements optimized to be the same?
int median = start + (end - start) >>> 1;
int median = start + (end - start) / 2;
(basically this question but for Java)
While the accepted answer is right in the sense that the division can't be simply replaced by a right shift, the benchmark is terribly wrong. Any Java benchmark running for less than one second is probably measuring the interpreter's performance - not something you usually care about.
I couldn't resist and wrote an own benchmark which mainly shows that it's all more complicated. I'm not trying to fully explain the results, but I can say that
a general division is a damn slow operation
it gets avoided is much as possible
division by a constant gets AFAIK always somehow optimized
division by a power of two gets replaced by a right shift and an adjustment for negative numbers
a manually optimized expression might be better
No, the Java compiler doesn't do that, because it can't be sure on what the sign of (end - start) will be. Why does this matter? Bit shifts on negative integers yield a different result than an ordinary division. Here you can see a demo: this simple test:
System.out.println((-10) >> 1); // prints -5
System.out.println((-11) >> 1); // prints -6
System.out.println((-11) / 2); // prints -5
Also note that I used >> instead of >>>. A >>> is an unsigned bitshift, while >> is signed.
System.out.println((-10) >>> 1); // prints 2147483643
#Mystical: I wrote a benchmark, that shows that the compiler / JVM doesn't do that optimization: https://ideone.com/aKDShA
If the JVM does not do it, you can easily do it yourself.
As noted, right shifts on negative numbers do not behave the same as division because the result is rounded in the wrong direction. If you know that the dividend is non-negative, you can safely replace the division by a shift. If it might be negative, you can use the following technique.
If you can express your original code in this form:
int result = x / (1 << shift);
You can replace it with this optimized code:
int result = (x + (x >> 31 >>> (32 - shift))) >> shift;
Or, alternatively:
int result = (x + ((x >> 31) & ((1 << shift) - 1))) >> shift;
These formulae compensate for the incorrect rounding by adding a small number computed from the sign bit of the dividend. This works for any x with all shift values from 1 to 30.
If the shift is 1 (i.e., you are dividing by 2) then the >> 31 can be removed in the first formula to give this very tidy snippet:
int result = (x + (x >>> 31)) >> 1;
I have found these techniques to be faster even when the shift is non-constant, but obviously they benefit the most if the shift is constant. Note: For long x instead of int, change 31 and 32 respectively to 63 and 64.
Examining the generated machine code shows that (unsurprisingly) the HotSpot Server VM can do this optimization automatically when the shift is constant, but (also unsurprisingly) the HotSpot Client VM is too stupid to.
I understand what the unsigned right shift operator ">>>" in Java does, but why do we need it, and why do we not need a corresponding unsigned left shift operator?
The >>> operator lets you treat int and long as 32- and 64-bit unsigned integral types, which are missing from the Java language.
This is useful when you shift something that does not represent a numeric value. For example, you could represent a black and white bit map image using 32-bit ints, where each int encodes 32 pixels on the screen. If you need to scroll the image to the right, you would prefer the bits on the left of an int to become zeros, so that you could easily put the bits from the adjacent ints:
int shiftBy = 3;
int[] imageRow = ...
int shiftCarry = 0;
// The last shiftBy bits are set to 1, the remaining ones are zero
int mask = (1 << shiftBy)-1;
for (int i = 0 ; i != imageRow.length ; i++) {
// Cut out the shiftBits bits on the right
int nextCarry = imageRow & mask;
// Do the shift, and move in the carry into the freed upper bits
imageRow[i] = (imageRow[i] >>> shiftBy) | (carry << (32-shiftBy));
// Prepare the carry for the next iteration of the loop
carry = nextCarry;
}
The code above does not pay attention to the content of the upper three bits, because >>> operator makes them
There is no corresponding << operator because left-shift operations on signed and unsigned data types are identical.
>>> is also the safe and efficient way of finding the rounded mean of two (large) integers:
int mid = (low + high) >>> 1;
If integers high and low are close to the the largest machine integer, the above will be correct but
int mid = (low + high) / 2;
can get a wrong result because of overflow.
Here's an example use, fixing a bug in a naive binary search.
Basically this has to do with sign (numberic shifts) or unsigned shifts (normally pixel related stuff).
Since the left shift, doesn't deal with the sign bit anyhow, it's the same thing (<<< and <<)...
Either way I have yet to meet anyone that needed to use the >>>, but I'm sure they are out there doing amazing things.
As you have just seen, the >> operator automatically fills the
high-order bit with its previous contents each time a shift occurs.
This preserves the sign of the value. However, sometimes this is
undesirable. For example, if you are shifting something that does not
represent a numeric value, you may not want sign extension to take
place. This situation is common when you are working with pixel-based
values and graphics. In these cases you will generally want to shift a
zero into the high-order bit no matter what its initial value was.
This is known as an unsigned shift. To accomplish this, you will use
java’s unsigned, shift-right operator,>>>, which always shifts zeros
into the high-order bit.
Further reading:
http://henkelmann.eu/2011/02/01/java_the_unsigned_right_shift_operator
http://www.java-samples.com/showtutorial.php?tutorialid=60
The signed right-shift operator is useful if one has an int that represents a number and one wishes to divide it by a power of two, rounding toward negative infinity. This can be nice when doing things like scaling coordinates for display; not only is it faster than division, but coordinates which differ by the scale factor before scaling will differ by one pixel afterward. If instead of using shifting one uses division, that won't work. When scaling by a factor of two, for example, -1 and +1 differ by two, and should thus differ by one afterward, but -1/2=0 and 1/2=0. If instead one uses signed right-shift, things work out nicely: -1>>1=-1 and 1>>1=0, properly yielding values one pixel apart.
The unsigned operator is useful either in cases where either the input is expected to have exactly one bit set and one will want the result to do so as well, or in cases where one will be using a loop to output all the bits in a word and wants it to terminate cleanly. For example:
void processBitsLsbFirst(int n, BitProcessor whatever)
{
while(n != 0)
{
whatever.processBit(n & 1);
n >>>= 1;
}
}
If the code were to use a signed right-shift operation and were passed a negative value, it would output 1's indefinitely. With the unsigned-right-shift operator, however, the most significant bit ends up being interpreted just like any other.
The unsigned right-shift operator may also be useful when a computation would, arithmetically, yield a positive number between 0 and 4,294,967,295 and one wishes to divide that number by a power of two. For example, when computing the sum of two int values which are known to be positive, one may use (n1+n2)>>>1 without having to promote the operands to long. Also, if one wishes to divide a positive int value by something like pi without using floating-point math, one may compute ((value*5468522205L) >>> 34) [(1L<<34)/pi is 5468522204.61, which rounded up yields 5468522205]. For dividends over 1686629712, the computation of value*5468522205L would yield a "negative" value, but since the arithmetically-correct value is known to be positive, using the unsigned right-shift would allow the correct positive number to be used.
A normal right shift >> of a negative number will keep it negative. I.e. the sign bit will be retained.
An unsigned right shift >>> will shift the sign bit too, replacing it with a zero bit.
There is no need to have the equivalent left shift because there is only one sign bit and it is the leftmost bit so it only interferes when shifting right.
Essentially, the difference is that one preserves the sign bit, the other shifts in zeros to replace the sign bit.
For positive numbers they act identically.
For an example of using both >> and >>> see BigInteger shiftRight.
In the Java domain most typical applications the way to avoid overflows is to use casting or Big Integer, such as int to long in the previous examples.
int hiint = 2147483647;
System.out.println("mean hiint+hiint/2 = " + ( (((long)hiint+(long)hiint)))/2);
System.out.println("mean hiint*2/2 = " + ( (((long)hiint*(long)2)))/2);
BigInteger bhiint = BigInteger.valueOf(2147483647);
System.out.println("mean bhiint+bhiint/2 = " + (bhiint.add(bhiint).divide(BigInteger.valueOf(2))));
we can shift using >> operator, and we can use '/' to divide in java. What I am asking is what really happens behind the scene when we do these operations, both are exactly same or not..?
No, absolutely not the same.
You can use >> to divide, yes, but just by 2, because >> shift all the bits to the right with the consequence of dividing by 2 the number.
This is just because of how binary base operations work. And works for unsigned numbers, for signed ones it depends on which codification are you using and what kind of shift it is.
eg.
122 = 01111010 >> 1 = 00111101 = 61
Check this out for an explanation on bit shifting:
What are bitwise shift (bit-shift) operators and how do they work?
Once you understand that, you should understand the difference between that and the division operation.