Java "Bit Shifting" Tutorial? [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I would be thankful for a good tutorial, that explain for java newbies how in java all the "bit shifting" work.
I always stumble across it, but never understood how it works. It should explain all the operations and concepts that are possible with byteshifting/bitmanipulation in java.
This is just an example what I mean, (but I am looking for a tutorial that explains every possible operation):
byte b = (byte)(l >> (8 - i << 3));

Well, the official Java tutorial Bitwise and Bit Shift Operators covers the actual operations that are available in Java, and how to invoke them.
If you're wondering "what can I do with bit-shifting", then that's not Java specific, and since it's a low-level technique I'm not aware of any list of "cool things you can" do per se. It'd be worth becoming familiar with the definitions, and keeping your eyes open for other code where this is used, to see what they've done.
Note that often bit-twiddling is an efficiency gain at the expense of clarity. For example, a << 1 is usually the same as a * 2 but arguably less clear. Repeated XORs can swap two numbers without using a temporary variable, but it's generally considered better form to write the code more clearly with the temporary variable (or even better, in a utility method). So in this respect it's hard to give great examples, because you're not likely to achieve anything new or profound on an architecture level; it's all about the low-level details. (And I'd estimate that a vast number of uses of bit-twiddling "in the wild" are instances of premature optimisation.)

When using the shift operator, be very careful not to repeat a common error!!
As the following SO post suggests, the author of the accepted answer mentions:
"In some languages, applying the shift operators to any datatype smaller than int automatically resizes the operand to be an
int."
This is absolutely crucial to remember when operating on bytes for example, otherwise you may get unexpected results (as I did).
Given a byte with the following bit pattern:
1001 0000
When I tried to bit shift by 4, and assigned to an int, such as:
int value = byteValue >>> 4;
I would expect to have:
0000 1001 (or a value of 9)
But I would get a HUGE number! That's because the byteValue is casted to int BEFORE the bit shift operation, thus resulting in something like this instead:
1111 1111 1111 1111 1111 1111 1001

There is an infinite number of possible combinations. However they will be made up of one or more combinations of
>> shift right with sign extension.
>>> shift right with out sign extension.
<< shift left.
To get an understanding I suggest you write the binary numbers on paper and work out what happens. Trying to read it in a tutorial won't guarantee understanding. esp if they haven't helped so far.

There is simple but clear tutorial that I find useful here

It's not exactly a tutorial, but I have a personal library of bit-shifting functions in Java which you are very welcome to study!
Also if you do a google search for "bitwise tricks" you will find a lot of material. Many of these are in C/C++ but are generally trivially to convert to Java as most of the syntax is the same.

Here are the details of how bit shifting works.
There is some non-intuitive behavior that is not covered by the official tutorial. For instance, the right operand has a limited range (0-31 for int, 0-63 for long), and will not produce a warning if you exceed that range -- it will just truncate the bits (i.e. %32 or %64), which may give behavior other than you expect.

This site seems to give a pretty good tutorial on what you can do with bit manipulation (so not specific to java but since it is pretty easy to translate)
http://www.bogotobogo.com/cplusplus/quiz_bit_manipulation.html
The tutorial above provides
Bitwise Operations
Setting and Clearing a Bit
Displaying an Integer with Bits
Converting Decimal to Hex
The Number of Bits Set in an Integer (Number of Ones)
The Bit Set Position of an Integer
In-Place Integer Swap with Bit Manipulation
The Number of Bits Required to Convert an Integer A to Integer B
Swap Odd and Even Bits in an Integer
What (n & (n-1) == 0) is checking?
Two's Complement
Fliping n-th bit of an integer
Floating Point Number Bit Pattern
Bit pattern palindrome of an integer
Here's a file that has a bunch of java implementations
http://geekviewpoint.com/

These are two good tutorials i found while learning about bit shifting, they arent in java but most languages use the same operators and the theory is the same.
Bit twiddling
PHP Bitwise Tutorial by Jim Plush

Related

Exact mixed comparison BigInteger and double in Java

Just noticed that Python and JavaScript have exact comparison.
Example in Python:
>>> 2**1023+1 > 8.98846567431158E307
True
>>> 2**1023-1 < 8.98846567431158E307
True
And JavaScript:
> 2n**1023n+1n > 8.98846567431158E307
true
> 2n**1023n-1n < 8.98846567431158E307
true
Anything similar available for Java, except converting both arguments to
BigDecimal?
Preliminary answer, i.e. verbal solution sketch:
I am skeptical about a solution that would convert
to BigDecimal, since this conversion results in
a shift of the base from base=2 to base=10. As soon
as the exponent of the Java double floating point value
is different from the binary precision, this leads to additional digits and lengthy
pow() operations, which one can verify by inspecting
some open source BigDecimal(double)
constructor implementation.
One can get the mantissa via Double.doubleToRawLongBits(d).
If the Java double floating point value is not a sub-normal
all that needs to be done is (raw & DOUBLE_SNIF_MASK) +
(DOUBLE_SNIF_MASK+1) where 0x000fffffffffffffL This means
the integer Java primitive type long should be enough to
carry the mantissa. The challenge is now to perform a comparison
taking the exponent of the float also into account.
But I must admit I didn't have time yet to work
out some Java code. I have also in mind some
optimizations using bigLength() of the other
argument, which is in this setting a BigInteger.
The use of bitLength() would speed up the comparison.
A simple heuristic can implement a fast path, so that
the mantissa can be ignored. Already the exponent
of the double and the bitLength() of the BigInteger
give enough information for a comparison result.
As soon as I have time and a prototype running, I might
publish some Java code fragment here. But maybe somebody
faced the problem already. My general hypothesis is that a fast or
even ultra fast routine is possible. But I didn't have
much time to search the internet and to find an
implementation, thats why I defered the problem to
stack overflow, maybe somebody else had the same
problem as well and/or might point to a complete solution?

Shifting bit values [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have recently started learning java of my own back and I am having trouble with one part in particular. Today I read up on shifting bit values and I was wondering if what I am doing is correct?
I want to shift a value to the right by 16 bits and then clear the upper 24 bits by anding the value with an 8-bit mask of ones. Here is a segment of my code:
int shift(){
point = point >> 16; //shifts the value to the right by 16 bits
point = point & 0xFF; //clear the upper 24 bits
return point;
}
Is this correct? Am I using this technique correctly?
Thanks!
Yes.
some content to fill out the 30 characters minimum length
OK, looks like that joke is not well-received, I was anticipating people to like this satyrical answer. Oh well, apparently people in SO are more serious than I thought.
Well, like others have mentioned, you can actually check it yourself!
As you have written in your code comments, you are already doing it right to convert your comment (human understanding) into code (machine understanding).
How do I know that you're correct? Well, you can check the online resources:
For bit manipulation
Or just this Wikipedia entry
So far your code seems to match your intention, after consulting those resources.
But, you say, that's theoretical, how do I know empirically that my answer is correct?
Well, you can do this method:
Build some sample test cases.
For this case you can use hexadecimal number so that you can easily confirm it (because you can see each bit).
For example: try 0xFFFFFFFF, 0x80000000, 0xC0C0C0C0.
Find the correct answer manually.
In this case, try shifting the bit yourself (using pen and paper)
For 0xFFFFFFFF, which is a 32-bit number with all 1's, when you shift 16 bits to the right you will get 0x0000FFFF, which is a 32-bit number with 1's only at the last 16 bits.
Then you do and AND operation with 0xFF, which is a 32-bit number with 1's only at the last 8 bits. This will again give you 0xFF, since only at the last 8 bits both numbers have bit 1.
Repeat for other examples. You should get 0x00 and 0xC0 for the other example.
Run your code on those input.
To run your code, you can use something called Java compiler (it's usually called javac in most systems).
If you really are a beginner, you can try online compiler like this
Just put your code there and run (with Input/Output (I/O) management, explained here)
Compare your output with the program output.
Usually, this alone will give you confidence that your code is correct.
But sometimes there are tricky cases which make the code incorrect even though it's correct for some small examples. Fortunately we already checked the logic using theoretical answer above.
So I hope that helps!
First let me welcome you to Java. Good choice!
About your question: If this is correct or not depends on what you expect.
But first of all, when learning Java you should do two things:
Get a development environment like Eclipse.
Learn how to write litte test routines with Junit. Here's a JUnit Tutorial
I've taken your code and embedded it in a test routine to see what actually happens:
public class Stackoverflow extends TestCase {
#Test
public final void test() throws IOException {
testNprint(1234);
testNprint(-1234);
testNprint(0);
testNprint(255);
testNprint(256);
testNprint(Integer.MAX_VALUE);
testNprint(Integer.MIN_VALUE);
}
private void testNprint(int point) {
System.out.printf("int: %1$d (0x%1$X) -> shifted: %2$d (0x%2$X)\n",
point, shift(point));
}
private int shift(int point) {
point = point >> 16; //shifts the value to the right by 16 bits
point = point & 0xFF; //clear the upper 24 bits
return point;
}
}
And here's the result. Now you can answer your question: Are the numbers as expected?
int: 1234 (0x4D2) -> shifted: 0 (0x0)
int: -1234 (0xFFFFFB2E) -> shifted: 255 (0xFF)
int: 0 (0x0) -> shifted: 0 (0x0)
int: 255 (0xFF) -> shifted: 0 (0x0)
int: 256 (0x100) -> shifted: 0 (0x0)
int: 2147483647 (0x7FFFFFFF) -> shifted: 255 (0xFF)
int: -2147483648 (0x80000000) -> shifted: 0 (0x0)
BTW: I guess the result is not as you've expected it :-) For the reason find out about the difference of >> and >>>.

Java/Python MD5 implementation- how to overcome unsigned 32-bit requirement?

I'm attempting to implement MD5 (for curiosity's sake) in Python/Java, and am effectively translating the wikipedia MD5 page's pseudocode into either language. First, I used Java, only to encounter frustration with its negative/positive integer overflow (because unsigned ints aren't an option, for-all integer,-2147483648 <= integer <= 2147483647). I then employed Python, after deciding that it's better suited for heavy numerical computation, but realized that I wouldn't be able to overcome the unsigned 32-bit integer requirement, either (as Python immediately casts wrapped ints to longs).
Is there any way to hack around Java/Python's lack of unsigned 32-bit integers, which are required by the aforementioned MD5 pseudocode?
Since all the operations are bitwise operations, they wouldn't suffer from sign extension (which would cause you problems), except for right shift.
Java has a >>> operator for this purpose.
As a note beforehand - I don't know if this is a good solution, but it appears to give the behaviour you want.
Using the ctypes module, you can access the underlying low-level data-type directly, and hence have an unsigned int in Python.
Specifically, ctypes.c_uint:
>>> i = ctypes.c_uint(0)
>>> i.value -= 1
>>> i
c_uint(4294967295)
>>> i.value += 1
>>> i
c_uint(0)
This is arguably abuse of the module - it's designed for using C code easily from within Python, but as I say, it appears to work. The only real downside I can think of is that I assume ctypes is CPython specific.

Difference in behaviour of unsigned and signed integer when integer overflow occurs [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Reading this article on wikipedia about Integer Overflow
I dont quite understand the part why overflowing of signed integers causes undefined behaviour but overflowing of unsigned integers causes wrapping around.Why is there a difference in their behaviour?
Another questions: Do programming languages in general have any safeguards against integer overflow?
The primary rationale for the difference is the fact that C and C++ language specification allows the implementation to use one of the following three different signed integer representations
1's complement
2's complement
Signed magnitude
If the language specification mandated some specific behavior in case of signed overflow (i.e. favored one of the above representations over the other two), it would force platforms based on two non-favored representations to implement "heavy" signed integer arithmetic. It would become necessary, since the natural machine-level behavior of such platforms would not match the behavior required by the language standard. These implementation would have to constantly watch for signed overflow and adjust the results to match the standard requirements.
That would severely degrade the performance of signed integer arithmetic on platforms that use non-favored signed representations, which is, of course, completely unacceptable in such languages as C and C++, which are designed to be as close as possible to the underlying hardware when it comes to such basic operations as integer arithmetic.
The reason the behavior is undefined (as opposed to *unspecified") is because there are platforms out there which deliberately generate hardware exceptions in case of signed overflow during integer arithmetic. Note, that the behavior is undefined only for arithmetic operations, which are typically performed by the machine. For value conversions signed overflow does not produce undefined behavior (the behavior is actually implementation-defined).
As for unsigned types, they are represented identically on all platforms, which means that requiring consistent behavior across all platforms is not a problem. The wraparound that conceptually matches the "modulo 2^width" behavior is the natural behavior on virtually all known binary hardware platforms.
Because that's the way the language is defined. It permits development of conforming implementations more easily on more kinds of hardware (like DSPs with saturating arithmetic, for example).
Depends on the language. Some hardware does, and you might be able to take advantage of that in your program.
C/C++ methodology on integer overflow is to provide behaviour that is fastest on the machine you are working on, so on some machines (here assuming 16-bit signed integers):
32766 + 2 == -32768
but on some machines is:
32766 + 2 == 32767
for other machines you can have trap value or whatever the CPU will do.
Note Java has integer overflow perfectly defined, to achieve "write once, run everywhere".
As for the unsigned integers - most of their applications are bitmasks, bitfields and number manipulation (modulo operations, identifiers) - exactly the operations you don't to have them undefined.
Some of the programming languages have such safety measures, some don't:
Python 3 auto-converts values that overflow to have type long (arbitrarily large integers).
In C/C++ you have to check overflow conditions yourself, climits (C) and limits (C++) headers have defined maximum and minimal values for each type.
Programming in x86 assembly - there is CF (carry flag) for unsigned and OF (overflow flag) for signed in FLAGS and EFLAGS registers to check when overflow occurs.
Many languages also have arbitrary precision type, in case you want to avoid overflow, but operations are slower, because such variables can be (in theory) as big as your memory.
In Java, you only have unsigned int and long values and their behaviour is consistent wherever you run it. If you add 1 to Integer.MAX_VALUE you bet Integer.MIN_VALUE (it wraps) and if you subtract 1 from Long.MIN_VALUE you get Long.MAX_VALUE.
So I have no idea why the behaviour of unsigned value would be undefined in other languages.

Why would you need unsigned types in Java?

I have often heard complaints against Java for not having unsigned data types. See for example this comment. I would like to know how is this a problem? I have been programming in Java for 10 years more or less and never had issues with it. Occasionally when converting bytes to ints a & 0xFF is needed, but I don't consider that as a problem.
Since unsigned and signed numbers are represented with the same bit values, the only places I can think of where signedness matters are:
When converting the numbers to other bit representation. Between 8, 16 and 32 bit integer types you can use bitmasks if needed.
When converting numbers to decimal format, usually to Strings.
Interoperating with non-Java systems through API's or protocols. Again the data is just bits, so I don't see the problem here.
Using the numbers as memory or other offsets. With 32 bit ints this might be problem for very huge offsets.
Instead I find it easier that I don't need to consider operations between unsigned and signed numbers and the conversions between those. What am I missing? What are the actual benefits of having unsigned types in a programming language and how would having those make Java better?
Occasionally when converting bytes to ints a & 0xFF is needed, but I don't consider that as a problem.
Why not? Is "applying a bitwise AND with 0xFF" actually part of what your code is trying to represent? If not, why should it have to be part of have you write it? I actually find that almost anything I want to do with bytes beyond just copying them from one place to another ends up requiring a mask. I want my code to be cruft-free; the lack of unsigned bytes hampers this :(
Additionally, consider an API which will always return a non-negative value, or only accepts non-negative values. Using an unsigned type allows you to express that clearly, without any need for validation. Personally I think it's a shame that unsigned types aren't used more in .NET, e.g. for things like String.Length, ICollection.Count etc. It's very common for a value to naturally only be non-negative.
Is the lack of unsigned types in Java a fatal flaw? Clearly not. Is it an annoyance? Absolutely.
The comment that you quote hits the nail on the head:
Java's lack of unsigned data types also stands against it. Yes, you can work around it, but it's not ideal and you'll be using code that doesn't really reflect the underlying data correctly.
Suppose you are interoperating with another system, which wants an unsigned 16 bit integer, and you want to represent the number 65535. You claim "the data is just bits, so I don't see the problem here" - but having to pass -1 to mean 65535 is a problem. Any impedance mismatch between the representation of your data and its underlying meaning introduces an extra speedbump when writing, reading and testing the code.
Instead I find it easier that I don't need to consider operations between unsigned and signed numbers and the conversions between those.
The only times you would need to consider those operations is when you were naturally working with values of two different types - one signed and one unsigned. At that point, you absolutely want to have that difference pointed out. With signed types being used to represent naturally unsigned values, you should still be considering the differences, but the fact that you should is hidden from you. Consider:
// This should be considered unsigned - so a value of -1 is "really" 65535
short length = /* some value */;
// This is really signed
short foo = /* some value */;
boolean result = foo < length;
Suppose foo is 100 and length is -1. What's the logical result? The value of length represents 65535, so logically foo is smaller than it. But you'd probably go along with the code above and get the wrong result.
Of course they don't even need to represent different types here. They could both be naturally unsigned values, represented as signed values with negative numbers being logically greater than positive ones. The same error applies, and wouldn't be a problem if you had unsigned types in the language.
You might also want to read this interview with Joshua Bloch (Google cache, as I believe it's gone from java.sun.com now), including:
Ooh, good question... I'm going to say that the strangest thing about the Java platform is that the byte type is signed. I've never heard an explanation for this. It's quite counterintuitive and causes all sorts of errors.
If you like, yes, everything is ones and zeroes. However, your hardware arithmetic and logic unit doesn't work that way. If you want to store your bits in a signed integer value but perform operations that are not natural to signed integers, you will usually waste both storage space and processing time.
An unsigned integer type stores twice as many non-negative values in the same space as the corresponding signed integer type. So if you want to take into Java any data commonly used in a language with unsigned values, such as a POSIX date value (unsigned number of seconds) that is normally used with C, then in general you will need to use a wider integer type than C would use. If you are processing many such values, again you will waste both storage space and fetch-execute time.
The times I have used unsigned data types have been when I read in large blocks of data that correspond to images, or worked with openGL. I personally prefer unsigned if I know something will never be negative, as a "safety feature" of sorts.
Unsigned types are useful for bit-by-bit comparisons, and I'm pretty sure they are used extensively in graphics.

Categories