Python bitshifting to java

Python bitshifting to java - java

I found a python code on github, and I need to do the same thing in java, I have almost converted it to java but I'm getting an warning saying Shift operation '>>' by overly large constant value
this is the python code that I'm trying to convert
if i > 32:
return (int(((j >> 32) & ((1 << i))))) >> i
return (int((((1 << i)) & j))) >> i
and this is the java code I made trying to convert from the python code
if (i > 32) {
return (j >> 32) & (1 << i) >> i;
}
return ((1 << i) & j) >> i;
the warning is in this line (j >> 32)

Since Java's int is 32 bits (See here), shifting it 32 bits to the right leaves nothing from the original int, and therefore doesn't make much sense

This doesn't really make sense to me because shifting 32 bits in an int leaves nothing, How ever if you want to implant the same method using long, here is the code I wrote to do so.
public int bitShift(long j, int i) {
return i > 32 ? ((int) ((j >> 32) & ((long) (1 << i)))) >> i : ((int) (j & ((long) (1 << i)))) >> i;
}

Related

Java: IEEE Doubles to IBM Float

I am working on a side project at work where I would like to read/write SAS Transport files. The challenge is that numbers are encoded in 64-bit IBM floating point numbers. While I have been able to find plenty of great resources for reading a byte array (containing an IBM float) into a IEEE 32-bit floats and 64-bit floats, I'm struggling to find the code to convert floats/doubles back to IBM floats.
I recently found some code for writing a 32-bit IEEE float back out to a byte array (containing an IBM float). It seems to be working, so I've been trying to translate it to a 64-bit version. I've reversed engineered where most of the magic numbers are coming from, but I've been stumped for over a week now.
I have also tried to translate the functions listed at the end of the SAS Transport documentation to Java, but I've run into a lot of issues related to endiness, Java's lack of unsigned types, and so on. Can anyone provide the code to convert doubles to IBM floating point format?
Just to show the progress I've made, here are some shortened versions of the code I've written so far:
This grabs a 32-bit IBM float from a byte array and generates an IEEE float:
public static double fromIBMFloat(byte[] data, int offset) {
int temp = readIntFromBuffer(data, offset);
int mantissa = temp & 0x00FFFFFF;
int exponent = ((temp >> 24) & 0x7F) - 64;
boolean isNegative = (temp & 0x80000000) != 0;
double result = mantissa * Math.pow(2, 4 * exponent - 24);
if (isNegative) {
result = -result;
}
return result;
}
This is the same thing for 64-bit:
public static double fromIBMDouble(byte[] data, int offset) {
long temp = readLongFromBuffer(data, offset);
long mantissa = temp & 0x00FFFFFFFFFFFFFFL;
long exponent = ((temp >> 56) & 0x7F) - 64;
boolean isNegative = (temp & 0x8000000000000000L) != 0;
double result = mantissa * Math.pow(2, 4 * exponent - 24);
if (isNegative) {
result = -result;
}
return result;
}
Great! These work for going to IEEE floats, but now I need to go the other way. This simple implementation seems to be working for 32-bit floats:
public static void toIBMFloat(double value, byte[] xport, int offset) {
if (value == 0.0 || Double.isNaN(value) || Double.isInfinite(value)) {
writeIntToBuffer(xport, offset, 0);
return;
}
int fconv = Float.floatToIntBits((float)value);
int fmant = (fconv & 0x007FFFFF) | 0x00800000;
int temp = (fconv & 0x7F800000) >> 23;
int t = (temp & 0xFF) - 126;
while ((t & 0x3) != 0) {
++t;
fmant >>= 1;
}
fconv = (fconv & 0x80000000) | (((t >> 2) + 64) << 24) | fmant;
writeIntToBuffer(xport, offset, fconv);
}
Now, the only thing left is to translate that to work with 64-bit IBM floats. A lot of the magic numbers listed relate to the number of bits in the IEEE 32-bit floating point exponent (8-bits) and mantissa (23-bit). So for 64-bit, I just need to switch those to use the 11-bit exponent and 52-bit mantissa. But where does that 126 come from? What is the point of the 0x3 in the while loop?
Any help breaking down the 32-bit version so I can implement a 64-bit version would be greatly appreciated.

I circled back and took another swing at the C implementations provided at the end of the SAS transport documentation. It turns out the issue wasn't with my implementation; it was an issue with my tests.
TL;DR
These are my 64-bit implementations:
public static void writeIBMDouble(double value, byte[] data, int offset) {
long ieee8 = Double.doubleToLongBits(value);
long ieee1 = (ieee8 >>> 32) & 0xFFFFFFFFL;
long ieee2 = ieee8 & 0xFFFFFFFFL;
writeLong(0L, data, offset);
long xport1 = ieee1 & 0x000FFFFFL;
long xport2 = ieee2;
int ieee_exp = 0;
if (xport2 != 0 || ieee1 != 0) {
ieee_exp = (int)(((ieee1 >>> 16) & 0x7FF0) >>> 4) - 1023;
int shift = ieee_exp & 0x3;
xport1 |= 0x00100000L;
if (shift != 0) {
xport1 <<= shift;
xport1 |= ((byte)(((ieee2 >>> 24) & 0xE0) >>> (5 + (3 - shift))));
xport2 <<= shift;
}
xport1 |= (((ieee_exp >>> 2) + 65) | ((ieee1 >>> 24) & 0x80)) << 24;
}
if (-260 <= ieee_exp && ieee_exp <= 248) {
long temp = ((xport1 & 0xFFFFFFFFL) << 32) | (xport2 & 0xFFFFFFFFL);
writeLong(temp, data, offset);
return;
}
writeLong(0xFFFFFFFFFFFFFFFFL, data, offset);
if (ieee_exp > 248) {
data[offset] = 0x7F;
}
}
public static void writeLong(long value, byte[] buffer, int offset) {
buffer[offset] = (byte)(value >>> 56);
buffer[offset + 1] = (byte)(value >>> 48);
buffer[offset + 2] = (byte)(value >>> 40);
buffer[offset + 3] = (byte)(value >>> 32);
buffer[offset + 4] = (byte)(value >>> 24);
buffer[offset + 5] = (byte)(value >>> 16);
buffer[offset + 6] = (byte)(value >>> 8);
buffer[offset + 7] = (byte)value;
}
And:
public static double readIBMDouble(byte[] data, int offset) {
long temp = readLong(data, offset);
long ieee = 0L;
long xport1 = temp >>> 32;
long xport2 = temp & 0x00000000FFFFFFFFL;
long ieee1 = xport1 & 0x00ffffff;
long ieee2 = xport2;
if (ieee2 == 0L && xport1 == 0L) {
return Double.longBitsToDouble(ieee);
}
int shift = 0;
int nib = (int)xport1;
if ((nib & 0x00800000) != 0) {
shift = 3;
} else if ((nib & 0x00400000) != 0) {
shift = 2;
} else if ((nib & 0x00200000) != 0) {
shift = 1;
}
if (shift != 0) {
ieee1 >>>= shift;
ieee2 = (xport2 >>> shift) | ((xport1 & 0x00000007) << (29 + (3 - shift)));
}
ieee1 &= 0xffefffff;
ieee1 |= (((((long)(data[offset] & 0x7f) - 65) << 2) + shift + 1023) << 20) | (xport1 & 0x80000000);
ieee = ieee1 << 32 | ieee2;
return Double.longBitsToDouble(ieee);
}
public static long readLong(byte[] buffer, int offset) {
long result = unsignedByteToLong(buffer[offset]) << 56;
result |= unsignedByteToLong(buffer[offset + 1]) << 48;
result |= unsignedByteToLong(buffer[offset + 2]) << 40;
result |= unsignedByteToLong(buffer[offset + 3]) << 32;
result |= unsignedByteToLong(buffer[offset + 4]) << 24;
result |= unsignedByteToLong(buffer[offset + 5]) << 16;
result |= unsignedByteToLong(buffer[offset + 6]) << 8;
result |= unsignedByteToLong(buffer[offset + 7]);
return result;
}
private static long unsignedByteToLong(byte value) {
return (long)value & 0xFF;
}
These are basically a one-to-one translation from what's in the document, except I convert the byte[] into a long up-front and just do bit-twiddling instead of working directly with bytes.
I also realized the code in the documentation had some special cases included for "missing" values that are specific to the SAS transport standard and have nothing to do with IBM hexidecimal floating point numbers. In fact, the Double.longBitsToDouble method detects the invalid bit-sequence and just sets the value to NaN. I moved this code out since it wasn't going to work anyway.
The good thing is that as part of this exercise I did learn a lot of tricks to bit manipulation in Java. For instance, a lot of the issues I ran into involving sign were resolved by using the >>> operator instead of the >> operator. Other than that, you just need to be careful upcasting to mask with 0xFF, 0xFFFF, etc. to make sure the sign is ignored.
I also learned about ByteBuffer which can facilitate loading back and forth among byte[] and primitives/strings; however, that comes with some minor overhead. But it would handle any endianness issues. It turns out endianness wasn't even a concern since most architectures in use today (x86) are little endian to begin with.
It seems reading/writing SAS transport files is a pretty common need, especially in the clinical trials arena so hopefully anyone working in Java/C# won't have to go through the trouble I did.

PMD UselessParentheses violation

I have the following Java method:
private int calculate() {
return (bytes[0] & 0xff) + ((bytes[1] & 0xff) << 8);
}
PMD complains on this code with "UselessParentheses" violation.
I've reviewed operator precentence rules and I still don't see redundant parentheses in that code. Am I missing something?

There's no unnecessary parenthesis in this code, as you can see if you run this:
byte [] bytes = new byte[] {1,2};
System.out.println( (bytes[0] & 0xff) + ((bytes[1] & 0xff) << 8));
System.out.println( bytes[0] & 0xff + ((bytes[1] & 0xff) << 8));
System.out.println( (bytes[0] & 0xff) + (bytes[1] & 0xff) << 8);
System.out.println( (bytes[0] & 0xff) + (bytes[1] & 0xff << 8));
Moreover, sometimes it's actually good to add extra parentheses for readability. For example:
int i = x << y + z; // this will shift x by y+z bits
int j = x << (y + z); // equivalent, but more readable

After reading the operator preferences, the line of code, and the PMD warning, this is probably one of those rare cases where the precedence is meant to be applied like
PMD complains on this code with a useless (parenthesis warning)
rather than
PMD complains on this code with a (useless parenthesis) warning.
You're code is right, and the parenthesis aren't superfluous. Removing them would make the code less readable, and every one of them are needed. In fact, this whole issue is worthy of a xkcd comic

Operator >> can not be applied to operand of type char and long

Am trying to convert java code to c# code. I got this error
Operator >> can not be applied to operand of type char and long.
Code is:
static int getPruningP(byte[] table, long index, long THRESHOLD)
{
if (index < THRESHOLD)
{
return tri2bin[table[(int)(index >> 2)] & 0xff] >> ((index & 3) << 1) & 3;
}
else {
return tri2bin[table[(int)(index - THRESHOLD)] & 0xff] >> 8 & 3;
}
}

You need to cast the long parameter to an int before doing the bitwise and.
Use
return tri2bin[table[(int)(index >> 2)] & 0xff] >> (((int)index & 3) << 1 ) & 3;
instead of
return tri2bin[table[(int)(index >> 2)] & 0xff] >> ((index & 3) << 1) & 3;
Binary & operators are predefined for the integral types and bool and the & operator evaluates both operators regardless of the first one's value.
Therefore you need matching types for your & operator, currently you do long & int.

Actually, it has nothing to do with the '&' or shift operators - the function returns 'int' and the result of the return statements is 'long', so you need to cast the return values:
static int getPruningP(byte[] table, long index, long THRESHOLD)
{
if (index < THRESHOLD)
{
return (int)(tri2bin[table[(int)(index >> 2)] & 0xff] >> ((index & 3) << 1) & 3);
}
else {
return (int)(tri2bin[table[(int)(index - THRESHOLD)] & 0xff] >> 8 & 3);
}
}

Java Bitshift error with negatives?

http://www.fastcgi.com/devkit/doc/fcgi-spec.html
In section 3.4:
typedef struct {
unsigned char nameLengthB0; /* nameLengthB0 >> 7 == 0 */
unsigned char valueLengthB0; /* valueLengthB0 >> 7 == 0 */
unsigned char nameData[nameLength];
unsigned char valueData[valueLength];
} FCGI_NameValuePair11;
typedef struct {
unsigned char nameLengthB0; /* nameLengthB0 >> 7 == 0 */
unsigned char valueLengthB3; /* valueLengthB3 >> 7 == 1 */
unsigned char valueLengthB2;
unsigned char valueLengthB1;
unsigned char valueLengthB0;
unsigned char nameData[nameLength];
unsigned char valueData[valueLength
((B3 & 0x7f) << 24) + (B2 << 16) + (B1 << 8) + B0];
} FCGI_NameValuePair14;
typedef struct {
unsigned char nameLengthB3; /* nameLengthB3 >> 7 == 1 */
unsigned char nameLengthB2;
unsigned char nameLengthB1;
unsigned char nameLengthB0;
unsigned char valueLengthB0; /* valueLengthB0 >> 7 == 0 */
unsigned char nameData[nameLength
((B3 & 0x7f) << 24) + (B2 << 16) + (B1 << 8) + B0];
unsigned char valueData[valueLength];
} FCGI_NameValuePair41;
typedef struct {
unsigned char nameLengthB3; /* nameLengthB3 >> 7 == 1 */
unsigned char nameLengthB2;
unsigned char nameLengthB1;
unsigned char nameLengthB0;
unsigned char valueLengthB3; /* valueLengthB3 >> 7 == 1 */
unsigned char valueLengthB2;
unsigned char valueLengthB1;
unsigned char valueLengthB0;
unsigned char nameData[nameLength
((B3 & 0x7f) << 24) + (B2 << 16) + (B1 << 8) + B0];
unsigned char valueData[valueLength
((B3 & 0x7f) << 24) + (B2 << 16) + (B1 << 8) + B0];
} FCGI_NameValuePair44;
I'm implementing this in Java, and in order to do the valueLengthB3 >> 7 == 1, etc, part, I'm just setting it negative. This doesn't work. How do negatives work in Java, and how do you do this operation in Java?
My current code:
public void param(String name, String value) throws IOException {
if (fp) {
throw new IOException("Params are already finished!");
}
if (name.length() < 128) {
dpout.write(name.length());
}else {
dpout.writeInt(-name.length());
}
if (value.length() < 128) {
dpout.write(value.length());
}else {
dpout.writeInt(-value.length());
}
dpout.write(name.getBytes());
dpout.write(value.getBytes());
}

Java uses pretty routine integer operations. The two main peculiarities relative to C and C++ are
Java has no unsigned integer types other than char (which is 16-bits wide), and
Java has separate arithmetic (>>) and logical (>>>) right-shift operators. The former preserves sign by filling in the needed most-significant bits of the result with copies of the most-significant bit of the left operand, whereas the latter fills in the most-significant bits of the result with zeroes.
Java has the advantage that all primitive types have well-known, consistent sizes and signedness on all platforms, and that its two right-shift operators have well-defined semantics for all valid operands. In contrast, in C, the result of performing a right shift on a negative value is implementation-defined, all of the standard data types have implementation-defined sizes, and some types (char) have implementation-defined signedness.
Now that you have posted some code, however, it appears that none of that is actually your problem. I am at a loss to understand why you think that negating a number would perform any kind of shifting, or indeed, why you think shifting is required at all for what you are trying to do.
Note especially that Java uses two's complement integer representation (as is by far the most common choice of C compilers, too), so negating a number modifies more than just the sign bit. If instead you want to set only the sign bit of an int, then you could spell that
value.length() | 0x80000000

If you were to receive bytes over the wire, they'd be signed meaning that the most significant bit will be the sign bit. If you want to extract the sign bit from byte, there are two sensible ways that come to mind: Test negativity by comparing against 0 or use the >>> operator, rather than the >> operator.
The following code shows how I'd deserialise such an array of signed chars in C. I can't imagine why this wouldn't work in Java, assuming data is instead an array of bytes... though I'm sure it'd be quite hideous.
long offset = 0;
long nameLength = data[offset] >= 0 ? data[offset++] : (-(long)data[offset++] << 24)
+ ( (long)data[offset++] << 16)
+ ( (long)data[offset++] << 8)
+ data[offset++];
long valueLength = data[offset] >= 0 ? data[offset++] : (-(long)data[offset++] << 24)
+ ( (long)data[offset++] << 16)
+ ( (long)data[offset++] << 8)
+ data[offset++];
for (long x = 0; x < nameLength; x++) {
/* XXX: Copy data[offset++] into name */
}
for (long x = 0; x < valueLength; x++) {
/* XXX: Copy data[offset++] into value */
}

how to read signed int from bytes in java?

I have a spec which reads the next two bytes are signed int.
To read that in java i have the following
When i read a signed int in java using the following code i get a value of 65449
Logic for calculation of unsigned
int a =(byte[1] & 0xff) <<8
int b =(byte[0] & 0xff) <<0
int c = a+b
I believe this is wrong because if i and with 0xff i get an unsigned equivalent
so i removed the & 0xff and the logic as given below
int a = byte[1] <<8
int b = byte[0] << 0
int c = a+b
which gives me the value -343
byte[1] =-1
byte[0]=-87
I tried to offset these values with the way the spec reads but this looks wrong.Since the size of the heap doesnt fall under this.
Which is the right way to do for signed int calculation in java?
Here is how the spec goes
somespec() { xtype 8 uint8 xStyle 16 int16 }
xStyle :A signed integer that represents an offset (in bytes) from the start of this Widget() structure to the start of an xStyle() structure that expresses inherited styles for defined by page widget as well as styles that apply specifically to this widget.

If you value is a signed 16-bit you want a short and int is 32-bit which can also hold the same values but not so naturally.
It appears you wants a signed little endian 16-bit value.
byte[] bytes =
short s = ByteBuffer.wrap(bytes).order(ByteOrder.LITTLE_ENDIAN).getShort();
or
short s = (short) ((bytes[0] & 0xff) | (bytes[1] << 8));
BTW: You can use an int but its not so simple.
// to get a sign extension.
int i = ((bytes[0] & 0xff) | (bytes[1] << 8)) << 16 >> 16;
or
int i = (bytes[0] & 0xff) | (short) (bytes[1] << 8));

Assuming that bytes[1] is the MSB, and bytes[0] is the LSB, and that you want the answer to be a 16 bit signed integer:
short res16 = ((bytes[1] << 8) | bytes[0]);
Then to get a 32 bit signed integer:
int res32 = res16; // sign extends.
By the way, the specification should say which of the two bytes is the MSB, and which is the LSB. If it doesn't and if there aren't any examples, you can't implement it!
Somewhere in the spec it will say how an "int16" is represented. Paste THAT part. Or paste a link to the spec so that we can read it ourselves.

Take a look on DataInputStream.readInt(). You can either steel code from there or just use DataInputStream: wrap your input stream with it and then read typed data easily.
For your convenience this is the code:
public final int readInt() throws IOException {
int ch1 = in.read();
int ch2 = in.read();
int ch3 = in.read();
int ch4 = in.read();
if ((ch1 | ch2 | ch3 | ch4) < 0)
throw new EOFException();
return ((ch1 << 24) + (ch2 << 16) + (ch3 << 8) + (ch4 << 0));
}

I can't compile it right now, but I would do (assuming byte1 and byte0 are realling of byte type).
int result = byte1;
result = result << 8;
result = result | byte0; //(binary OR)
if (result & 0x8000 == 0x8000) { //sign extension
result = result | 0xFFFF0000;
}
if byte1 and byte0 are ints, you will need to make the `&0xFF
UPDATE because Java forces the expression of an if to be a boolean

do you have a way of finding a correct output for a given input?
technically, an int size is 4 bytes, so with just 2 bytes you can't reach the sign bit.

I ran across this same problem reading a MIDI file. A MIDI file has signed 16 bit as well as signed 32 bit integers. In a MIDI file, the most significant bytes come first (big-endian).
Here's what I did. It might be crude, but it maintains the sign. If the least significant bytes come first (little-endian), reverse the order of the indexes.
pos is the position in the byte array where the number starts.
length is the length of the integer, either 2 or 4. Yes, a 2 byte integer is a short, but we all work with ints.
private int convertBytes(byte[] number, int pos, int length) {
int output = 0;
if (length == 2) {
output = ((int) number[pos]) << 24;
output |= convertByte(number[pos + 1]) << 16;
output >>= 16;
} else if (length == 4) {
output = ((int) number[pos]) << 24;
output |= convertByte(number[pos + 1]) << 16;
output |= convertByte(number[pos + 2]) << 8;
output |= convertByte(number[pos + 3]);
}
return output;
}
private int convertByte(byte number) {
return (int) number & 0xff;
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.