Can I simplify reading a short from binary data - java

I'm trying to simplify some code for decoding data in a file and I've written a test case to show the issue.
Given two bytes as 0xFe and 0xFF I want that to be read as 0xFFFE (65534),
the existing code does
headerBuffer.get() & 0xff + (headerBuffer.get() & 0xff) * 256
I thought, if I made buffer byte order little endian, I could get same result by reading as a short. But I do not get same result, why not ?
headerBuffer.getShort();
public void testReadingOfShort() {
ByteBuffer headerBuffer = ByteBuffer.allocate(2);
headerBuffer.order(ByteOrder.LITTLE_ENDIAN);
headerBuffer.put((byte) 0xFE);
headerBuffer.put((byte)0xFF);
headerBuffer.position(0);
int format = headerBuffer.get() & 0xff + (headerBuffer.get() & 0xff) * 256;
headerBuffer.position(0);
int formatNew = headerBuffer.getShort();
System.out.println("Format:"+format+"("+ Hex.asHex(format)+")"+":FormatNew:"
+formatNew+"("+Hex.asHex(formatNew)+")");
}
Outputs
Format:65534(0xfffe):FormatNew:-2(0xfffffffffffffffe)

You do get the same value. The problem happens when you assign the short to an int on this line:
int formatNew = headerBuffer.getShort();
When you do this, Java performs sign extension to ensure that the numeric value in the short gets converted to the same numeric value in the int. In your case, that is -2.
The representation of -2 as a short is 0xFFFE, while the int representation is 0xFFFFFFFE. In other words, the sign bit of the short is copied into the additional upper bits of int.
You can address this by not assigning the short to int. You also need to make sure that your Hex.asHex has a proper overload for short, otherwise the same conversion would happen when formatNew gets passed as an argument.
Alternatively, if you would like to treat the value of the short as unsigned, and assign it to an int, you can mask the result with 0xFFFF, like this:
int formatNew = headerBuffer.getShort() & 0xFFFF;

My hypothesis here is that in your Hex class you have a .asHex() method taking a short as an argument which does something like:
int value = (int) argument;
Tough luck. If you "upcast" from one integer type to another, the sign bit, if present, is carried. Which means that if you try and cast short 0xfffe to an int, you will NOT end up with 0x0000fffe but... 0xfffffffe. Hence your result.
If you wanted to cast it as an unsigned value you'd have to mask it, like so:
int value = (int) argument & 0xffff;

You can simply obtain the desired value as
int formatNew = headerBuffer.getShort() & 0xFFFF;
or, alternatively, if you use Java 8:
int formatNew = Short.toUnsignedInt(headerBuffer.getShort());
This will basically drop all bits from the int that are not part of a short. But it won't relieve your from the responsibility of carefully checking where you expect unsigned values, and how to handle the (naturally) signed values in the respective context.

Related

Fixing "incompatible types: possible lossy conversion from int to byte" in Java

I have question regarding my code here:
public class Main
{
public static void main(String[] args) {
System.out.println("Hello World\n");
int x = 36;
byte b1 = ((byte) x) & ((byte) 0xff); // it seems it is the part after &, but I have 0xff cast to byte by using (byte)0xff, so not sure where exactly the error is coming from.
System.out.println(b1);
}
}
I am not sure exactly which part is causing the error of:
incompatible types: possible lossy conversion from int to byte
This is the error message output from the program:
You appear to be confused.
There is no point in your code. taking any number, calculating that & 0xFF, and then storing it in a byte, is always a noop - it does nothing.
You additionally get an error because & inherently always produces at least an int (it'll upcast anything smaller to match), so you're trying to assign an int to a byte.
What are you trying to accomplish?
"I want to have my byte be unsigned"!
No can do. Java doesn't have unsigned bytes. A java byte is signed. Period. It can hold a value between -128 and +127. For calculation purposes, -128 and 255 are identical (they are both the bit sequence 1111 1111 - in hex, 0xFF, and they act identically under all relevant arithmetic, though it does get tricky when converting them to another numeric type int).
"I just want to store 255"!
Then use int. This is where most & 0xFF you'll ever see in java code comes from: When you have a byte value which java inherently treats as signed, but you wish to treat it as unsigned and, therefore (given that in java bytes can't do that), you want to upcast it to an int, containing the unsigned representation. This is how to do that:
int x = y & 0xFF;
Where y is any byte.
You presumably saw this somewhere and are now trying to apply it, but assigning the result of y & 0xFF to a byte doesn't mean anything. You'd assign it to an int variable, or just use it as expression in a further calculation (y & 0xFF is an int - make sure you add the appropriate parentheses, & has perhaps unexpected precedence).
int x = 36;
byte b1 = ((byte) x) & ((byte) 0xff);
Every imaginable way of this actually working would mean that b1 is... still 36.
To compute x & y where the two operands are bytes, they must first be promoted to int values. There is no & between bytes. The result is therefore of type int
That is, what you wrote is effectively evaluated as if you'd written it as the following, making explicit what the language gives you implicitly:
byte b1 = ((int) (byte) x) & ((int) (byte) 0xff);
Just do the arithmetic and then cast the result to byte.
byte b1 = (byte)(x & 0xff);
Link to Java Language Specification
Edited to add, thanks to #rzwitserloot, that masking a byte value with 0xff is however pointless. If you need the assignment from an integer to a byte, just write the cast:
byte b1 = (byte)x;

How can I mask a hexadecimal int using Java?

I have an integer that contains a hexa value. I want to extract the first characters from this hexa value like it was a String value but I don't want to convert it to a String.
int a = 0x63C5;
int afterMask= a & 0xFFF;
System.out.println(afterMask); // this gives me "3C5" but I want to get the value "63C"
In my case I can't use String utilities like substring.
It's important to understand that an integer is just a number. There's no difference between:
int x = 0x10;
int x = 16;
Both end up with integers with the same value. The first is written in the source code as hex but it's still representing the same value.
Now, when it comes to masking, it's simplest to think of it in terms of binary, given that the operation will be performed bit-wise. So it sounds like you want bits 4-15 of the original value, but then shifted to be bits 0-11 of the result.
That's most simply expressed as a mask and then a shift:
int afterMask = (a & 0xFFF0) >> 4;
Or a shift then a mask:
int afterMask = (a >> 4) & 0xFFF;
Both will give you a value of (decimal) 1596 = (hex) 63C.
In this particular case, as your input didn't have anything in bits 12+, the mask is unnecessary - but it would be if you wanted an input of (say) 0x1263c5 to still give you an output corresponding to 0x63c.
If you want "63C" all you need is to shift right 4 bits (to drop the right most nibble). Like,
int a = 0x63C5;
int afterMask = a >> 4;
System.out.println(Integer.toHexString(afterMask));
Outputs (as requested)
63c
int a = 0x63C5;
int aftermask = a >> 4 ;
System.out.println( String.format("%X", aftermask) );
The mask you need to use is 0XFFF0

Java - why does readAllBytes return incorrect byte codes? [duplicate]

This question already has answers here:
Java byte array contains negative numbers
(5 answers)
Closed 6 years ago.
I have the following problem. I'm using Java to create a byte array from a file. So I do the following:
byte[] myByteArray = Files.readAllBytes(filename);
However, for many of the bytes it is returning incorrect/negative values.
For instance, if I test using javascript, to read every byte of a file e.g.
function readbytes(s){
var f = new File(s);
var i,a,c;
var d = [];
if (f.isopen) {
c = f.eof;
for(i=0;i<c ;i++){
a = f.readbytes(1);
d.push(a);
}
f.close();
return d;
} else {
post("could not open file: " + s + "n");
}
}
(readbytes is a function in the program Im using that gives the byte at a specific position).
This returns the correct bytes
So Im wondering, why does java return incorrect codes? Is this something to do with unsigned values?
Java doesn't know unsigned bytes. For instance the unsigned byte 255 would be printed as its signed version -1. In memory however, the actual value would be the same and represented as 255.
If you'd like to convert a byte to its unsigned representation, you may use the bitwise AND operator.
For instance:
bytes[x] & 0xff
Java doesn't know about bytes at runtime either for any operand that may be pushed onto the Java virtual machine's stack. In fact every operation you apply to an integral value results in an integer. That's why ((byte)-1) & 0xff) results in an integer and its value is 255. If you would like to store that value back into a byte, you'd have to cast it to byte again, which of course, is -1.
byte x = -1; // java is friendly enough to insert the implicit cast here
System.out.println(x); // -1
System.out.println(x & 0xff); // 255
byte y = (byte)(x & 0xff); // must add (byte) cast
System.out.println(y); // -1
Also keep in mind that technically the output you see is different but the content is still the same since you can map from Java's signed byte always to its unsigned representation. Ideally, you'd use something like DataInputStream which offers you int readUnsignedByte().

FindBugs: INT_VACUOUS_BIT_OPERATION

In order to convert from int to IP String I am using approach in Going from 127.0.0.1 to 2130706433, and back again
private static final byte BYTE_MASK = (byte)0xff;
protected byte[] unpack(int bytes) {
return new byte[] {
(byte)((bytes >>> 24) & BYTE_MASK),
(byte)((bytes >>> 16) & BYTE_MASK),
(byte)((bytes >>> 8) & BYTE_MASK),
(byte)((bytes ) & BYTE_MASK)
};
}
But FindBugs in Eclipse generates bugs: INT_VACUOUS_BIT_OPERATION.
INT_VACUOUS_BIT_OPERATION: bit operations that don't do any meaningful work.
Why is that and how to fix it?
I suspect it's because you don't need the & BYTE_MASK if you're also casting to byte. I'm assuming that BYTE_MASK is 0xff... in which case it's basically pointless. Just casting will have the same effect.
From section 5.1.3 of the JLS:
A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.

Read two bytes into an integer?

I have a byte[] that I've read from a file, and I want to get an int from two bytes in it. Here's an example:
byte[] bytes = new byte[] {(byte)0x00, (byte)0x2F, (byte)0x01, (byte)0x10, (byte)0x6F};
int value = bytes.getInt(2,4); //This method doesn't exist
This should make value equal to 0x0110, or 272 in decimal. But obviously, byte[].getInt() doesn't exist. How can I accomplish this task?
The above array is just an example. Actual values are unknown to me.
You should just opt for the simple:
int val = ((bytes[2] & 0xff) << 8) | (bytes[3] & 0xff);
You could even write your own helper function getBytesAsWord (byte[] bytes, int start) to give you the functionality if you didn't want the calculations peppering your code but I think that would probably be overkill.
Try:
public static int getInt(byte[] arr, int off) {
return arr[off]<<8 &0xFF00 | arr[off+1]&0xFF;
} // end of getInt
Your question didn't indicate what the two args (2,4) meant. 2 and 4 don't make sense in your example as indices in the array to find ox01 and 0x10, I guessed you wanted to take two consecutive element, a common thing to do, so I used off and off+1 in my method.
You can't extend the byte[] class in java, so you can't have a method bytes.getInt, so I made a static method that uses the byte[] as the first arg.
The 'trick' to the method is that you bytes are 8 bit signed integers and values over 0x80 are negative and would be sign extended (ie 0xFFFFFF80 when used as an int). That is why the '&0xFF' masking is needed. the '<<8' shifts the more significant byte 8 bits left.
The '|' combines the two values -- just as '+' would. The order of the operators is important because << has highest precedence, followed by & followed by | -- thus no parentheses are needed.
Here's a nice simple reliable way.
ByteBuffer byteBuffer = ByteBuffer.allocateDirect(4);
// by choosing big endian, high order bytes must be put
// to the buffer before low order bytes
byteBuffer.order(ByteOrder.BIG_ENDIAN);
// since ints are 4 bytes (32 bit), you need to put all 4, so put 0
// for the high order bytes
byteBuffer.put((byte)0x00);
byteBuffer.put((byte)0x00);
byteBuffer.put((byte)0x01);
byteBuffer.put((byte)0x10);
byteBuffer.flip();
int result = byteBuffer.getInt();
Alternatively, you could use:
int val = (bytes[2] << 8) + bytes[3]
You can use ByteBuffer. It has the getInt method you are searching for and many other useful methods
The Google Base16 class is from Guava-14.0.1.
new BigInteger(com.google.common.io.BaseEncoding.base16().encode(bytesParam),16).longValue();

Categories