Read binary stream containing unsigned numbers - java

I want to read binary file containing 32-bit unsigned integers and 8-bit unsigned integers. I already know DataInputStream but its method readInt returns signed integers and there is no method for reading unsigned ints (there are such methods for 16-bit and 8-bit integers).
Reading separate bytes and concatenating them bitwise is the “official” way to do it? Would reading bytes into ByteArray and composing integers from them using bitshifts and bitwise ors significantly decrease performance?

You can use
long value = Integer.toUnsignedLong​(dataInputStream.readInt());
This is equivalent to the pre-Java 8 code
long value = dataInputStream.readInt() & 0xFFFFFFFFL;
The key point is that signed or unsigned are just different interpretations of the bit pattern, but to read the four byte quantity, readInt() is always sufficient. The operation above converts to a signed long, a datatype capable of covering all values of unsigned int.
But since the int does already hold all information, there is no need to convert it to a long immediately. The Two’s Complement used to represent the signed numbers even allows performing basic operations, i.e. +, -, and *, without differentiating between signed and unsigned numbers. For other operations, Java 8 introduced methods to perform them by interpreting the int value as unsigned:
Integer.divideUnsigned​(…)
Integer.remainderUnsigned​(…)
Integer.compareUnsigned​(…)
Integer.toUnsignedString​(…)
A practical example, I encountered, is parsing class files. These files have sized encoded as unsigned int at some place, but with most standard Java APIs, class files are delivered as byte array or ByteBuffer instances, which can not contain more than 2³¹ bytes. So dealing with larger numbers is an unnecessary complication for something that can’t be correct anyway, as a class file containing such a large size specification must be truncated.
So the code to handle this looks basically like:
int size = input.readInt();
if(Integer.compareUnsigned(size, Integer.MAX_VALUE)>0) throw new IllegalArgumentException(
"truncated class file (attribute size "+Integer.toUnsignedString(size)+')');
// just use the int value
or without Java 8 features
(even simpler, as long as the reader understands the Two’s Complement):
int size = input.readInt();
if(size < 0) throw new IllegalArgumentException(
"truncated class file (attribute size "+(size&0xFFFFFFFFL)+')');
// just use the int value
(see also this answer)

Related

How to handle unsigned shorts/ints/longs in Java

I'm reading a file format that specifies some types are unsigned integers and shorts. When I read the values, I get them as a byte array. The best route to turning them into shorts/ints/longs I've seen is something like this:
ByteBuffer wrapped = ByteBuffer.wrap(byteArray);
int x = wrapped.getInt();
That looks like it could easily overflow for unsigned ints. Is there a better way to handle this scenario?
Update: I should mention that I'm using Groovy, so I absolutely don't care if I have to use a BigInteger or something like that. I just want the maximum safety on keeping the value intact.
A 32bit value, signed or unsigned, can always be stored losslessly in an int*. This means that you never have to worry about putting unsigned values in signed types from a data safety point of view.
The same is true for 8bit values in bytes, 16bit values in shorts and 64bit values in longs.
Once you've read an unsigned value into the corresponding signed type, you can promote them to signed values of a larger types to more easily work with the intended value:
Integer.toUnsignedLong(int)
Short.toUnsignedInt(short)
Byte.toUnsignedInt(byte)
Since there's no primitive type larger than long, you can either go via BigInteger, or use the convenience methods on Long to do unsigned operations:
BigInteger.valueOf(Long.toUnsignedString(long))
Long.divideUnsigned(long,long) and friends
* This is thanks to the JVM requiring integer types to be two's complement.
To hold an unsigned int/short/byte, you need to use the next "bigger" type, i.e. long/int/short. If you already hold the value in the signed type that can overflow, the conversion can be done by doing the following:
int unsignedVal = byteVal & 0xff
If you just cast them, the negative-bit will be regarded and you will still end up with the negative value.
If you have to handle unsigned longs you need to "switch" to java.math.BigInteger.
Unsigned primitives are a pain in Java.
There's no clean way of handing them, except using larger types with more bits, and taking care to avoid automatic sign extension when casting.
In your case, you can do something like this:
ByteBuffer wrapped = ByteBuffer.wrap(byteArray);
int signedInt = wrapped.getInt();
long unsigned = signedInt & 0xffffffffL;
I usually write the required conversion(s) in a utility class someplace, since they're easy to get wrong. If you copy & paste that one liner conversion everywhere, eventually one will be wrong.
Note that if you need unsigned longs, the only larger type is BigInteger.
If you need anything more than simple conversions, I suggest using Guava since it has some nice classes for dealing with unsigned types. See documentation here.

How to send value bigger than 127 in byte Java

I am working on an Smart Card where there is a method in javax.smartcardio.CommandAPDU.
CommandAPDU(int cla, int ins, int p1, int p2, byte[] data, int ne)
I need to send data as byte[] (5th argument). Now my problem is that, as Java primitive data types are signed the max value of a byte can not exceed 127. I need to send a value bigger than 127. To be precise, the hex value 94 which is equal to 148.
As some solution suggests that we can cast it to integer.
byte b = -108;
int i = b & 0xff;
I can't do that as the CommandAPDU(); constructor doesn't take an []. So how to do it?
Depending on how it is interpreted by the smart card, you could just send the correct negative value. If the smart card interprets value as unsigned, you could for example send -1 for 255.
You're calculating the APDU with unsigned bytes, while Java uses signed bytes.
It's just a matter of how the data is interpreted, sending -108 to the smart card will be interpreted in exactly the same way as sending 148 from a platform using unsigned bytes. The bit combination is exactly the same.
Java can even do the conversion itself so that you can write the code using unsigned numbers;
byte data = (byte)0x94; // stores -108 in "data", which will be interpreted
// as 148 on an unsigned platform
For long blocks of data, it is probably best to use a hexadecimal encoder/decoder. But be sure that you handle the data as bytes internally (directly decode and don't look back to the hex String). The Apache codec library contains a good encoder/decoder, or you can use Bouncy Castle or Guava or use one of the many examples on SO.

How to read unsigned values from files

I am trying to read binary data (Doom WAD files), which contain a lot of unsigned short and unsigned byte values.
At the moment I read the file into a byte[], wrap a ByteBuffer with little-endian order around it and access the values by bbuf.getShort() etc. respectively.
Converting those e. g. to 2D-coordinates is not a problem, because in the end it won't matter if they range eg. from -128 to 128 or from 0 to 256, but more often the short values are used as array indices and short/byte values as flags/, so I need a fast way to treat them as signed types.
I know, Java doesn't have unsigned types "for sake of simplicity...".
Can you make any suggestions?
In order to save unsigned ints you need a long. Then you need to truncate last 32 bits. You can use following trick to do it.
final long UNSIGNED_INT_BITS = 0xffffffffL;
int a = -3;
long b = UNSIGNED_INT_BITS & a;
System.out.println(a);
System.out.println(b);
System.out.println(Long.toHexString(UNSIGNED_INT_BITS));
Output:
-3
4294967293
ffffffff
If all else fails, you could always store them internally as ints and make sure you do proper conversion when reading/writing.
(Read as byte/short, cast to int, add 2^bits if negative. Just truncate to 8/16 bits when writing.)
Hardly the most elegant solution, I admit.
If you need to interprete 0xFF byte as 256 do the following
int n = b & 0xFF;

How to read file created by C++ program in java?

I have one file created by c++ program which is in encrypted format. I want to read it in my java program. In case of decryption of file contents, decryption algorithm is performing operations on byte[which is unsigned char-BYTE in c/c++]. I used same decryption algorithm which I have used in my c/c++ program. This algorithm contains ^, %, * and - operations on byte. But byte datatype of java is signed because of which I am facing problems in decryption. How can I read file or process read data with 1byte at a time which is unsigned?
thanks in advance.
byte b = <as read from file>;
int i = b & 0xFF;
Perform operations on i as required
The standard method InputStream.read() reads one byte and fits it into a int, so in practice it is an unsinged byte. There are no unsigned primitive data types in java, so the only approach is to fit it in an upper primitive.
That being said you should have no trouble performing encryption/decryption over data bytes read from the file, since the bytes are the same, no matter if they are interpreted as signed or unsigned (0xFF can be 255 or -1). You say the alghorithm contains "^, %, *", etc. That is an interpretation of raw bytes, taking into account a character encoding (that fits 8 bit per character I suppose). You should not perform encryption/decryption operations over other than raw bytes.
First, InputStream.read() returns an int but it holds a byte; it uses an int so -1 can be returned if the EOF is reached. If the int is not -1, you can cast it to byte.
Second, there are read() metods that allow storing the bytes directly in a byte[]
And last, if you are going to use the file as a byte[] (and it is not too big) maybe it would be interesting copying the data from FileInputStream and write it into a ByteArrayOutputStream. You can get the resulting byte[] from the late object (note: do not use the .read() method, use .read(byte[], int, int) for performance).
Since there is no unsigned primitive type in Java, I think what you can do is to convert signed byte into integer (which will virtually be unsigned because the integer will always be positive). You can follow the code in here: Can we make unsigned byte in Java for the conversion.

Similar functionality for java to struct for python

I have a program that I made in Python to find specific tags in TIFF IFD's and return the values. It was just a proof of concept thing in python, and now I need to move the functionality to java. I think I can just use the String(byteArray[]) constructor for the ASCII data types, but I still need to get Unsigned short (2 byte) and unsigned long (4 byte) values. I don't need to write them back to the file or modify them, all I need to do is get a Java Integer or Long object from them. This is easy in python with the struct and mmap classes, does any one know of a similar way in java? I looked at the DataInput class, but the readUnsignedLong method reads 8 bytes.
DataInputStream allows you to read shorts and longs. You should mask them with the appropriate bit mask (0xFFFF for short, 0xFFFFFFFF for 32 bit) in order to account for the difference between signed/unsigned types.
e.g.
// omits error handling
FileInputStream fis = ...;
DataInputStream stream = new DataInputStream(fis);
int short_value = 0xFFFF & stream.readShort();
long long_value = 0xFFFFFFFF & stream.readInt();
If you're sure that the data won't be towards the high end of the 2 byte field, or 4 byte field, you can forego the bit masking. Otherwise, you need to use a wider data type to account for the fact that unsigned values hold a larger range of values than their signed counterparts.
I looked at the DataInput class, but the readUnsignedLong method reads 8 bytes.
Java does not have unsigned types. It takes 4 bytes to make an int, and 8 bytes to make a long, unsigned or otherwise.
If you don't want to use DataInput, you can read the bytes into byte arrays (byte[]) and use a ByteBuffer to turn those byte values into ints and longs with left padding. See ByteBuffer#getInt() and ByteBuffer#getLong().
DataInput would be the preferred method. You can use readUnsignedShort for the two byte values. For the 4 byte values you'll have to use this workaround...
long l = dis.readInt() & 0xffffffffL;
You could use Javolution's Struct class which provides structure to regions of data. You set up a wrapper and then use the wrapper to access the data. Simples. Java really needs this super-useful class in its default classpath TBQH.
Preon Library is good to create struct in Java. I have tried Javolution's Struct but it was not help full my case. It is open source and very good library.

Categories