Java - indexing into array with a byte

Java - indexing into array with a byte - java

Is it possible to index a Java array based on a byte?
i.e. something like
array[byte b] = x;
I have a very performance-critical application which reads b (in the code above) from a file, and I don't want the overhead of converting this to an int. What is the best way to achieve this? Is there a performance-decrease as a result of using this method of indexing rather than an int?
With many thanks,
Froskoy.

There's no overhead for "converting this to an int." At the Java bytecode level, all bytes are already ints.
In any event, doing array indexing will automatically upcast to an int anyway. None of these things will improve performance, and many will decrease performance. Just leave your code using an int.
The JVM specification, section 2.11.1:
Note that most instructions in Table 2.2 do not have forms for the integral types byte, char, and short. None have forms for the boolean type. Compilers encode loads of literal values of types byte and short using Java virtual machine instructions that sign-extend those values to values of type int at compile-time or runtime. Loads of literal values of types boolean and char are encoded using instructions that zero-extend the literal to a value of type int at compile-time or runtime. Likewise, loads from arrays of values of type boolean, byte, short, and char are encoded using Java virtual machine instructions that sign-extend or zero-extend the values to values of type int. Thus, most operations on values of actual types boolean, byte, char, and short are correctly performed by instructions operating on values of computational type int.

As all integer types in java are signed you have anyway to mask out 8 bits of b's value provided you do expect to read from the file values greater than 0x7F:
byte b;
byte a[256];
a [b & 0xFF] = x;

No; array indices are non-negative integers (JLS 10.4), but byte indices will be promoted.

No, there is no performance decrease, because on the moment you read the byte, you store it in a CPU register sometime. Those registers always works with WORDs, which means that the byte is always "converted" to an int (or a long, if you are on a 64 bit machine).
So, simply read your byte like this:
int b = (in.readByte() & 0xFF);
If your application is that performance critical, you should be optimizing elsewhere.

Related

How to handle unsigned shorts/ints/longs in Java

I'm reading a file format that specifies some types are unsigned integers and shorts. When I read the values, I get them as a byte array. The best route to turning them into shorts/ints/longs I've seen is something like this:
ByteBuffer wrapped = ByteBuffer.wrap(byteArray);
int x = wrapped.getInt();
That looks like it could easily overflow for unsigned ints. Is there a better way to handle this scenario?
Update: I should mention that I'm using Groovy, so I absolutely don't care if I have to use a BigInteger or something like that. I just want the maximum safety on keeping the value intact.

A 32bit value, signed or unsigned, can always be stored losslessly in an int*. This means that you never have to worry about putting unsigned values in signed types from a data safety point of view.
The same is true for 8bit values in bytes, 16bit values in shorts and 64bit values in longs.
Once you've read an unsigned value into the corresponding signed type, you can promote them to signed values of a larger types to more easily work with the intended value:
Integer.toUnsignedLong(int)
Short.toUnsignedInt(short)
Byte.toUnsignedInt(byte)
Since there's no primitive type larger than long, you can either go via BigInteger, or use the convenience methods on Long to do unsigned operations:
BigInteger.valueOf(Long.toUnsignedString(long))
Long.divideUnsigned(long,long) and friends
* This is thanks to the JVM requiring integer types to be two's complement.

To hold an unsigned int/short/byte, you need to use the next "bigger" type, i.e. long/int/short. If you already hold the value in the signed type that can overflow, the conversion can be done by doing the following:
int unsignedVal = byteVal & 0xff
If you just cast them, the negative-bit will be regarded and you will still end up with the negative value.
If you have to handle unsigned longs you need to "switch" to java.math.BigInteger.

Unsigned primitives are a pain in Java.
There's no clean way of handing them, except using larger types with more bits, and taking care to avoid automatic sign extension when casting.
In your case, you can do something like this:
ByteBuffer wrapped = ByteBuffer.wrap(byteArray);
int signedInt = wrapped.getInt();
long unsigned = signedInt & 0xffffffffL;
I usually write the required conversion(s) in a utility class someplace, since they're easy to get wrong. If you copy & paste that one liner conversion everywhere, eventually one will be wrong.
Note that if you need unsigned longs, the only larger type is BigInteger.
If you need anything more than simple conversions, I suggest using Guava since it has some nice classes for dealing with unsigned types. See documentation here.

Conversion of a byte value into an int

I know that if we are performing arithmetic operations on byte value,then implicitly it is promoted to an int and the result will be an int,and hence we need to explicitly convert it into byte in order to store the result in a byte variable.
But I wanted to ask-
Does the conversion from byte to an int happens at the time when it
is declared or does it happens when we use it in arithmetic
operation? Because the java decompiler I am using converts it from
byte to an int at the time of declaration.So,is it the decompiler
problem or is it really so.
And if it really happens at the time of declaration,then why storing
a value beyond the range of byte shows an error?
Eg-
The code is-
public class Hello
{
public static void main(String[] args)
{
byte a=90;
}
}
Output from decompiler-

A byte remains a byte, and is as such statically typed by the compiler.
The JVM machine implements holding a single byte (as opposed to a byte array) as variable in an int slot. And uses int opcodes for arithmetic.
Also assigning a final int constant to a byte will be done by the compiler, as long as it is in the byte range -128, ... 127.

Implicit type conversion in Java is a compile-time mechanism. So yes, it will show errors such as value beyond range. Here is a great SO answer on this topic:
Runtime vs Compile time

The compiler will crop any value declared on the stack to an integer, if it is within range.
Taken from infoq article
When loading a value onto the operand stack, the JVM treats primitive
types that are smaller than an integer as if they were integers.
Consequently, it makes little difference to a program’s bytecode
representation if a method variable is represented by, for example, a
short instead of an int. Instead, the Java compiler inserts additional
bytecode instructions that crop a value to the allowed range of a
short when assigning values. Using shorts over integers in a method’s
body can therefore rather result in additional bytecode instructions
rather than presumably optimizing a program.

Is Byte implemented as Int anyway?

I had a java game book recommend implementing all data as Int when possible, that that type runs the fastest. It said the Byte, Char, and Boolean are implemented as Int anyway, so you don't save space and the casting you end up having to do in the code because of the Byte data will slow it down. For instance, a cast is needed for
a = (byte)(b+c);
since the addition result is an Int, even when a,b, and c are all declared as Bytes.
I currently have a huge 2D array declared as Byte for my game, to save space and for bitwise operations. Is it actually saving space? I've also seen bitwise operations done on Ints in examples, do bitwise operations work as expected on Ints?

This is generally incorrect. In fact, this is outlined in the JVM Specification §2.3:
The primitive data types supported by the Java Virtual Machine are the numeric types, the boolean type (§2.3.4), and the returnAddress type (§2.3.3).
The numeric types consist of the integral types (§2.3.1) and the floating-point types (§2.3.2).
The integral types are:
byte, whose values are 8-bit signed two's-complement integers, and whose default value is zero
short, whose values are 16-bit signed two's-complement integers, and whose default value is zero
int, whose values are 32-bit signed two's-complement integers, and whose default value is zero
long, whose values are 64-bit signed two's-complement integers, and whose default value is zero
char, whose values are 16-bit unsigned integers representing Unicode code points in the Basic Multilingual Plane, encoded with UTF-16, and whose default value is the null code point ('\u0000')
Now, for boolean it's slightly a different story. From §2.3.4:
Although the Java Virtual Machine defines a boolean type, it only provides very limited support for it. There are no Java Virtual Machine instructions solely dedicated to operations on boolean values. Instead, expressions in the Java programming language that operate on boolean values are compiled to use values of the Java Virtual Machine int data type.
You can see differences in the bytecode depending on whether you use a byte[] or an int[], so they're not identical:
byte[] b = {42};
ICONST_1
NEWARRAY T_BYTE
DUP
ICONST_0
BIPUSH 42
BASTORE
ASTORE 1
vs
int[] b = {42};
ICONST_1
NEWARRAY T_INT
DUP
ICONST_0
BIPUSH 42
IASTORE
ASTORE 1
Is it actually saving space?
Yes, it likely is, especially if the array is very large.
do bitwise operations work as expected on Ints?
Yes, they do.

It is true that byte + byte = int, requiring a cast, but bytes are implemented with 8 bits of data in memory, while ints are 32 bits. Therefore, using bytes will decrease the amount of memory that an array takes up by 4 times.
For example, if you had a 10 by 10 array of bytes, its size would be 800, but a 10 by 10 array of ints' size would be 3200.
More information on this

The answer depends on whether you are working with a single variable of type byte or with a byte array, byte[]. A byte array does indeed save space, with each Java byte using just a byte of memory (plus a constant amount of housekeeping data for the array object). However, a single local variable of type byte is actually stored as an int on the stack and takes the corresponding 4 bytes of memory. This is even expressed in the bytecode - there is the opcode baload - "Load byte or boolean from array" but there is no opcode for loading a byte from local variable like there is iload for ints. Similarily, local char and boolean variables actually are stored on the stack as an int and int-based opcodes are used to access them.
The entry 2.6.1 in JLS also says that all local variables take either one or two "slots" on the stack, so the single-slot types byte, char, float and int all take the same space. The JVM cannot address a unit smaller than a single such slot, so in the case of a byte, 3 bytes are so to say wasted.
To wrap it up: do use byte arrays to save space, but in the case of individual variables, using a byte instead of int won't save space and may even have a small negative performance impact (it may however be required if you need byte for the semantics, e.g. counter wrapping behavior and such).

Yes, when you perform a byte addition the return value is always an integer, reason being that 8bit + 8bit addition can always result in a value which is greater than 8bit.
e.g.
decimal | binary
255 | 1111 1111
121 | 0111 1001
376 | 1 0111 1000
So if you try to store it in a byte again will definitely result in loss of data.
Instead of 376 you would get "120" if typecasting to byte is done.
You can definitely use int instead of byte and also the bit wise operations work perfectly on int.
refer: http://www.tutorialspoint.com/java/java_bitwise_operators_examples.htm

Objective-C and Java Primitive Data Types

I need to convert a piece of code from Objective-C to Java, but I have a problem understanding the Primitive Types in Objective-C. So I had their data types in my Objective-C code :
UInt64, Uint32, UInt8 ,
which are unsigned integers (as I understand from internet). So my question is, can I use Java primitive types like byte (8bit) - instead of UInt8, int (32bit) - instead of UInt32, and long (64bit) - instead of UInt64.

Unfortunately, it isn't a straight translation and without knowing more about your program, its hard to suggest what the "right" approach is.
If your UInt8 values really range from 0-255, you may have to use Java signed int to be able to hold the entire range.
If you are dealing with byte streams or memory layouts and really need to use just a single byte of memory, than you could try byte, but you may have to test and handle cases to handle when the high-bit is set (value > 127). Ditto with the other unsigned types.
Ideally, if your code just kind of "defaulted" to the unsigned types, but really the signed versions would have worked fine too (i.e. the ranges of your values never equal or exceed 2^7, 2^15, or 2^31 respectively), then you may be fine with the "straight" translation to byte, int, and long.

Yes, those are the correctly sized data types to use in Java. Make sure you take into account that Java does not have unsigned types and the trick is to use the next largest size. 64 bit unsigned arithmetic requires special consideration.

Conversions and Promotions

I just wondering why this works (in Java):
byte b = 27;
but being method declared like this:
public void method(byte b){
System.out.println(b);
}
This doesn't work:
a.method(27);
Gives a Compiler error as follows:
`The method method(byte) in the type App is not applicable for the arguments (int)`
Reading this doesn't give me any clue (probably i am missunderstanding something).
Thanks in advance.

The reason the assignment
byte b = 27;
works is due to section 5.2 of the Java Language Specification (assignment conversion), which includes:
In addition, if the expression is a constant expression (§15.28) of type
byte, short, char or int :
A narrowing primitive conversion may
be used if the type of the variable is
byte, short, or char, and the value of
the constant expression is
representable in the type of the
variable.
In other words, the language has special provision for this case with assignments. Normally, there's no implicit conversion from int to byte.
Interestingly, C# works differently in this respect (despite being like Java in so many other aspects of core functionality) - the method call is valid in C#.

Simple answer - a byte is 1 byte of memory, and an integer is typically 4 bytes.
Without explicitly trying to cast the int to a byte, there is no implied coercion because an integer of size say 10790 would lose information if truncated down to one byte.

The process of taking one numeric value and converting it to another without your help is called a promotion - You are taking for example a 1 byte number, and making it into a 4 byte number by filling in zeroes. This is (generally) safe since the range of possible values in the target type is greater than in the source type.
When the opposite occurs, such as taking an int and converting to a byte, it is a "demotion" ad it is not automatic, you have to force it since there's a loss of value.
What happens here is that 27 is interpreted as an int. The expression itself is an int, and then it gets sent to a method that expects a byte. In other words, the fact that you are putting it inside a call to a method that takes a byte doesn't change the fact that Java considers it to be a bit to begin with.
I don't remember exactly how to define byte constants in Java, but generally speaking, a number like 27 is a magic number. You may want to write a
final byte MY_CONSTANT_THAT_MEANS_27_BUT_WITH_MEANINGFUL_NAME = 27;
And then call your function with that constant instead of the 27.

By default, Java will look at a numeric value such as 27 as an integer type. Think of this in a similar aspect, if you say 123.45, it will see it as a float, and "hello" is a string. These are static values, and not variables, so you need to cast it to a byte, like so:
a.method((byte)27);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.