Java buffer indexed by long instead of int? - java

It looks like all java containers, buffers, arrays and etc, can only be indexed by int. On C++, I can index by unsidned long for example.
What is the solution for this in Java? I can surely create my own class that uses lots of int32 indexable buffers and access the rigth one, but is there a better and simpler way?

According to the Java language specification.
10.4 Array Access
Arrays must be indexed by int values; short, byte, or char values may also be used as index values because they are subjected to unary numeric promotion (§5.6) and become int values.
An attempt to access an array component with a long index value results in a compile-time error.
And from my own perspective, since the Array.length returns an int, there would be no need to have an index beyond Integer.MAX_VALUE. So indexing via a long wouldn't be necessary.

Related

How to handle unsigned shorts/ints/longs in Java

I'm reading a file format that specifies some types are unsigned integers and shorts. When I read the values, I get them as a byte array. The best route to turning them into shorts/ints/longs I've seen is something like this:
ByteBuffer wrapped = ByteBuffer.wrap(byteArray);
int x = wrapped.getInt();
That looks like it could easily overflow for unsigned ints. Is there a better way to handle this scenario?
Update: I should mention that I'm using Groovy, so I absolutely don't care if I have to use a BigInteger or something like that. I just want the maximum safety on keeping the value intact.
A 32bit value, signed or unsigned, can always be stored losslessly in an int*. This means that you never have to worry about putting unsigned values in signed types from a data safety point of view.
The same is true for 8bit values in bytes, 16bit values in shorts and 64bit values in longs.
Once you've read an unsigned value into the corresponding signed type, you can promote them to signed values of a larger types to more easily work with the intended value:
Integer.toUnsignedLong(int)
Short.toUnsignedInt(short)
Byte.toUnsignedInt(byte)
Since there's no primitive type larger than long, you can either go via BigInteger, or use the convenience methods on Long to do unsigned operations:
BigInteger.valueOf(Long.toUnsignedString(long))
Long.divideUnsigned(long,long) and friends
* This is thanks to the JVM requiring integer types to be two's complement.
To hold an unsigned int/short/byte, you need to use the next "bigger" type, i.e. long/int/short. If you already hold the value in the signed type that can overflow, the conversion can be done by doing the following:
int unsignedVal = byteVal & 0xff
If you just cast them, the negative-bit will be regarded and you will still end up with the negative value.
If you have to handle unsigned longs you need to "switch" to java.math.BigInteger.
Unsigned primitives are a pain in Java.
There's no clean way of handing them, except using larger types with more bits, and taking care to avoid automatic sign extension when casting.
In your case, you can do something like this:
ByteBuffer wrapped = ByteBuffer.wrap(byteArray);
int signedInt = wrapped.getInt();
long unsigned = signedInt & 0xffffffffL;
I usually write the required conversion(s) in a utility class someplace, since they're easy to get wrong. If you copy & paste that one liner conversion everywhere, eventually one will be wrong.
Note that if you need unsigned longs, the only larger type is BigInteger.
If you need anything more than simple conversions, I suggest using Guava since it has some nice classes for dealing with unsigned types. See documentation here.

Can you index an array with a long int?

Is it somehow possible to use a long int to index an array? Or is this not allowed?
What I mean is bellow in the code.
long x = 20;
char[] array = new char[x];
or
long x = 5;
char res;
res = array[x];
If you look at the Java Documentation in 10.4:
Arrays must be indexed by int values; short, byte, or char values may
also be used as index values because they are subjected to unary
numeric promotion (§5.6.1) and become int values.
An attempt to access an array component with a long index value
results in a compile-time error.
The error you would get would look something like this:
test.java:12: possible loss of precision
found : long
required: int
System.out.println(array[index]);
^
1 error
If for some reason you have an index stored in a long, just cast it to an int and then index your array. You cannot create an array large enough so it cannot be indexed by an integer in Java. So there is no need for long integers here.
No, it's not possible. JLS 15.10 states that the expression in an array initializer must be promoted to an int:
Each dimension expression undergoes unary numeric promotion (§5.6.1). The promoted type must be int, or a compile-time error occurs.
Same thing with applies to array access expressions (JLS 15.13):
The index expression undergoes unary numeric promotion (§5.6.1). The promoted type must be int, or a compile-time error occurs.
If you want to use a long, you'll have to cast it to int first:
char[] array = new char[(int) x];
res = array[(int) x];
Technically you can have such a structure using Unsafe class. with it you can allocate as much memory as you want (as you have actually). To be noted that this is native memory and not heap memory. Because of this, there are downsides as compared to typical arrays, though: the memory isn't garbage collected (you'll need to manually deallocate memory) and there aren't bound checks so without being careful you can have seg fault and crash your JVM.
See example here, at Big Array section: http://mishadoff.com/blog/java-magic-part-4-sun-dot-misc-dot-unsafe/
Also there are rumors that future versions of JVM and the language will have support arrays of size long.

Java - indexing into array with a byte

Is it possible to index a Java array based on a byte?
i.e. something like
array[byte b] = x;
I have a very performance-critical application which reads b (in the code above) from a file, and I don't want the overhead of converting this to an int. What is the best way to achieve this? Is there a performance-decrease as a result of using this method of indexing rather than an int?
With many thanks,
Froskoy.
There's no overhead for "converting this to an int." At the Java bytecode level, all bytes are already ints.
In any event, doing array indexing will automatically upcast to an int anyway. None of these things will improve performance, and many will decrease performance. Just leave your code using an int.
The JVM specification, section 2.11.1:
Note that most instructions in Table 2.2 do not have forms for the integral types byte, char, and short. None have forms for the boolean type. Compilers encode loads of literal values of types byte and short using Java virtual machine instructions that sign-extend those values to values of type int at compile-time or runtime. Loads of literal values of types boolean and char are encoded using instructions that zero-extend the literal to a value of type int at compile-time or runtime. Likewise, loads from arrays of values of type boolean, byte, short, and char are encoded using Java virtual machine instructions that sign-extend or zero-extend the values to values of type int. Thus, most operations on values of actual types boolean, byte, char, and short are correctly performed by instructions operating on values of computational type int.
As all integer types in java are signed you have anyway to mask out 8 bits of b's value provided you do expect to read from the file values greater than 0x7F:
byte b;
byte a[256];
a [b & 0xFF] = x;
No; array indices are non-negative integers (JLS 10.4), but byte indices will be promoted.
No, there is no performance decrease, because on the moment you read the byte, you store it in a CPU register sometime. Those registers always works with WORDs, which means that the byte is always "converted" to an int (or a long, if you are on a 64 bit machine).
So, simply read your byte like this:
int b = (in.readByte() & 0xFF);
If your application is that performance critical, you should be optimizing elsewhere.

Why HashMap internal work helper variables are int, which can be byte datatype

HashMap internally has its own static final variables for its working.
static final int DEFAULT_INITIAL_CAPACITY = 16;
Why can't they use byte datatype instead of using int since the value is too small.
They could, but it would be a micro-optimization, and the tradeoff would be less readable and maintainable code (Premature optimization, anyone?).
This is a static final variable, so it's allocated only once per classloader. I'd say we can spare those 3 (I'm guessing here) bytes.
I think this is because the capacity for a Map is expressed in terms of an int. When you try to work with a byte and an int, because of promotion rules, the byte will anyways be converted to an int. The default capacity is expressed in terms of an int to maybe avoid those needless promotions.
Using byte or short for variables and constants instead of int is a premature optimization that has next to no effect.
Most arithmetic and logical instructions of the JVM work only with int, long, float and double, other data types have to be cast to (usually) ints in order for these instructions to be executed on them.
The default type of number literals is int for integral and double for floating point numbers. Using byte, short and float types can thus cause some subtle programming bugs and generally worsens code readability.
A little example from the Java Puzzlers book:
public static void main(String[] args) {
for (byte b = Byte.MIN_VALUE; b < Byte.MAX_VALUE; b++) {
if (b == 0x90)
System.out.print("Joy!");
}
}
This program doesn't print Joy!, because the hex value 0x90 is implicitly promoted to an int with the value 144. Since bytes in Java are signed (which itself is very inconvenient), the variable b is never assigned to this value (Byte.MAX_VALUE = 127) and therefore, the condition is never satisfied.
All in all, the reduction of the memory footprint is simply too small (next to none) to justify such micro-optimisation. Generally, explicit numeric types of different size are not necessary and suitable for higher level programming. I personally think that only case where smaller numeric types are acceptable are byte arrays.
The byte values still taking the same space in the JVM and it will also need to be converted to int to the practical purposes explicitly or implicitly, including array sizes, indexes, etc.
Converting from a byte to an int(as it needs to be anint` in any case) would make the code slower if anything. The cost of memory is pretty trivial in the overall scheme of things.
Given the default could be any int value, I think int makes sense.
A lot of data can be represented as a series of Bytes.
Int is the default data type that most users will use when counting or workign with whole numbers.
the issue with using Byte is that the compiler will not recognize it for type conversion.
anytime you tried
int variablename = bytevariable;
it wouldnt complete the assignment however
double variablename = intVariable;
would work.

Question about sizeof. I want to reverse bits in number

What will be equivalent of this in Java?
for (i = (sizeof(num)*8-1); i; i--)
num is given number, not array. I want to reverse bits in integer.
Java does not have sizeof. Arrays have the length property, and many collections have size() and similar things like that, but a linguistic sizeof for any arbitrary object is both not supported and not needed.
Related questions
Is there any sizeof-like method in Java?
In Java, what is the best way to determine the size of an object?
Getting bits of an integer in LSB-MSB order
To get the bits of an integer from its least significant bit to its most significant bit, you can do something like this:
int num = 0xABCD1234;
System.out.println(Integer.toBinaryString(num));
for (int i = 0; i < Integer.SIZE; i++) {
System.out.print((num >> i) & 1);
}
This prints:
10101011110011010001001000110100 // MSB-LSB order from toBinaryString
00101100010010001011001111010101 // LSB-MSB order from the loop
So in this specific case, the sizeof * 8 translates to Integer.SIZE, "the number of bits used to represent an int value in two's complement binary form". In Java, this is fixed at 32.
JLS 4.2.1 Integral types and values
For int, from -2147483648 to 2147483647, inclusive
This loop is likely iterating over an array in reverse order. In this case, it is an array of 'num' objects, and there are 8 elements in the array (the '-1' is necessary because an array of 8 elements has valid indices 0...7).
To do that in Java, the equivalent would be:
for(int i = array.length-1; i >= 0; --i)
in C/C++, the sizeof operator tells you how many bytes a variable or type takes on the current target platform. That is to say, it depends on the target platform, and therefore there is a keyword for discovering it.
Java targets a virtual machine, and the size of types is constant.
If num is an int, it is 4 bytes (32-bits). If it is long, it is 8 bytes (64 bits).
Furthermore, you cannot treat a variable as an array of bytes. You have to use bitwise operators (shifts, and, or etc) to manipulate the bits in a primitive like an int or long.
There isn't a direct equivalent. The sizeof returns the size of a type or the type of the expression in bytes, and this information is not available in Java.
It's not required as the sizes in bytes of the built-in types are fixed, lengths of arrays are obtained using the .length psuedo-field, and memory for objects is allocated using new, so the object size is not required.
If you tell use what the type of num is, then it can be translated.
In addition to polygenelubricants' answer, there's another way to reverse the bits of an integer in Java:
int reversed = Integer.reverse(input);
Easy!
It's worth checkout the source code for Integer.reverse, it's rather nifty (and extremely scary).

Categories