Regarding my previous Question, Why do == comparisons with Integer.valueOf(String) give different results for 127 and 128? , we know that Integer class has a cache which stores values between -128 and 127.
Just wondering, why between -128 and 127?
Integer.valueOf() documentation stated that it "caching frequently requested values" . But does values between -128 and 127 are frequently requested for real? I thought frequently requested values are very subjective.
Is there any possible reason behind this?
From the documentation also stated: "..and may cache other values outside of this range."
How is this can be achieved?
Just wondering, why between -128 and 127?
A larger range of integers may be cached, but at least those between -128 and 127 must be cached because it is mandated by the Java Language Specification (emphasis mine):
If the value p being boxed is true, false, a byte, or a char in the range \u0000 to \u007f, or an int or short number between -128 and 127 (inclusive), then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2.
The rationale for this requirement is explained in the same paragraph:
Ideally, boxing a given primitive value p, would always yield an identical reference. In practice, this may not be feasible using existing implementation techniques. The rules above are a pragmatic compromise. The final clause above requires that certain common values always be boxed into indistinguishable objects. [...]
This ensures that in most common cases, the behavior will be the desired one, without imposing an undue performance penalty, especially on small devices. Less memory-limited implementations might, for example, cache all char and short values, as well as int and long values in the range of -32K to +32K.
How can I cache other values outside of this range.?
You can use the -XX:AutoBoxCacheMax JVM option, which is not really documented in the list of available Hotspot JVM Options. However it is mentioned in the comments inside the Integer class around line 590:
The size of the cache may be controlled by the -XX:AutoBoxCacheMax=<size> option.
Note that this is implementation specific and may or may not be available on other JVMs.
-128 to 127 is the default size. But javadoc also says that the size of the Integer cache may be controlled by the -XX:AutoBoxCacheMax=<size> option. Note that it sets only high value, low value is always -128. This feature was introduced in 1.6.
As for why -128 to 127 - this is byte value range and it is natural to use it for a very small cache.
The reason for caching small integers, if that's what you're asking, is that many algorithms use small integers in their calculations, so avoiding the object-creation overhead for these values tends to be worthwhile.
The question then becomes which Integers to cache. Again, speaking in general, the frequency with which constant values are used tends to decrease as the absolute value of the constant increases -- everyone spends a lot of time using the values 1 or 2 or 10, relatively few few use the value 109 very intensively; fewer will have performance depend on how quickly one can obtain an Integer for 722.. Java chose to allocate 256 slots spanning the range of a signed byte value. This decision may have been informed by analyzing programs in existence at the time, but is just as likely to have been a purely arbitrary one. It's a reasonable amount of space to invest, it can be accessed rapidly (mask to find out if the value's in the cache's range, then a quick table lookup to access the cache), and it will definitely cover the most common cases.
In other words, I think the answer to your question is "it isn't as subjective as you thought, but the exact bounds are largely a rule-of-thumb decision ... and experiemental evidence has been that it was good enough."
Max high integer value that can be cached can be configured through system property i.e java.lang.Integer.IntegerCache.high(-XX:AutoBoxCacheMax) . The cache is implemented using an array.
private static class IntegerCache {
static final int high;
static final Integer cache[];
static {
final int low = -128;
// high value may be configured by property
int h = 127;
if (integerCacheHighPropValue != null) {
// Use Long.decode here to avoid invoking methods that
// require Integer's autoboxing cache to be initialized
int i = Long.decode(integerCacheHighPropValue).intValue();
i = Math.max(i, 127);
// Maximum array size is Integer.MAX_VALUE
h = Math.min(i, Integer.MAX_VALUE - -low);
}
high = h;
cache = new Integer[(high - low) + 1];
int j = low;
for(int k = 0; k < cache.length; k++)
cache[k] = new Integer(j++);
}
private IntegerCache() {}
}
When you encounter with Integer class and always boxed within the range -128 to 127 it's always better to convert the Integer object into int value as below.
<Your Integer Object>.intValue()
Related
This question already has answers here:
Weird Integer boxing in Java
(12 answers)
Closed 8 years ago.
Take a look at following code:
Long minima = -9223372036854775808L;
Long anotherminima = -9223372036854775808L;
System.out.println(minima==anotherminima); //evaluates to false
System.out.println(Long.MIN_VALUE);
Long another= 1L;
Long one = 1L;
System.out.println(another == one); //evaluates to true
I am not able to understand this behavior..? I was hoping the first eval to be true as well..
And that is what I am expecting..
There is a caching technique used by the JVM for some range of autoboxing value. As specified in the spec ( http://docs.oracle.com/javase/specs/jls/se7/html/jls-5.html#jls-5.1.7 )
If the value p being boxed is true, false, a byte, or a char in the range \u0000 to \u007f, or an int or short number between -128 and 127 (inclusive), then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2.
Ideally, boxing a given primitive value p, would always yield an identical reference. In practice, this may not be feasible using existing implementation techniques. The rules above are a pragmatic compromise. The final clause above requires that certain common values always be boxed into indistinguishable objects. The implementation may cache these, lazily or eagerly. For other values, this formulation disallows any assumptions about the identity of the boxed values on the programmer's part. This would allow (but not require) sharing of some or all of these references.
This ensures that in most common cases, the behavior will be the desired one, without imposing an undue performance penalty, especially on small devices. Less memory-limited implementations might, for example, cache all char and short values, as well as int and long values in the range of -32K to +32K.
So 1L is in this cached values and then the autoboxing give the exact same reference, but in the case of the number outside of this range (like the Long.MIN_VALUE) it isn'T cached and thus a different instance/reference is given.
In any case, when comparing object you should always use .equals()
Just a guess here, but it could be that 1L is in the constant pool, and thus the reference equality evaluates to true (just like how sometimes, even by Strings, == will evaluate to true), while that other huge number isn't. Not sure how to check which constants are in the pool at initiation.
Edit: Java has a cache of certain constant objects (including the wrapper classes for primitives, and String). Thus, if you write
String st1 = "A";
if "A" is in the constant pool, Java won't create a new String object- it will just create a reference to the already existing one. So if you then did
String st2 = "A";
System.out.println(st1 == st2);
It would print out true.
Now, not all Integer, Long, Short, etc... are cached (there are way too many), but the lower values are. So I would assume that 1L is. That means that in your question, both another and one refer to the same object, and thus it returns true even for reference equality.
First of all you should use long instead of Long. Secondly == between Integer, Long etc will check for reference equality. You may check the 5.1.7 Boxing Conversion. Alo 1L is in the constant pool so the second case is returning true.
On a side note you should use .equals for comparing the long.
From the Oracle docs:
If the value p being boxed is true, false, a byte, or a char in the
range \u0000 to \u007f, or an int or short number between -128 and 127
(inclusive), then let r1 and r2 be the results of any two boxing
conversions of p. It is always the case that r1 == r2.
Ideally, boxing a given primitive value p, would always yield an
identical reference. In practice, this may not be feasible using
existing implementation techniques. The rules above are a pragmatic
compromise. The final clause above requires that certain common values
always be boxed into indistinguishable objects. [...]
This ensures that in most common cases, the behavior will be the
desired one, without imposing an undue performance penalty, especially
on small devices. Less memory-limited implementations might, for
example, cache all char and short values, as well as int and long
values in the range of -32K to +32K.
Adding on to this answer, everything from -128 to 127 is cached into an instance in permanent java memory, meaning that those numbers (almost) always have reference equality, because new instances are not created for them. Outside of the cache, a new instance is created every time the number is used, so they are not equal referentially. Read more here and here
problem:
(minima==anotherminima)
You are comparing the memory location of the object not its values thus it returns false.
If you want to compare two long wrapper class you need to call compareTo
From jls
If the value p being boxed is true, false, a byte, or a char in the range \u0000 to \u007f,
or an int or short number between -128 and 127 (inclusive), then let r1 and r2 be the results
of any two boxing conversions of p. It is always the case that r1 == r2.
Now if you use -128 and 127 to initialized the Long wrapper class it will result to boxing conversion which will have the same reference.
You should use long instead of Long. Note that Long is a class and so you must use equals() to compare instances of it. On the other hand, if you use long instead, you can compare with == because these are primitives.
Regarding my previous Question, Why do == comparisons with Integer.valueOf(String) give different results for 127 and 128? , we know that Integer class has a cache which stores values between -128 and 127.
Just wondering, why between -128 and 127?
Integer.valueOf() documentation stated that it "caching frequently requested values" . But does values between -128 and 127 are frequently requested for real? I thought frequently requested values are very subjective.
Is there any possible reason behind this?
From the documentation also stated: "..and may cache other values outside of this range."
How is this can be achieved?
Just wondering, why between -128 and 127?
A larger range of integers may be cached, but at least those between -128 and 127 must be cached because it is mandated by the Java Language Specification (emphasis mine):
If the value p being boxed is true, false, a byte, or a char in the range \u0000 to \u007f, or an int or short number between -128 and 127 (inclusive), then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2.
The rationale for this requirement is explained in the same paragraph:
Ideally, boxing a given primitive value p, would always yield an identical reference. In practice, this may not be feasible using existing implementation techniques. The rules above are a pragmatic compromise. The final clause above requires that certain common values always be boxed into indistinguishable objects. [...]
This ensures that in most common cases, the behavior will be the desired one, without imposing an undue performance penalty, especially on small devices. Less memory-limited implementations might, for example, cache all char and short values, as well as int and long values in the range of -32K to +32K.
How can I cache other values outside of this range.?
You can use the -XX:AutoBoxCacheMax JVM option, which is not really documented in the list of available Hotspot JVM Options. However it is mentioned in the comments inside the Integer class around line 590:
The size of the cache may be controlled by the -XX:AutoBoxCacheMax=<size> option.
Note that this is implementation specific and may or may not be available on other JVMs.
-128 to 127 is the default size. But javadoc also says that the size of the Integer cache may be controlled by the -XX:AutoBoxCacheMax=<size> option. Note that it sets only high value, low value is always -128. This feature was introduced in 1.6.
As for why -128 to 127 - this is byte value range and it is natural to use it for a very small cache.
The reason for caching small integers, if that's what you're asking, is that many algorithms use small integers in their calculations, so avoiding the object-creation overhead for these values tends to be worthwhile.
The question then becomes which Integers to cache. Again, speaking in general, the frequency with which constant values are used tends to decrease as the absolute value of the constant increases -- everyone spends a lot of time using the values 1 or 2 or 10, relatively few few use the value 109 very intensively; fewer will have performance depend on how quickly one can obtain an Integer for 722.. Java chose to allocate 256 slots spanning the range of a signed byte value. This decision may have been informed by analyzing programs in existence at the time, but is just as likely to have been a purely arbitrary one. It's a reasonable amount of space to invest, it can be accessed rapidly (mask to find out if the value's in the cache's range, then a quick table lookup to access the cache), and it will definitely cover the most common cases.
In other words, I think the answer to your question is "it isn't as subjective as you thought, but the exact bounds are largely a rule-of-thumb decision ... and experiemental evidence has been that it was good enough."
Max high integer value that can be cached can be configured through system property i.e java.lang.Integer.IntegerCache.high(-XX:AutoBoxCacheMax) . The cache is implemented using an array.
private static class IntegerCache {
static final int high;
static final Integer cache[];
static {
final int low = -128;
// high value may be configured by property
int h = 127;
if (integerCacheHighPropValue != null) {
// Use Long.decode here to avoid invoking methods that
// require Integer's autoboxing cache to be initialized
int i = Long.decode(integerCacheHighPropValue).intValue();
i = Math.max(i, 127);
// Maximum array size is Integer.MAX_VALUE
h = Math.min(i, Integer.MAX_VALUE - -low);
}
high = h;
cache = new Integer[(high - low) + 1];
int j = low;
for(int k = 0; k < cache.length; k++)
cache[k] = new Integer(j++);
}
private IntegerCache() {}
}
When you encounter with Integer class and always boxed within the range -128 to 127 it's always better to convert the Integer object into int value as below.
<Your Integer Object>.intValue()
HashMap internally has its own static final variables for its working.
static final int DEFAULT_INITIAL_CAPACITY = 16;
Why can't they use byte datatype instead of using int since the value is too small.
They could, but it would be a micro-optimization, and the tradeoff would be less readable and maintainable code (Premature optimization, anyone?).
This is a static final variable, so it's allocated only once per classloader. I'd say we can spare those 3 (I'm guessing here) bytes.
I think this is because the capacity for a Map is expressed in terms of an int. When you try to work with a byte and an int, because of promotion rules, the byte will anyways be converted to an int. The default capacity is expressed in terms of an int to maybe avoid those needless promotions.
Using byte or short for variables and constants instead of int is a premature optimization that has next to no effect.
Most arithmetic and logical instructions of the JVM work only with int, long, float and double, other data types have to be cast to (usually) ints in order for these instructions to be executed on them.
The default type of number literals is int for integral and double for floating point numbers. Using byte, short and float types can thus cause some subtle programming bugs and generally worsens code readability.
A little example from the Java Puzzlers book:
public static void main(String[] args) {
for (byte b = Byte.MIN_VALUE; b < Byte.MAX_VALUE; b++) {
if (b == 0x90)
System.out.print("Joy!");
}
}
This program doesn't print Joy!, because the hex value 0x90 is implicitly promoted to an int with the value 144. Since bytes in Java are signed (which itself is very inconvenient), the variable b is never assigned to this value (Byte.MAX_VALUE = 127) and therefore, the condition is never satisfied.
All in all, the reduction of the memory footprint is simply too small (next to none) to justify such micro-optimisation. Generally, explicit numeric types of different size are not necessary and suitable for higher level programming. I personally think that only case where smaller numeric types are acceptable are byte arrays.
The byte values still taking the same space in the JVM and it will also need to be converted to int to the practical purposes explicitly or implicitly, including array sizes, indexes, etc.
Converting from a byte to an int(as it needs to be anint` in any case) would make the code slower if anything. The cost of memory is pretty trivial in the overall scheme of things.
Given the default could be any int value, I think int makes sense.
A lot of data can be represented as a series of Bytes.
Int is the default data type that most users will use when counting or workign with whole numbers.
the issue with using Byte is that the compiler will not recognize it for type conversion.
anytime you tried
int variablename = bytevariable;
it wouldnt complete the assignment however
double variablename = intVariable;
would work.
Can anyone confirm if this is true?
Java will turn a long % int into a long value. However it can never be greater than the modulus to it is always safe to cast it to an int.
long a =
int b =
int c = (int) (a % b); // cast is always safe.
Similarly a long % short will always be safe to cast to a short.
If true, does any one know why Java has a longer type for % than needed?
Additionally, there is a similar case for long & int (if you ignore sign extension)
For most (if not all) arithmetic operations, Java will assume you want the maximum defined precision. Imagine if you did this:
long a = ...;
int b = ...;
long c = a % b + Integer.MAX_VALUE;
If Java automatically down-casted a % b to an int, then the above code would cause an int overflow rather than setting c to a perfectly reasonable long value.
This is the same reason that performing operations with a double and an int will produce a double. It's much safer to up-cast the least-accurate value to a more accurate one. Then if the programmer knows more than the compiler and wants to down-cast, he can do it explicitly.
Update
Also, after thinking more about this, I'm guessing most CPU architectures don't have operations that combine 32-bit and 64-bit values. So the 32-bit value would need to be promoted to a 64-bit value just to use it as an argument to the CPU's mod operation, and the result of that operation would be a 64-bit value natively. A down-cast to an int would add an operation, which has performance implications. Combining that fact with the idea that you might actually want to keep a long value for other operations (as I mention above), it really wouldn't make sense to force the result into an int unless the developer explicitly wants it to be one.
It is always safe! (Math agrees with me.)
The result of a mod operation is always less than the divisor. Since the result of a mod operation is essentially the remainder after performing integer division, you will never have a remainder larger than the divisor.
I suspect the reason for having the operation return a long is because the divisor gets expanded to a long before the operation takes place. This makes a long result possible. (note even though the variable is expanded in memory, its value will not change. An expanded int will never be larger than an int can hold.)
As Marc B alluded to, Java will promote b to a long before actually doing the % operation. This promotion applies to all the arithmetic operations, even << and >> I believe.
In other words, if you have a binary operation and the two arguments don't have the same type, the smaller one will be promoted so that both sides will have the same type.
This is a late party chime-in but the reason is pretty simple:
The bytecode operands do need explicit casts (L2I) and longs need 2 stack positions compared to 1 for int, char, short, byte [casting from byte to int doesn't need a bytecode instruction]. After the mod operation the result takes 2 positions on the top of stack.
edit: Also, I forgot to mention Java doesn't have division/remainder of 64b/32b. There are only 64->64bit operations, i.e. LDIV and LREM.
does any one know why Java has a longer type for % than needed?
I don't know for sure. Maybe to make it work exactly the same way as the other multiplicative operators: * and \. In the JLS The type of a multiplicative expression is the promoted type of its operands. Adding an exception to long % int would be confusing.
Is there a way in Java to use unsigned numbers like in (My)SQL?
For example: I want to use an 8-bit variable (byte) with a range like: 0 ... 256; instead of -128 ... 127.
No, Java doesn't have any unsigned primitive types apart from char (which has values 0-65535, effectively). It's a pain (particularly for byte), but that's the way it is.
Usually you either stick with the same size, and overflow into negatives for the "high" numbers, or use the wider type (e.g. short for byte) and cope with the extra memory requirements.
You can use a class to simulate an unsigned number. For example
public class UInt8 implements Comparable<UInt8>,Serializable
{
public static final short MAX_VALUE=255;
public static final short MIN_VALUE=0;
private short storage;//internal storage in a int 16
public UInt8(short value)
{
if(value<MIN_VALUE || value>MAX_VALUE) throw new IllegalArgumentException();
this.storage=value;
}
public byte toByte()
{
//play with the shift operator ! <<
}
//etc...
}
You can mostly use signed numbers as if they were unsigned. Most operations stay the same, some need to be modified. See this post.
Internally, you shouldn't be using the smaller values--just use int. As I understand it, using smaller units does nothing but slow things down. It doesn't save memory because internally Java uses the system's word size for all storage (it won't pack words).
However if you use a smaller size storage unit, it has to mask them or range check or something for every operation.
ever notice that char (any operation) char yields an int? They just really don't expect you to use these other types.
The exceptions are arrays (which I believe will get packed) and I/O where you might find using a smaller type useful... but masking will work as well.
Nope, you can't change that. If you need something larger than 127 choose something larger than a byte.
If you need to optimize your storage (e.g. large matrix) you can u can code bigger positive numbers with negatives numbers, so to save space. Then, you have to shift the number value to get the actual value when needed. For instance, I want to manipulate short positive numbers only. Here how this is possible in Java:
short n = 32767;
n = (short) (n + 10);
System.out.println(n);
int m = (int) (n>=0?n:n+65536);
System.out.println(m);
So when a short integer exceeds range, it becomes negative. Yet, at least you can store this number in 16 bits, and restore its correct value by adding shift value (number of different values that can be coded). The value should be restored in a larger type (int in our case). This may not be very convenient, but I find it's so in my case.
I'm quite new to Java and to programming.
Yet, I encountered the same situation recently the need of unsigned values.
It took me around two weeks to code everything I had in mind, but I'm a total noob, so you could spend much less.
The general idea is to create interface, I have named it: UnsignedNumber<Base, Shifted> and to extend Number.class whilst implementing an abstract AbstractUnsigned<Base, Shifted, Impl extends AbstractUnsigned<Base, Shifted, Impl>> class.
So, Base parameterized type represents the base type, Shifted represents actual Java type. Impl is a shortcut for Implementation of this abstract class.
Most of the time consumed boilerplate of Java 8 Lambdas and internal private classes and safety procedures. The important thing was to achieve the behavior of unsigned when mathematical operation like subtraction or negative addition spawns the zero limit: to overflow the upper signed limit backwards.
Finally, it took another couple of days to code factories and implementation sub classes.
So far I have know:
UByte and MUByte
UShort and MUShort
UInt and MUInt
... Etc.
They are descendants of AbstractUnsigned:
UByte or MUByte extend AbstractUnsigned<Byte, Short, UByte> or AbstractUnsigned<Byte, Short, MUByte>
UShort or MUShort extend AbstractUnsigned<Short, Integer, UShort> or AbstractUnsigned<Short, Integer, MUShort>
...etc.
The general idea is to take unsigned upper limit as shifted (casted) type and code transposition of negative values as they were to come not from zero, but the unsigned upper limit.
UPDATE:
(Thanks to Ajeans kind and polite directions)
/**
* Adds value to the current number and returns either
* new or this {#linkplain UnsignedNumber} instance based on
* {#linkplain #isImmutable()}
*
* #param value value to add to the current value
* #return new or same instance
* #see #isImmutable()
*/
public Impl plus(N value) {
return updater(number.plus(convert(value)));
}
This is an externally accessible method of AbstractUnsigned<N, Shifted, Impl> (or as it was said before AbstractUnsigned<Base, Shifted, Impl>);
Now, to the under-the-hood work:
private Impl updater(Shifted invalidated){
if(mutable){
number.setShifted(invalidated);
return caster.apply(this);
} else {
return shiftedConstructor.apply(invalidated);
}
}
In the above private method mutable is a private final boolean of an AbstractUnsigned. number is one of the internal private classes which takes care of transforming Base to Shifted and vice versa.
What matters in correspondence with previous 'what I did last summer part'
is two internal objects: caster and shiftedConstructor:
final private Function<UnsignedNumber<N, Shifted>, Impl> caster;
final private Function<Shifted, Impl> shiftedConstructor;
These are the parameterized functions to cast N (or Base) to Shifted or to create a new Impl instance if current implementation instance of the AbstractUnsigned<> is immutable.
Shifted plus(Shifted value){
return spawnBelowZero.apply(summing.apply(shifted, value));
}
In this fragment is shown the adding method of the number object. The idea was to always use Shifted internally, because it is uncertain when the positive limits of 'original' type will be spawned. shifted is an internal parameterized field which bears the value of the whole AbstractUnsigned<>. The other two Function<> derivative objects are given below:
final private BinaryOperator<Shifted> summing;
final private UnaryOperator<Shifted> spawnBelowZero;
The former performs addition of two Shifted values. And the latter performs spawning below zero transposition.
And now an example from one of the factory boilerplates 'hell' for AbstractUnsigned<Byte, Short> specifically for the mentioned before spawnBelowZero UnaryOperator<Shifted>:
...,
v-> v >= 0
? v
: (short) (Math.abs(Byte.MIN_VALUE) + Byte.MAX_VALUE + 2 + v),
...
if Shifted v is positive nothing really happens and the original value is being returned. Otherwise: there's a need to calculate the upper limit of the Base type which is Byte and add up to that value negative v. If, let's say, v == -8 then Math.abs(Byte.MIN_VALUE) will produce 128 and Byte.MAX_VALUE will produce 127 which gives 255 + 1 to get the original upper limit which was cut of by the sign bit, as I got that, and the so desirable 256 is in the place. But the very first negative value is actually that 256 that's why +1 again or +2 in total. Finally, 255 + 2 + v which is -8 gives 255 + 2 + (-8) and 249
Or in a more visual way:
0 1 2 3 ... 245 246 247 248 249 250 251 252 253 254 255 256
-8 -7 -6 -5 -4 -3 -2 -1
And to finalize all that: this definitely does not ease your work or saves memory bytes, but you have a pretty much desirable behaviour when it is needed. And you can use that behaviour pretty much with any other Number.class subclasses. AbstractUnsigned being subclass of Number.class itself provides all the convenience methods and constants
similar to other 'native' Number.class subclasses, including MIN_VALUE and MAX_VALUE and a lot more, for example, I coded convenience method for mutable subclasses called makeDivisibileBy(Number n) which performs the simplest operation of value - (value % n).
My initial endeavour here was to show that even a noob, such as I am, can code it. My initial endeavour when I was coding that class was to get conveniently versatile tool for constant using.