I heard that the BigDecimal.valueOf() method is better than calling new BigDecimal() because it caches common values. I wanted to know how the caching mechanism of valueOf works.
Looking at the JDK 1.8 sources, it looks like it's just a static array which is initialized as part of class initialization - it only caches the values 0 to 10 inclusive, but that's an implementation detail. For example, given dasblinkenlight's post, it looks like earlier versions only cached 0 and 1.
For more detail - and to make sure you're getting information about the JDK which you're actually running - look at the source of the JDK you're using for yourself - most IDEs will open the relevant source code automatically, if they detect the source archive has been included in your JDK installation. Of course, if you're using a different JRE at execution time, you'd need to validate that too.
It's easy to tell whether or not a value has been cached, based on reference equality. Here's a short but complete program which finds the first non-negative value which isn't cached:
import java.math.BigDecimal;
public class Test {
public static void main(String[] args) {
for (long x = 0; x < Long.MAX_VALUE; x++) {
if (BigDecimal.valueOf(x) != BigDecimal.valueOf(x)) {
System.out.println("Value for " + x + " wasn't cached");
break;
}
}
}
}
On my machine with Java 8, the output is:
Value for 11 wasn't cached
Of course, an implementation could always cache the most recently requested value, in which case the above code would run for a very long time and then finish with no output...
If I m not wrong it caches only zero,one, two and ten, Thats why we have only
public static final BigInteger ZERO = new BigInteger(new int[0], 0);
public static final BigInteger ONE = valueOf(1);
private static final BigInteger TWO = valueOf(2);
public static final BigInteger TEN = valueOf(10);
That also calls valueOf(x) method.
Related
Java's assert mechanism allows disabling putting in assertions which have essentially no run time cost (aside from a bigger class file) if assertions are disabled. But this may cover all situations.
For instance, many of Java's collections feature "fail-fast" iterators that attempt to detect when you're using them in a thread-unsafe way. But this requires both the collection and the iterator itself to maintain extra state that would not be needed if these checks weren't there.
Suppose someone wanted to do something similar, but allow the checks to be disabled and if they are disabled, it saves a few bytes in the iterator and likewise a few more bytes in the ArrayList, or whatever.
Alternatively, suppose we're doing some sort of object pooling that we want to be able to turn on and off at runtime; when it's off, it should just use Java's garbage collection and take no room for reference counts, like this (note that the code as written is very broken):
class MyClass {
static final boolean useRefCounts = my.global.Utils.useRefCounts();
static {
if(useRefCounts)
int refCount; // want instance field, not local variable
}
void incrementRefCount(){
if(useRefCounts) refCount++; // only use field if it exists;
}
/**return true if ready to be collected and reused*/
boolean decrementAndTestRefCount(){
// rely on Java's garbage collector if ref counting is disabled.
return useRefCounts && --refCount == 0;
}
}
The trouble with the above code is that the static bock makes no sense. But is there some trick using low-powered magic to make something along these lines work? (If high powered magic is allowed, the nuclear option is generate two versions of MyClass and arrange to put the correct one on the class path at start time.)
NOTE: You might not need to do this at all. The JIT is very good at inlining constants known at runtime especially boolean and optimising away the code which isn't used.
The int field is not ideal, however, if you are using a 64 bit JVM, the object size might not change.
On the OpenJDK/Oracle JVM (64-bit), the header is 12 bytes by default. The object alignment is 8 byte so the object will use 16 bytes. The field, adds 4 bytes, which after alignment is also 16 bytes.
To answer the question, you need two classes (unless you use generated code or hacks)
class MyClass {
static final boolean useRefCounts = my.global.Utils.useRefCounts();
public static MyClass create() {
return useRefCounts ? new MyClassPlus() : new MyClass();
}
void incrementRefCount() {
}
boolean decrementAndTestRefCount() {
return false;
}
}
class MyClassPlus extends MyClass {
int refCount; // want instance field, not local variable
void incrementRefCount() {
refCount++; // only use field if it exists;
}
boolean decrementAndTestRefCount() {
return --refCount == 0;
}
}
If you accept a slightly higher overhead in the case you’re using your ref count, you may resort to external storage, i.e.
class MyClass {
static final WeakHashMap<MyClass,Integer> REF_COUNTS
= my.global.Utils.useRefCounts()? new WeakHashMap<>(): null;
void incrementRefCount() {
if(REF_COUNTS != null) REF_COUNTS.merge(this, 1, Integer::sum);
}
/**return true if ready to be collected and reused*/
boolean decrementAndTestRefCount() {
return REF_COUNTS != null
&& REF_COUNTS.compute(this, (me, i) -> --i == 0? null: i) == null;
}
}
There is a behavioral difference for the case that someone invokes decrementAndTestRefCount() more often than incrementRefCount(). While your original code silently runs into a negative ref count, this code will throw a NullPointerException. I prefer failing with an exception in this case…
The code above will leave you with the overhead of a single static field in case you’re not using the feature. Most JVMs should have no problems eliminating the conditionals regarding the state of a static final variable.
Note further that the code allows MyClass instances to get garbage collected while having a non-zero ref count, just like when it was an instance field, but also actively removes the mapping when the count reaches the initial state of zero again, to minimize the work needed for cleanup.
I wrote a minimal somewhat-lazy (int) sequence class, GarbageTest.java, as an experiment, to see if I could process very long, lazy sequences in Java, the way I can in Clojure.
Given a naturals() method that returns the lazy, infinite, sequence of natural numbers; a drop(n,sequence) method that drops the first n elements of sequence and returns the rest of the sequence; and an nth(n,sequence) method that returns simply: drop(n, lazySeq).head(), I wrote two tests:
static int N = (int)1e6;
// succeeds # N = (int)1e8 with java -Xmx10m
#Test
public void dropTest() {
assertThat( drop(N, naturals()).head(), is(N+1));
}
// fails with OutOfMemoryError # N = (int)1e6 with java -Xmx10m
#Test
public void nthTest() {
assertThat( nth(N, naturals()), is(N+1));
}
Note that the body of dropTest() was generated by copying the body of nthTest() and then invoking IntelliJ's "inline" refactoring on the nth(N, naturals()) call. So it seems to me that the behavior of dropTest() should be identical to the behavior of nthTest().
But it isn't identical! dropTest() runs to completion with N up to 1e8 whereas nthTest() fails with OutOfMemoryError for N as small as 1e6.
I've avoided inner classes. And I've experimented with a variant of my code, ClearingArgsGarbageTest.java, that nulls method parameters before calling other methods. I've applied the YourKit profiler. I've looked at the byte code. I just cannot find the leak that causes nthTest() to fail.
Where's the "leak"? And why does nthTest() have the leak while dropTest() does not?
Here's the rest of the code from GarbageTest.java in case you don't want to click through to the Github project:
/**
* a not-perfectly-lazy lazy sequence of ints. see LazierGarbageTest for a lazier one
*/
static class LazyishSeq {
final int head;
volatile Supplier<LazyishSeq> tailThunk;
LazyishSeq tailValue;
LazyishSeq(final int head, final Supplier<LazyishSeq> tailThunk) {
this.head = head;
this.tailThunk = tailThunk;
tailValue = null;
}
int head() {
return head;
}
LazyishSeq tail() {
if (null != tailThunk)
synchronized(this) {
if (null != tailThunk) {
tailValue = tailThunk.get();
tailThunk = null;
}
}
return tailValue;
}
}
static class Incrementing implements Supplier<LazyishSeq> {
final int seed;
private Incrementing(final int seed) { this.seed = seed;}
public static LazyishSeq createSequence(final int n) {
return new LazyishSeq( n, new Incrementing(n+1));
}
#Override
public LazyishSeq get() {
return createSequence(seed);
}
}
static LazyishSeq naturals() {
return Incrementing.createSequence(1);
}
static LazyishSeq drop(
final int n,
final LazyishSeq lazySeqArg) {
LazyishSeq lazySeq = lazySeqArg;
for( int i = n; i > 0 && null != lazySeq; i -= 1) {
lazySeq = lazySeq.tail();
}
return lazySeq;
}
static int nth(final int n, final LazyishSeq lazySeq) {
return drop(n, lazySeq).head();
}
In your method
static int nth(final int n, final LazyishSeq lazySeq) {
return drop(n, lazySeq).head();
}
the parameter variable lazySeq hold a reference to the first element of your sequence during the entire drop operation. This prevents the entire sequence from getting garbage collected.
In contrast, with
public void dropTest() {
assertThat( drop(N, naturals()).head(), is(N+1));
}
the first element of your sequence is returned by naturals() and directly passed to the invocation of drop, thus removed from the operand stack and does not exist during the execution of drop.
Your attempt to set the parameter variable to null, i.e.
static int nth(final int n, /*final*/ LazyishSeq lazySeqArg) {
final LazyishSeq lazySeqLocal = lazySeqArg;
lazySeqArg = null;
return drop(n,lazySeqLocal).head();
}
does not help, as now, the lazySeqArg variable is null, but the lazySeqLocal holds a reference to the first element.
A local variable does not prevent garbage collection in general, the collection of otherwise unused objects is permitted, but that doesn’t imply that a particular implementation is capable of doing it.
In case of the HotSpot JVM, only optimized code will get rid of such unused references. But here, nth is not a hot spot, as the heavy things happen within drop method.
This is the reason why the same issue does not appear at the drop method, despite it also holds a reference to the first element in its parameter variable. The drop method contains the loop doing the actual work, hence, is very likely to get optimized by the JVM, which may cause it to eliminate unused variables, allowing the already processed part of the sequence to become collected.
There are many factors which may affect the JVM’s optimizations. Besides the different shape of the code, it seems that that rapid memory allocations during the unoptimized phase may also reduce the optimizer’s improvements. Indeed, when I run with -Xcompile, to forbid interpreted execution altogether, both variants run successfully, even int N = (int)1e9 is no problem anymore. Of course, forcing compilation raises the startup time.
I have to admit that I do not understand why the mixed mode performs that much worse and I’ll investigate further. But generally, you have to be aware that the efficiency of the garbage collector is implementation dependent, so objects collected in one environment may stay in memory in another.
Clojure implements a strategy for dealing with this sort of scenario which it calls "locals clearing". There's support for it in the compiler that makes it kick in automatically where required in pure Clojure code (unless disabled at compilation time – this is sometimes useful for debugging). Clojure does also clear locals in various places in its Java runtime, however, and the way it does that could be used in Java libraries and possibly even application code, though it would undoubtedly be somewhat cumbersome.
Before I get into what Clojure does, here's a short summary of what is going on in this example:
nth(int, LazyishSeq) is implemented in terms of drop(int, LazyishSeq) and LazyishSeq.head().
nth passes both its arguments to drop and has no further use for them.
drop can easily be implemented so as to avoid holding on to the head of the passed-in sequence.
Here nth still holds on to the head of its sequence argument. The runtime may potentially discard that reference, but it is not guaranteed that it will.
The way Clojure deals with this is by clearing the reference to the sequence explicitly before control is handed off to drop. This is done using a rather elegant trick (link to the below snippet on GitHub as of Clojure 1.9.0):
// clojure/src/jvm/clojure/lang/Util.java
/**
* Copyright (c) Rich Hickey. All rights reserved.
* The use and distribution terms for this software are covered by the
* Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php)
* which can be found in the file epl-v10.html at the root of this distribution.
* By using this software in any fashion, you are agreeing to be bound by
* the terms of this license.
* You must not remove this notice, or any other, from this software.
**/
// … beginning of the file omitted …
// the next line is the 190th in the file as of Clojure 1.9.0
static public Object ret1(Object ret, Object nil){
return ret;
}
static public ISeq ret1(ISeq ret, Object nil){
return ret;
}
// …
Given the above, the call to drop inside nth can be changed to
drop(n, ret1(lazySeq, lazySeq = null))
Here lazySeq = null is evaluated as an expression before control is transferred to ret1; the value is null and there is also the side effect of setting the lazySeq reference to null. The first argument to ret1 will have been evaluated by this point, however, so ret1 receives the reference to the sequence in its first argument and returns it as expected, and that value is then passed to drop.
Thus drop receives the original value held by the lazySeq local, but the local itself is cleared before control is transferred to drop.
Consequently nth no longer holds on to the head of the sequence.
I want to parse a String, which contains a number, using JDT to find out whether the contained number is inside the valid Range of one of the Primitive Types.
Let's say i got a float value like this as String "1.7976931348623157e350" and want to see whether it is still inside the allowed range for primitive type 'double'. (In this case it would not be inside the valid range, because the maximum exponent of double is 308).
I don't want to use the standard methods like : Double.parseDouble("1.7976931348623157e350"), because I'm afraid it might be too slow if I have a big amount of primitive types, which I want to check .
If you know the Eclipse development environment you will know that inside a normal java file, eclipse is able to tell whether a variable is out of range or not, by underlining it red, in the the case of 'out of range'. So basically i want to use this functionality. But as you can guess - it's easier said then done!
I have started experimenting with the ASTParser from this library: org.eclipse.jdt.core.dom
But I must admit I was not very successful here.
First i tried calling some of those vistor methods using methods like:
resolveBinding() , but they always only returned me "Null".
I have found some interesting class called ASTSyntaxErrorPropagator , but i'm not sure how this is used correctly. It seems to propagate parsing problems or something like that and gets it's information delivered by some thing class called CodeSnippetParsingUtil I assume. Anyways, these are only speculations.
Does anyone know how to use this ASTParser correctly?
I would be really thankful for some advice.
Here is some basic code-snipped which I tried to debug:
public class DatatypesParser {
public static void main(String[] args) {
ASTParser parser = ASTParser.newParser(AST.JLS4);
Map options = JavaCore.getOptions();
JavaCore.setComplianceOptions(JavaCore.VERSION_1_7, options);
String statement = new String("int i = " + Long.MAX_VALUE + ";");
parser.setSource(statement.toCharArray());
parser.setKind(ASTParser.K_STATEMENTS);
parser.setResolveBindings(true);
parser.setBindingsRecovery(true);
ASTNode ast = parser.createAST(null);
ast.accept(new ASTVisitor() {
#Override
public boolean visit(VariableDeclarationStatement node) {
CodeSnippetParsingUtil util = new CodeSnippetParsingUtil();
return true;
}
});
}
I don't want to use the standard methods like :
Double.parseDouble("1.7976931348623157e350"), because i'm afraid it
might be too slow if i have a big amount of primitive types, which i
want to check .
Under the hood JDT is actually using the standard methods of Double to parse the value, and quite a bit more - so you should always use the standard methods if performance is a concern.
Here is how the double gets parsed by JDT.
From org.eclipse.jdt.internal.compiler.ast.DoubleLiteral:
public void computeConstant() {
Double computedValue;
[...]
try {
computedValue = Double.valueOf(String.valueOf(this.source));
} catch (NumberFormatException e) {
[...]
return;
}
final double doubleValue = computedValue.doubleValue();
if (doubleValue > Double.MAX_VALUE) {
// error: the number is too large to represent
return;
}
[...]
}
I have been asked to use Google's Caliper project to create a few microbenchmarks. I would very much like to use the annotation features of the newest beta snapshot, but aside from a few small examples I am having trouble finding good documentation on how to actually run the thing... There is a video tutorial up which instructs users on the new maven integration feature, which I was also asked NOT to use.
Right now I just have a small example stripped from one of theirs, modified with some other information I gleaned from another SO question:
public class Benchmarks {
public class Test {
#Param int size; // set automatically by framework
private int[] array; // set by us, in setUp()
#BeforeExperiment void setUp() {
// #Param values are guaranteed to have been injected by now
array = new int[size];
}
#Benchmark int timeArrayIteration(int reps) {
int dummy = 0;
for (int i = 0; i < reps; i++) {
for (int doNotIgnoreMe : array) {
dummy += doNotIgnoreMe;
}
}
return dummy;
}
}
//(Questionable practice here?)
public static void main (String args[]) {
CaliperMain.main(Test.class, args);
}
}
Running it gives me the message that I did not set a default value for size. I am having trouble tracking down where I should be putting it.
Removing "size" entirely by commenting out the #Param line and giving a hard value to the array declaration in setUp just leads to it deciding that there are "No Experiments to be done," which makes sense, I suppose.
If there are any up-to-date resources or tutorials that could point out what I am doing wrong (probably a whole lot, honestly) I would be very appreciative.
EDIT:
I have updated to this as per some advice:
public class Benchmarks {
#Param({"1", "10", "1000"}) int size; // set automatically by framework
private int[] array; // set by us, in setUp()
#BeforeExperiment void setUp() {
// #Param values are guaranteed to have been injected by now
array = new int[size];
}
#Benchmark int timeArrayIteration(int reps) {
int dummy = 0;
for (int i = 0; i < reps; i++) {
for (int doNotIgnoreMe : array) {
dummy += doNotIgnoreMe;
}
}
return dummy;
}
}
I am running through the beta snapshot and passing in the Benchmarks class as an argument. I receive the following:
Experiment selection:
Instruments: []
User parameters: {size=[1, 10, 1000]}
Virtual machines: [default]
Selection type: Full cartesian product
There were no experiments to be performed for the class Benchmarks using the instruments [allocation, runtime]
It doesn't seem to be detecting any Instruments. I am not passing any in, as it's mentioned in the documentation that it simply uses default allocation, runtime (which is fine for my purposes).
DOUBLE EDIT: Found that problem, stupid mistake. Will do a quick write-up once I confirm it.
Running it gives me the message that I did not set a default value for size.
Parameters are set either from default values:
#Param({"1", "10", "1000"}) int size;
Or by passing values via the the -D flag. E.g.: -Dsize=1,10,1000. Enums and booleans get special treatment in that it uses all possible values without having to list them in the annotation.
Removing "size" entirely by commenting out the #Param line and giving a hard value to the array declaration in setUp just leads to it deciding that there are "No Experiments to be done," which makes sense, I suppose.
The issue is likely that that your benchmark is an inner class and needs a reference to the enclosing class (though this should have been an error). Either make your benchmark class a top-level class (recommended) or make it static.
Also, there is no particular need to include the main method. Invoking CaliperMain with your benchmark class as the first parameter is equivalent.
Running it gives me the message that I did not set a default value for size.
That's very simple:
#Param({"1", "10", "1000"}) int size;
Removing "size" entirely by commenting out the #Param line and giving a hard value to the array declaration in setUp just leads to it deciding that there are "No Experiments to be done," which makes sense, I suppose.
No, it doesn't. Without any params, each benchmark method is to be run exactly once. See the other answer for the solution.
There's a quite some Javadoc, e.g., on #Param. Actually, not so much has changed. Annotations have replaced conventions (now you don't need the time prefix), params stayed the same, setup uses an annotation instead of inheritance.
This is a simplified example. I have this enum declaration as follows:
public enum ELogLevel {
None,
Debug,
Info,
Error
}
I have this code in another class:
if ((CLog._logLevel == ELogLevel.Info) || (CLog._logLevel == ELogLevel.Debug) || (CLog._logLevel == ELogLevel.Error)) {
System.out.println(formatMessage(message));
}
My question is if there is a way to shorten the test. Ideally i would like somethign to the tune of (this is borrowed from Pascal/Delphi):
if (CLog._logLevel in [ELogLevel.Info, ELogLevel.Debug, ELogLevel.Error])
Instead of the long list of comparisons. Is there such a thing in Java, or maybe a way to achieve it? I am using a trivial example, my intention is to find out if there is a pattern so I can do these types of tests with enum value lists of many more elements.
EDIT: It looks like EnumSet is the closest thing to what I want. The Naïve way of implementing it is via something like:
if (EnumSet.of(ELogLevel.Info, ELogLevel.Debug, ELogLevel.Error).contains(CLog._logLevel))
But under benchmarking, this performs two orders of magnitude slower than the long if/then statement, I guess because the EnumSet is being instantiated every time it runs. This is a problem only for code that runs very often, and even then it's a very minor problem, since over 100M iterations we are talking about 7ms vs 450ms on my box; a very minimal amount of time either way.
What I settled on for code that runs very often is to pre-instantiate the EnumSet in a static variable, and use that instance in the loop, which cuts down the runtime back down to a much more palatable 9ms over 100M iterations.
So it looks like we have a winner! Thanks guys for your quick replies.
what you want is an enum set
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/EnumSet.html
put the elements you want to test for in the set, and then use the Set method contains().
import java.util.EnumSet;
public class EnumSetExample
{
enum Level { NONE, DEBUG, INFO, ERROR };
public static void main(String[] args)
{
EnumSet<Level> subset = EnumSet.of(Level.DEBUG, Level.INFO);
for(Level currentLevel : EnumSet.allOf(Level.class))
{
if (subset.contains(currentLevel))
{
System.out.println("we have " + currentLevel.toString());
}
else
{
System.out.println("we don't have " + currentLevel.toString());
}
}
}
}
There's no way to do it concisely in Java. The closest you can come is to dump the values in a set and call contains(). An EnumSet is probably most efficient in your case. You can shorted the set initialization a little using the double brace idiom, though this has the drawback of creating a new inner class each time you use it, and hence increases the memory usage slightly.
In general, logging levels are implemented as integers:
public static int LEVEL_NONE = 0;
public static int LEVEL_DEBUG = 1;
public static int LEVEL_INFO = 2;
public static int LEVEL_ERROR = 3;
and then you can test for severity using simple comparisons:
if (Clog._loglevel >= LEVEL_DEBUG) {
// log
}
You could use a list of required levels, ie:
List<ELogLevel> levels = Lists.newArrayList(ELogLevel.Info,
ELogLevel.Debug, ELogLevel.Error);
if (levels.contains(CLog._logLevel)) {
//
}