Cost of reading fields of classes

Cost of reading fields of classes - java

Java 7
I'm reading J. Bloch's effective Java and now I'm at the section about initializing field laziliy. He introudce the so called double-check idiom as follows:
public static void main(String[] args){
Test t = new Test();
long now = System.nanoTime();
t.getT();
long invokationTime = System.nanoTime() - now;
System.out.println(invokationTime); //Prints 3299 averagely
}
private static class Test{
private volatile Test field;
public Test getT(){
Test result = field; // <----- Note the local variable here
if(result != null){
synchronized (this) {
result = field;
if (result == null)
field = result = new Test();
}
}
return result;
}
}
DEMO
He gave the following explanation of using the local variable:
What this variable does is to
ensure that field is read only once in the common case where it’s
already initialized.
Now, let's consider the following code:
public static void main(String[] args){
Test t = new Test();
long now = System.nanoTime();
t.getT();
long invokationTime = System.nanoTime() - now;
System.out.println(invokationTime); //Prints 3101 averagely
}
private static class Test{
private volatile Test field;
public Test getT(){
if(field != null){
synchronized (this) {
if (field == null)
field = new Test();
}
}
return field;
}
}
DEMO
On my machine, the second lazy-init method is even faster. But on ideone's machine they took approximately 7985 and 10630 like J. Bloch said. So is it worth to use such local variables for optimization? As far as I coould figure out, the costs of reading and writing variables are almost equal.
So we should worry about it only if the method consist mostly of such lightweight operation, right?

It's sort of worth it because you already decided that eager loading and synchronization are both too expensive. But really, it's almost certainly not. The cases where you actually need double-checked locking code tend to be in non-locking data structures and things. Good designs for performant code tend to push that into isolated places so you're not shooting yourself in the foot every time you touch the code.

Related

Java : performance of the instance without variable

I had a class like this :
class Test
{
public static test getInstance()
{
return new test()
}
public void firstMethod()
{
//do something
}
public void secondMethod()
{
//do something
}
public void thirdMethod()
{
//do something
}
}
in the another class if we calling Test.getInstance().methodName() several times with different method, what happening?
Which one will be faster and using low memory in following codes?
Test.getInstance().firstMethod()
Test.getInstance().secondMethod()
Test.getInstance().thirdMethod()
or
Test test = Test.getInstance();
test.firstMethod();
test.secondMethod();
test.thirdMethod();

Test.getInstance().firstMethod()
Test.getInstance().secondMethod()
Test.getInstance().thirdMethod()
This will create three different instances of the Test class and call a method on each.
Test test = Test.getInstance();
test.firstMethod();
test.secondMethod();
test.thirdMethod();
Will create only one instance and invoke the three methods on that instance.
So it's a completely different behavior to begin with. Obviously, since the first creates three objects, then it should take more heap space.
If you're intending to implement a singleton class, however, both are equivalent.

Every time you call getInstance the system has to allocate heap storage for a Test object and initialize it.
Furthermore, somewhere down the line the system will have to garbage-collect all those extra Test objects. With a copying collector the overhead per object is minimal, but there is some -- if for no other reason than you're causing GC to occur more often.

class Test
{
public static Test getInstance()
{
return new Test();
}
public void firstMethod()
{
// do something
}
public void secondMethod()
{
// do something
}
public void thirdMethod()
{
// do something
}
}
public class Blah
{
public static void main(String[] args)
{
int i = 0;
long start = System.nanoTime();
Test t = new Test();
for (; i < 100000; i++)
{
t.firstMethod();
}
long stop = System.nanoTime();
System.out.println(stop - start);
i = 0;
start = System.nanoTime();
for (; i < 100000; i++)
{
Test.getInstance().firstMethod();
}
stop = System.nanoTime();
System.out.println(stop - start);
}
}
output:
~3486938
~4894574
Creating single instance with new Test() proved to be consistently faster by about 30%.
Memory calculations are harder due to disability to do it in one run. However if we run only the first loop (changing what's inside) and use:
Runtime runtime = Runtime.getRuntime();
long memory = runtime.totalMemory() - runtime.freeMemory();
just before printing. In two separate runs we can determine the difference: ~671200 or ~1342472 (seemed to change between runs rather randomly with no clear influence on runtime) for new Test() and ~2389288 (no big differences this time) for getInstance() in 100000 iterations. Again clear victory to single instance

Best performance between two examples

PerformanceTest1:
public class PerformanceTest1 {
public static void main(String[] args) {
boolean i = false;
if (i == false)
i = true;
System.out.println(i);
}
}
PerformanceTest2:
public class PerformanceTest2 {
public static void main(String[] args) {
boolean i = false;
i = true;
System.out.println(i);
}
}
I've been asking myself about these two possibilities, what would give the best performance. I don't know if the fact of checking if (i == false) (at PerformanceTest1) every time while(true) loop is executed would give a worse performance than just setting i = true every time the while(true) loop is executed.
Q: So, PerformanceTest1 or PerformanceTest2 would give a best performance? Why?
EDIT:
So, based on the answers, I suppose that the performance of the code below would be the same too?
public class PerformanceTest1 {
public static void main(String[] args) {
Point i;
if (i == null)
i = new Point();
}
}
public class PerformanceTest2 {
public static void main(String[] args) {
Point i;
i = new Point();
}
}

The branch predictor would just ignore the path which is executing the if inside the while after few iterations so there will be no difference, as the condition will be always false.
The CPU will keep its execution as assuming that the if is not taken and by getting a 100% prediction hit. So there will be no rollback and the two becomes basically equivalent.
Just as a side note, there's no need to have i == false, !i is enough.

There is no real difference in terms of performance between the two methods.
The first if-test will only be executed once. This is because the JVM (Java Virtual Machine) will not perform the test after a few times as i will always be true. I'm no expert on the JVM and runtime, but you might even expect the if-test to only run once.

They both will have the same performance, but I believe that the PerformanceTest1 might take up more performance, since you are doing an extra operation. it takes less performance to simply assign a value to a boolean variable, than checking if it equals to false first

Checking if assertions are enabled

You can enable/disable assert on the ClassLoader.
But how can you determine if they are already enabled?
(I want to take some code paths that perform expensive checks only if the JVM is invoked with assertions enabled.)

public static boolean areAssertsEnabled() {
boolean assertsEnabled = false;
assert assertsEnabled = true; // Intentional side effect!!!
return assertsEnabled;
}

boolean assertEnabled = false;
try {
assert false;
} catch (AssertionError e) {
assertEnabled = true;
}

ManagementFactory.getRuntimeMXBean().getInputArguments().contains("-ea");

The sibling answer is correct. But I question the utility and the generality of this approach. (Jump to “Alternative approach” for another way to deal with this problem.)
The simplest way for assertions to be enabled is if they are enabled for all classes.
-ea
or:
-enableassertions
In that case you can store that fact in one variable and use it throughout your program:
public class Main {
public static boolean assertsEnabled = false;
static {
assert assertsEnabled = true;
}
[…]
}
But say I have classes
Main, A, B, C
And:
-ea:Main -ea:A
I.e. assertions are only enabled for Main and A. The intent must thus be that assertions inside B and C shouldn’t be run.
Given this:
public class Main {
public static boolean assertsEnabled = false;
static {
assert assertsEnabled = true;
}
public static void main(String[] args) {
System.out.println("Hello from main()");
m();
assert A.print();
A.print2();
assert B.print();
B.print2();
assert C.print();
C.print2();
}
private static void m() {
if (assertsEnabled) {
System.out.println(" main() again (using static variable)");
}
}
}
It is clear how the print() methods will be handled: they will be run since -ea:Main. If -da:Main() then they will not be run.
m() will print the string since we know that assertsEnabled.
The print2() functions look like this:
// C
public static void print2() {
if (Main.assertsEnabled) {
System.out.println(" assert inside C (using variable from Main)");
}
}
Here, it is also clear what will happen: the program will print that string since -ea:Main and the way we initialized Main.assertsEnabled. But hold on: assertions are disabled for C (effectively -da:C). So is this really what we intended? Perhaps. Or perhaps we just used the static variable belonging to Main as it was convenient enough, and didn’t consider that this run in Main:
public static boolean assertsEnabled = false;
static {
assert assertsEnabled = true;
}
Will behave differently than the exact same code which would be copy-pasted into C.
So code that acts differently based on the assertion inclusion of other classes seems potentially confusing. Let’s instead just copy–paste this snippet into every class which uses assertions:
private static boolean assertsEnabled = false;
static {
assert assertsEnabled = true;
}
And use it like this:
if (assertsEnabled) {
// Error checking
}
But I think there is a more straightforward approach.
Alternative approach
OP:
I want to take some code paths that perform expensive checks only if the JVM is invoked with assertions enabled.
For any block of code X which is only supposed to be run if assertions are enabled
Make a static method x() with return type boolean
Just put return true to satisfy the type checker (could also be whatever you want to assert, but since you want to check if assertions are enabled and then run some code it seems that the checks are more involved than what one single boolean expression can conveniently achieve)
Put X inside the method body
Put assert in front of all invocations of x()
assert x();
[…]
private static boolean x() {
// X
}
For example:
private static boolean x() {
var setup = new Setup();
assert ...;
assert ...;
[more setup and preparation]
assert ...;
return true;
}
Interleaving regular code and assertion code
The “time how long this runs” problem: sometimes you have cross-cutting concerns. In this case, you might want to run some assertion-only code, then the regular code, and then finally the other part of the assertion-only code (which uses the first part).
The Java article on assertions covers how to approach this problem:
Occasionally it is necessary to save some data prior to performing a computation in order to check a postcondition. You can do this with two assert statements and a simple inner class that saves the state of one or more variables so they can be checked (or rechecked) after the computation. […]
Here’s a more simplified and hand-wavy example:
private static void doWork(Work work) {
ConsistencyCheck cc = null;
assert ((cc = new ConsistencyCheck(work)) != null);
doWorkInner(work);
assert cc.check(work);
}
The only overhead here (if it isn’t removed by the JIT as dead code) when running with assertions disabled would be to initialize an object to null, which shouldn’t be expensive.

Setting a value only on first access -- best practice, (micro)performance?

In the below code, assume that getAndClear() will get called billions of times, i.e. assume that performance matters. It will return an array only during its first call. It must return null in all further calls. (That is, my question is about micro-optimization in some sense, and I'm aware of the fact it's bad practice, but you can also consider it as a question of "which code is nicer" or "more elegant".)
public class Boo {
public static int[] anything = new int[] { 2,3,4 };
private static int[] something = new int[] { 5,6,7 }; // this may be much bigger as well
public static final int[] getAndClear() {
int[] st = something;
something = null;
// ... (do something else, useful)
return st;
}
}
Is the below code faster? Is it better practice?
public static int[] getAndClear() {
int[] array = sDynamicTextIdList;
if (array != null) {
sDynamicTextIdList = null;
// ... (do something else, useful)
return array;
}
// ... (do something else, useful)
return null;
}
A further variant could be this:
public static int[] getAndClear() {
int[] array = sDynamicTextIdList;
if (array != null) {
sDynamicTextIdList = null;
}
// ... (do something else, useful)
return array;
}
I know it probably breaks down to hardware architecture level and CPU instructions (setting something to 0 vs. checking for 0), and performance-wise, it doesn't matter, but then I would like to know which is the "good practive" or more quality code. In this case, the question can be reduced to this:
private static boolean value = true;
public static int[] getTrueOnlyOnFirstCall() {
boolean b = value;
value = false;
return b;
}
If the method is called 100000 times, this means that value will be set to false 99999 times unnecessarily. The other variant (faster? nicer?) would look like this:
public static int[] getTrueOnlyOnFirstCall() {
boolean b = value;
if (b) {
value = false;
return true;
}
return false;
}
Moreover, compile-time and JIT-time optimizations may also play a role here, so this question could be extended by "and what about in C++". (If my example is not applicable to C++ in this form, then feel free to subtitute the statics with member fields of a class.)

IMHO, it's not worth doing the micro-optimization. One drawback to optimization is that it relies heavily on the environment (as you mentioned JIT--the version of the JDK plays a strong role; what is faster now may be slower in the future).
Code maintainability is (in my opinion) far more important over the long haul. Implement the version which is the clearest. I like the getTrueOnlyOnFirstCall() which contains the if statement, for example.
In all of these examples, though, you would need synchronization around the getters and around the portions which modify the boolean.

Asking about threading, arrays and cache memory

I hope in a good manner :-)
I wrote this piece of code.
What I wished to do, is to build something like "cache".
I assumed that I had to watch for different threads, as might many calls get to that class, so I tried the ThreadLocal functionality.
Base pattern is
have "MANY SETS of VECTOR"
The vector holds something like:
VECTOR.FieldName = "X"
VECTOR.FieldValue= "Y"
So many Vector objects in a set. Different set for different calls from different machines, users, objects.
private static CacheVector instance = null;
private static SortedSet<SplittingVector> s = null;
private static TreeSet<SplittingVector> t = null;
private static ThreadLocal<SortedSet<SplittingVector>> setOfVectors = new ThreadLocal<SortedSet<SplittingVector>>();
private static class MyComparator implements Comparator<SplittingVector> {
public int compare(SplittingVector a, SplittingVector b) {
return 1;
}
// No need to override equals.
}
private CacheVector() {
}
public static SortedSet<SplittingVector> getInstance(SplittingVector vector) {
if (instance == null) {
instance = new CacheVector();
//TreeSet<SplittingVector>
t = new TreeSet<SplittingVector>(new MyComparator());
t.add(vector);
s = Collections.synchronizedSortedSet(t);//Sort the set of vectors
CacheVector.assign(s);
} else {
//TreeSet<SplittingVector> t = new TreeSet<SplittingVector>();
t.add(vector);
s = Collections.synchronizedSortedSet(t);//Sort the set of vectors
CacheVector.assign(s);
}
return CacheVector.setOfVectors.get();
}
public SortedSet<SplittingVector> retrieve() throws Exception {
SortedSet<SplittingVector> set = setOfVectors.get();
if (set == null) {
throw new Exception("SET IS EMPTY");
}
return set;
}
private static void assign(SortedSet<SplittingVector> nSet) {
CacheVector.setOfVectors.set(nSet);
}
So... I have it in the attach and I use it like this:
CachedVector cache = CachedVector.getInstance(bufferedline);
The nice part: Bufferedline is a splitted line based on some delimiter from data files. Files can be of any size.
So how do you see this code? Should I be worry ?
I apologise for the size of this message!

Writing correct multi-threaded code is not that easy (i.e. your singleton fails to be), so try to rely on existing solutions if posssible. If you're searching for a thread-safe Cache implementation in Java, check out this LinkedHashMap. You can use it to implement a LRU cache. And collections.synchronizedMap(). can make this thread-safe.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Cost of reading fields of classes - java

Related

Java : performance of the instance without variable

Best performance between two examples

Checking if assertions are enabled

Setting a value only on first access -- best practice, (micro)performance?

Asking about threading, arrays and cache memory

Categories

Resources