I have the below code:
public class Foo {
private volatile Map<String, String> map;
public Foo() {
refresh();
}
public void refresh() {
map = getData();
}
public boolean isPresent(String id) {
return map.containsKey(id);
}
public String getName(String id) {
return map.get(id);
}
private Map<String, String> getData() {
// logic
}
}
Is the above code thread safe or do I need to add synchronized or mutexes in there? If it's not thread safe, please clarify why.
Also, I've read that one should use AtomicReference instead of this, but in the source of the AtomicReference class, I can see that the field used to hold the value is volatile (along with a few convenience methods).
Is there a specific reason to use AtomicReference instead?
I've read multiple answer related to this but the concept of volatile still confuses me. Thanks in advance.
If you're not modifying the contents of map (except inside of refresh() when creating it), then there are no visibility issues in the code.
It's still possible to do isPresent(), refresh(), getName() (if no outside synchronization is used) and end up with isPresent()==true and getName()==null.
A class is "thread safe" if it does the right thing when it is used by multiple threads at the same time. There is no way to tell whether a class is thread safe unless you can say what "the right thing" means, and especially, what "the right thing when used by multiple threads" means.
What is the right thing if thread A calls foo.isPresent("X") and it returns true, and then thread B calls foo.refresh(), and then thread A calls foo.getName("X")?
If you are going to claim "thread safety", then you must be very explicit about what the caller should expect in cases like that.
Volatile is only useful in this scenario to update the value immediately. It doesn't really make the code by itself thread-safe.
But because you've stated in your comment, you only update the reference and because the reference-switch is atomic, your code will be thread-safe.(from the given code)
If I understood your question correctly and your comments - your class Foo holds a Map in which only the reference should be updated e.g. a whole new Map added instead of mutating it. If this is the premise:
It does not make any difference if you declare it as volatile or not. Every read/write operation in Java is atomic itself. You will never see a half transaction on these operations. See the JLS 17.7
17.7. Non-Atomic Treatment of double and long
For the purposes of the Java programming language memory model, a single write to a non-volatile long or double value is treated as two separate writes: one to each 32-bit half. This can result in a situation where a thread sees the first 32 bits of a 64-bit value from one write, and the second 32 bits from another write.
Writes and reads of volatile long and double values are always atomic.
Writes to and reads of references are always atomic, regardless of whether they are implemented as 32-bit or 64-bit values.
Some implementations may find it convenient to divide a single write action on a 64-bit long or double value into two write actions on adjacent 32-bit values. For efficiency's sake, this behavior is implementation-specific; an implementation of the Java Virtual Machine is free to perform writes to long and double values atomically or in two parts.
Implementations of the Java Virtual Machine are encouraged to avoid splitting 64-bit values where possible. Programmers are encouraged to declare shared 64-bit values as volatile or synchronize their programs correctly to avoid possible complications.
EDIT: Although the top statement still stands as it is - for thread safety it's necessary to add volatile to reflect the immediate update on different Threads to reflect the reference update. The behavior of a Thread is to make local copy of it while with volatile it will do a happens-before relationship in other words the Threads will have the same state of the Map.
As Integer class is also immutable class and we know that immutable class is thread-safe what is the need of Atomic Integer.
I am confused .
Is it the reason that reads and write of immutable objects need not be atomic whereas read and write of atomic integer is atomic .
That means atomic classes are also thread-safe.
AtomicInteger is used in multithreaded environments when you need to make sure that only one thread can update an int variable. The advantage is that no external synchronization is requried since the operations which modify it's value are executed in a thread-safe way.
Consider the followind code:
private int count;
public int updateCounter() {
return ++count;
}
If multiple threads would call the updateCounter method, it's possible that some of them would receive the same value. The reason it that the ++count operation isn't atomical since isn't only one operation, but made from three operations: read count, add 1 to it's value and write it back to it. Multiple calling threads could see the variable as unmodified to it's latest value.
The above code should be replaced with this:
private AtomicInteger count = new AtomicInteger(0);
public int updateCounter() {
return count.incrementAndGet();
}
The incrementAndGet method is guaranteed to atomically increment the stored value and return it's value without using any external synchonization.
If your value never changes, you don't have to use the AtomicInteger, it's enought to use int.
AtomicInteger is thread safe (in fact, all classes from java.util.concurrent.atomic package are thread safe), while normal integers are NOT threadsafe.
You would require 'synchronized' & 'volatile' keywords, when you are using an 'Integer' variable in multi-threaded environment (to make it thread safe) where as with atomic integers you don't need 'synchronized' & 'volatile' keywords as atomic integers take care of thread safety.
Also, I would recommend the below helpful tutorial on the same subject:
http://tutorials.jenkov.com/java-concurrency/compare-and-swap.html
Please refer below oracle doc for more information on 'atomic' package:
https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/atomic/package-summary.html
While immutable objects are thread-safe by definition, mutable objects can be thread safe too.
That is precisely the purpose of the Atomic... classes (AtomicInteger, AtomicBoolean, and so on).
The various ...get... and ...set... methods allow thread-safe access and mutation of the object.
Not surprisingly, the class is declared in the java.util.concurrent package.
You only have to browse the API for the java.util.concurrent.atomic package:
A small toolkit of classes that support lock-free thread-safe programming on single variables.
Consider a variable
int myInt = 3;
AtomicInteger relates to myInt.
Integer relates to 3.
in other words, your variable is mutable and can change it's value. While the value 3 is an integer literal, a constant, an immutable expression.
Integers are object representations of literals and are therefore immutable, you can basically only read them.
AtomicIntegers are containers for those values. You can read and set them. Same as asigning a value to variable. But different to changing the value of int variable, operations on an AtomicInteger are atomic.
For example this is not atomic
if(myInt == 3) {
myInt++;
}
This is atomic
AtomicInteger myInt = new AtomicInteger(3);
//atomic
myInt.compareAndSet(3, 4);
I think the main difference between AtomicInteger and normal immutable Integer will come into the picture, once we understand why even immutable Integers are not thread-safe.
Let's see with an example.
Suppose, we have a value of int count = 5, which is being shared by two different threads named T1 and T2 with both reading and writing at the same time.
We know that, if there is any value being reassigned into an immutable object, the old object remains at the pool and the new one takes over.
Now, when T1 and T2 are updating their values into count variable, Java might take this value into some cache and will do the set operations there and we won't know when JVM will write the updated value into main memory, so there might be a possibility that one of the threads may be updating the value into a totally stale value.
This brings us to the volatile keyword.
Volatile - This keyword ensures that all the I/O operations on any variable will take place on the main memory so that all the threads are working with the most updated value.
Consider, if 1 Thread is writing and all other threads are reading then, volatile will solve our problem, but if all the threads are reading and writing on the same variable at the same time, then we need synchronizing to ensure thread-safety.
Volatile keyword does not ensure thread-safety.
Now, coming to why AtomicIntegers. Even if are using syncrhonized keyword to ensure thread-safety, the actual update operation of count variable will be a three step process.
get updated value of count variable
increment the value by 1
set the value to count variable
This is why it takes a slightly longer time to update any value for normal Integers to update values once the thread safety is taken into consideration.
**AtomicIntegers solve this problem furthermore of thread safety and also faster updates by an optimized lock-free algorithm called Compare-And-Swap (CAS method).
They perform all the update operations atomically as a single-step process. **
Since Java 5, we've had boxing/unboxing of primitive types so that int is wrapped to be java.lang.Integer, and so and and so forth.
I see a lot of new Java projects lately (that definitely require a JRE of at least version 5, if not 6) that are using int rather than java.lang.Integer, though it's much more convenient to use the latter, as it has a few helper methods for converting to long values et al.
Why do some still use primitive types in Java? Is there any tangible benefit?
In Joshua Bloch's Effective Java, Item 5: "Avoid creating unnecessary objects", he posts the following code example:
public static void main(String[] args) {
Long sum = 0L; // uses Long, not long
for (long i = 0; i <= Integer.MAX_VALUE; i++) {
sum += i;
}
System.out.println(sum);
}
and it takes 43 seconds to run. Taking the Long into the primitive brings it down to 6.8 seconds... If that's any indication why we use primitives.
The lack of native value equality is also a concern (.equals() is fairly verbose compared to ==)
for biziclop:
class Biziclop {
public static void main(String[] args) {
System.out.println(new Integer(5) == new Integer(5));
System.out.println(new Integer(500) == new Integer(500));
System.out.println(Integer.valueOf(5) == Integer.valueOf(5));
System.out.println(Integer.valueOf(500) == Integer.valueOf(500));
}
}
Results in:
false
false
true
false
EDIT Why does (3) return true and (4) return false?
Because they are two different objects. The 256 integers closest to zero [-128; 127] are cached by the JVM, so they return the same object for those. Beyond that range, though, they aren't cached, so a new object is created. To make things more complicated, the JLS demands that at least 256 flyweights be cached. JVM implementers may add more if they desire, meaning this could run on a system where the nearest 1024 are cached and all of them return true... #awkward
Autounboxing can lead to hard to spot NPEs
Integer in = null;
...
...
int i = in; // NPE at runtime
In most situations the null assignment to in is a lot less obvious than above.
Boxed types have poorer performance and require more memory.
Primitive types:
int x = 1000;
int y = 1000;
Now evaluate:
x == y
It's true. Hardly surprising. Now try the boxed types:
Integer x = 1000;
Integer y = 1000;
Now evaluate:
x == y
It's false. Probably. Depends on the runtime. Is that reason enough?
Besides performance and memory issues, I'd like to come up with another issue: The List interface would be broken without int.
The problem is the overloaded remove() method (remove(int) vs. remove(Object)). remove(Integer) would always resolve to calling the latter, so you could not remove an element by index.
On the other hand, there is a pitfall when trying to add and remove an int:
final int i = 42;
final List<Integer> list = new ArrayList<Integer>();
list.add(i); // add(Object)
list.remove(i); // remove(int) - Ouch!
Can you really imagine a
for (int i=0; i<10000; i++) {
do something
}
loop with java.lang.Integer instead? A java.lang.Integer is immutable, so each increment round the loop would create a new java object on the heap, rather than just increment the int on the stack with a single JVM instruction. The performance would be diabolical.
I would really disagree that it's much mode convenient to use java.lang.Integer than int. On the contrary. Autoboxing means that you can use int where you would otherwise be forced to use Integer, and the java compiler takes care of inserting the code to create the new Integer object for you. Autoboxing is all about allowing you to use an int where an Integer is expected, with the compiler inserting the relevant object construction. It in no way removes or reduces the need for the int in the first place. With autoboxing you get the best of both worlds. You get an Integer created for you automatically when you need a heap based java object, and you get the speed and efficiency of an int when you are just doing arithmetic and local calculations.
Primitive types are much faster:
int i;
i++;
Integer (all Numbers and also a String) is an immutable type: once created it can not be changed. If i was Integer, than i++ would create a new Integer object - much more expensive in terms of memory and processor.
First and foremost, habit. If you've coded in Java for eight years, you accumulate a considerable amount of inertia. Why change if there is no compelling reason to do so? It's not as if using boxed primitives comes with any extra advantages.
The other reason is to assert that null is not a valid option. It would be pointless and misleading to declare the sum of two numbers or a loop variable as Integer.
There's the performance aspect of it too, while the performance difference isn't critical in many cases (though when it is, it's pretty bad), nobody likes to write code that could be written just as easily in a faster way we're already used to.
By the way, Smalltalk has only objects (no primitives), and yet they had optimized their small integers (using not all 32 bits, only 27 or such) to not allocate any heap space, but simply use a special bit pattern. Also other common objects (true, false, null) had special bit patterns here.
So, at least on 64-bit JVMs (with a 64 bit pointer namespace) it should be possible to not have any objects of Integer, Character, Byte, Short, Boolean, Float (and small Long) at all (apart from these created by explicit new ...()), only special bit patterns, which could be manipulated by the normal operators quite efficiently.
I can't believe no one has mentioned what I think is the most important reason:
"int" is so, so much easier to type than "Integer". I think people underestimate the importance of a concise syntax. Performance isn't really a reason to avoid them because most of the time when one is using numbers is in loop indexes, and incrementing and comparing those costs nothing in any non-trivial loop (whether you're using int or Integer).
The other given reason was that you can get NPEs but that's extremely easy to avoid with boxed types (and it is guaranteed to be avoided as long as you always initialize them to non-null values).
The other reason was that (new Long(1000))==(new Long(1000)) is false, but that's just another way of saying that ".equals" has no syntactic support for boxed types (unlike the operators <, >, =, etc), so we come back to the "simpler syntax" reason.
I think Steve Yegge's non-primitive loop example illustrates my point very well:
http://sites.google.com/site/steveyegge2/language-trickery-and-ejb
Think about this: how often do you use function types in languages that have good syntax for them (like any functional language, python, ruby, and even C) compared to java where you have to simulate them using interfaces such as Runnable and Callable and nameless classes.
Couple of reasons not to get rid of primitives:
Backwards compatability.
If it's eliminated, any old programs wouldn't even run.
JVM rewrite.
The entire JVM would have to be rewritten to support this new thing.
Larger memory footprint.
You'd need to store the value and the reference, which uses more memory. If you have a huge array of bytes, using byte's is significantly smaller than using Byte's.
Null pointer issues.
Declaring int i then doing stuff with i would result in no issues, but declaring Integer i and then doing the same would result in an NPE.
Equality issues.
Consider this code:
Integer i1 = 5;
Integer i2 = 5;
i1 == i2; // Currently would be false.
Would be false. Operators would have to be overloaded, and that would result in a major rewrite of stuff.
Slow
Object wrappers are significantly slower than their primitive counterparts.
Objects are much more heavyweight than primitive types, so primitive types are much more efficient than instances of wrapper classes.
Primitive types are very simple: for example an int is 32 bits and takes up exactly 32 bits in memory, and can be manipulated directly. An Integer object is a complete object, which (like any object) has to be stored on the heap, and can only be accessed via a reference (pointer) to it. It most likely also takes up more than 32 bits (4 bytes) of memory.
That said, the fact that Java has a distinction between primitive and non-primitive types is also a sign of age of the Java programming language. Newer programming languages don't have this distinction; the compiler of such a language is smart enough to figure out by itself if you're using simple values or more complex objects.
For example, in Scala there are no primitive types; there is a class Int for integers, and an Int is a real object (that you can methods on etc.). When the compiler compiles your code, it uses primitive ints behind the scenes, so using an Int is just as efficient as using a primitive int in Java.
In addition to what others have said, primitive local variables are not allocated from the heap, but instead on the stack. But objects are allocated from the heap and thus have to be garbage collected.
It's hard to know what kind of optimizations are going on under the covers.
For local use, when the compiler has enough information to make optimizations excluding the possibility of the null value, I expect the performance to be the same or similar.
However, arrays of primitives are apparently very different from collections of boxed primitives. This makes sense given that very few optimizations are possible deep within a collection.
Furthermore, Integer has a much higher logical overhead as compared with int: now you have to worry about about whether or not int a = b + c; throws an exception.
I'd use the primitives as much as possible and rely on the factory methods and autoboxing to give me the more semantically powerful boxed types when they are needed.
int loops = 100000000;
long start = System.currentTimeMillis();
for (Long l = new Long(0); l<loops;l++) {
//System.out.println("Long: "+l);
}
System.out.println("Milliseconds taken to loop '"+loops+"' times around Long: "+ (System.currentTimeMillis()- start));
start = System.currentTimeMillis();
for (long l = 0; l<loops;l++) {
//System.out.println("long: "+l);
}
System.out.println("Milliseconds taken to loop '"+loops+"' times around long: "+ (System.currentTimeMillis()- start));
Milliseconds taken to loop '100000000' times around Long: 468
Milliseconds taken to loop '100000000' times around long: 31
On a side note, I wouldn't mind seeing something like this find it's way into Java.
Integer loop1 = new Integer(0);
for (loop1.lessThan(1000)) {
...
}
Where the for loop automatically increments loop1 from 0 to 1000
or
Integer loop1 = new Integer(1000);
for (loop1.greaterThan(0)) {
...
}
Where the for loop automatically decrements loop1 1000 to 0.
Primitive types have many advantages:
Simpler code to write
Performance is better since you are not instantiating an object for the variable
Since they do not represent a reference to an object there is no need to check for nulls
Use primitive types unless you need to take advantage of the boxing features.
You need primitives for doing mathematical operations
Primitives takes less memory as answered above and better performing
You should ask why Class/Object type is required
Reason for having Object type is to make our life easier when we deal with Collections. Primitives cannot be added directly to List/Map rather you need to write a wrapper class. Readymade Integer kind of Classes helps you here plus it has many utility methods like Integer.pareseInt(str)
I agree with previous answers, using primitives wrapper objects can be expensive.
But, if performance is not critical in your application, you avoid overflows when using objects. For example:
long bigNumber = Integer.MAX_VALUE + 2;
The value of bigNumber is -2147483647, and you would expect it to be 2147483649. It's a bug in the code that would be fixed by doing:
long bigNumber = Integer.MAX_VALUE + 2l; // note that '2' is a long now (it is '2L').
And bigNumber would be 2147483649. These kind of bugs sometimes are easy to be missed and can lead to unknown behavior or vulnerabilities (see CWE-190).
If you use wrapper objects, the equivalent code won't compile.
Long bigNumber = Integer.MAX_VALUE + 2; // Not compiling
So it's easier to stop these kind of issues by using primitives wrapper objects.
Your question is so answered already, that I reply just to add a little bit more information not mentioned before.
Because JAVA performs all mathematical operations in primitive types. Consider this example:
public static int sumEven(List<Integer> li) {
int sum = 0;
for (Integer i: li)
if (i % 2 == 0)
sum += i;
return sum;
}
Here, reminder and unary plus operations can not be applied on Integer(Reference) type, compiler performs unboxing and do the operations.
So, make sure how many autoboxing and unboxing operations happen in java program. Since, It takes time to perform this operations.
Generally, it is better to keep arguments of type Reference and result of primitive type.
The primitive types are much faster and require much less memory. Therefore, we might want to prefer using them.
On the other hand, current Java language specification doesn’t allow usage of primitive types in the parameterized types (generics), in the Java collections or the Reflection API.
When our application needs collections with a big number of elements, we should consider using arrays with as more “economical” type as possible.
*For detailed info see the source: https://www.baeldung.com/java-primitives-vs-objects
To be brief: primitive types are faster and require less memory than boxed ones
i am going thru Java threads book. I came across this statement
Statement 1:- "volatile variables can be safely used only for single load or store operation and can't be
applied to long or double variales. These restrictions make the use of volatile variables uncommon"
I did not get what does single load or store operation mean here? why volatile can't be
applied to long or double variales?
Statement 2:- "A Volatile integer can not be used with the ++ operator because ++ operator contains
multiple instructions.The AtomicInteger class has a method that allows the integer it holds to be
incremented atomically."
Why Volatile integer can not be used with the ++ operator and how AtomicInteger addresses it?
Statement 1:- "volatile variables can be safely used only for single load or store operation and can't be applied to long or double variales. These restrictions make the use of volatile variables uncommon"
What?! I believe this is simply flat-out wrong. Maybe your book is out of date.
Statement 2:- "A Volatile integer can not be used with the ++ operator because ++ operator contains multiple instructions.The AtomicInteger class has a method that allows the integer it holds to be incremented atomically."
Exactly what it says. The ++ operator actually translates to machine code like this (in Java-like pseudocode):
sync_CPU_caches();
int processorRegister = variable;
processorRegister = processorRegister + 1;
variable = processorRegister;
sync_CPU_caches();
This is not thread-safe, because even though it has a memory barrier, and reads atomically, and writes atomically, it is not guaranteed that you won't get a thread switch in the middle, and processor registers are local to a CPU core (think of them as like "local variables" inside the CPU core). But an AtomicInteger is thread-safe - it probably is implemented using special machine code instructions such as compare-and-swap.
The main purpose of volatile variables is not to cause immediate thread-safe access to that variable, but to ensure a so called happens-before safety.
Theoretically a call to
volatile int i = 0;
and
int i = 0;
has no difference, as a 32-bit word is written atomically anyways (on 32 bit and higher machines to be correct). Since pointers are 32/64 bit ints as well internally, there is basically only one operation that volatile makes atomically, and that is if you use a 64 bit long in a 32 bit environment.
The happens-before however is something that actually messes up the above example. To understand this you need to know that threads don't use the actual memory of the variable in question but might make copies of it to speed up execution and can re-order the statements for optimization. Now if you have something like:
Thread A: value = 1; doIt = true;
Thread B: if (doIt) { doDoIt(value); }
It is possible that in Thread B doIt is true, but value is not yet 1, because the order of execution might have been changed by the JVM, or the new value has just not yet been broadcasted to the copy of Thread B's value.
If doIt is declared volatile instead, then at the moment of accessing it, the JVM ensures that all code before that access has already been executed and broadcasted. So the above example is the actual reason to use volatile.
Are there any concurrency problems with one thread reading from one index of an array, while another thread writes to another index of the array, as long as the indices are different?
e.g. (this example not necessarily recommended for real use, only to illustrate my point)
class Test1
{
static final private int N = 4096;
final private int[] x = new int[N];
final private AtomicInteger nwritten = new AtomicInteger(0);
// invariant:
// all values x[i] where 0 <= i < nwritten.get() are immutable
// read() is not synchronized since we want it to be fast
int read(int index) {
if (index >= nwritten.get())
throw new IllegalArgumentException();
return x[index];
}
// write() is synchronized to handle multiple writers
// (using compare-and-set techniques to avoid blocking algorithms
// is nontrivial)
synchronized void write(int x_i) {
int index = nwriting.get();
if (index >= N)
throw SomeExceptionThatIndicatesArrayIsFull();
x[index] = x_i;
// from this point forward, x[index] is fixed in stone
nwriting.set(index+1);
}
}
edit: critiquing this example is not my question, I literally just want to know if array access to one index, concurrently to access of another index, poses concurrency problems, couldn't think of a simple example.
While you will not get an invalid state by changing arrays as you mention, you will have the same problem that happens when two threads are viewing a non volatile integer without synchronization (see the section in the Java Tutorial on Memory Consistency Errors). Basically, the problem is that Thread 1 may write a value in space i, but there is no guarantee when (or if) Thread 2 will see the change.
The class java.util.concurrent.atomic.AtomicIntegerArray does what you want to do.
The example has a lot of stuff that differs from the prose question.
The answer to that question is that distinct elements of an array are accessed independently, so you don't need synchronization if two threads change different elements.
However, the Java memory model makes no guarantees (that I'm aware of) that a value written by one thread will be visible to another thread, unless you synchronize access.
Depending on what you're really trying to accomplish, it's likely that java.util.concurrent already has a class that will do it for you. And if it doesn't, I still recommend taking a look at the source code for ConcurrentHashMap, since your code appears to be doing the same thing that it does to manage the hash table.
I am not really sure if synchronizing only the write method, while leaving the read method unsychronized would work. Not really what are all the consequences, but at least it might lead to read method returning some values that has just been overriden by write.
Yes, as bad cache interleaving can still happen in a multi-cpu/core environment. There are several options to avoid it:
Use the Unsafe Sun-private library to atomically set an element in an array (or the jsr166y added feature in Java7
Use AtomicXYZ[] array
Use custom object with one volatile field and have an array of that object.
Use the ParallelArray of jsr166y addendum instead in your algorithm
Since read() is not synchronized you could have the following scenario:
Thread A enters write() method
Thread A writes to nwriting = 0;
Thread B reads from nwriting =0;
Thread A increments nwriting. nwriting=1
Thread A exits write();
Since you want to guarantee that your variable addresses never conflict, what about something like (discounting array index issues):
int i;
synchronized int curr(){ return i; }
synchronized int next(){ return ++i;}
int read( ) {
return values[curr()];
}
void write(int x){
values[next()]=x;
}