Same with the follow link, I use the same code with the questioner.
Java multi-threading atomic reference assignment
In my code, there
HashMap<String,String> cache = new HashMap<String,String>();
public class myClass {
private HashMap<String,String> cache = null;
public void init() {
refreshCache();
}
// this method can be called occasionally to update the cache.
//Only one threading will get to this code.
public void refreshCache() {
HashMap<String,String> newcache = new HashMap<String,String>();
// code to fill up the new cache
// and then finally
cache = newcache; //assign the old cache to the new one in Atomic way
}
//Many threads will run this code
public void getCache(Object key) {
ob = cache.get(key)
//do something
}
}
I read the sjlee's answer again and again, I can't understand in which case these code will go wrong. Can anyone give me a example?
Remember I don't care about the getCache function will get the old data.
I'm sorry I can't add comment to the above question because I don't have 50 reputation.
So I just add a new question.
Without a memory barrier you might see null or an old map but you could see an incomplete map. I.e. you see bits of it but not all. Thus is not a problem if you don't mind entries being missing but you risk seeing the Map object but not anything it refers to resulting in a possible NPE.
There is no guarantee you will see a complete Map.
final fields will be visible but non - final fields might not.
this is a very interesting problem, and it shows that one of your core assumptions
"Remember I don't care about the getCache function will get the old
data."
is not correct.
we think, that if "refreshCache" and "getCache" is not synchronized, then we will only get old data, which is not true.
Their call by the initial thread may never reflect in other threads. Since cache is not volatile, every thread is free to keep it's own local copy of it and never make it consistent across threads.
Because the "visibility" aspect of multi-threading, which says that unless we use appropriate locking, or use volatile, we do not trigger a happens-before scenario, which forces threads to make shared variable value consistent across the multiple processors they are running on, which means "cache" , may never get initialized causing an obvious NPE in getCache
to understand this properly, i would recommend reading section 16.2.4 of "Java concurrency in practice" book which deals with a similar problem in double checked locking code.
Solution: would be
To make refreshCache synchronized to force, all threads to update their copy of HashMap whenever any one thread calls it, or
To make cache volatile or
You would have to call refreshCache in every single thread that calls getCache which kind of defeats the purpose of a common cache.
I want to make sure that I correctly understand the 'Effectively Immutable Objects' behavior according to Java Memory Model.
Let's say we have a mutable class which we want to publish as an effectively immutable:
class Outworld {
// This MAY be accessed by multiple threads
public static volatile MutableLong published;
}
// This class is mutable
class MutableLong {
private long value;
public MutableLong(long value) {
this.value = value;
}
public void increment() {
value++;
}
public long get() {
return value;
}
}
We do the following:
// Create a mutable object and modify it
MutableLong val = new MutableLong(1);
val.increment();
val.increment();
// No more modifications
// UPDATED: Let's say for this example we are completely sure
// that no one will ever call increment() since now
// Publish it safely and consider Effectively Immutable
Outworld.published = val;
The question is:
Does Java Memory Model guarantee that all threads MUST have Outworld.published.get() == 3 ?
According to Java Concurrency In Practice this should be true, but please correct me if I'm wrong.
3.5.3. Safe Publication Idioms
To publish an object safely, both the reference to the object and the
object's state must be made visible to other threads at the same time.
A properly constructed object can be safely published by:
- Initializing an object reference from a static initializer;
- Storing a reference to it into a volatile field or AtomicReference;
- Storing a reference to it into a final field of a properly constructed object; or
- Storing a reference to it into a field that is properly guarded by a lock.
3.5.4. Effectively Immutable Objects
Safely published effectively immutable objects can be used safely by
any thread without additional synchronization.
Yes. The write operations on the MutableLong are followed by a happens-before relationship (on the volatile) before the read.
(It is possible that a thread reads Outworld.published and passes it on to another thread unsafely. In theory, that could see earlier state. In practice, I don't see it happening.)
There is a couple of conditions which must be met for the Java Memory Model to guarantee that Outworld.published.get() == 3:
the snippet of code you posted which creates and increments the MutableLong, then sets the Outworld.published field, must happen with visibility between the steps. One way to achieve this trivially is to have all that code running in a single thread - guaranteeing "as-if-serial semantics". I assume that's what you intended, but thought it worth pointing out.
reads of Outworld.published must have happens-after semantics from the assignment. An example of this could be having the same thread execute Outworld.published = val; then launch other the threads which could read the value. This would guarantee "as if serial" semantics, preventing re-ordering of the reads before the assignment.
If you are able to provide those guarantees, then the JMM will guarantee all threads see Outworld.published.get() == 3.
However, if you're interested in general program design advice in this area, read on.
For the guarantee that no other threads ever see a different value for Outworld.published.get(), you (the developer) have to guarantee that your program does not modify the value in any way. Either by subsequently executing Outworld.published = differentVal; or Outworld.published.increment();. While that is possible to guarantee, it can be so much easier if you design your code to avoid both the mutable object, and using a static non-final field as a global point of access for multiple threads:
instead of publishing MutableLong, copy the relevant values into a new instance of a different class, whose state cannot be modified. E.g.: introduce the class ImmutableLong, which assigns value to a final field on construction, and doesn't have an increment() method.
instead of multiple threads accessing a static non-final field, pass the object as a parameter to your Callable/Runnable implementations. This will prevent the possibility of one rogue thread from reassigning the value and interfering with the others, and is easier to reason about than static field reassignment. (Admittedly, if you're dealing with legacy code, this is easier said than done).
The question is: Does Java Memory Model guarantee that all threads
MUST have Outworld.published.get() == 3 ?
The short answer is no. Because other threads might access Outworld.published before it has been read.
After the moment when Outworld.published = val; had been performed, under condition that no other modifications done with the val - yes - it always be 3.
But if any thread performs val.increment then its value might be different for other threads.
Which one among ThreadLocal or a local variable in Runnable will be preferred? For performance reasons. I hope using a local variable will give more chances for cpu caching, etc.
Which one among ThreadLocal or a local variable in Runnable will be preferred.
If you have a variable that is declared inside the thread's class (or the Runnable) then a local variable will work and you don't need the ThreadLocal.
new Thread(new Runnable() {
// no need to make this a thread local because each thread already
// has their own copy of it
private SimpleDateFormat format = new SimpleDateFormat(...);
public void run() {
...
// this is allocated per thread so no thread-local
format.parse(...);
...
}
}).start();
On the other hand, ThreadLocals are used to save state on a per thread basis when you are executing common code. For example, the SimpleDateFormat is (unfortunately) not thread-safe so if you want to use it in code executed by multiple threads you would need to store one in a ThreadLocal so that each thread gets it's own version of the format.
private final ThreadLocal<SimpleDateFormat> localFormat =
new ThreadLocal<SimpleDateFormat>() {
#Override
protected SimpleDateFormat initialValue() {
return new SimpleDateFormat(...);
}
};
...
// if a number of threads run this common code
SimpleDateFormat format = localFormat.get();
// now we are using the per-thread format (but we should be using Joda Time :-)
format.parse(...);
An example of when this is necessary is a web request handler. The threads are allocated up in Jetty land (for example) in some sort of pool that is outside of our control. A web request comes in which matches your path so Jetty calls your handler. You need to have a SimpleDateFormat object but because of its limitations, you have to create one per thread. That's when you need a ThreadLocal. When you are writing reentrant code that may be called by multiple threads and you want to store something per-thread.
Instead, if you want pass in arguments to your Runnable then you should create your own class and then you can access the constructor and pass in arguments.
new Thread(new MyRunnable("some important string")).start();
...
private static class MyRunnable implements {
private final String someImportantString;
public MyRunnable(String someImportantString) {
this.someImportantString = someImportantString;
}
// run by the thread
public void run() {
// use the someImportantString string here
...
}
}
Whenever your program could correctly use either of the two (ThreadLocal or local variable), choose the local variable: it will be more performant.
ThreadLocal is for storing per-thread state past the execution scope of a method. Obviously local variables can't persist past the scope of their declaration. If you needed them to, that's when you might start using a ThreadLocal.
Another option is using synchronized to manage access to a shared member variable. This is a complicated topic and I won't bother to go into it here as it's been explained and documented by more articulate people than me in other places. Obviously this is not a variant of "local" storage -- you'd be sharing access to a single resource in a thread-safe way.
I was also confused why i need ThreadLocal when i can just use local variables, since they both maintain their state inside a thread. But after a lot of searching and experimenting i see why is ThreadLocal needed.
I found two uses so far -
Saving thread specific values inside the same shared object
Alternative to passing variables as parameters through N-layers of code
1:
If you have two threads operating on the same object and both threads modify this object - then both threads keep losing their modifications to each other.
To make this object have two separate states for each thread, we declare this object or part of it ThreadLocal.
Of course, ThreadLocal is only beneficial here because both threads are sharing the same object. If they are using different objects, there's no need for the objects to be ThreadLocal.
2:
The second benefit of ThreadLocal, seems to be a side effect of how its implemented.
A ThreadLocal variable can be .set() by a thread, and then be .get() anywhere else. .get() will retrieve the same value that this thread had set anywhere else. We'll need a globally available wrapper to do a .get() and .set(), to actually write down the code.
When we do a threadLocalVar.set() - its as if its put inside some global "map", where this current thread is the key.
As if -
someGlobalMap.put(Thread.currentThread(),threadLocalVar);
So ten layers down, when we do threadLocalVar.get() - we get the value that this thread had set ten layers up.
threadLocalVar = someGlobalMap.get(Thread.currentThread());
So the function at tenth level does not have to lug around this variable as parameter, and can access it with a .get() without worrying about if it is from the right thread.
Lastly, since a ThreadLocal variable is a copy to each thread, of course, it does not need synchronization. I misunderstood ThreadLocal earlier as an alternative to synchronization, that it is not. It is just a side effect of it, that we dont need to synchronize the activity of this variable now.
Hope this has helped.
This question is answered by the simple rule that a variable should be declared in the smallest possible enclosing scope. A ThreadLocal is the largest possible enclosing scope so you should only use it for data that is needed across many lexical scopes. If it can be a local variable, it should be.
When should I use a ThreadLocal variable?
How is it used?
One possible (and common) use is when you have some object that is not thread-safe, but you want to avoid synchronizing access to that object (I'm looking at you, SimpleDateFormat). Instead, give each thread its own instance of the object.
For example:
public class Foo
{
// SimpleDateFormat is not thread-safe, so give one to each thread
private static final ThreadLocal<SimpleDateFormat> formatter = new ThreadLocal<SimpleDateFormat>(){
#Override
protected SimpleDateFormat initialValue()
{
return new SimpleDateFormat("yyyyMMdd HHmm");
}
};
public String formatIt(Date date)
{
return formatter.get().format(date);
}
}
Documentation.
Since a ThreadLocal is a reference to data within a given Thread, you can end up with classloading leaks when using ThreadLocals in application servers using thread pools. You need to be very careful about cleaning up any ThreadLocals you get() or set() by using the ThreadLocal's remove() method.
If you do not clean up when you're done, any references it holds to classes loaded as part of a deployed webapp will remain in the permanent heap and will never get garbage collected. Redeploying/undeploying the webapp will not clean up each Thread's reference to your webapp's class(es) since the Thread is not something owned by your webapp. Each successive deployment will create a new instance of the class which will never be garbage collected.
You will end up with out of memory exceptions due to java.lang.OutOfMemoryError: PermGen space and after some googling will probably just increase -XX:MaxPermSize instead of fixing the bug.
If you do end up experiencing these problems, you can determine which thread and class is retaining these references by using Eclipse's Memory Analyzer and/or by following Frank Kieviet's guide and followup.
Update: Re-discovered Alex Vasseur's blog entry that helped me track down some ThreadLocal issues I was having.
Many frameworks use ThreadLocals to maintain some context related to the current thread. For example when the current transaction is stored in a ThreadLocal, you don't need to pass it as a parameter through every method call, in case someone down the stack needs access to it. Web applications might store information about the current request and session in a ThreadLocal, so that the application has easy access to them. With Guice you can use ThreadLocals when implementing custom scopes for the injected objects (Guice's default servlet scopes most probably use them as well).
ThreadLocals are one sort of global variables (although slightly less evil because they are restricted to one thread), so you should be careful when using them to avoid unwanted side-effects and memory leaks. Design your APIs so that the ThreadLocal values will always be automatically cleared when they are not needed anymore and that incorrect use of the API won't be possible (for example like this). ThreadLocals can be used to make the code cleaner, and in some rare cases they are the only way to make something work (my current project had two such cases; they are documented here under "Static Fields and Global Variables").
In Java, if you have a datum that can vary per-thread, your choices are to pass that datum around to every method that needs (or may need) it, or to associate the datum with the thread. Passing the datum around everywhere may be workable if all your methods already need to pass around a common "context" variable.
If that's not the case, you may not want to clutter up your method signatures with an additional parameter. In a non-threaded world, you could solve the problem with the Java equivalent of a global variable. In a threaded word, the equivalent of a global variable is a thread-local variable.
There is very good example in book Java Concurrency in Practice. Where author (Joshua Bloch) explains how Thread confinement is one of the simplest ways to achieve thread safety and ThreadLocal is more formal means of maintaining thread confinement. In the end he also explain how people can abuse it by using it as global variables.
I have copied the text from the mentioned book but code 3.10 is missing as it is not much important to understand where ThreadLocal should be use.
Thread-local variables are often used to prevent sharing in designs based on mutable Singletons or global variables. For example, a single-threaded application might maintain a global database connection that is initialized at startup to avoid having to pass a Connection to every method. Since JDBC connections may not be thread-safe, a multithreaded application that uses a global connection without additional coordination is not thread-safe either. By using a ThreadLocal to store the JDBC connection, as in ConnectionHolder in Listing 3.10, each thread will have its own connection.
ThreadLocal is widely used in implementing application frameworks. For example, J2EE containers associate a transaction context with an executing thread for the duration of an EJB call. This is easily implemented using a static Thread-Local holding the transaction context: when framework code needs to determine what transaction is currently running, it fetches the transaction context from this ThreadLocal. This is convenient in that it reduces the need to pass execution context information into every method, but couples any code that uses this mechanism to the framework.
It is easy to abuse ThreadLocal by treating its thread confinement property as a license to use global variables or as a means of creating “hidden” method arguments. Like global variables, thread-local variables can detract from reusability and introduce hidden couplings among classes, and should therefore be used with care.
Essentially, when you need a variable's value to depend on the current thread and it isn't convenient for you to attach the value to the thread in some other way (for example, subclassing thread).
A typical case is where some other framework has created the thread that your code is running in, e.g. a servlet container, or where it just makes more sense to use ThreadLocal because your variable is then "in its logical place" (rather than a variable hanging from a Thread subclass or in some other hash map).
On my web site, I have some further discussion and examples of when to use ThreadLocal that may also be of interest.
Some people advocate using ThreadLocal as a way to attach a "thread ID" to each thread in certain concurrent algorithms where you need a thread number (see e.g. Herlihy & Shavit). In such cases, check that you're really getting a benefit!
ThreadLocal in Java had been introduced on JDK 1.2 but was later generified in JDK 1.5 to introduce type safety on ThreadLocal variable.
ThreadLocal can be associated with Thread scope, all the code which is executed by Thread has access to ThreadLocal variables but two thread can not see each others ThreadLocal variable.
Each thread holds an exclusive copy of ThreadLocal variable which becomes eligible to Garbage collection after thread finished or died, normally or due to any Exception, Given those ThreadLocal variable doesn't have any other live references.
ThreadLocal variables in Java are generally private static fields in Classes and maintain its state inside Thread.
Read more: ThreadLocal in Java - Example Program and Tutorial
The documentation says it very well: "each thread that accesses [a thread-local variable] (via its get or set method) has its own, independently initialized copy of the variable".
You use one when each thread must have its own copy of something. By default, data is shared between threads.
Webapp server may keep a thread pool, and a ThreadLocal var should be removed before response to the client, thus current thread may be reused by next request.
Two use cases where threadlocal variable can be used -
1- When we have a requirement to associate state with a thread (e.g., a user ID or Transaction ID). That usually happens with a web application that every request going to a servlet has a unique transactionID associated with it.
// This class will provide a thread local variable which
// will provide a unique ID for each thread
class ThreadId {
// Atomic integer containing the next thread ID to be assigned
private static final AtomicInteger nextId = new AtomicInteger(0);
// Thread local variable containing each thread's ID
private static final ThreadLocal<Integer> threadId =
ThreadLocal.<Integer>withInitial(()-> {return nextId.getAndIncrement();});
// Returns the current thread's unique ID, assigning it if necessary
public static int get() {
return threadId.get();
}
}
Note that here the method withInitial is implemented using lambda expression.
2- Another use case is when we want to have a thread safe instance and we don't want to use synchronization as the performance cost with synchronization is more. One such case is when SimpleDateFormat is used. Since SimpleDateFormat is not thread safe so we have to provide mechanism to make it thread safe.
public class ThreadLocalDemo1 implements Runnable {
// threadlocal variable is created
private static final ThreadLocal<SimpleDateFormat> dateFormat = new ThreadLocal<SimpleDateFormat>(){
#Override
protected SimpleDateFormat initialValue(){
System.out.println("Initializing SimpleDateFormat for - " + Thread.currentThread().getName() );
return new SimpleDateFormat("dd/MM/yyyy");
}
};
public static void main(String[] args) {
ThreadLocalDemo1 td = new ThreadLocalDemo1();
// Two threads are created
Thread t1 = new Thread(td, "Thread-1");
Thread t2 = new Thread(td, "Thread-2");
t1.start();
t2.start();
}
#Override
public void run() {
System.out.println("Thread run execution started for " + Thread.currentThread().getName());
System.out.println("Date formatter pattern is " + dateFormat.get().toPattern());
System.out.println("Formatted date is " + dateFormat.get().format(new Date()));
}
}
Since Java 8 release, there is more declarative way to initialize ThreadLocal:
ThreadLocal<String> local = ThreadLocal.withInitial(() -> "init value");
Until Java 8 release you had to do the following:
ThreadLocal<String> local = new ThreadLocal<String>(){
#Override
protected String initialValue() {
return "init value";
}
};
Moreover, if instantiation method (constructor, factory method) of class that is used for ThreadLocal does not take any parameters, you can simply use method references (introduced in Java 8):
class NotThreadSafe {
// no parameters
public NotThreadSafe(){}
}
ThreadLocal<NotThreadSafe> container = ThreadLocal.withInitial(NotThreadSafe::new);
Note:
Evaluation is lazy since you are passing java.util.function.Supplier lambda that is evaluated only when ThreadLocal#get is called but value was not previously evaluated.
You have to be very careful with the ThreadLocal pattern. There are some major down sides like Phil mentioned, but one that wasn't mentioned is to make sure that the code that sets up the ThreadLocal context isn't "re-entrant."
Bad things can happen when the code that sets the information gets run a second or third time because information on your thread can start to mutate when you didn't expect it. So take care to make sure the ThreadLocal information hasn't been set before you set it again.
ThreadLocal will ensure accessing the mutable object by the multiple
threads in the non synchronized method is synchronized, means making
the mutable object to be immutable within the method. This
is achieved by giving new instance of mutable object for each thread
try accessing it. So It is local copy to the each thread. This is some
hack on making instance variable in a method to be accessed like a
local variable. As you aware method local variable is only available
to the thread, one difference is; method local variables will not
available to the thread once method execution is over where as mutable
object shared with threadlocal will be available across multiple
methods till we clean it up.
By Definition:
The ThreadLocal class in Java enables you to create variables that can
only be read and written by the same thread. Thus, even if two threads
are executing the same code, and the code has a reference to a
ThreadLocal variable, then the two threads cannot see each other's
ThreadLocal variables.
Each Thread in java contains ThreadLocalMap in it.
Where
Key = One ThreadLocal object shared across threads.
value = Mutable object which has to be used synchronously, this will be instantiated for each thread.
Achieving the ThreadLocal:
Now create a wrapper class for ThreadLocal which is going to hold the mutable object like below (with or without initialValue()). Now getter and setter of this wrapper will work on threadlocal instance instead of mutable object.
If getter() of threadlocal didn't find any value with in the threadlocalmap of the Thread; then it will invoke the initialValue() to get its private copy with respect to the thread.
class SimpleDateFormatInstancePerThread {
private static final ThreadLocal<SimpleDateFormat> dateFormatHolder = new ThreadLocal<SimpleDateFormat>() {
#Override
protected SimpleDateFormat initialValue() {
SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd") {
UUID id = UUID.randomUUID();
#Override
public String toString() {
return id.toString();
};
};
System.out.println("Creating SimpleDateFormat instance " + dateFormat +" for Thread : " + Thread.currentThread().getName());
return dateFormat;
}
};
/*
* Every time there is a call for DateFormat, ThreadLocal will return calling
* Thread's copy of SimpleDateFormat
*/
public static DateFormat getDateFormatter() {
return dateFormatHolder.get();
}
public static void cleanup() {
dateFormatHolder.remove();
}
}
Now wrapper.getDateFormatter() will call threadlocal.get() and that will check the currentThread.threadLocalMap contains this (threadlocal) instance.
If yes return the value (SimpleDateFormat) for corresponding threadlocal instance
else add the map with this threadlocal instance, initialValue().
Herewith thread safety achieved on this mutable class; by each thread is working with its own mutable instance but with same ThreadLocal instance. Means All the thread will share the same ThreadLocal instance as key, but different SimpleDateFormat instance as value.
https://github.com/skanagavelu/yt.tech/blob/master/src/ThreadLocalTest.java
when?
When an object is not thread-safe, instead of synchronization which hampers the scalability, give one object to every thread and keep it thread scope, which is ThreadLocal. One of most often used but not thread-safe objects are database Connection and JMSConnection.
How ?
One example is Spring framework uses ThreadLocal heavily for managing transactions behind the scenes by keeping these connection objects in ThreadLocal variables. At high level, when a transaction is started it gets the connection ( and disables the auto commit ) and keeps it in ThreadLocal. on further db calls it uses same connection to communicate with db. At the end, it takes the connection from ThreadLocal and commits ( or rollback ) the transaction and releases the connection.
I think log4j also uses ThreadLocal for maintaining MDC.
ThreadLocal is useful, when you want to have some state that should not be shared amongst different threads, but it should be accessible from each thread during its whole lifetime.
As an example, imagine a web application, where each request is served by a different thread. Imagine that for each request you need a piece of data multiple times, which is quite expensive to compute. However, that data might have changed for each incoming request, which means that you can't use a plain cache. A simple, quick solution to this problem would be to have a ThreadLocal variable holding access to this data, so that you have to calculate it only once for each request. Of course, this problem can also be solved without the use of ThreadLocal, but I devised it for illustration purposes.
That said, have in mind that ThreadLocals are essentially a form of global state. As a result, it has many other implications and should be used only after considering all the other possible solutions.
There are 3 scenarios for using a class helper like SimpleDateFormat in multithread code, which best one is use ThreadLocal
Scenarios
1- Using like share object by the help of lock or synchronization mechanism which makes the app slow
Thread pool Scenarios
2- Using as a local object inside a method
In thread pool, in this scenario, if we have 4 thread each one has 1000 task time then we have
4000 SimpleDateFormat object created and waiting for GC to erase them
3- Using ThreadLocal
In thread pool, if we have 4 thread and we gave to each thread one SimpleDateFormat instance
so we have 4 threads, 4 objects of SimpleDateFormat.
There is no need of lock mechanism and object creation and destruction. (Good time complexity and space complexity)
https://www.youtube.com/watch?v=sjMe9aecW_A
Nothing really new here, but I discovered today that ThreadLocal is very useful when using Bean Validation in a web application. Validation messages are localized, but by default use Locale.getDefault(). You can configure the Validator with a different MessageInterpolator, but there's no way to specify the Locale when you call validate. So you could create a static ThreadLocal<Locale> (or better yet, a general container with other things you might need to be ThreadLocal and then have your custom MessageInterpolator pick the Locale from that. Next step is to write a ServletFilter which uses a session value or request.getLocale() to pick the locale and store it in your ThreadLocal reference.
As was mentioned by #unknown (google), it's usage is to define a global variable in which the value referenced can be unique in each thread. It's usages typically entails storing some sort of contextual information that is linked to the current thread of execution.
We use it in a Java EE environment to pass user identity to classes that are not Java EE aware (don't have access to HttpSession, or the EJB SessionContext). This way the code, which makes usage of identity for security based operations, can access the identity from anywhere, without having to explicitly pass it in every method call.
The request/response cycle of operations in most Java EE calls makes this type of usage easy since it gives well defined entry and exit points to set and unset the ThreadLocal.
Thread-local variables are often used to prevent sharing in designs based on
mutable Singletons or global variables.
It can be used in scenarios like making seperate JDBC connection for each thread when you are not using a Connection Pool.
private static ThreadLocal<Connection> connectionHolder
= new ThreadLocal<Connection>() {
public Connection initialValue() {
return DriverManager.getConnection(DB_URL);
}
};
public static Connection getConnection() {
return connectionHolder.get();
}
When you call getConnection, it will return a connection associated with that thread.The same can be done with other properties like dateformat, transaction context that you don't want to share between threads.
You could have also used local variables for the same, but these resource usually take up time in creation,so you don't want to create them again and again whenever you perform some business logic with them. However, ThreadLocal values are stored in the thread object itself and as soon as the thread is garbage collected, these values are gone too.
This link explains use of ThreadLocal very well.
Caching, sometime you have to calculate the same value lots of time so by storing the last set of inputs to a method and the result you can speed the code up. By using Thread Local Storage you avoid having to think about locking.
ThreadLocal is a specially provisioned functionality by JVM to provide an isolated storage space for threads only. like the value of instance scoped variable are bound to a given instance of a class only. each object has its only values and they can not see each other value. so is the concept of ThreadLocal variables, they are local to the thread in the sense of object instances other thread except for the one which created it, can not see it. See Here
import java.util.concurrent.atomic.AtomicInteger;
import java.util.stream.IntStream;
public class ThreadId {
private static final AtomicInteger nextId = new AtomicInteger(1000);
// Thread local variable containing each thread's ID
private static final ThreadLocal<Integer> threadId = ThreadLocal.withInitial(() -> nextId.getAndIncrement());
// Returns the current thread's unique ID, assigning it if necessary
public static int get() {
return threadId.get();
}
public static void main(String[] args) {
new Thread(() -> IntStream.range(1, 3).forEach(i -> {
System.out.println(Thread.currentThread().getName() + " >> " + new ThreadId().get());
})).start();
new Thread(() -> IntStream.range(1, 3).forEach(i -> {
System.out.println(Thread.currentThread().getName() + " >> " + new ThreadId().get());
})).start();
new Thread(() -> IntStream.range(1, 3).forEach(i -> {
System.out.println(Thread.currentThread().getName() + " >> " + new ThreadId().get());
})).start();
}
}
The ThreadLocal class in Java enables you to create variables that can only be read and written by the same thread. Thus, even if two threads are executing the same code, and the code has a reference to a ThreadLocal variable, then the two threads cannot see each other's ThreadLocal variables.
Read more
[For Reference]ThreadLocal cannot solve update problems of shared object. It is recommended to use a staticThreadLocal object which is shared by all operations in the same thread.
[Mandatory]remove() method must be implemented by ThreadLocal variables, especially when using thread pools in which threads are often reused. Otherwise, it may affect subsequent business logic and cause unexpected problems such as memory leak.
Threadlocal provides a very easy way to achieve objects reusability with zero cost.
I had a situation where multiple threads were creating an image of mutable cache, on each update notification.
I used a Threadlocal on each thread, and then each thread would just need to reset old image and then update it again from the cache on each update notification.
Usual reusable objects from object pools have thread safety cost associated with them, while this approach has none.
Try this small example, to get a feel for ThreadLocal variable:
public class Book implements Runnable {
private static final ThreadLocal<List<String>> WORDS = ThreadLocal.withInitial(ArrayList::new);
private final String bookName; // It is also the thread's name
private final List<String> words;
public Book(String bookName, List<String> words) {
this.bookName = bookName;
this.words = Collections.unmodifiableList(words);
}
public void run() {
WORDS.get().addAll(words);
System.out.printf("Result %s: '%s'.%n", bookName, String.join(", ", WORDS.get()));
}
public static void main(String[] args) {
Thread t1 = new Thread(new Book("BookA", Arrays.asList("wordA1", "wordA2", "wordA3")));
Thread t2 = new Thread(new Book("BookB", Arrays.asList("wordB1", "wordB2")));
t1.start();
t2.start();
}
}
Console output, if thread BookA is done first:
Result BookA: 'wordA1, wordA2, wordA3'.
Result BookB: 'wordB1, wordB2'.
Console output, if thread BookB is done first:
Result BookB: 'wordB1, wordB2'.
Result BookA: 'wordA1, wordA2, wordA3'.
1st Use case - Per thread context which gives thread safety as well as performance
Real-time example in SpringFramework classes -
LocaleContextHolder
TransactionContextHolder
RequestContextHolder
DateTimeContextHolder
2nd Use case - When we don't want to share something among threads and at the same time don't want to use synchronize/lock due to performance cost
example - SimpleDateFormat to create the custom format for dates
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
/**
* #author - GreenLearner(https://www.youtube.com/c/greenlearner)
*/
public class ThreadLocalDemo1 {
SimpleDateFormat sdf = new SimpleDateFormat("dd-mm-yyyy");//not thread safe
ThreadLocal<SimpleDateFormat> tdl1 = ThreadLocal.withInitial(() -> new SimpleDateFormat("yyyy-dd-mm"));
public static void main(String[] args) {
ThreadLocalDemo1 d1 = new ThreadLocalDemo1();
ExecutorService es = Executors.newFixedThreadPool(10);
for(int i=0; i<100; i++) {
es.submit(() -> System.out.println(d1.getDate(new Date())));
}
es.shutdown();
}
String getDate(Date date){
// String s = tsdf.get().format(date);
String s1 = tdl1.get().format(date);
return s1;
}
}
Usage Tips
Use local variables if possible. This way we can avoid using ThreadLocal
Delegate the functionality to frameworks as and when possible
If using ThreadLocal and setting the state into it then make sure to clean it after using otherwise it can become the major reason for OutOfMemoryError
This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
What is the best way to increase number of locks in java
Suppose I want to lock based on an integer id value. In this case, there's a function that pulls a value from a cache and does a fairly expensive retrieve/store into the cache if the value isn't there.
The existing code isn't synchronized and could potentially trigger multiple retrieve/store operations:
//psuedocode
public Page getPage (Integer id){
Page p = cache.get(id);
if (p==null)
{
p=getFromDataBase(id);
cache.store(p);
}
}
What I'd like to do is synchronize the retrieve on the id, e.g.
if (p==null)
{
synchronized (id)
{
..retrieve, store
}
}
Unfortunately this won't work because 2 separate calls can have the same Integer id value but a different Integer object, so they won't share the lock, and no synchronization will happen.
Is there a simple way of insuring that you have the same Integer instance? For example, will this work:
syncrhonized (Integer.valueOf(id.intValue())){
The javadoc for Integer.valueOf() seems to imply that you're likely to get the same instance, but that doesn't look like a guarantee:
Returns a Integer instance
representing the specified int value.
If a new Integer instance is not
required, this method should generally
be used in preference to the
constructor Integer(int), as this
method is likely to yield
significantly better space and time
performance by caching frequently
requested values.
So, any suggestions on how to get an Integer instance that's guaranteed to be the same, other than the more elaborate solutions like keeping a WeakHashMap of Lock objects keyed to the int? (nothing wrong with that, it just seems like there must be an obvious one-liner than I'm missing).
You really don't want to synchronize on an Integer, since you don't have control over what instances are the same and what instances are different. Java just doesn't provide such a facility (unless you're using Integers in a small range) that is dependable across different JVMs. If you really must synchronize on an Integer, then you need to keep a Map or Set of Integer so you can guarantee that you're getting the exact instance you want.
Better would be to create a new object, perhaps stored in a HashMap that is keyed by the Integer, to synchronize on. Something like this:
public Page getPage(Integer id) {
Page p = cache.get(id);
if (p == null) {
synchronized (getCacheSyncObject(id)) {
p = getFromDataBase(id);
cache.store(p);
}
}
}
private ConcurrentMap<Integer, Integer> locks = new ConcurrentHashMap<Integer, Integer>();
private Object getCacheSyncObject(final Integer id) {
locks.putIfAbsent(id, id);
return locks.get(id);
}
To explain this code, it uses ConcurrentMap, which allows use of putIfAbsent. You could do this:
locks.putIfAbsent(id, new Object());
but then you incur the (small) cost of creating an Object for each access. To avoid that, I just save the Integer itself in the Map. What does this achieve? Why is this any different from just using the Integer itself?
When you do a get() from a Map, the keys are compared with equals() (or at least the method used is the equivalent of using equals()). Two different Integer instances of the same value will be equal to each other. Thus, you can pass any number of different Integer instances of "new Integer(5)" as the parameter to getCacheSyncObject and you will always get back only the very first instance that was passed in that contained that value.
There are reasons why you may not want to synchronize on Integer ... you can get into deadlocks if multiple threads are synchronizing on Integer objects and are thus unwittingly using the same locks when they want to use different locks. You can fix this risk by using the
locks.putIfAbsent(id, new Object());
version and thus incurring a (very) small cost to each access to the cache. Doing this, you guarantee that this class will be doing its synchronization on an object that no other class will be synchronizing on. Always a Good Thing.
Use a thread-safe map, such as ConcurrentHashMap. This will allow you to manipulate a map safely, but use a different lock to do the real computation. In this way you can have multiple computations running simultaneous with a single map.
Use ConcurrentMap.putIfAbsent, but instead of placing the actual value, use a Future with computationally-light construction instead. Possibly the FutureTask implementation. Run the computation and then get the result, which will thread-safely block until done.
Integer.valueOf() only returns cached instances for a limited range. You haven't specified your range, but in general, this won't work.
However, I would strongly recommend you not take this approach, even if your values are in the correct range. Since these cached Integer instances are available to any code, you can't fully control the synchronization, which could lead to a deadlock. This is the same problem people have trying to lock on the result of String.intern().
The best lock is a private variable. Since only your code can reference it, you can guarantee that no deadlocks will occur.
By the way, using a WeakHashMap won't work either. If the instance serving as the key is unreferenced, it will be garbage collected. And if it is strongly referenced, you could use it directly.
Using synchronized on an Integer sounds really wrong by design.
If you need to synchronize each item individually only during retrieve/store you can create a Set and store there the currently locked items. In another words,
// this contains only those IDs that are currently locked, that is, this
// will contain only very few IDs most of the time
Set<Integer> activeIds = ...
Object retrieve(Integer id) {
// acquire "lock" on item #id
synchronized(activeIds) {
while(activeIds.contains(id)) {
try {
activeIds.wait();
} catch(InterruptedExcption e){...}
}
activeIds.add(id);
}
try {
// do the retrieve here...
return value;
} finally {
// release lock on item #id
synchronized(activeIds) {
activeIds.remove(id);
activeIds.notifyAll();
}
}
}
The same goes to the store.
The bottom line is: there is no single line of code that solves this problem exactly the way you need.
How about a ConcurrentHashMap with the Integer objects as keys?
You could have a look at this code for creating a mutex from an ID. The code was written for String IDs, but could easily be edited for Integer objects.
As you can see from the variety of answers, there are various ways to skin this cat:
Goetz et al's approach of keeping a cache of FutureTasks works quite well in situations like this where you're "caching something anyway" so don't mind building up a map of FutureTask objects (and if you did mind the map growing, at least it's easy to make pruning it concurrent)
As a general answer to "how to lock on ID", the approach outlined by Antonio has the advantage that it's obvious when the map of locks is added to/removed from.
You may need to watch out for a potential issue with Antonio's implementation, namely that the notifyAll() will wake up threads waiting on all IDs when one of them becomes available, which may not scale very well under high contention. In principle, I think you can fix that by having a Condition object for each currently locked ID, which is then the thing that you await/signal. Of course, if in practice there's rarely more than one ID being waited on at any given time, then this isn't an issue.
Steve,
your proposed code has a bunch of problems with synchronization. (Antonio's does as well).
To summarize:
You need to cache an expensive
object.
You need to make sure that while one thread is doing the retrieval, another thread does not also attempt to retrieve the same object.
That for n-threads all attempting to get the object only 1 object is ever retrieved and returned.
That for threads requesting different objects that they do not contend with each other.
pseudo code to make this happen (using a ConcurrentHashMap as the cache):
ConcurrentMap<Integer, java.util.concurrent.Future<Page>> cache = new ConcurrentHashMap<Integer, java.util.concurrent.Future<Page>>;
public Page getPage(Integer id) {
Future<Page> myFuture = new Future<Page>();
cache.putIfAbsent(id, myFuture);
Future<Page> actualFuture = cache.get(id);
if ( actualFuture == myFuture ) {
// I am the first w00t!
Page page = getFromDataBase(id);
myFuture.set(page);
}
return actualFuture.get();
}
Note:
java.util.concurrent.Future is an interface
java.util.concurrent.Future does not actually have a set() but look at the existing classes that implement Future to understand how to implement your own Future (Or use FutureTask)
Pushing the actual retrieval to a worker thread will almost certainly be a good idea.
See section 5.6 in Java Concurrency in Practice: "Building an efficient, scalable, result cache". It deals with the exact issue you are trying to solve. In particular, check out the memoizer pattern.
(source: umd.edu)