Does any one have an example how to do this? Are they handled by the garbage collector? I'm using Tomcat 6.
The javadoc says this:
"Each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive and the ThreadLocal instance is accessible; after a thread goes away, all of its copies of thread-local instances are subject to garbage collection (unless other references to these copies exist).
If your application or (if you are talking about request threads) container uses a thread pool that means that threads don't die. If necessary, you would need to deal with the thread locals yourself. The only clean way to do this is to call the ThreadLocal.remove() method.
There are two reasons you might want to clean up thread locals for threads in a thread pool:
to prevent memory (or hypothetically resource) leaks, or
to prevent accidental leakage of information from one request to another via thread locals.
Thread local memory leaks should not normally be a major issue with bounded thread pools since any thread locals are likely to get overwritten eventually; i.e. when the thread is reused. However, if you make the mistake of creating a new ThreadLocal instances over and over again (instead of using a static variable to hold a singleton instance), the thread local values won't get overwritten, and will accumulate in each thread's threadlocals map. This could result in a serious leak.
Assuming that you are talking about thread locals that are created / used during a webapp's processing of an HTTP request, then one way to avoid the thread local leaks is to register a ServletRequestListener with your webapp's ServletContext and implement the listener's requestDestroyed method to cleanup the thread locals for the current thread.
Note that in this context you also need to consider the possibility of information leaking from one request to another.
Here is some code to clean all thread local variables from the current thread when you do not have a reference to the actual thread local variable. You can also generalize it to cleanup thread local variables for other threads:
private void cleanThreadLocals() {
try {
// Get a reference to the thread locals table of the current thread
Thread thread = Thread.currentThread();
Field threadLocalsField = Thread.class.getDeclaredField("threadLocals");
threadLocalsField.setAccessible(true);
Object threadLocalTable = threadLocalsField.get(thread);
// Get a reference to the array holding the thread local variables inside the
// ThreadLocalMap of the current thread
Class threadLocalMapClass = Class.forName("java.lang.ThreadLocal$ThreadLocalMap");
Field tableField = threadLocalMapClass.getDeclaredField("table");
tableField.setAccessible(true);
Object table = tableField.get(threadLocalTable);
// The key to the ThreadLocalMap is a WeakReference object. The referent field of this object
// is a reference to the actual ThreadLocal variable
Field referentField = Reference.class.getDeclaredField("referent");
referentField.setAccessible(true);
for (int i=0; i < Array.getLength(table); i++) {
// Each entry in the table array of ThreadLocalMap is an Entry object
// representing the thread local reference and its value
Object entry = Array.get(table, i);
if (entry != null) {
// Get a reference to the thread local object and remove it from the table
ThreadLocal threadLocal = (ThreadLocal)referentField.get(entry);
threadLocal.remove();
}
}
} catch(Exception e) {
// We will tolerate an exception here and just log it
throw new IllegalStateException(e);
}
}
There is no way to cleanup ThreadLocal values except from within the thread that put them in there in the first place (or when the thread is garbage collected - not the case with worker threads). This means you should take care to clean up your ThreadLocal's when a servlet request is finished (or before transferring AsyncContext to another thread in Servlet 3), because after that point you may never get a chance to enter that specific worker thread, and hence, will leak memory in situations when your web app is undeployed while the server is not restarted.
A good place to do such cleanup is ServletRequestListener.requestDestroyed().
If you use Spring, all the necessary wiring is already in place, you can simply put stuff in your request scope without worrying about cleaning them up (that happens automatically):
RequestContextHolder.getRequestAttributes().setAttribute("myAttr", myAttr, RequestAttributes.SCOPE_REQUEST);
. . .
RequestContextHolder.getRequestAttributes().getAttribute("myAttr", RequestAttributes.SCOPE_REQUEST);
Reading again the Javadoc documentation carefully:
'Each thread holds an implicit reference to its copy of a thread-local variable as long as the thread is alive and the ThreadLocal instance is accessible; after a thread goes away, all of its copies of thread-local instances are subject to garbage collection (unless other references to these copies exist).
'
There is no need to clean anything, there is an 'AND' condition for the leak to survive. So even in a web container where thread survive to the application,
as long as the webapp class is unloaded ( only beeing reference in a static class loaded in the parent class loader would prevent this and this has nothing to do with ThreadLocal but general issues with shared jars with static data ) then the second leg of the AND condition is not met anymore so the thread local copy is eligible for garbage collection.
Thread local can't be the cause of memory leaks, as far the implementation meets the documentation.
I would like to contribute my answer to this question even though it's old. I had been plagued by the same problem (gson threadlocal not getting removed from the request thread), and had even gotten comfortable restarting the server anytime it ran out of memory (which sucks big time!!).
In the context of a java web app that is set to dev mode (in that the server is set to bounce every time it senses a change in the code, and possibly also running in debug mode), I quickly learned that threadlocals can be awesome and sometime be a pain. I was using a threadlocal Invocation for every request. Inside the Invocation. I'd sometimes also use gson to generate my response. I would wrap the Invocation inside a 'try' block in the filter, and destroy it inside a 'finally' block.
What I observed (I have not metrics to back this up for now) is that if I made changes to several files and the server was constantly bouncing in between my changes, I'd get impatient and restart the server (tomcat to be precise) from the IDE. Most likely than not, I'd end up with an 'Out of memory' exception.
How I got around this was to include a ServletRequestListener implementation in my app, and my problem vanished. I think what was happening is that in the middle of a request, if the server would bounce several times, my threadlocals were not getting cleared up (gson included) so I'd get this warning about the threadlocals and two or three warning later, the server would crash. With the ServletResponseListener explicitly closing my threadlocals, the gson problem vanished.
I hope this makes sense and gives you an idea of how to overcome threadlocal issues. Always close them around their point of usage. In the ServletRequestListener, test each threadlocal wrapper, and if it still has a valid reference to some object, destroy it at that point.
I should also point out that make it a habit to wrap a threadlocal as a static variable inside a class. That way you can be guaranteed that by destroying it in the ServeltRequestListener, you won't have to worry about other instances of the same class hanging around.
#lyaffe's answer is the best possible for Java 6. There are a few issues that this answer resolves using what is available in Java 8.
#lyaffe's answer was written for Java 6 before MethodHandle became available. It suffers from performance penalties due to reflection. If used as below, MethodHandle provides zero overhead access to fields and methods.
#lyaffe's answer also goes through the ThreadLocalMap.table explicitly and is prone to bugs. There is a method ThreadLocalMap.expungeStaleEntries() now available that does the same thing.
The code below has 3 initialization methods to minimize the cost of invoking expungeStaleEntries().
private static final MethodHandle s_getThreadLocals = initThreadLocals();
private static final MethodHandle s_expungeStaleEntries = initExpungeStaleEntries();
private static final ThreadLocal<Object> s_threadLocals = ThreadLocal.withInitial(() -> getThreadLocals());
public static void expungeThreadLocalMap()
{
Object threadLocals;
threadLocals = s_threadLocals.get();
try
{
s_expungeStaleEntries.invoke(threadLocals);
}
catch (Throwable e)
{
throw new IllegalStateException(e);
}
}
private static Object getThreadLocals()
{
ThreadLocal<Object> local;
Object result;
Thread thread;
local = new ThreadLocal<>();
local.set(local); // Force ThreadLocal to initialize Thread.threadLocals
thread = Thread.currentThread();
try
{
result = s_getThreadLocals.invoke(thread);
}
catch (Throwable e)
{
throw new IllegalStateException(e);
}
return(result);
}
private static MethodHandle initThreadLocals()
{
MethodHandle result;
Field field;
try
{
field = Thread.class.getDeclaredField("threadLocals");
field.setAccessible(true);
result = MethodHandles.
lookup().
unreflectGetter(field);
result = Preconditions.verifyNotNull(result, "result is null");
}
catch (NoSuchFieldException | SecurityException | IllegalAccessException e)
{
throw new ExceptionInInitializerError(e);
}
return(result);
}
private static MethodHandle initExpungeStaleEntries()
{
MethodHandle result;
Class<?> clazz;
Method method;
Object threadLocals;
threadLocals = getThreadLocals();
clazz = threadLocals.getClass();
try
{
method = clazz.getDeclaredMethod("expungeStaleEntries");
method.setAccessible(true);
result = MethodHandles.
lookup().
unreflect(method);
}
catch (NoSuchMethodException | SecurityException | IllegalAccessException e)
{
throw new ExceptionInInitializerError(e);
}
return(result);
}
The JVM would automatically clean-up all the reference-less objects that are within the ThreadLocal object.
Another way to clean up those objects (say for example, these objects could be all the thread unsafe objects that exist around) is to put them inside some Object Holder class, which basically holds it and you can override the finalize method to clean the object that reside within it. Again it depends on the Garbage Collector and its policies, when it would invoke the finalize method.
Here is a code sample:
public class MyObjectHolder {
private MyObject myObject;
public MyObjectHolder(MyObject myObj) {
myObject = myObj;
}
public MyObject getMyObject() {
return myObject;
}
protected void finalize() throws Throwable {
myObject.cleanItUp();
}
}
public class SomeOtherClass {
static ThreadLocal<MyObjectHolder> threadLocal = new ThreadLocal<MyObjectHolder>();
.
.
.
}
final ThreadLocal<T> old = backend;
// try to clean by reflect
try {
// BGN copy from apache ThreadUtils#getAllThreads
ThreadGroup systemGroup = Thread.currentThread().getThreadGroup();
while (systemGroup.getParent() != null) {
systemGroup = systemGroup.getParent();
}
int count = systemGroup.activeCount();
Thread[] threads;
do {
threads = new Thread[count + (count / 2) + 1]; //slightly grow the array size
count = systemGroup.enumerate(threads, true);
//return value of enumerate() must be strictly less than the array size according to javadoc
} while (count >= threads.length);
// END
// remove by reflect
final Field threadLocalsField = Thread.class.getDeclaredField("threadLocals");
threadLocalsField.setAccessible(true);
Class<?> threadLocalMapClass = Class.forName("java.lang.ThreadLocal$ThreadLocalMap");
Method removeMethod = threadLocalMapClass.getDeclaredMethod("remove", ThreadLocal.class);
removeMethod.setAccessible(true);
for (int i = 0; i < count; i++) {
final Object threadLocalMap = threadLocalsField.get(threads[i]);
if (threadLocalMap != null) {
removeMethod.invoke(threadLocalMap, old);
}
}
}
catch (Exception e) {
throw new ThreadLocalAttention(e);
}
Related
Goal: To know, as I fork off a thread, which processor it's going to land on. Is that possible? Regardless of whether the underlying approach is valid, is there a good answer to that narrow question? Thanks.
(Right now I need to make a copy of one of our classes for each thread, write to it in that thread and merge them all later. Using a synchronized approach is not possible because my Java expert boss thinks it's a bad idea, and after a lot of discussion I agree. If I knew which processor each thread would land on, I would only need to make as many copies of that class as there are processors.)
We use Apache Spark to get our jobs spread across a cluster, but in our application is makes sense to run one big executor and then do some multi-threading of our own out on each machine in the cluster.
I could save a lot of deep copying if I could know which processor a thread is being sent to, is that possible? I threw in our code but it's probably more of a conceptual question:
When I get down to the "do task" part of compute(), can I know which processor it's running on?
public class TholdExecutor extends RecursiveTask<TholdDropEvaluation> {
final static Logger logger = LoggerFactory.getLogger(TholdExecutor.class);
private List<TholdDropResult> partitionOfN = new ArrayList<>();
private int coreCount;
private int desiredPartitionSize; // will be updated by whatever is passed into the constructor per-chromosome
private TholdDropEvaluation localDropEvaluation; // this DropEvaluation
private TholdDropResult mSubI_DR;
public TholdExecutor(List<TholdDropResult> subsetOfN, int cores, int partSize, TholdDropEvaluation passedDropEvaluation, TholdDropResult mDrCopy) {
partitionOfN = subsetOfN;
coreCount = cores;
desiredPartitionSize = partSize;
// the TholdDropEvaluation needs to be a copy for each thread? It can't be the same one passed to threads ... so ...
TholdDropEvaluation localDropEvaluation = makeDECopy(passedDropEvaluation); // THIS NEEDS TO BE A DEEP COPY OF THE DROP EVAL!!! NOT THE ORIGINAL!!
// we never modify the TholdDropResult that is passed in, we just need to read it all on the same JVM/worker, so
mSubI_DR = mDrCopy; // this is purely a reference and can point to the passed in value (by reference, right?)
}
// this makes a deep copy of the TholdDropEvaluation for each thread, we copy the SharingRun's startIndex and endIndex only,
// as LEG events will be calculated during the subsequent dropComparison. The constructor for TholdDropEvaluation must set
// LEG events to zero.
private void makeDECopy(TholdDropEvaluation passedDropEvaluation) {
TholdDropEvaluation tholdDropEvaluation = new TholdDropEvaluation();
// iterate through the SharingRuns in the SharingRunList from the TholdDropEval that was passed in
for (SharingRun sr : passedDropEvaluation.getSharingRunList()) {
SharingRun ourSharingRun = new SharingRun();
ourSharingRun.startIndex = sr.startIndex;
ourSharingRun.endIndex = sr.endIndex;
tholdDropEvaluation.addSharingRun(ourSharingRun);
}
return tholdDropEvaluation
}
#Override
protected TholdDropEvaluation compute() {
int simsToDo = partitionOfN.size();
UUID tag = UUID.randomUUID();
long computeStartTime = System.nanoTime();
if (simsToDo <= desiredPartitionSize) {
logger.debug("IN MULTI-THREAD compute() --- UUID {}:Evaluating partitionOfN sublist length", tag, simsToDo);
// job within size limit, do the task and return the completed TholdDropEvaluation
// iterate through each TholdDropResult in the sub-partition and do the dropComparison to the refernce mSubI_DR,
// writing to the copy of the DropEval in tholdDropEvaluation
for (TholdDropResult currentResult : partitionOfN) {
mSubI_DR.dropComparison(currentResult, localDropEvaluation);
}
} else {
// job too large, subdivide and call this recursively
int half = simsToDo / 2;
logger.info("Splitting UUID = {}, half is {} and simsToDo is {}", tag, half, simsToDo );
TholdExecutor nextExec = new TholdExecutor(partitionOfN.subList(0, half), coreCount, desiredPartitionSize, tholdDropEvaluation, mSubI_DR);
TholdExecutor futureExec = new TholdExecutor(partitionOfN.subList(half, simsToDo), coreCount, desiredPartitionSize, tholdDropEvaluation, mSubI_DR);
nextExec.fork();
TholdDropEvaluation futureEval = futureExec.compute();
TholdDropEvaluation nextEval = nextExec.join();
tholdDropEvaluation.merge(futureEval);
tholdDropEvaluation.merge(nextEval);
}
logger.info("{} Compute time is {} ns",tag, System.nanoTime() - computeStartTime);
// NOTE: this was inside the else block in Rob's example, but don't we want it outside the block so it's returned
// whether
return tholdDropEvaluation;
}
}
Even if you could figure out where a thread would run initially there's no reason to assume it would live on that processor/core for the rest of its life. In all probability for any task big enough to be worth the cost of spawning a thread it won't, so you'd need to control where it ran completely to offer that level of assurance.
As far as I know there's no standard mechanism for controlling mappings from threads to processor cores inside Java. Typically that's known as "thread affinity" or "processor affinity". On Windows and Linux for example you can control that using:
Windows: SetThreadAffinityMask
Linux: sched_setaffinity or pthread_setaffinity_np
so in theory you could write some C and JNI code that allowed you to abstract this enough on the Java hosts you cared about to make it work.
That feels like the wrong solution to the real problem you seem to be facing, because you end up withdrawing options from the OS scheduler, which potentially doesn't allow it to make the smartest scheduling decisions causing total runtime to increase. Unless you're pushing an unusual workload and modelling/querying processor information/topology down to the level of NUMA and shared caches it ought to do a better job of figuring out where to run threads for most workloads than you could. Your JVM typically runs a large number of additional threads besides just the ones you explicitly create from after main() gets called. Additionally I wouldn't like to promise anything about what the JVM you run today (or even tomorrow) might decide to do on its own about thread affinity.
Having said that it seems like the underlying problem is that you want to have one instance of an object per thread. Typically that's much easier than predicting where a thread will run and then manually figuring out a mapping between N processors and M threads at any point in time. Usually you'd use "thread local storage" (TLS) to solve this problem.
Most languages provide this concept in one form or another. In Java this is provided via the ThreadLocal class. There's an example in the linked document given:
public class ThreadId {
// Atomic integer containing the next thread ID to be assigned
private static final AtomicInteger nextId = new AtomicInteger(0);
// Thread local variable containing each thread's ID
private static final ThreadLocal<Integer> threadId =
new ThreadLocal<Integer>() {
#Override protected Integer initialValue() {
return nextId.getAndIncrement();
}
};
// Returns the current thread's unique ID, assigning it if necessary
public static int get() {
return threadId.get();
}
}
Essentially there are two things you care about:
When you call get() it returns the value (Object) belonging to the current thread
If you call get in a thread which currently has nothing it will call initialValue() you implement, which allows you to construct or obtain a new object.
So in your scenario you'd probably want to deep copy the initial version of some local state from a read-only global version.
One final point of note: if your goal is to divide and conquer; do some work on lots of threads and then merge all their results to one answer the merging part is often known as a reduction. In that case you might be looking for MapReduce which is probably the most well known form of parallelism using reductions.
I'm using a WeakHashMap concurrently. I want to achieve fine-grained locking based on an Integer parameter; if thread A needs to modify a resource identified by Integer a and thread B does the same for resource identified by Integer b, then they need not to be synchronized. However, if there are two threads using the same resource, say thread C is also using a resource identified by Integer a, then of course thread A and C need to synchronize on the same Lock.
When there are no more threads that need the resource with ID X then the Lock in the Map for key=X can be removed. However, another thread can come in at that moment and try to use the lock in the Map for ID=X, so we need global synchronization when adding/removing the lock. (This would be the only place where every thread must synchronize, regardless of the Integer parameter) But, a thread cannot know when to remove the lock, because it doesn't know it is the last thread using the lock.
That's why I'm using a WeakHashMap: when the ID is no longer used, the key-value pair can be removed when the GC wants it.
To make sure I have a strong reference to the key of an already existing entry, and exactly that object reference that forms the key of the mapping, I need to iterate the keySet of the map:
synchronized (mrLocks){
// ... do other stuff
for (Integer entryKey : mrLocks.keySet()) {
if (entryKey.equals(id)) {
key = entryKey;
break;
}
}
// if key==null, no thread has a strong reference to the Integer
// key, so no thread is doing work on resource with id, so we can
// add a mapping (new Integer(id) => new ReentrantLock()) here as
// we are in a synchronized block. We must keep a strong reference
// to the newly created Integer, because otherwise the id-lock mapping
// may already have been removed by the time we start using it, and
// then other threads will not use the same Lock object for this
// resource
}
Now, can the content of the Map change while iterating it? I think not, because by calling mrLocks.keySet(), I created a strong reference to all keys for the scope of iteration. Is that correct?
As the API makes no assertions about the keySet(), I would recommend a cache usage like this:
private static Map<Integer, Reference<Integer>> lockCache = Collections.synchronizedMap(new WeakHashMap<>());
public static Object getLock(Integer i)
{
Integer monitor = null;
synchronized(lockCache) {
Reference<Integer> old = lockCache.get(i);
if (old != null)
monitor = old.get();
// if no monitor exists yet
if (monitor == null) {
/* clone i for avoiding strong references
to the map's key besides the Object returend
by this method.
*/
monitor = new Integer(i);
lockCache.remove(monitor); //just to be sure
lockCache.put(monitor, new WeakReference<>(monitor));
}
}
return monitor;
}
This way you are holding a reference to the monitor (the key itself) while locking on it and allow the GC to finalize it when not using it anymore.
Edit:
After the discussion about payload in the comments I thought about a solution with two caches:
private static Map<Integer, Reference<ReentrantLock>> lockCache = new WeakHashMap<>();
private static Map<ReentrantLock, Integer> keyCache = new WeakHashMap<>();
public static ReentrantLock getLock(Integer i)
{
ReentrantLock lock = null;
synchronized(lockCache) {
Reference<ReentrantLock> old = lockCache.get(i);
if (old != null)
lock = old.get();
// if no lock exists or got cleared from keyCache already but not from lockCache yet
if (lock == null || !keyCache.containsKey(lock)) {
/* clone i for avoiding strong references
to the map's key besides the Object returend
by this method.
*/
Integer cacheKey = new Integer(i);
lock = new ReentrantLock();
lockCache.remove(cacheKey); // just to be sure
lockCache.put(cacheKey, new WeakReference<>(lock));
keyCache.put(lock, cacheKey);
}
}
return lock;
}
As long as a strong reference to the payload (the lock) exists, the strong reference to the mapped integer in keyCache avoids the removal of the payload from the lockCache cache.
I'm trying to understand if there is a thread-safety issue inside of Struts2 ScopeInterceptor class (/org/apache/struts2/interceptor/ScopeInterceptor.java), here's the code in question:
private static Map locks = new IdentityHashMap();
static final void lock(Object o, ActionInvocation invocation) throws Exception {
synchronized (o) {
int count = 3;
Object previous = null;
while ((previous = locks.get(o)) != null) {
if (previous == invocation) {
return;
}
if (count-- <= 0) {
locks.remove(o);
o.notify();
throw new StrutsException("Deadlock in session lock");
}
o.wait(10000);
}
;
locks.put(o, invocation);
}
}
static final void unlock(Object o) {
synchronized (o) {
locks.remove(o);
o.notify();
}
}
I have a Websphere application showing 45 stalled threads, high cpu usage. 33 threads are stalled at "locks.remove(o)" inside of "unlock" method. The other 12 threads are stalled inside of "locks.get(o)" inside of "lock" method.
It seems to me that the usage of IdentityHashMap is thread-unsafe. Could simply wrapping IdentityHashMap with Collections.synchronizedMap() solve this problem?:
private static Map locks = Collections.synchronizedMap(new IdentityHashMap());
static final void lock(Object o, ActionInvocation invocation) throws Exception {
synchronized (o) {
int count = 3;
Object previous = null;
while ((previous = locks.get(o)) != null) {
if (previous == invocation) {
return;
}
if (count-- <= 0) {
locks.remove(o);
o.notify();
throw new StrutsException("Deadlock in session lock");
}
o.wait(10000);
}
;
locks.put(o, invocation);
}
}
static final void unlock(Object o) {
synchronized (o) {
locks.remove(o);
o.notify();
}
}
It seems to me that the author tried to "fix" IdentityHashMap's synchronization problem by using synchronized code blocks, however that doesn't protect against multiple threads if the Object "o" is a thread-specific object. And, since the code blocks within lock and unlock are separate, then IdentityHashMap will (and does!) get called simultaneously by more than one thread (as per our Java core evidence).
Is the Collections.synchronizedMap() wrapper the correct fix, or am I missing something?
I believe you are right, and there appears to be a thread safety issue. The developer is attempting to be thread safe by synchronizing on the object "o", but it looks like this object is actually the session object rather than something that is scoped more widely. I believe the change needs to be to synchronize on the locks object.
No real answers, but hopefully some useful information:
The IdentityHashMap documentation (http://docs.oracle.com/javase/7/docs/api/java/util/IdentityHashMap.html) states:
Note that this implementation is not synchronized. If multiple threads access an identity hash map concurrently, and at least one of
the threads modifies the map structurally, it must be synchronized
externally. (A structural modification is any operation that adds or
deletes one or more mappings; merely changing the value associated
with a key that an instance already contains is not a structural
modification.) This is typically accomplished by synchronizing on some
object that naturally encapsulates the map. If no such object exists,
the map should be "wrapped" using the Collections.synchronizedMap
method. This is best done at creation time, to prevent accidental
unsynchronized access to the map:
Map m = Collections.synchronizedMap(new IdentityHashMap(...));
So the Collections.synchronizedMap strategy sounds right, but this page (http://docs.oracle.com/javase/tutorial/essential/concurrency/locksync.html) makes me wonder about whether it will work since the methods are static:
Locks In Synchronized Methods
When a thread invokes a synchronized method, it automatically acquires
the intrinsic lock for that method's object and releases it when the
method returns. The lock release occurs even if the return was caused
by an uncaught exception.
You might wonder what happens when a static synchronized method is
invoked, since a static method is associated with a class, not an
object. In this case, the thread acquires the intrinsic lock for the
Class object associated with the class. Thus access to class's static
fields is controlled by a lock that's distinct from the lock for any
instance of the class.
Since those are static methods (even though the fields are not static), it's hard to tell if the Collections.synchronizedMap wrapper will really work to prevent a deadlock... My answer is that I have no answer!
Yes I think so.
If you accessing lock(Object o, ActionInvocation invocation) with different os you are modify the IdentityHashMap simultaneously with different monitors for the different threads. This makes it possible for different threads to call IdentityHashMap simultaneously.
This can be solved by synchronizing the IdentityHashMap.
I m wondering when do we need to use the threadlocal variable?, I have a code that runs multiple threads, each one read some files on S3, I wish to keep track of how many lines read out of the files altogether, here is my code:
final AtomicInteger logLineCounter = new AtomicInteger(0);
for(File f : files) {
calls.add(_exec.submit(new Callable<Void>() {
#Override
public Void call() throws Exception {
readNWrite(f, logLineCounter);
return null;
}
}));
}
for (Future<Void> f : calls) {
try {
f.get();
} catch (Exception e) {
//
}
}
LOGGER.info("Total number of lines: " + logLineCounter);
...
private void readNWrite(File f, AtomicInteger counter) {
Iterator<Activity> it = _dataReader.read(file);
int lineCnt = 0;
if (it != null && it.hasNext()) {
while(it.hasNext()) {
lineCnt++;
// write to temp file here
}
counter.getAndAdd(lineCnt);
}
}
my question is do I need to make the lineCnt in the readNWrite() method to be threadlocal?
No you don't need to use ThreadLocal here - your code looks perfectly fine:
lineCnt is a local variable which is therefore not shared across thread => it is thread safe
counter.getAndAdd(lineCnt); is an atomic and thread safe operation
If you are interested, there are several posts on SO about the use of ThreadLocal, such as this one.
lineCnt is already "thread local" since it's on the stack. Use ThreadLocal only when you need thread-local copies of instance member variables.
You dont need to make lineCnt a ThreadLocal explicitly. lineCnt is a local variable to the thread. It is not accessible by any other thread.
You can get more information about ThreadLocal here
ThreadLocal from javadoc
These variables differ from their normal counterparts in that each
thread that accesses one (via its get or set method) has its own,
independently initialized copy of the variable. ThreadLocal instances
are typically private static fields in classes that wish to associate
state with a thread
- A Thread has a Stack, Register, and Program Counter.
- lineCnt is already into the ThreadLocal.
- lineCnt is a personal copy of the instance variable lineCnt for this thread, and its not visible to any other thread .
The following method belongs to an object A that implements Runnable. It's called asynchronously by other method from the object A and by code inside the run method (so, it's called from other thread, with a period of 5 seconds).
Could I end up with file creation exceptions?
If i make the method synchronized... the lock is always acquired over the object A ?
The fact that one of the callers is at the run() method confuses me :S
Thanks for your inputs.
private void saveMap(ConcurrentMap<String, String> map) {
ObjectOutputStream obj = null;
try {
obj = new ObjectOutputStream(new FileOutputStream("map.txt"));
obj.writeObject(map);
} catch (IOException ex) {
Logger.getLogger(MessagesFileManager.class.getName()).log(Level.SEVERE, null, ex);
} finally {
try {
obj.close();
} catch (IOException ex) {
Logger.getLogger(MessagesFileManager.class.getName()).log(Level.SEVERE, null, ex);
}
}
notifyActionListeners();
}
Synchronized instance methods use the this object as the lock and prevent simultaneous execution of all synchronized instance methods (even other ones) from different threads.
To answer your question regarding requirements for synchronization, the answer is basically yes because you have multiple threads accessing the same method, so output may collide.
As a design comment, I would make your saveMap method static, because it doesn't access any fields (it's stateless), and it more strongly indicates that output to the file is not dependent on the instance, so it's more obvious that file output may collide with other instances.
Edited:
Here's the code for what I'm suggesting:
private static synchronized void saveMap(Map<String, String> map) {
...
}
FYI, static synchronized methods use the class object (ie MyClass.class), which is a singleton, as the lock object.
It's called asynchronously by other method from the object A and by code inside the run method (so, it's called from other thread, with a period of 5 seconds).
Given that saveMap is called from multiple threads, without synchronization you cannot guarantee that two threads won't try to write to the same file concurrently. This will cause an incorrectly-formatted file when it happens.
The simplest solution is to make the method synchronized.
private synchronized void saveMap(ConcurrentMap<String, String> map) { ... }
If the map is large enough, this may cause unresponsiveness in your program. Another option is to write to a temporary file (a new file each time it's called) and then use synchronization while swapping the new file over map.txt by renaming and deleting.
private void saveMap(ConcurrentMap<String, String> map) {
File file = ... original code to write to a temporary file ...
if (file != null) {
synchronized(this) {
... move file over map.txt ...
}
notifyActionListeners();
}
}
Keep in mind that swapping two files won't be an atomic operation. Any external program or thread from the same program may catch the short time that map.txt doesn't exist. I was unable to find an atomic file-swap method in Java, but maybe with some searching you will.