for learning purpose i have tried to implements a queue data-structure + Consumer/producer chain that is thread-safe, for learning purpose too i have not used notify/wait mechanism :
SyncQueue :
package syncpc;
/**
* Created by Administrator on 01/07/2009.
*/
public class SyncQueue {
private int val = 0;
private boolean set = false;
boolean isSet() {
return set;
}
synchronized public void enqueue(int val) {
this.val = val;
set = true;
}
synchronized public int dequeue() {
set = false;
return val;
}
}
Consumer :
package syncpc;
/**
* Created by Administrator on 01/07/2009.
*/
public class Consumer implements Runnable {
SyncQueue queue;
public Consumer(SyncQueue queue, String name) {
this.queue = queue;
new Thread(this, name).start();
}
public void run() {
while(true) {
if(queue.isSet()) {
System.out.println(queue.dequeue());
}
}
}
}
Producer :
package syncpc;
import java.util.Random;
/**
* Created by Administrator on 01/07/2009.
*/
public class Producer implements Runnable {
SyncQueue queue;
public Producer(SyncQueue queue, String name) {
this.queue = queue;
new Thread(this, name).start();
}
public void run() {
Random r = new Random();
while(true) {
if(!queue.isSet()) {
queue.enqueue(r.nextInt() % 100);
}
}
}
}
Main :
import syncpcwn.*;
/**
* Created by Administrator on 27/07/2015.
*/
public class Program {
public static void main(String[] args) {
SyncQueue queue = new SyncQueue();
new Producer(queue, "PROCUDER");
new Consumer(queue, "CONSUMER");
}
}
The problem here, is that if isSet method is not synchronized , i got an ouput like that :
97,
55
and the program just continue running without outputting any value. while if isSet method is synchronized the program work correctly.
i don't understand why, there is no deadlock, isSet method just query the set instance variable without setting it, so there is no race condition.
set needs to be volatile:
private boolean volatile set = false;
This ensures that all readers see the updated value when a write completes. Otherwise they will end up seeing the cached value. This is discussed in more detail in this article on concurrency, and also provides examples of different patterns that use volatile.
Now the reason that your code works with synchronized is probably best explained with an example. synchronized methods can be written as follows (i.e., they are equivalent to the following representation):
public class SyncQueue {
private int val = 0;
private boolean set = false;
boolean isSet() {
synchronized(this) {
return set;
}
}
public void enqueue(int val) {
synchronized(this) {
this.val = val;
set = true;
}
}
public int dequeue() {
synchronized(this) {
set = false;
return val;
}
}
}
Here, the instance is itself used as a lock. This means that only thread can hold that lock. What this means is that any thread will always get the updated value because only one thread could be writing the value, and a thread that wants to read set won't be able to execute isSet until the other thread releases the lock on this, at which point the value of set will have been updated.
If you want to understand concurrency in Java properly you should really read Java: Concurrency In Practice (I think there's a free PDF floating around somewhere as well). I'm still going through this book because there are still many things that I do not understand or am wrong about.
As matt forsythe commented, you will run into issues when you have multiple consumers. This is because they could both check isSet() and find that there is a value to dequeue, which means that they will both attempt to dequeue that same value. It comes down to the fact that what you really want is for the "check and dequeue if set" operation to be effectively atomic, but it is not so the way you have coded it. This is because the same thread that initially called isSet may not necessarily be the same thread that then calls dequeue. So the operation as a whole is not atomic which means that you would have to synchronize the entire operation.
The problem you have is visibility (or rather, the lack of it).
Without any instructions to the contrary, the JVM will assume that the value assigned to a variable in one thread need not be visible to the other threads. It may be made visible sometimes later (when it's convenient to do so), or maybe not ever. The rules governing what should be made visible and when are defined by the Java Memory Model and they're summed up here. (They may be a bit dry and scary at first, but it's absolutely crucial to understand them.)
So even though the producer sets set to true, the consumer will continue to see it as false. How can you publish a new value?
Mark the field as volatile. This works well for primitive values like boolean, with references you have to be a bit more careful.
synchronized provides not just mutual exclusion but also guarantees that any values set in it will be visible to anyone entering a synchronized block that uses the same object. (This is why everything works if you declare the isSet() method synchronized.)
Using a thread-safe library class, like the Atomic* classes of java.util.concurrent
In your case volatile is probably the best solution because you're only updating a boolean, so atomicity of the update is guaranteed by default.
As #matt forsythe pointed out, there is also a TOCTTOU issue with your code too because your threads can be interrupted by another between isSet() and enqueue()/dequeue().
I assume that when we get stuck in threading issue, the first step was to make sure that both the threads are running well. ( i know they will as there are no locks to create deadlock)
For that you could have added a printf statement in enqueue function as well. That would make sure that enqueue and dequeue threads are running well.
Then second step should have been that "set" is the shared resource, so is the value toggling well enough so that code can run in desired fashion.
I think if you could reason and put the logging well enough, you can realize the issues in problem.
Related
This question already has answers here:
Why doesnt this Java loop in a thread work?
(4 answers)
Closed 3 years ago.
For a recent library I'm writing, I wrote a thread which loops indefinitely. In this loop, I start with a conditional statement checking a property on the threaded object. However it seems that whatever initial value the property has, will be what it returns even after being updated.
Unless I do some kind of interruption such as Thread.sleep or a print statement.
I'm not really sure how to ask the question unfortunately. Otherwise I would be looking in the Java documentation. I have boiled down the code to a minimal example that explains the problem in simple terms.
public class App {
public static void main(String[] args) {
App app = new App();
}
class Test implements Runnable {
public boolean flag = false;
public void run() {
while(true) {
// try {
// Thread.sleep(1);
// } catch (InterruptedException e) {}
if (this.flag) {
System.out.println("True");
}
}
}
}
public App() {
Test t = new Test();
Thread thread = new Thread(t);
System.out.println("Starting thread");
thread.start();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {}
t.flag = true;
System.out.println("New flag value: " + t.flag);
}
}
Now, I would presume that after we change the value of the flag property on the running thread, we would immediately see the masses of 'True' spitting out to the terminal. However, we don't..
If I un-comment the Thread.sleep lines inside the thread loop, the program works as expected and we see the many lines of 'True' being printed after we change the value in the App object. As an addition, any print method in place of the Thread.sleep also works, but some simple assignment code does not. I assume this is because it is pulled out as un-used code at compile time.
So, my question is really: Why do I have to use some kind of interruption to get the thread to check conditions correctly?
So, my question is really: Why do I have to use some kind of interruption to get the thread to check conditions correctly?
Well you don't have to. There are at least two ways to implement this particular example without using "interruption".
If you declare flag to be volatile, then it will work.
It will also work if you declare flag to be private, write synchronized getter and setter methods, and use those for all accesses.
public class App {
public static void main(String[] args) {
App app = new App();
}
class Test implements Runnable {
private boolean flag = false;
public synchronized boolean getFlag() {
return this.flag;
}
public synchronized void setFlag(boolean flag) {
return this.flag = flag;
}
public void run() {
while(true) {
if (this.getFlag()) { // Must use the getter here too!
System.out.println("True");
}
}
}
}
public App() {
Test t = new Test();
Thread thread = new Thread(t);
System.out.println("Starting thread");
thread.start();
try {
Thread.sleep(1000);
} catch (InterruptedException e) {}
t.setFlag(true);
System.out.println("New flag value: " + t.getFlag());
}
But why do you need to do this?
Because unless you use either a volatile or synchronized (and you use synchronized correctly) then one thread is not guaranteed to see memory changes made by another thread.
In your example, the child thread does not see the up-to-date value of flag. (It is not that the conditions themselves are incorrect or "don't work". They are actually getting stale inputs. This is "garbage in, garbage out".)
The Java Language Specification sets out precisely the conditions under which one thread is guaranteed to see (previous) writes made by another thread. This part of the spec is called the Java Memory Model, and it is in JLS 17.4. There is a more easy to understand explanation in Java Concurrency in Practice by Brian Goetz et al.
Note that the unexpected behavior could be due to the JIT deciding to keep the flag in a register. It could also be that the JIT compiler has decided it does not need force memory cache write-through, etcetera. (The JIT compiler doesn't want to force write-through on every memory write to every field. That would be a major performance hit on multi-core systems ... which most modern machines are.)
The Java interruption mechanism is yet another way to deal with this. You don't need any synchronization because the method calls that. In addition, interruption will work when the thread you are trying to interrupt is currently waiting or blocked on an interruptible operation; e.g. in an Object::wait call.
Because the variable is not modified in that thread, the JVM is free to effectively optimize the check away. To force an actual check, use the volatile keyword:
public volatile boolean flag = false;
I have a scenario where I have to maintain a Map which can be populated by multiple threads, each modifying their respective List (unique identifier/key being the thread name), and when the list size for a thread exceeds a fixed batch size, we have to persist the records to the database.
Aggregator class
private volatile ConcurrentHashMap<String, List<T>> instrumentMap = new ConcurrentHashMap<String, List<T>>();
private ReentrantLock lock ;
public void addAll(List<T> entityList, String threadName) {
try {
lock.lock();
List<T> instrumentList = instrumentMap.get(threadName);
if(instrumentList == null) {
instrumentList = new ArrayList<T>(batchSize);
instrumentMap.put(threadName, instrumentList);
}
if(instrumentList.size() >= batchSize -1){
instrumentList.addAll(entityList);
recordSaver.persist(instrumentList);
instrumentList.clear();
} else {
instrumentList.addAll(entityList);
}
} finally {
lock.unlock();
}
}
There is one more separate thread running after every 2 minutes (using the same lock) to persist all the records in Map (to make sure we have something persisted after every 2 minutes and the map size does not gets too big)
if(//Some condition) {
Thread.sleep(//2 minutes);
aggregator.getLock().lock();
List<T> instrumentList = instrumentMap.values().stream().flatMap(x->x.stream()).collect(Collectors.toList());
if(instrumentList.size() > 0) {
saver.persist(instrumentList);
instrumentMap .values().parallelStream().forEach(x -> x.clear());
aggregator.getLock().unlock();
}
}
This solution is working fine in almost for every scenario that we tested, except sometimes we see some of the records went missing, i.e. they are not persisted at all, although they were added fine to the Map.
My questions are:
What is the problem with this code?
Is ConcurrentHashMap not the best solution here?
Does the List that is used with the ConcurrentHashMap have an issue?
Should I use the compute method of ConcurrentHashMap here (no need I think, as ReentrantLock is already doing the same job)?
The answer provided by #Slaw in the comments did the trick. We were letting the instrumentList instance escape in non-synchronized way i.e. access/operations are happening over list without any synchonization. Fixing the same by passing the copy to further methods did the trick.
Following line of code is the one where this issue was happening
recordSaver.persist(instrumentList);
instrumentList.clear();
Here we are allowing the instrumentList instance to escape in non-synchronized way i.e. it is passed to another class (recordSaver.persist) where it was to be actioned on but we are also clearing the list in very next line(in Aggregator class) and all of this is happening in non-synchronized way. List state can't be predicted in record saver... a really stupid mistake.
We fixed the issue by passing a cloned copy of instrumentList to recordSaver.persist(...) method. In this way instrumentList.clear() has no affect on list available in recordSaver for further operations.
I see, that you are using ConcurrentHashMap's parallelStream within a lock. I am not knowledgeable about Java 8+ stream support, but quick searching shows, that
ConcurrentHashMap is a complex data structure, that used to have concurrency bugs in past
Parallel streams must abide to complex and poorly documented usage restrictions
You are modifying your data within a parallel stream
Based on that information (and my gut-driven concurrency bugs detector™), I wager a guess, that removing the call to parallelStream might improve robustness of your code. In addition, as mentioned by #Slaw, you should use ordinary HashMap in place of ConcurrentHashMap if all instrumentMap usage is already guarded by lock.
Of course, since you don't post the code of recordSaver, it is possible, that it too has bugs (and not necessarily concurrency-related ones). In particular, you should make sure, that the code that reads records from persistent storage — the one, that you are using to detect loss of records — is safe, correct, and properly synchronized with rest of your system (preferably by using a robust, industry-standard SQL database).
It looks like this was an attempt at optimization where it was not needed. In that case, less is more and simpler is better. In the code below, only two concepts for concurrency are used: synchronized to ensure a shared list is properly updated and final to ensure all threads see the same value.
import java.util.ArrayList;
import java.util.List;
public class Aggregator<T> implements Runnable {
private final List<T> instruments = new ArrayList<>();
private final RecordSaver recordSaver;
private final int batchSize;
public Aggregator(RecordSaver recordSaver, int batchSize) {
super();
this.recordSaver = recordSaver;
this.batchSize = batchSize;
}
public synchronized void addAll(List<T> moreInstruments) {
instruments.addAll(moreInstruments);
if (instruments.size() >= batchSize) {
storeInstruments();
}
}
public synchronized void storeInstruments() {
if (instruments.size() > 0) {
// in case recordSaver works async
// recordSaver.persist(new ArrayList<T>(instruments));
// else just:
recordSaver.persist(instruments);
instruments.clear();
}
}
#Override
public void run() {
while (true) {
try { Thread.sleep(1L); } catch (Exception ignored) {
break;
}
storeInstruments();
}
}
class RecordSaver {
void persist(List<?> l) {}
}
}
I was given a question by my friend and was asked to explain why the program could get hung in infinite loop.
public class Test {
private static boolean flag;
private static int count;
private static class ReaderThread extends Thread {
public void run() {
while (!flag)
Thread.yield();
System.out.println(count);
}
}
public static void main(String[] args) {
new ReaderThread().start();
count = 1;
flag = true;
}
}
I was sure that it cannot happen. But it did actually happen one time (out of probably 50 times).
I am not able to explain this behavior. Is there any catch that I am missing?
From book - Java Concurrency In Practice (this example seems to be taken from the book itself).
When the reads and writes occur in different threads, there is no guarantee that the reading thread will see a value written by another thread on a timely basis, or even at all because threads might cache these values.
In order to ensure visibility of memory writes across threads, you must use synchronization or declare variable as volatile.
I'm just a non-developer playing to be a developer, so my question may be extremely simple!
I'm just testing Java multi-threading stuff, this is not real code. I wonder how to make two member variables update at the same time in Java, in case we want them both in sync. As an example:
public class Testing
{
private Map<String, Boolean> itemToStatus = new ConcurrentHashMap<>();
private Set<String> items = ConcurrentHashMap.newKeySet();
public static void main(String[] args)
{
(new Testing()).start("ABC");
}
public void start(String name) {
if (name.equals("ABC")) {
itemToStatus.put(name, true);
items.add(name);
}
}
}
In that scenario (imagine multi-threaded, of course) I want to be able to guarantee that any reads of items and itemToStatus always return the same.
So, if the code is in the line itemToStatus.put(name, true), and other thread asks items.contains(name), it will return false. On the other hand, if that other thread asks itemToStatus.containsKey(name); it will return true. And I don't want that, I want them both to give the same value, if that makes sense?
How can I make those two changes atomic? Would this work?
if (name.equals("ABC")) {
synchronised(this) {
itemToStatus.put(name, true);
items.add(name);
}
}
Still, I don't see why that would work. I think that's the case where you need a lock or something?
Cheers!
Just synchronizing the writes won't work. You would also need to synchronize (on the same object) the read access to items and itemToStatus collections. That way, no thread could be reading anything if another thread were in the process of updating the two collections. Note that synchronizing in this way means you don't need ConcurrentHashMap or ConcurrentHashSet; plain old HashMap and HashSet will work because you're providing your own synchronization.
For example:
public void start(String name) {
if (name.equals("ABC")) {
synchronized (this) {
itemToStatus.put(name, true);
items.add(name);
}
}
}
public synchronized boolean containsItem(String name) {
return items.contains(name);
}
public synchronized boolean containsStatus(String name) {
return itemToStatus.containsKey(name);
}
That will guarantee that the value returned by containsItem would also have been returned by containsStatus if that call had been made instead. Of course, if you want the return values to be consistent over time (as in first calling containsItem() and then containsStatus()), you would need higher-level synchronization.
The short answer is yes: by synchronizing the code block, as you did in your last code snippet, you made the class thread-safe because that code block is the only one that reads or modifies the status of the class (represented by the two instance variables).
The meaning of synchronised(this) is that you use the instance of the object (this) as a lock: when a thread enters that code block it gets the lock, preventing other threads to enter the same code block until the thread releases it when it exits from the code block.
This is an implementation of readers writers, i.e. many readers can read but only one writer can write at any one time. Does this work as expected?
public class ReadersWriters extends Thread{
static int num_readers = 0;
static int writing = 0;
public void read_start() throws InterruptedException {
synchronized(this.getClass()) {
while(writing == 1) wait();
num_readers++;
}
}
public void read_end() {
synchronized(this.getClass()) {
if(--num_readers == 0) notifyAll();
}
}
public void write_start() throws InterruptedException{
synchronized(this.getClass()) {
while(num_readers > 0) wait();
writing = 1;
}
}
public void write_end() {
this.getClass().notifyAll();
}
}
Also is this implementation any different from declaring each method
public static synchronized read_start()
for example?
Thanks
No - you're implicitly calling this.wait(), despite not having synchronized on this, but instead on the class. Likewise you're calling this.notifyAll() in read_end. My suggestions:
Don't extend Thread - you're not specializing the thread at all.
Don't use static variables like this from instance members; it makes it look like there's state on a per-object basis, but actually there isn't. Personally I'd just use instance variables.
Don't use underscores in names - the conventional Java names would be numReaders, readEnd (or better, endRead) etc.
Don't synchronize on either this or the class if you can help it. Personally I prefer to have a private final Object variable to lock on (and wait etc). That way you know that only your code can be synchronizing on it, which makes it easier to reason about.
You never set writing to 0. Any reason for using an integer instead of a boolean in the first place?
Of course, it's better to use the classes in the framework for this if at all possible - but I'm hoping you're really writing this to understand threading better.
You can achieve your goal in much simpler way by using
java.util.concurrent.locks.ReentrantReadWriteLock
Just grab java.util.concurrent.locks.ReentrantReadWriteLock.ReadLock when you start reading and java.util.concurrent.locks.ReentrantReadWriteLock.WriteLock when you start writing.
This class is intended exactly for that - allow multiple readers that are mutually exclusive with single writer.
Your particular implementation of read_start is not equivalent to simply declaring the method synchronized. As was noted by J. Skeed, you need to call notify (and wait) on the object you are synchronize-ing with. You cannot use an unrelated object (here: the class) for this. And using the synchronized modified on a method does not make the method implicitly call wait or anything like that.
There is, BTW., an implementation of read/write locks, which ships with the core JDK: java.util.concurrent.locks.ReentrantReadWriteLock. Using that one, your code might look like the following instead:
class Resource {
private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
private final Lock rlock = lock.readLock();
private final Lock wlock = lock.writeLock();
void read() { ... /* caller has to hold the read lock */ ... }
void write() { ... /* caller has to hold the write lock */ ... }
Lock readLock() { return rlock; }
Lock writeLock() { return wlock; }
}
Usage
final Resource r = ...;
r.readLock().lock();
try {
r.read();
} finally {
r.unlock();
}
and similar for the write operation.
The example code synchronizes on this.getClass(), which will return the same Class object for multiple instances of ReadersWriters in the same class loader. If multiple instances of ReadersWriters exist, even though you have multiple threads, there will be contention for this shared lock. This would be similar to adding the static keyword to a private lock field (as Jon Skeet suggested) and would likely lead to worse performance than synchronizing on this or a private lock object. More specifically, one thread which is reading would be blocking another thread which is writing, and this is likely undesirable.