Thread safe Map of Queues
The above question has a design implementation VERY similar to my own. I understand why it's not thread safe, but the accepted "pattern" is throwing me for a loop. I don't quite get how to implement it or what it does in relationship to the question.
boolean send = false;
System.out.println("MailCenter Size Start: " + mailBox.keySet().size());
if (parsedInput.size() == 3) {
for (String s : writers.keySet()) {
if (!s.equals(parsedInput.get(1))) {
send = true;
}
}
if (send) {
mailMessage.add(noAnsiName + ": " + parsedInput.get(2));
mailBox.putIfAbsent(parsedInput.get(1), mailMessage);
System.out.println("Current mail message is: " + mailMessage.peek());
out.println("SERVER You have sent mail to " + parsedInput.get(1) + ".");
}
System.out.println("MailCenter Size Middle: " + mailBox.keySet().size());
} else {
int loop = 0;
for (Map.Entry entry : mailBox.entrySet()) {
System.out.println(entry.getKey() + ":\t" + entry.getValue());
System.out.println("*** LOOP STATUS *** " + loop);
loop++;
}
}
I don't quite get how to implement it or what it does in relationship
to the question.
The question shows that a ConcurrentHashMap is being used. However, the variable map is being declared as a Map, so the method putIfAbsent() referenced by the accepted answer is not visible.
Map<String, ConcurrentLinkedQueue<String>> map = new ConcurrentHashMap<>();
So, in order for the answer to work, the above must be changed to declare map as a ConcurrentMap.
Now, that's a good step toward thread safety. However, just because we're using a concurrent map implementation doesn't mean a get() and subsequent put() on the map are an atomic unit of work. In other words, another thread can still change the state of the map after the current thread calls get(), but before it calls put().
The answer recommends putIfAbsent() because it ensures atomicity by wrapping the equivalent of a containsKey() call and a put() call in a synchronized block. If you also then use the return value correctly, you will have solid concurrent behavior.
Related
I have a collection of 20 items, I will create a loop for the items and make API Calls to get the data, based on the data returned, I will have to update in the database. This requirement is simple and I am able to accomplish in plain Java.
Now for performance, I am learning about using RxJava. I went through many articles in the internet and found that people refer to the async-http-client library for async http calls, I find that the library is out of date and the maintainer is planning for a hand-over to someone else, the one given in RxJava library is also like developed in 2014. Since I am new to RxJava, can you please help me with the right approach.
I am currently getting all the data and converting to observables like below
Observable<ENV> envs= Observable.fromIterable(allEnvs);
I also need to get some help like is the above code fine or should I create like the following for the observable construction, this is the snippet in groovy which I will have to write in Java.
val createObserver = Observable.create(ObservableOnSubscribe<String> { emitter ->
emitter.onNext("Hello World")
emitter.onComplete()
})
Kindly help me in choosing the best approach
Imagine that the http call is represented by class below :
public class HttpCall implements Callable<String> {
private final int i;
private HttpCall(int i) {
this.i = i;
}
#Override
public String call() {
try {
Thread.sleep(2000);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "Something for : " + i;
}
}
It waits 2 sec and then emits a string (the http call result).
To combine all the items resulting from different http calls we can use merge operator. But before that we need to transform the Callable to an Observable by using fromCallable operator.
void sequentially() {
List<Observable<String>> httpRequests = IntStream.range(0, 20)
.mapToObj(HttpCall::new)
.map(Observable::fromCallable)
.collect(Collectors.toList());
Observable.merge(httpRequests)
.timestamp(TimeUnit.SECONDS)
.subscribe(e -> System.out.println("Elapsed time : " + e.time() + " -- " + e.value() + ". Executed on thread : " + Thread.currentThread().getName()));
}
Because all the requests are executed on the same thread, the order is maintained :
Elapsed time : 1602122218 -- Something for : 0. Executed on thread : main
Elapsed time : 1602122220 -- Something for : 1. Executed on thread : main
Elapsed time : 1602122222 -- Something for : 2. Executed on thread : main
...
As you can see the items are separated by 2 sec.
To run each request in its own thread we need to tell Rx that we need a thread for each call. Easy-peasy, just switch to one of the suggested schedulers. IO its what we need (as it's an IO operation).
void parallel( {
List<Observable<String>> httpRequests = IntStream.range(0, 20)
.mapToObj(HttpCall::new)
.map(httpCall -> Observable.fromCallable(httpCall)
.subscribeOn(Schedulers.io())
) // take a thread from the IO pool
.collect(Collectors.toList());
Observable.merge(httpRequests)
.timestamp(TimeUnit.SECONDS)
.subscribe(e -> System.out.println("Elapsed time : " + e.time() + " -- " + e.value() + ". Executed on thread : " + Thread.currentThread().getName()));
}
This time the order is not guarenteed and they are produced at almost the same time :
Elapsed time : 1602123707 -- Something for : 2. Executed on thread : RxCachedThreadScheduler-3
Elapsed time : 1602123707 -- Something for : 0. Executed on thread : RxCachedThreadScheduler-1
Elapsed time : 1602123707 -- Something for : 1. Executed on thread : RxCachedThreadScheduler-1
...
The code could be shorten like :
Observable.range(0, 20)
.map(HttpCall::new)
.flatMap(httpCall -> Observable.fromCallable(httpCall).subscribeOn(Schedulers.io()))
.timestamp(TimeUnit.SECONDS)
.subscribe(e -> System.out.println("Elapsed time : " + e.time() + " -- " + e.value() + ". Executed on thread : " + Thread.currentThread().getName()));
merge uses flatMap behind scenes.
I have 4 threads - 2 of the thread does update and 2 of the thread does read on the concurrentHashMap. The code is as follow:
private static ConcurrentHashMap<String, String> myHashMap = new ConcurrentHashMap<>();
private static final Object lock = new Object();
Thread 1 and Thread 2's run method (key and value is a string)
synchronized (lock) {
if (!myHashMap.containsKey(key)) {
myHashMap.put(key, value);
} else {
String value = myHashMap.get(key)
// do something with the value
myHashMap.put(key, value);
}
}
Thread 3 and Thread 4's run method does the print
for (Entry<String, String> entry : myHashMap.entrySet()) {
String key = entry.getKey();
String value = entry.getValue();
System.out.println("key, " + key + " value " + value);
}
Is there any issue with the above usage of ConcurrenHashMap code?
Because when I read the Javadoc and search the web, I found the following claim:
This class is fully interoperable with Hashtable in programs that rely on its thread safety but not on its synchronization details. (Note - I understand the print thread result might not be the most recent result, but that is ok as long as the update thread does things correctly.)
There is also some claim over the website that says the same Iterator cannot be used for 2 or more different thread. So I am wondering if the print method uses the same Iterator in 2 thread above. And why we cannot use the same Iterator in 2 different threads?
As for the requirement, I want concurrent read without blocking that is why I choose the ConcurrentHashMap.
Instead of using the if else block you can use the putIfAbsent method from the concurrent hashmap and second thing you should not use the external locking in concurrent hashmap.
Say, SomeOtherService in one Veritcle use UserService in different Verticle, the communication happens over the Event Bus.
To represent it:
class SomeOtherService {
final UserService userService = new UserService();
// Mutable state
final Map<String, Single<String>> cache = new HashMap(); // Not Synchronized ?
public Single<String> getUserSessionInfo(String id) {
// Seems it is not save ! :
return cache.computeIfAbsent(id, _id -> {
log("could not find " + id + " in cache. Connecting to userService...");
return userService.getUserSessionInfo(id); // uses generated proxy to send msg to the event bus to call it
}
);
}
}
// Somewhere in another verticle/micro-service on another machine.
class UserService {
public Single<String> getUserSessionInfo(String id) {
return Single.fromCallable( () -> {
waitForOneSecond();
log("getUserSessionInfo for " + id);
if (id.equals("1"))
return "one";
if (id.equals("2"))
return "two";
else throw new Exception("could not"); // is it legal?
}
);
}
And the client code, where we subscribe and deciding about the scheduler:
final Observable<String> obs2 = Observable.from(new String[] {"1", "1"});
// Emulating sequential call of 'getUserSessionInfo' to fork in separate scheduler A
obs.flatMap(id -> {
log("flatMap"); // on main thread
return someOtherService.getUserSessionInfo(id)
.subscribeOn(schedulerA) // Forking. will thread starvation happen? (since we have only 10 threads in the pool)
.toObservable();
}
).subscribe(
x -> log("next: " + x)
);
The question is, how good is the solution to use HashMap for the cache (since it is the shared state here) by using the computeIfAbsent method?
Even though we are using Event Loop & Event Bus it would not save us from shared state and possible concurrency issue, assuming that log-operation (like getUserSessionInfo(id) happens in separate scheduler/threads?
Should I use ReplySubject instead to implement caching? What is the best practices for vert.x + rx-java?
Seems that as loon as cache.computeIfAbsent is run on EventLoop it is safe because it is sequential ?
Sorry.. a lot of questions, I guess I can narrow down to: What is the best practices to implement Cash for the Service calls in Vert.x and Rx-Java?
The whole example is here:
I think I found my answer here: http://blog.danlew.net/2015/06/22/loading-data-from-multiple-sources-with-rxjava/ -
Observable<Data> source = Observable .concat(memory, diskWithCache, networkWithSave) .first();
and when I save it by using map.put(..) explicitly instead of using computeIfAbsent
and as log I'm on the event-loop I'm safe to use un-sychronized cash map
I'm working on a multithreaded program where each thread calculates the GCD for two numbers, stores the numbers and GCD into a TreeMap, and prints out the TreeMap after all the threads finish. What kind of method should I use to make sure that only one thread is storing the data at the same time, and how do I use the last thread to print the TreeMap when it is ready to print?
for (int i = 0; i < myList.size(); ++i) {
for (int j = i + 1; j < myList.size(); ++j) {
modulus1 = myList.get(i);
modulus2 = myList.get(j);
pool.execute(new ThreadProcessRunnable(modulus1, modulus2, myMap));
}
}
public void run() {
ThreadProcess process = null;
try {
// Only one thread should execute the following code
for (Map.Entry<BigInteger, ArrayList<BigInteger>> entry : myMap.entrySet()) {
System.out.println("key ->" + entry.getKey() + ", value->" + entry.getValue());
}
} catch (Exception e) {
System.err.println("Exception ERROR");
}
You must use syncronize(myMap) {...} block in places where you need to guarantee single thread access to the map.
As for printing the result by the last thread, you can use a boolean flag as a signal of completeness and check it every time. Don't forget to make it volatile to let each thread see its value changes.
UPD: Brian Goetz "Java Concurrency In Practice" is a strongly recommended reading.
//Only one thread should executes the following code
synchronize{
for (Map.Entry<BigInteger, ArrayList<BigInteger>> entry : myMap.entrySet()) {
System.out.println("key ->" + entry.getKey() + ", value->" + entry.getValue());
}
}
You can use Collections.synchronizedMap to make it thread safe. And use thread.join to make sure printing is done only when all threads are dead.
Edit: Do the printing in Main thread. Just before printing, call join on all threads.
Let us say that one Thread is executing inside a Synchronized Function in Java and another Thread wants to access the same method but it will have to wait till the first Thread completes.
How can the second Thread know which Thread is having the Lock on the Object.
I would want to print the details of the first Thread and possibly from where the first Thread was initiated.
If you are using java.util.concurrent.locks.ReentrantLock then a subclass can call getOwner.
Alternatively you can use JMX. Iterate through threads to find the java.lang.management.ThreadInfo with appropriate getLockedMonitors() or getLockedSynchronizers().
It is a little tricky, almost what Tom Hawtin wrote, but you must explicity request the monitor info when getting the ThreadInfo in dumpAllThreads.
Something like:
Object lock = ...
ThreadMXBean mx = ManagementFactory.getThreadMXBean();
ThreadInfo[] allInfo = mx.dumpAllThreads(true, false);
for (ThreadInfo threadInfo : allInfo) {
MonitorInfo[] monitors = threadInfo.getLockedMonitors();
for (MonitorInfo monitorInfo : monitors) {
if (monitorInfo.getIdentityHashCode() == System.identityHashCode(lock)) {
StackTraceElement[] stackTrace = threadInfo.getStackTrace();
// use the the Information from threadInfo
}
}
}
Is this for diagnostic purposes, or is it for functionality you want to use as part of your application. If it's for diagnostics, then the various verbose logging solutions in the other answers here are probably enough to get you going. If you want to do this as part of functionality, then you really should use something more robust and flexible than the synchronized keyword, such as the ReentrantLock wizardry mentioned by #Tom.
I believe it is not possible to do that. However you can do something similar with some extra coding:
public void myFunction() {
System.out.println("" + Thread.currentThread() + " entering sync # myFunction");
synchronized(this) {
System.out.println("" + Thread.currentThread() + " entered sync # myFunction");
...
System.out.println("" + Thread.currentThread() + " leaving sync # myFunction");
}
System.out.println("" + Thread.currentThread() + " left sync # myFunction");
}