Say, SomeOtherService in one Veritcle use UserService in different Verticle, the communication happens over the Event Bus.
To represent it:
class SomeOtherService {
final UserService userService = new UserService();
// Mutable state
final Map<String, Single<String>> cache = new HashMap(); // Not Synchronized ?
public Single<String> getUserSessionInfo(String id) {
// Seems it is not save ! :
return cache.computeIfAbsent(id, _id -> {
log("could not find " + id + " in cache. Connecting to userService...");
return userService.getUserSessionInfo(id); // uses generated proxy to send msg to the event bus to call it
}
);
}
}
// Somewhere in another verticle/micro-service on another machine.
class UserService {
public Single<String> getUserSessionInfo(String id) {
return Single.fromCallable( () -> {
waitForOneSecond();
log("getUserSessionInfo for " + id);
if (id.equals("1"))
return "one";
if (id.equals("2"))
return "two";
else throw new Exception("could not"); // is it legal?
}
);
}
And the client code, where we subscribe and deciding about the scheduler:
final Observable<String> obs2 = Observable.from(new String[] {"1", "1"});
// Emulating sequential call of 'getUserSessionInfo' to fork in separate scheduler A
obs.flatMap(id -> {
log("flatMap"); // on main thread
return someOtherService.getUserSessionInfo(id)
.subscribeOn(schedulerA) // Forking. will thread starvation happen? (since we have only 10 threads in the pool)
.toObservable();
}
).subscribe(
x -> log("next: " + x)
);
The question is, how good is the solution to use HashMap for the cache (since it is the shared state here) by using the computeIfAbsent method?
Even though we are using Event Loop & Event Bus it would not save us from shared state and possible concurrency issue, assuming that log-operation (like getUserSessionInfo(id) happens in separate scheduler/threads?
Should I use ReplySubject instead to implement caching? What is the best practices for vert.x + rx-java?
Seems that as loon as cache.computeIfAbsent is run on EventLoop it is safe because it is sequential ?
Sorry.. a lot of questions, I guess I can narrow down to: What is the best practices to implement Cash for the Service calls in Vert.x and Rx-Java?
The whole example is here:
I think I found my answer here: http://blog.danlew.net/2015/06/22/loading-data-from-multiple-sources-with-rxjava/ -
Observable<Data> source = Observable .concat(memory, diskWithCache, networkWithSave) .first();
and when I save it by using map.put(..) explicitly instead of using computeIfAbsent
and as log I'm on the event-loop I'm safe to use un-sychronized cash map
Related
I have a Mono<String> object in reactor. How can I get a string value from this object?
I know I can do something like below :
For Mono<String> userName, I can do,
userName.map(System.out::println).
But this will directly print the value. I want to store this string value into another variable so that I can pass that variable around to some other functions. How can I extract this value?
To answer the question directly in its simplest form - you use Mono.block().
But you almost certainly shouldn't, as this blocks the thread, defeating the point of using reactor in the first place. You should, instead, call subscribe() and then provide a consumer. The consumer will be called asynchronously when the Mono emits a value, with that value as a parameter.
There's nothing inherently stopping you just assigning the value to a field of course:
mono.subscribe(v -> this.value = v);
...but this has very limited usefulness in practice, since you won't know when that field will be initialised.
The more normal way is to either call the other methods in your subscriber all in one go:
mono.subscribe(v -> {
oneMethodThatNeedsTheValue(v);
anotherMethodThatNeedsTheValue(v);
});
...or to use Mono.cache() and pass it around:
class Test {
void useMonoVal(Mono<String> mono) {
mono.subscribe(s -> System.out.println("I need to see " + s));
}
void anotherMethod(Mono<String> mono) {
mono.subscribe(s -> System.out.println("I need to talk to " + s));
}
public static void main(String[] args) {
Mono myMono = Mono.just("Bob").cache();
Test t = new Test();
t.useMonoVal(myMono);
t.anotherMethod(myMono);
}
}
(The cache() method ensures that the Mono is only evluated once and then cached for all future subscribers, which is irrelevant when using the just() factory of course, but just there for the sake of a complete example.)
To expand, the whole point of using a reactive paradigm (and therefore reactor, and by extension its Mono and Flux objects) is that it enables you to code in a non-blocking way, meaning that the current thread of execution isn't "held up" waiting for the mono to emit a value.
Side note: I'm not sure it's directly relevant to the question, but you can't do a.map(System.out::println); - you probably mean a.subscribe(System.out::println);.
You have to subscribe before getting the string from mono object
Just have a look at this for more detail How to get String from Mono<String> in reactive java
public class MonoT {
static String x = null;
public static void main(String[] args) {
// TODO Auto-generated method stub
Mono<String> username = Mono.just("Naren");
username.subscribe(v->{
x = v;
});
System.out.println(x);
}
}
I have a class in which I am populating a map liveSocketsByDatacenter from a single background thread every 30 seconds inside updateLiveSockets() method and then I have a method getNextSocket() which will be called by multiple reader threads to get a live socket available which uses the same map to get this information.
public class SocketManager {
private static final Random random = new Random();
private final ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
private final AtomicReference<Map<Datacenters, List<SocketHolder>>> liveSocketsByDatacenter =
new AtomicReference<>(Collections.unmodifiableMap(new HashMap<>()));
private final ZContext ctx = new ZContext();
// Lazy Loaded Singleton Pattern
private static class Holder {
private static final SocketManager instance = new SocketManager();
}
public static SocketManager getInstance() {
return Holder.instance;
}
private SocketManager() {
connectToZMQSockets();
scheduler.scheduleAtFixedRate(new Runnable() {
public void run() {
updateLiveSockets();
}
}, 30, 30, TimeUnit.SECONDS);
}
// during startup, making a connection and populate once
private void connectToZMQSockets() {
Map<Datacenters, ImmutableList<String>> socketsByDatacenter = Utils.SERVERS;
// The map in which I put all the live sockets
Map<Datacenters, List<SocketHolder>> updatedLiveSocketsByDatacenter = new HashMap<>();
for (Map.Entry<Datacenters, ImmutableList<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> addedColoSockets = connect(entry.getKey(), entry.getValue(), ZMQ.PUSH);
updatedLiveSocketsByDatacenter.put(entry.getKey(),
Collections.unmodifiableList(addedColoSockets));
}
// Update the map content
this.liveSocketsByDatacenter.set(Collections.unmodifiableMap(updatedLiveSocketsByDatacenter));
}
private List<SocketHolder> connect(Datacenters colo, List<String> addresses, int socketType) {
List<SocketHolder> socketList = new ArrayList<>();
for (String address : addresses) {
try {
Socket client = ctx.createSocket(socketType);
// Set random identity to make tracing easier
String identity = String.format("%04X-%04X", random.nextInt(), random.nextInt());
client.setIdentity(identity.getBytes(ZMQ.CHARSET));
client.setTCPKeepAlive(1);
client.setSendTimeOut(7);
client.setLinger(0);
client.connect(address);
SocketHolder zmq = new SocketHolder(client, ctx, address, true);
socketList.add(zmq);
} catch (Exception ex) {
// log error
}
}
return socketList;
}
// this method will be called by multiple threads to get the next live socket
// is there any concurrency or thread safety issue or race condition here?
public Optional<SocketHolder> getNextSocket() {
// For the sake of consistency make sure to use the same map instance
// in the whole implementation of my method by getting my entries
// from the local variable instead of the member variable
Map<Datacenters, List<SocketHolder>> liveSocketsByDatacenter =
this.liveSocketsByDatacenter.get();
Optional<SocketHolder> liveSocket = Optional.absent();
List<Datacenters> dcs = Datacenters.getOrderedDatacenters();
for (Datacenters dc : dcs) {
liveSocket = getLiveSocket(liveSocketsByDatacenter.get(dc));
if (liveSocket.isPresent()) {
break;
}
}
return liveSocket;
}
// is there any concurrency or thread safety issue or race condition here?
private Optional<SocketHolder> getLiveSocketX(final List<SocketHolder> endpoints) {
if (!CollectionUtils.isEmpty(endpoints)) {
// The list of live sockets
List<SocketHolder> liveOnly = new ArrayList<>(endpoints.size());
for (SocketHolder obj : endpoints) {
if (obj.isLive()) {
liveOnly.add(obj);
}
}
if (!liveOnly.isEmpty()) {
// The list is not empty so we shuffle it an return the first element
Collections.shuffle(liveOnly);
return Optional.of(liveOnly.get(0));
}
}
return Optional.absent();
}
// Added the modifier synchronized to prevent concurrent modification
// it is needed because to build the new map we first need to get the
// old one so both must be done atomically to prevent concistency issues
private synchronized void updateLiveSockets() {
Map<Datacenters, ImmutableList<String>> socketsByDatacenter = Utils.SERVERS;
// Initialize my new map with the current map content
Map<Datacenters, List<SocketHolder>> liveSocketsByDatacenter =
new HashMap<>(this.liveSocketsByDatacenter.get());
for (Entry<Datacenters, ImmutableList<String>> entry : socketsByDatacenter.entrySet()) {
List<SocketHolder> liveSockets = liveSocketsByDatacenter.get(entry.getKey());
List<SocketHolder> liveUpdatedSockets = new ArrayList<>();
for (SocketHolder liveSocket : liveSockets) { // LINE A
Socket socket = liveSocket.getSocket();
String endpoint = liveSocket.getEndpoint();
Map<byte[], byte[]> holder = populateMap();
Message message = new Message(holder, Partition.COMMAND);
boolean status = SendToSocket.getInstance().execute(message.getAdd(), holder, socket);
boolean isLive = (status) ? true : false;
// is there any problem the way I am using `SocketHolder` class?
SocketHolder zmq = new SocketHolder(socket, liveSocket.getContext(), endpoint, isLive);
liveUpdatedSockets.add(zmq);
}
liveSocketsByDatacenter.put(entry.getKey(),
Collections.unmodifiableList(liveUpdatedSockets));
}
this.liveSocketsByDatacenter.set(Collections.unmodifiableMap(liveSocketsByDatacenter));
}
}
As you can see in my class:
From a single background thread which runs every 30 seconds, I populate liveSocketsByDatacenter map with all the live sockets in updateLiveSockets() method.
And then from multiple threads, I call the getNextSocket() method to give me a live socket available which uses a liveSocketsByDatacenter map to get the required information.
I have my code working fine without any issues and wanted to see if there is any better or more efficient way to write this. I also wanted to get an opinion on thread safety issues or any race conditions if any are there, but so far I haven't seen any but I could be wrong.
I am mostly worried about updateLiveSockets() method and getLiveSocketX() method. I am iterating liveSockets which is a List of SocketHolder at LINE A and then making a new SocketHolder object and adding to another new list. Is this ok here?
Note: SocketHolder is an immutable class.
Neither of codes B or C is thread-safe.
Code B
When you are iterating on the enpoints list to copy it, nothing prevents another thread to modify, i.e. elements to be added and/or removed.
Code C
Assuming endpoints is not null, you are doing three calls to the list object: isEmpty, size, and get. There are several problems from a concurrency perspective:
Based on the type List<SocketHolder> of the argument, there is no guarantee that these methods enforce internal changes to the list to be propagated to other threads (memory visibility), let apart race conditions (if the list is modified while while your thread execute one of this function).
Let's suppose that the list endpoints provide the guarantee described just before - e.g. it has been wrapped with Collections.synchronizedList(). In this case, thread safety is still missing because between each of the calls to isEmpty, size, and get, the list can be modified while your thread executes the getLiveSocketX method. This could make your code use an outdated state of the list. For instance, your could use a size returned by endpoints.size() which is no longer valid because an element has been added to or removed from the list.
Edit - after code update
In the code you provided, it seems at first sight that:
You are indeed not co-modifying the endpoints list we were discussing about before in the method getLiveSocketX, because the method updateLiveSockets creates a new list liveUpdatedSockets which you populate from the existing liveSockets.
You use an AtomicReference to keep a map of Datacenters to the lists of sockets of interest. The consequences of this AtomicReference is to force memory visibility from this map down to all the lists and their elements. This means that, by side-effect, you are protected from memory inconsistencies between your "producer" and "consumer" threads (executing updateLiveSockets and getLiveSocket respectively). You are still exposed to race conditions, though - let's imagine updateLiveSockets and getLiveSocket running at the same time. Consider a socket S which status just switches from alive to closed. updateLiveSockets will see the status of a socket S as non-alive and create a new SocketHolder accordingly. However, getLiveSocket which is running at the exact same time will see an outdated state of S - since it will still use the list of sockets which updateLiveSockets is re-creating.
The synchronized keyword used on the method updateLiveSockets does not provide you any guarantee here, because no other part of the code is also synchronized.
To summarize, I would say:
The code of getLiveSocketX as it is written is not inherently thread-safe;
However, the way you copy the lists prevents concurrent modifications; and you are benefiting from a side-effect of the AtomicReference to have the minimal guarantee on memory visibility one would expect to get consistent list of sockets in getNextSocket after those have been generated from another thread;
You are still exposed to race conditions as described in (2), but this may be fine depending on the specifications you wish to confer to the getLiveSocket and getNextSocket methods - you may accept one socket returned by the getLiveSocket to be unavailable and have a retry mechanism.
All of that being said, I would thoroughly review and refactor the code to exhibit a more readable and explicit thread-safe consumer/producer pattern. Extra care should be taken with the use of the AtomicReference and the single synchronized, which seem to me being improperly used - although in fine the AtomicReference does help you as discussed before.
I am just learning and trying to apply CompletableFuture to my problem statement. I have a list of items I am iterating over.
Prop is a class with only two attributes prop1 and prop2, respective getters and setters.
List<Prop> result = new ArrayList<>();
for ( Item item : items ) {
item.load();
Prop temp = new Prop();
// once the item is loaded, get its properties
temp.setProp1(item.getProp1());
temp.setProp2(item.getProp2());
result.add(temp);
}
return result;
However, item.load() here is a blocking call. So, I was thinking to use CompletableFuture something like below -
for (Item item : items) {
CompletableFuture<Prop> prop = CompletableFuture.supplyAsync(() -> {
try {
item.load();
return item;
} catch (Exception e) {
logger.error("Error");
return null;
}
}).thenApply(item1 -> {
try {
Prop temp = new Prop();
// once the item is loaded, get its properties
temp.setProp1(item.getProp1());
temp.setProp2(item.getProp2());
return temp;
} catch (Exception e) {
}
});
}
But I am not sure how I can wait for all the items to be loaded and then aggregate and return their result.
I may be completely wrong in the way of implementing CompletableFutures since this is my first attempt. Please pardon any mistake. Thanks in advance for any help.
There are two issues with your approach of using CompletableFuture.
First, you say item.load() is a blocking call, so the CompletableFuture’s default executor is not suitable for it, as it tries to achieve a level of parallelism matching the number of CPU cores. You could solve this by passing a different Executor to CompletableFuture’s asynchronous methods, but your load() method doesn’t return a value that your subsequent operations rely on. So the use of CompletableFuture complicates the design without a benefit.
You can perform the load() invocations asynchronously and wait for their completion just using an ExecutorService, followed by the loop as-is (without the already performed load() operation, of course):
ExecutorService es = Executors.newCachedThreadPool();
es.invokeAll(items.stream()
.map(i -> Executors.callable(i::load))
.collect(Collectors.toList()));
es.shutdown();
List<Prop> result = new ArrayList<>();
for(Item item : items) {
Prop temp = new Prop();
// once the item is loaded, get its properties
temp.setProp1(item.getProp1());
temp.setProp2(item.getProp2());
result.add(temp);
}
return result;
You can control the level of parallelism through the choice of the executor, e.g. you could use a Executors.newFixedThreadPool(numberOfThreads) instead of the unbounded thread pool.
I'm trying to delete a batch of couchbase documents in rapid fashion according to some constraint (or update the document if the constraint isn't satisfied). Each deletion is dubbed a "parcel" according to my terminology.
When executing, I run into a very strange behavior - the thread in charge of this task starts working as expected for a few iterations (at best). After this "grace period", couchbase gets "stuck" and the Observable doesn't call any of its Subscriber's methods (onNext, onComplete, onError) within the defined period of 30 seconds.
When the latch timeout occurs (see implementation below), the method returns but the Observable keeps executing (I noticed that when it kept printing debug messages when stopped with a breakpoint outside the scope of this method).
I suspect couchbase is stuck because after a few seconds, many Observables are left in some kind of a "ghost" state - alive and reporting to their Subscriber, which in turn have nothing to do because the method in which they were created has already finished, eventually leading to java.lang.OutOfMemoryError: GC overhead limit exceeded.
I don't know if what I claim here makes sense, but I can't think of another reason for this behavior.
How should I properly terminate an Observable upon timeout? Should I? Any other way around?
public List<InfoParcel> upsertParcels(final Collection<InfoParcel> parcels) {
final CountDownLatch latch = new CountDownLatch(parcels.size());
final List<JsonDocument> docRetList = new LinkedList<JsonDocument>();
Observable<JsonDocument> obs = Observable
.from(parcels)
.flatMap(parcel ->
Observable.defer(() ->
{
return bucket.async().get(parcel.key).firstOrDefault(null);
})
.map(doc -> {
// In-memory manipulation of the document
return updateDocs(doc, parcel);
})
.flatMap(doc -> {
boolean shouldDelete = ... // Decide by inner logic
if (shouldDelete) {
if (doc.cas() == 0) {
return Observable.just(doc);
}
return bucket.async().remove(doc);
}
return (doc.cas() == 0 ? bucket.async().insert(doc) : bucket.async().replace(doc));
})
);
obs.subscribe(new Subscriber<JsonDocument>() {
#Override
public void onNext(JsonDocument doc) {
docRetList.add(doc);
latch.countDown();
}
#Override
public void onCompleted() {
// Due to a bug in RxJava, onError() / retryWhen() does not intercept exceptions thrown from within the map/flatMap methods.
// Therefore, we need to recalculate the "conflicted" parcels and send them for update again.
while(latch.getCount() > 0) {
latch.countDown();
}
}
#Override
public void onError(Throwable e) {
// Same reason as above
while (latch.getCount() > 0) {
latch.countDown();
}
}
};
);
latch.await(30, TimeUnit.SECONDS);
// Recalculating remaining failed parcels and returning them for another cycle of this method (there's a loop outside)
}
I think this is indeed due to the fact that using a countdown latch doesn't signal the source that the flow of data processing should stop.
You could use more of rxjava, by using toList().timeout(30, TimeUnit.SECONDS).toBlocking().single() instead of collecting in an (un synchronized and thus unsafe) external list and of using the countdownLatch.
This will block until a List of your documents is returned.
When you create your couchbase env in code, set computationPoolSize to something large. When the Couchbase clients runs out of threads using async it just stops working, and wont ever call the callback.
I have the following issue after trying to run my webapplication on Linux server.
When running on windows, everything works perfectly (simplified version) - call send() method, wait for JMS response on synchronizer object, send the response to client)...
When started on linux server (same JVM version - 1.7, bytecode - java 1.5 version), I get response only for the first message, and following error in log for the rest of the messages:
synchronizer is null /*my_generated_message_id*/
It looks like JMS message listener thread cannot see new entries (created in JMS sender Thread) in synchronizers map, but I don't understand why...
Synchronizers Map definition:
public final Map<String, ReqRespSynchro<Map>> synchronizers
= Collections.synchronizedMap(new HashMap<String, ReqRespSynchro<Map>>());
Sending JMS request with active response awaiting:
#Override
public Map send(Map<String,Object> params) {
String msgIdent = ""/*my_generated_message_id*/;
Map response = null;
ReqRespSynchro<Map> synchronizer = synchronizers.get(msgIdent);
if (synchronizer == null) {
synchronizer = new ReqRespSynchro<Map>();
synchronizers.put(msgIdent , synchronizer);
}
synchronized(synchronizer) {
try {
sender.send(params);
} catch (Exception ex) {
log.error("send error", ex);
}
synchronizer.initSendSequence();
int iter = 1;
try {
while (!synchronizer.isSet() && iter > 0) {
synchronizer.wait(this.waitTimeout);
iter--;
}
} catch (Exception ex) {
log.error("send error 2", ex);
return null;
} finally {
response = (synchronizers.remove(msgIdent )).getRespObject();
}
}
return response;
}
JMS onMessage response processing (separate thread):
public void onMessage(Message msg) {
Map<String,Object> response = (Map<String,Object>) om.getObject();
String msgIdent = response.getMyMsgID(); ///*my_generated_message_id*/
ReqRespSynchro<Map> synchronizer = synchronizers.get(msgIdent);
if (synchronizer != null) {
synchronized (synchronizer) {
msgSynchronizer.setRespObject(response);
synchronizer.notify();
}
} else {
log.error("synchronizer is null " + msgIdent);
}
}
Synchronizer class:
public class ReqRespSynchro<E> {
private E obj = null;
public synchronized void setRespObject(E obj) {
this.obj = obj;
}
public synchronized void initSendSequence() {
this.obj = null;
}
public synchronized boolean isSet() {
return this.obj != null;
}
public synchronized E getRespObject() {
E ret = null;
ret = obj;
return ret;
}
}
Your code bears the “check-then-act” anti-pattern.
ReqRespSynchro<Map> synchronizer = synchronizers.get(msgIdent);
if (synchronizer == null) {
synchronizer = new ReqRespSynchro<Map>();
synchronizers.put(msgIdent , synchronizer);
}
Here, you first check whether the synchronizers contains a particular mapping then you act by putting a new mapping when the mapping is not present, but by the time you act, there is no guaranty that the condition you have checked still holds.
While the map returned by Collections.synchronizedMap guarantees thread-safe put and get methods, it does not (and can’t) guaranty that there won’t be an update between subsequent invocation of get and put.
So if two threads execute the code above, there is the possibility that one thread puts a new value while the other already has performed the get operation but not the put operation and will therefore proceed with putting a new value, overwriting the existing. So the threads will use different ReqRespSynchro instances and so will the other threads get either of these from the map.
The correct use would be to synchronize the entire compound operation:
synchronized(synchronizers) {
ReqRespSynchro<Map> synchronizer = synchronizers.get(msgIdent);
if (synchronizer == null) {
synchronizer = new ReqRespSynchro<Map>();
synchronizers.put(msgIdent , synchronizer);
}
}
It’s a common mistake to think that by wrapping a map or collection into a synchronized one, every thread safety issue was solved. But you still have to think about access patterns and guard compound operations manually, so sometimes you’re better off using manual locking only and resist the temptation of easy-to-use synchronized wrappers.
But note the ConcurrentMap was added to the Java API to address this use pattern (amongst others).
Change the map declaration to
public final ConcurrentHashMap<String, ReqRespSynchro<Map>> synchronizers
= new ConcurrentHashMap<>();
This map provides thread safe put and get methods, but also methods allowing to avoid the “check-then-act” anti-pattern for updates.
Using the ConcurrentMap under Java 8 is especially easy:
ReqRespSynchro<Map> synchronizer = synchronizers
.computeIfAbsent(msgIdent, key -> new ReqRespSynchro<>());
The invocation of computeIfAbsent will get the ReqRespSynchro<Map>, if there is one, otherwise the provided function will be executed to compute a value which will get stored, all with atomicity guaranty. The places where you simply get an existing instance need no change.
The pre-Java 8 code is a bit more convoluted:
ReqRespSynchro<Map> synchronizer = synchronizers.get(msgIdent);
if (synchronizer == null) {
synchronizer = new ReqRespSynchro<>();
ReqRespSynchro<Map> concurrent = synchronizers.putIfAbsent(msgIdent , synchronizer);
if(concurrent!=null) synchronizer = concurrent;
}
Here, we can’t perform the operation atomically, but we are able to detect if a concurrent update happened in-between. In this case, putIfAbsent will not modify the map but return the value already contained in the map. So if we encounter such a situation, all we have to do is to use that existing one instead of the one we attempted to put.
This could happen if your waitTimeout in send() method is too short. You only have one iteration set for the waiting cycle. So the msgIdent entry may be removed from the map in finally block in send before it can be read in onMessage(): wait timeout expires, iteration counter is decremented, thread exits the cycle and removes the entry from map.
Even if waitTimeout is long enough you may experience a so-called spurious wakeup:
A thread can also wake up without being notified, interrupted, or timing out, a so-called spurious wakeup. While this will rarely occur in practice, applications must guard against it by testing for the condition that should have caused the thread to be awakened, and continuing to wait if the condition is not satisfied.
By the way why don't you send response back via JMS without some cryptic synchronization? Here is an example for ActiveMQ message broker: How should I implement request response with JMS?