We are currently using Hazelcast 3.1.5.
I have a simple distributed locking mechanism that is supposed to provide thread safety across multiple JVM nodes. Code is pretty simple.
private static HazelcastInstance hInst = getHazelcastInstance();
private IMap<String, Integer> mapOfLocks = null;
...
...
mapOfLocks = hInst.getMap("mapOfLocks");
if (mapOfLocks.get(name) == null) {
mapOfLocks.put(name,1);
mapOfLocks.lock(name);
}
else {
mapOfLocks.put(name,mapOfLocks.get(name)+1);
}
...
<STUFF HAPPENS HERE>
mapOfLocks.unlock(name);
..
}
Earlier, I used to call HazelcastInstance.getLock() directly and things seemed to work, though we never saw anything out of place when multiple JVMs were involved.
Recently, I was asked to investigate a database deadlock in block, and after weeks of investigation and log analysis, I was able to determine this was caused by multiple threads being able to acquire the lock against the same key. Before the first thread can commit the code, second thread manages to get another lock, at which point the second thread is blocked by the Database lock from the first thread.
Is there any outstanding bug against Hazelcast implementation of distributed locks, should I be doing anything differently with my configuration?
And, Oh my configuration has multicast disabled and tcp-ip enabled
Here is how you can use IMap as a lock container.
You don't need to have entry for the name present in the map in order to lock it.
HazelcastInstance instance = Hazelcast.newHazelcastInstance();
IMap<Object, Object> lockMap = instance.getMap("lockMap");
lockMap.lock(name);
try {
//do some work
} finally {
lockMap.unlock(name);
}
Related
I'm exposing a legacy web app on GraphQL, but this web app uses Threadlocals (amongst other Apache-Shiro).
Since GraphQL-java seems to be using the fork-join pool for concurrency I worry about how far I need to go to ensure that my ThreadLocals still work and work safely.
Reading the documentation and the source it seems a large part of the concurrency is achieved by DataFetchers that return CompletableFuture's I can't tell for sure if that's the only source of concurrency (i think not) and whether the DataFetchers themselves are invoked from the fork-join pool
So would it be Safe to wrap my DataFetcher's in a delegate that set and clears the ThreadLocals? or does that still have the risk of being preempted and continued on another thread in the fork-join pool something like:
static class WrappedDataFetcher implements DataFetcher<Object> {
private DataFetcher<?> realDataFetcher;
WrappedDataFetcher(DataFetcher<?> realDataFetcher) {
this.realDataFetcher = realDataFetcher;
}
#Override
public Object get(DataFetchingEnvironment dataFetchingEnvironment) throws Exception {
try {
setThreadLocalsFromRequestOrContext(dataFetchingEnvironment);
return realDataFetcher.get(dataFetchingEnvironment);
} finally {
clearTreadLocals();
}
}
}
Or would I need to explicitly run my DataFetchers in a Threadpool like:
static class WrappedDataFetcherThreadPool implements DataFetcher<Object> {
private DataFetcher<?> wrappedDataFetcher;
private ThreadPoolExecutor executor;
WrappedDataFetcherThreadPool(DataFetcher<?> realDataFetcher, ThreadPoolExecutor executor) {
// Wrap in Wrapper from previous example to ensure threadlocals in the executor
this.wrappedDataFetcher = new WrappedDataFetcher(realDataFetcher);
this.executor = executor;
}
#Override
public Object get(DataFetchingEnvironment dataFetchingEnvironment) throws Exception {
Future<?> future = executor.submit(() -> wrappedDataFetcher.get(dataFetchingEnvironment));
return future.get(); //for simplicity / clarity of the question
}
}
I think the second one solves my problem but it feels like overkill and I worry about performance. But I think the first risks preemption.
If there is a better way to handle this I would love to hear it as well.
Note: this is not about the async nature of GraphQL (I hope to leverage that as well) but about the possible side effect of running multiple requests WITH treadLocals that might get mixed up between requests due to the fork-join pool
As far as I know graphql-java does not use its own thread pool and relies on the application for it. The way it achieves it using future callbacks. Say this is the current state of the application.
Thread T_1 with thread local storage TLS_1 executing data fetcher DF_1.
Graphql-java engine attaches a synchronous callback to the future returned by DF_1. If a future is not returned it wraps the result in a completed future and then attaches the synchronous callback. Since the callback is synchronous the thread that completes the future runs the callback. If any other thread apart from T_1 completes the future, TLS_1 is lost(unless it's copied over to the executing thread). One example of this is a non blocking HTTP I/O library which uses an I/O thread to complete the response future.
Here is a link where the authors have commented more on the thread behavior in graphql-java library
https://spectrum.chat/graphql-java/general/how-to-supply-custom-executor-service-for-data-fetchers-to-run-on~29caa730-9114-4883-ab4a-e9700f225f93
I am looking for a rule engine for my web application and I found Easy Rules. However, in the FAQ section, it states that the limitation on thread safety.
Is a Web Container considered as a multi-threaded environment? For HTTP request is probably processed by a worker thread created by the application server.
How does thread safety comes into place?
How to deal with thread safety?
If you run Easy Rules in a multi-threaded environment, you should take into account the following considerations:
Easy Rules engine holds a set of rules, it is not thread safe.
By design, rules in Easy Rules encapsulate the business object model they operate on, so they are not thread safe neither.
Do not try to make everything synchronized or locked down!
Easy Rules engine is a very lightweight object and you can create an instance per thread, this is by far the easiest way to avoid thread safety problems
http://www.easyrules.org/get-involved/faq.html
http://www.easyrules.org/tutorials/shop-tutorial.html
Based on this example, how will multi-threading affects the rule engine?
public class AgeRule extends BasicRule {
private static final int ADULT_AGE = 18;
private Person person;
public AgeRule(Person person) {
super("AgeRule",
"Check if person's age is > 18 and
marks the person as adult", 1);
this.person = person;
}
#Override
public boolean evaluate() {
return person.getAge() > ADULT_AGE;
}
#Override
public void execute() {
person.setAdult(true);
System.out.printf("Person %s has been marked as adult",
person.getName());
}
}
public class AlcoholRule extends BasicRule {
private Person person;
public AlcoholRule(Person person) {
super("AlcoholRule",
"Children are not allowed to buy alcohol",
2);
this.person = person;
}
#Condition
public boolean evaluate() {
return !person.isAdult();
}
#Action
public void execute(){
System.out.printf("Shop: Sorry %s,
you are not allowed to buy alcohol",
person.getName());
}
}
public class Launcher {
public void someMethod() {
//create a person instance
Person tom = new Person("Tom", 14);
System.out.println("Tom:
Hi! can I have some Vodka please?");
//create a rules engine
RulesEngine rulesEngine = aNewRulesEngine()
.named("shop rules engine")
.build();
//register rules
rulesEngine.registerRule(new AgeRule(tom));
rulesEngine.registerRule(new AlcoholRule(tom));
//fire rules
rulesEngine.fireRules();
}
}
Yes, a web application is multithreaded. As you expect, there is a pool of threads maintained by the server. When the serversocket gets an incoming request on the port it's listening to, it delegates the request to a thread from the pool.Typically the request is executed on that thread until the response is completed.
If you try to create a single rules engine and let multiple threads access it, then either
the rules engine data is corrupted as a result of being manipulated by multiple threads (because data structures not meant to be threadsafe can perform operations in multiple steps that can be interfered with by other threads as they're accessing and changing the same data), or
you use locking to make sure only one thread at a time can use the rules engine, avoiding having your shared object get corrupted, but in the process creating a bottleneck. All of your requests will need to wait for the rules engine to be available and only one thread at a time can make progress.
It's much better to give each request its own copy of the rules engine, so it doesn't get corrupted and there is no need for locking. The ideal situation for threads is for each to be able to execute independently without needing to contend for shared resources.
In my service code, I am trying to create or update a Person domain object:
#Transactional
def someServiceMethod(some params....) {
try{
def person = Person.findByEmail(nperson.email.toLowerCase())
if (!person) {
person = new Person()
person.properties = nperson.properties
} else {
// update the person parameters (first/last name)
person.firstName = nperson.firstName
person.lastName = nperson.lastName
person.phone = nperson.phone
}
if (person.validate()) {
person.save(flush: true)
//... rest of code
}
// rest of other code....
} catch(e) {
log.error("Unknown error: ${e.getMessage()}", e)
e.printStackTrace()
return(null)
}
Now above code OCCASIONALLY when trying to save a Person object with already existing email throws following exception:
Hibernate operation: could not execute statement; SQL [n/a]; Duplicate entry 'someemail#gmail.com' for key 'email_UNIQUE'; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry 'someemail#gmail.com' for key 'email_UNIQUE'
This is very strange because I am already finding the person by email and hence the save() should try to update the record instead of creating the new one.
I was wondering why is this happening!
EDIT:
I am on grails 2.4.5 and Hibernate plugin in BuildConfig is:
runtime ':hibernate4:4.3.8.1'
EDIT2:
My application is on multiple servers hence synchronized block won't work
If this is concurrency issue, here is what we do in such case. We have a lot of concurrent background processes which work on the same tables. If there is such operation it indeed is in synchronized block, so code may look like:
class SomeService {
static transactional = false //service cannot be transactional
private Object someLock = new Object() //synchronized block on some object must be used
def someConcurrentSafeMethod(){
synchronized(someLock){
def person = Person.findByEmail(nperson.email.toLowerCase())
...
person.save(flush: true) // flush is very important, must be done in synchronized block
}
}
}
There are few important points to make this working (from our experience, not official):
Service cannot be transactional - if service is transactional, transaction is commited after the method returns value and synchronization inside method will not be enough. Programmatic transactions may be another way
synchronized method is not enough synchronized def someConcurrentSafeMethod() will not work - probably because service is wrapped in proxy
Session MUST be flushed inside synchronized block
every object which will be saved, should be read in synchronized block, if you pass it from external method, you may run into optimistic locking failed exception
UPDATED
Because application is deployed on distributed system, above will not solve the issue here (still may help others). After discussion we had on Slack, I just summarize potential ways to do that:
pessimistic locking of updated objects and lock of whole table for inserts (if possible)
moving 'dangerous' database related methods to single server with some API like REST and calling it from other deployments (and using synchronized approach from above)
using multiple save approach - if operation fails, catch exception and try again. This is supported by integration libraries like Spring Integration or Apache Camel and is one of enterprise patterns. See request-handler-advice-chain for Spring Integration as an example
use something to queue operations, for example JMS server
If anyone has more ideas please share them.
I am using Apache TomEE 1.5.2 JAX-RS, pretty much out of the box, with the predefined HSQLDB.
The following is simplified code. I have a REST-style interface for receiving signals:
#Stateless
#Path("signal")
public class SignalEndpoint {
#Inject
private SignalStore store;
#POST
public void post() {
store.createSignal();
}
}
Receiving a signal triggers a lot of stuff. The store will create an entity, then fire an asynchronous event.
public class SignalStore {
#PersistenceContext
private EntityManager em;
#EJB
private EventDispatcher dispatcher;
#Inject
private Event<SignalEntity> created;
public void createSignal() {
SignalEntity entity = new SignalEntity();
em.persist(entity);
dispatcher.fire(created, entity);
}
}
The dispatcher is very simple, and merely exists to make the event handling asynchronous.
#Stateless
public class EventDispatcher {
#Asynchronous
public <T> void fire(Event<T> event, T parameter) {
event.fire(parameter);
}
}
Receiving the event is something else, which derives data from the signal, stores it, and fires another asynchronous event:
#Stateless
public class DerivedDataCreator {
#PersistenceContext
private EntityManager em;
#EJB
private EventDispatcher dispatcher;
#Inject
private Event<DerivedDataEntity> created;
#Asynchronous
public void onSignalEntityCreated(#Observes SignalEntity signalEntity) {
DerivedDataEntity entity = new DerivedDataEntity(signalEntity);
em.persist(entity);
dispatcher.fire(created, entity);
}
}
Reacting to that is even a third layer of entity creation.
To summarize, I have a REST call, which synchronously creates a SignalEntity, which asynchronously triggers the creation of a DerivedDataEntity, which asynchronously triggers the creation of a third type of entity. It all works perfectly, and the storage processes are beautifully decoupled.
Except for when I programmatically trigger a lot (f.e. 1000) of signals in a for-loop. Depending on my AsynchronousPool size, after processing signals (quite fast) in the amount of about half of that size, the application completely freezes for up to some minutes. Then it resumes, to process about the same amount of signals, quite fast, before freezing again.
I have been playing around with AsynchronousPool settings for the last half hour. Setting it to 2000, for instance, will easily make all my signals be processed at once, without any freezes. But the system isn't sane either, after that. Triggering another 1000 signals, resulted in them being created allright, but the entire creation of derived data never happened.
Now I am completely at a loss as to what to do. I can of course get rid of all those asynchronous events and implement some sort of queue myself, but I always thought the point of an EE container was to relieve me of such tedium. Asynchronous EJB events should already bring their own queue mechanism. One which should not freeze as soon as the queue is too full.
Any ideas?
UPDATE:
I have now tried it with 1.6.0-SNAPSHOT. It behaves a little bit differently: It still doesn't work, but I do get an exception:
Aug 01, 2013 3:12:31 PM org.apache.openejb.core.transaction.EjbTransactionUtil handleSystemException
SEVERE: EjbTransactionUtil.handleSystemException: fail to allocate internal resource to execute the target task
javax.ejb.EJBException: fail to allocate internal resource to execute the target task
at org.apache.openejb.async.AsynchronousPool.invoke(AsynchronousPool.java:81)
at org.apache.openejb.core.ivm.EjbObjectProxyHandler.businessMethod(EjbObjectProxyHandler.java:240)
at org.apache.openejb.core.ivm.EjbObjectProxyHandler._invoke(EjbObjectProxyHandler.java:86)
at org.apache.openejb.core.ivm.BaseEjbProxyHandler.invoke(BaseEjbProxyHandler.java:303)
at <<... my code ...>>
...
Caused by: java.util.concurrent.RejectedExecutionException: Timeout waiting for executor slot: waited 30 seconds
at org.apache.openejb.util.executor.OfferRejectedExecutionHandler.rejectedExecution(OfferRejectedExecutionHandler.java:55)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:132)
at org.apache.openejb.async.AsynchronousPool.invoke(AsynchronousPool.java:75)
... 38 more
It is as though TomEE would not do ANY queueing of operations. If no thread is free to process in the moment of the call, tough luck. Surely, this cannot be intended..?
UPDATE 2:
Okay, I seem to have stumbled upon a semi-solution: Setting the AsynchronousPool.QueueSize property to maxint solves the freeze. But questions remain: Why is the QueueSize so limited in the first place, and, more worryingly: Why would this block the entire application? If the queue is full, it blocks, but as soon as a task is taken from it, another should pop in, right? The queue appears to be blocked until it is completely empty again.
UPDATE 3:
For anyone who wants to have a go: http://github.com/JanDoerrenhaus/tomeefreezetestcase
UPDATE 4:
As it turns out, increasing the queue size does NOT solve the problem, it merely delays it. The problem remains the same: Too many asynchronous operations at once, and TomEE chockes so bad, that it cannot even undeploy the application on termination anymore.
So far, my diagnosis is that the task cleanup does not work properly. My tasks are all very small and fast (see the test case on github). I was already afraid that it would be OpenJPA or HSQLDB slowing down on too many concurrent calls, but I commented out all em.persist calls, and the problem remained the same. So if my tasks are quite small and fast, but still manage to block out TomEE so bad that it could not get any further task in after 30 seconds (javax.ejb.EJBException: fail to allocate internal resource to execute the target task), I would imagine that completed tasks linger, clogging up the pipe, so to speak.
How could I resolve this issue?
Basically BlockingQueues use locks to ensure the consistency of data and avoid data loss, so in too highly concurrent environment it will reject a lot of tasks (your case).
You can play on trunk with the RejectedExecutionHandler implementation to retry to offer the task. One implementation can be:
new RejectedExecutionHandler() {
#Override
public void rejectedExecution(final Runnable r, final ThreadPoolExecutor executor) {
for (int i = 0; i < 10; i++) {
if (executor.getQueue().offer(r)) {
return;
}
try {
Thread.sleep(50);
} catch (final InterruptedException e) {
// no-op
}
}
throw new RejectedExecutionException();
}
}
It even works better with random sleep (between min and max).
The idea is basically: if the queue is full, wait some short time to reduce the concurrency.
configurable through WEB-INF/application.properties https://issues.apache.org/jira/browse/TOMEE-1012
I have implemented an Actor system using Akka and its Java API UntypedActor. In it, one actor (type A) starts other actors (type B) dynamically on demand, using getContext().actorOf(...);. Those B actors will do some computation which A doesn't really care about anymore. But I'm wondering: is it necessary to clean up those actors of type B when they have finished? If so, how?
By having B actors call getContext().stop(getSelf()) when they're done?
By having B actors call getSelf().tell(Actors.poisonPill()); when they're done? [this is what I'm using now].
By doing nothing?
By ...?
The docs are not clear on this, or I have overlooked it. I have some basic knowledge of Scala, but the Akka sources aren't exactly entry-level stuff...
What you are describing are single-purpose actors created per “request” (defined in the context of A), which handle a sequence of events and then are done, right? That is absolutely fine, and you are right to shut those down: if you don’t, they will accumulate over time and you run into a memory leak. The best way to do this is the first of the possibilities you mention (most direct), but the second is also okay.
A bit of background: actors are registered within their parent in order to be identifyable (e.g. needed in remoting but also in other places) and this registration keeps them from being garbage collected. OTOH, each parent has a right to access the children it created, hence no automatic termination (i.e. by Akka) makes sense, instead requiring explicit shutdown in user code.
In addition to Roland Kuhn's answer, rather than create a new actor for every request, you could create a predefined set of actors that share the same dispatcher, or you can use a router that distributes requests to a pool of actors.
The Balancing Pool Router, for example, allows you to have a fixed set of actors of a particular type share the same mailbox:
akka.actor.deployment {
/parent/router9 {
router = balancing-pool
nr-of-instances = 5
}
}
Read the documentation on dispatchers and on routing for further detail.
I was profiling(visualvm) one of the sample cluster application from AKKA documentation and I see garbage collection cleaning up the per request actors during every GC. Unable to completely understand the recommendation of explicitly killing the actor after use. My actorsystem and actors are managed by SPRING IOC container and I use spring extension in-direct actor-producer to create actors. The "aggregator" actor is getting garbage collected on every GC, i did monitor the # of instances in visual VM.
#Component
#Scope(ConfigurableBeanFactory.SCOPE_PROTOTYPE)
public class StatsService extends AbstractActor {
private final LoggingAdapter log = Logging.getLogger(getContext().getSystem(), this);
#Autowired
private ActorSystem actorSystem;
private ActorRef workerRouter;
#Override
public void preStart() throws Exception {
System.out.println("Creating Router" + this.getClass().getCanonicalName());
workerRouter = getContext().actorOf(SPRING_PRO.get(actorSystem)
.props("statsWorker").withRouter(new FromConfig()), "workerRouter");
super.preStart();
}
#Override
public Receive createReceive() {
return receiveBuilder()
.match(StatsJob.class, job -> !job.getText().isEmpty(), job -> {
final String[] words = job.getText().split(" ");
final ActorRef replyTo = sender();
final ActorRef aggregator = getContext().actorOf(SPRING_PRO.get(actorSystem)
.props("statsAggregator", words.length, replyTo));
for (final String word : words) {
workerRouter.tell(new ConsistentHashableEnvelope(word, word),
aggregator);
}
})
.build();
}
}
Actors by default do not consume much memory. If the application intends to use actor b later on, you can keep them alive. If not, you can shut them down via poisonpill. As long your actors are not holding resources, leaving an actor should be fine.