I have an application that is quite complex (it's a command and control center spring + angular based application that is meant to be used by police and other emergency center controllers).
Main component of application (lets call it backbone [spring web app]) is communication with different applications/hardware. Most of that communication is done by using RabbitMQ messages (lets call them telegrams or in short TMs).
When one of that TMs is received by backbone new thread is created and in it some method/methods is/are executed.
The problem is that it can happen that backbone receives two or more TMs almost at the same time, and since they are executed in different threads it can happen that they are not finessed in same order as they arrived and hence wrong information is presented to the user.
Usually, problems like this I handle with Redis. I have a redis lock that basically looks like this
distributedRedisLocker.lock(() -> {
executeSomeMethod();
}, howLongIsLockKept, howLongDoWeWaitForItToFinnish);
But in this case I would like to avoid using redis, is there any other java/spring based solution to this?
I do not need it to be same as redis lock that I have, only what I want is that TMs are proccessed in order they arrive, and if one of them fails somewhere in method execution that it does not block the next one forever.
Reposting as an answer.
One way to solve your problem is to avoid concurrency. You could use Executors.newSingleThreadExecutor() which only uses one thread:
ExecutorService executor = Executors.newSingleThreadExecutor();
and then
executor.execute(...);
or
executor.submit(...);
This would help you avoid races: if some TM A is added to execution queue defined by this executor before some TM B, then A will be executed before B as a whole.
Also, no explicit locks are involved (apart from implicit ones that can be contained inside the executor implementation, but they are encapsulated and will not stay handing forever in case of an error).
There is a subtle moment: if two TMs arrive at the same time, it's impossible to predict which will be added earlier and which later.
Related
I have an use case, where I want to do a POST request to another Java Service(Service B) from my service (Service A) whose response I don't care about but I am concerned about not changing the existing latencies for functionality of Service A that executes the call.
So, basically I have set of checks I perform in Service A which run in parallel using a ThreadPool. An ExecutorService instance is used to spawn different threads. Each check has an CompletabaleFuture associated. These CompletableFuture objects are submitted to the ExecutorService, which then provides threads for execution. The ThreadPool has a limited size.
For one of the check that I perform parallely, I also want to additionally perform a delegation which a new class will handle and try to execute a POST request to Service B. If I do delegation within the same thread as the one used to perform a particular check, then because of the limited size of ThreadPool, the overall performance of the parallel checks will be affected, ultimately affecting my service.
Is there a way to restructure or make the calls to new service in an efficient manner ? I'm aware that in Kotlin, there's something called as Co-routines. Not sure if Java has something similar that could be useful.
My project is not using Akka as of now, it runs as a single process and does too many things.
As a first inital step, we are trying to put a scaffolding in place so that we get Fault Tolerance. For example,
ProjectSupervisor
|
ExistingProcessActor
Where ExistingProcessActor would run as a forever long running job anf if it gets killed because of some Exception, ProjectSupervisor will restart it
I wasn't too sure about this, so I asked on user-group, and received an interesting advice as
[Me] def run: Unit = LogReaderDisruptor.main(Array())
is a method that is supposed to run forever, plus it required some setup (that is available on client's machine or test environment)
[Advice here] By that do you
mean that the main() is never returning? If so, then you're blocking
the Actor and wasting an entire thread. Spawn this off on a dedicated
dispatcher instead (see dispatchers and futures docs).
I went through the documentation but did not understood what I need to do.
Does that mean following?
ExistingProcessActor will start a new future by giving a custom dispatcher of type PinnedDispatcher?
If so, then in case of any failure how would ProjectSupervisor be notified? He is monitoring ExistingProcessActor and not future (future is not Actor)
I am confused, kindly guide me
You don't need to use a Future in order to have the work done in a dedicated dispatcher. This can be done by spawning an actor and assigning that actor to the dedicated dispatcher. A simple example from the akka docs:
val myActor =
context.actorOf(Props[MyActor].withDispatcher("my-dispatcher"), "myactor1")
http://doc.akka.io/docs/akka/snapshot/scala/dispatchers.html
You could do this within your project supervisor when creating the ExistingProcessActor, putting that actor on a pinned dispatcher, and then your supervision strategy will continue to work as you want.
I need to use memcached Java API in my Scala/Akka code. This API gives you both synchronous and asynchronous methods. The asynchronous ones return java.util.concurrent.Future. There was a question here about dealing with Java Futures in Scala here How do I wrap a java.util.concurrent.Future in an Akka Future?. However in my case I have two options:
Using synchronous API and wrapping blocking code in future and mark blocking:
Future {
blocking {
cache.get(key) //synchronous blocking call
}
}
Using asynchronous Java API and do polling every n ms on Java Future to check if the future completed (like described in one of the answers above in the linked question above).
Which one is better? I am leaning towards the first option because polling can dramatically impact response times. Shouldn't blocking { } block prevent from blocking the whole pool?
I always go with the first option. But i am doing it in a slightly different way. I don't use the blocking feature. (Actually i have not thought about it yet.) Instead i am providing a custom execution context to the Future that wraps the synchronous blocking call. So it looks basically like this:
val ecForBlockingMemcachedStuff = ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(100)) // whatever number you think is appropriate
// i create a separate ec for each blocking client/resource/api i use
Future {
cache.get(key) //synchronous blocking call
}(ecForBlockingMemcachedStuff) // or mark the execution context implicit. I like to mention it explicitly.
So all the blocking calls will use a dedicated execution context (= Threadpool). So it is separated from your main execution context responsible for non blocking stuff.
This approach is also explained in a online training video for Play/Akka provided by Typesafe. There is a video in lesson 4 about how to handle blocking calls. It is explained by Nilanjan Raychaudhuri (hope i spelled it correctly), who is a well known author for Scala books.
Update: I had a discussion with Nilanjan on twitter. He explained what the difference between the approach with blocking and a custom ExecutionContext is. The blocking feature just creates a special ExecutionContext. It provides a naive approach to the question how many threads you will need. It spawns a new thread every time, when all the other existing threads in the pool are busy. So it is actually an uncontrolled ExecutionContext. It could create lots of threads and lead to problems like an out of memory error. So the solution with the custom execution context is actually better, because it makes this problem obvious. Nilanjan also added that you need to consider circuit breaking for the case this pool gets overloaded with requests.
TLDR: Yeah, blocking calls suck. Use a custom/dedicated ExecutionContext for blocking calls. Also consider circuit breaking.
The Akka documentation provides a few suggestions on how to deal with blocking calls:
In some cases it is unavoidable to do blocking operations, i.e. to put
a thread to sleep for an indeterminate time, waiting for an external
event to occur. Examples are legacy RDBMS drivers or messaging APIs,
and the underlying reason is typically that (network) I/O occurs under
the covers. When facing this, you may be tempted to just wrap the
blocking call inside a Future and work with that instead, but this
strategy is too simple: you are quite likely to find bottlenecks or
run out of memory or threads when the application runs under increased
load.
The non-exhaustive list of adequate solutions to the “blocking
problem” includes the following suggestions:
Do the blocking call within an actor (or a set of actors managed by a router), making sure to configure a thread pool which is either
dedicated for this purpose or sufficiently sized.
Do the blocking call within a Future, ensuring an upper bound on the number of such calls at any point in time (submitting an unbounded
number of tasks of this nature will exhaust your memory or thread
limits).
Do the blocking call within a Future, providing a thread pool with an upper limit on the number of threads which is appropriate for the
hardware on which the application runs.
Dedicate a single thread to manage a set of blocking resources (e.g. a NIO selector driving multiple channels) and dispatch events as they
occur as actor messages.
The first possibility is especially well-suited for resources which
are single-threaded in nature, like database handles which
traditionally can only execute one outstanding query at a time and use
internal synchronization to ensure this. A common pattern is to create
a router for N actors, each of which wraps a single DB connection and
handles queries as sent to the router. The number N must then be tuned
for maximum throughput, which will vary depending on which DBMS is
deployed on what hardware.
I have a very complex system (100+ threads) which need to send email without blocking. My solution to the problem was to implement a class called EmailQueueSender which is started at the beginning of execution and has a ScheduledExecutorService which looks at an internal queue every 500ms and if size()>0 it empties it.
While this is going on there's a synchronized static method called addEmailToQueue(String[]) which accepts an email containing body,subject..etc as an array. The system does work, and my other threads can move on after adding their email to queue without blocking or even worrying if the email was successfully sent...it just seems to be a little messy...or hackish...Every programmer gets this feeling in their stomach when they know they're doing something wrong or there's a better way. That said, can someone slap me on the wrist and suggest a more efficient way to accomplish this?
Thanks!
http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/ThreadPoolExecutor.html
this class alone will probably handle most of the stuff you need.
just put the sending code in a runnable and add it with the execute method.
the getQueue method will allow you to retrieve the current list of waiting items so you can save it when restarting the sender service without losing emails
If you are using Java 6, then you can make heavy use of the primitives in the java.util.concurrent package.
Having a separate thread that handles the real sending is completely normal. Instead of polling a queue, I would rather use a BlockingQueue as you can use a blocking take() instead of busy-waiting.
If you are interested in whether the e-mail was successfully sent, your append method could return a Future so that you can pass the return value on once you have sent the message.
Instead of having an array of Strings, I would recommend creating a (almost trivial) Java class to hold the values. Object creation is cheap these days.
Im not sure if this would work for your application, but sounds like it would. A ThreadPoolExecutor (an ExecutorService-implementation) can take a BlockingQueue as argument, and you can simply add new threads to the queue. When you are done you simply terminate the ThreadPoolExecutor.
private BlockingQueue<Runnable> queue;
...
ThreadPoolExecutor executor = new ThreadPoolExecutor(10, 10, new Long(1000),
TimeUnit.MILLISECONDS, this.queue);
You can keep a count of all the threads added to the queue. When you think you are done (the queue is empty, perhaps?) simply compare this to
if (issuedThreads == pool.getCompletedTaskCount()) {
pool.shutdown();
}
If the two match, you are done. Another way to terminate the pool is to wait a second in a loop:
try {
while (!this.pool.awaitTermination(1000, TimeUnit.MILLISECONDS));
} catch (InterruptedException e) {//log exception...}
There might be a full blown mail package out there already, but I would probably start with Spring's support for email and job scheduling. Fire a new job for each email to be sent, and let the timing of the executor send the jobs and worry about how many need to be done. No queuing involved.
Underneath the framework, Spring is using Java Mail for the email part, and lets you choose between ThreadPoolExecutor (as mention by #Lorenzo) or Quartz. Quartz is better in my opinion, because you can even set it up so that it fires your jobs at fixed points in time like cron jobs (eg. at midnight). The advantage of using Spring is that it greatly simplifies working with these packages, so that your job is even easier.
There are many packages and tools that will help with this, but the generic name for cases like this, extensively studied in computer science, is producer-consumer problem. There are various well-known solutions for it, which could be considered 'design patterns'.
This question is about the fallouts of using SingleThreadExecutor (JDK 1.6). Related questions have been asked and answered in this forum before, but I believe the situation I am facing, is a bit different.
Various components of the application (let's call the components C1, C2, C3 etc.) generate (outbound) messages, mostly in response to messages (inbound) that they receive from other components. These outbound messages are kept in queues which are usually ArrayBlockingQueue instances - fairly standard practice perhaps. However, the outbound messages must be processed in the order they are added. I guess use of a SingleThreadExector is the obvious answer here. We end up having a 1:1 situation - one SingleThreadExecutor for one queue (which is dedicated to messages emanating from one component).
Now, the number of components (C1,C2,C3...) is unknown at a given moment. They will come into existence depending on the need of the users (and will be eventually disposed of too). We are talking about 200-300 such components at the peak load. Following the 1:1 design principle stated above, we are going to arrange for 200 SingleThreadExecutors. This is the source of my query here.
I am uncomfortable with the thought of having to create so many SingleThreadExecutors. I would rather try and use a pool of SingleThreadExecutors, if that makes sense and is plausible (any ready-made, seen-before classes/patterns?). I have read many posts on recommended use of SingleThreadExecutor here, but what about a pool of the same?
What do learned women and men here think? I would like to be directed, corrected or simply, admonished :-).
If your requirement is that the messages be processed in the order that they're posted, then you want one and only one SingleThreadExecutor. If you have multiple executors, then messages will be processed out-of-order across the set of executors.
If messages need only be processed in the order that they're received for a single producer, then it makes sense to have one executor per producer. If you try pooling executors, then you're going to have to put a lot of work into ensuring affinity between producer and executor.
Since you indicate that your producers will have defined lifetimes, one thing that you have to ensure is that you properly shut down your executors when they're done.
Messaging and batch jobs is something that has been solved time and time again. I suggest not attempting to solve it again. Instead, look into Quartz, which maintains thread pools, persisting tasks in a database etc. Or, maybe even better look into JMS/ActiveMQ. But, at the very least look into Quartz, if you have not already. Oh, and Spring makes working with Quartz so much easier...
I don't see any problem there. Essentially you have independent queues and each has to be drained sequentially, one thread for each is a natural design. Anything else you can come up with are essentially the same. As an example, when Java NIO first came out, frameworks were written trying to take advantage of it and get away from the thread-per-request model. In the end some authors admitted that to provide a good programming model they are just reimplementing threading all over again.
It's impossible to say whether 300 or even 3000 threads will cause any issues without knowing more about your application. I strongly recommend that you should profile your application before adding more complexity
The first thing that you should check is that number of concurrently running threads should not be much higher than number of cores available to run those threads. The more active threads you have, the more time is wasted managing those threads (context switch is expensive) and the less work gets done.
The easiest way to limit number of running threads is to use semaphore. Acquire semaphore before starting work and release it after the work is done.
Unfortunately limiting number of running threads may not be enough. While it may help, overhead may still be to great, if time spent per context switch is major part of total cost of one unit of work. In this scenario, often the most efficient way is to have fixed number of queues. You get queue from global pool of queues when component initializes using algorithm such as round-robin for queue selection.
If you are in one of those unfortunate cases where most obvious solutions do not work, I would start with something relatively simple: one thread pool, one concurrent queue, lock, list of queues and temporary queue for each thread in pool.
Posting work to queue is simple: add payload and identity of producer.
Processing is relatively straightforward as well. First you get get next item from queue. Then you acquire the lock. While you have lock in place, you check if any of other threads is running task for same producer. If not, you register thread by adding a temporary queue to list of queues. Otherwise you add task to existing temporary queue. Finally you release the lock. Now you either run the task or poll for next and start over depending on whether current thread was registered to run tasks. After running the task, you get lock again and see, if there is more work to be done in temporary queue. If not, remove queue from list. Otherwise get next task. Finally you release the lock. Again, you choose whether to run the task or to start over.