I have a database which contains e-mails to be sent. I'm using multiple threads to send out these e-mails. The approach I'm using is that each thread will query the database, get N e-mails in memory and mark those as being sent. Another thread will see those N e-mails as marked and move on and fetch the next N entries.
Now this isn't working as before thread1 can update the entries as being sent, thread2 queries for the e-mails and thus both threads end up getting the same set of e-mails.
Each thread has its own connection to the database. Is that the root cause of this behaviour? Should I be just sharing one connection object across all the threads?
Or is there any better approach that I could use?
My recommendation is to have a single thread take care of querying the database, placing the retrieved emails in a thread-safe queue (e.g. an ArrayBlockingQueue, which has the advantage of being bounded); you can then have any number of threads removing and processing emails from this queue. The synchronization overhead on the ArrayBlockingQueue is fairly lightweight, and this way you don't need to use database transactions or anything like that.
class EmailChunk {
Email[] emails;
}
// only instantiate one of these
class DatabaseThread implements Runnable {
final BlockingQueue<EmailChunk> emailQueue;
public DatabaseThread(BlockingQueue<EmailChunk> emailQueue) {
this.emailQueue = emailQueue;
}
public void run() {
EmailChunk newChunk = // query database, create email chunk
// add newChunk to queue, wait 30 seconds if it's full
emailQueue.offer(newChunk, 30, TimeUnit.SECONDS);
}
}
// instantiate as many of these as makes sense
class EmailThread implements Runnable {
final BlockingQueue<EmailChunk> emailQueue;
public EmailThread(BlockingQueue<EmailChunk> emailQueue) {
this.emailQueue = emailQueue;
}
public void run() {
// take next chunk from queue, wait 30 seconds if queue is empty
emailChunk nextChunk = emailQueue.poll(30, TimeUnit.SECONDS);
}
}
class Main {
final int queueSize = 5;
public static void main(String[] args) {
BlockingQueue<EmailChunk> emailQueue = new ArrayBlockingQueue<>(queueSize);
// instantiate DatabaseThread and EmailThread objects with this queue
}
}
You need to have a way to share one method // code to control the concurrence. Sincronize the statements to get the emails and mark them. Then sent the e-mails. Something like this:
public void processMails(){
List<String> mails;
synchronized(this){
mails = getMails();
markMails(mails);
}
sendMails(mails);
}
This method could be in your DAO Facade where all threads can access.
EDIT:
if you have multiples instances of DAO class:
public void processMails(){
List<String> mails;
synchronize(DAO.class){
mails = getMails();
markMails(mails);
}
sendMails(mails);
}
Other alternative
private static final Object LOCK = new Object();
public void processMails(){
List<String> mails;
synchronize(LOCK){
mails = getMails();
markMails(mails);
}
sendMails(mails);
}
Related
My application starts couple of clients which communicate with steam. There are two types of task which I can ask for clients. One when I don't care about blocking for example ask client about your friends. But second there are tasks which I can submit just one to client and I need to wait when he finished it asynchronously. So I am not sure if there is already some design pattern but you can see what I already tried. When I ask for second task I removed it from queue and return it here after this task is done. But I don't know if this is good sollution because I can 'lost' some clients when I do something wrong
#Component
public class SteamClientWrapper {
private Queue<DotaClientImpl> clients = new LinkedList<>();
private final Object clientLock = new Object();
public SteamClientWrapper() {
}
#PostConstruct
public void init() {
// starting clients here clients.add();
}
public DotaClientImpl getClient() {
return getClient(false);
}
public DotaClientImpl getClient(boolean freeLast) {
synchronized (clients) {
if (!clients.isEmpty()) {
return freeLast ? clients.poll() : clients.peek();
}
}
return null;
}
public void postClient(DotaClientImpl client) {
if (client == null) {
return;
}
synchronized (clientLock) {
clients.offer(client);
clientLock.notify();
}
}
public void doSomethingBlocking() {
DotaClientImpl client = getClient(true);
client.doSomething();
}
}
Sounds like you could use Spring's ThreadPoolTaskExecutor to do that.
An Executor is basically what you tried to do - store tasks in a queue and process the next as soon the previous has finished.
Often this is used to run tasks in parallel, but it can also reduce overhead for serial processing.
A sample doing it this way would be on
https://dzone.com/articles/spring-and-threads-taskexecutor
To ensure only one client task runs at a time, simply set the configuration to
executor.setCorePoolSize(1);
executor.setMaxPoolSize(1);
Need help with Java multiple threading
I have a case as below:
There are many records. Each record has about 250 fields. Each field needs to be validated against on a predefined rule.
So I defined a class, FieldInfo, to represent each field:
public class FieldInfo {
private String name;
private String value;
private String error_code;
private String error_message;
// ignore getters and setters
}
a class Record to represent a record:
public class Record {
List<FieldInfo> fields;
// omit getter and setter here
}
and the rule interface and class:
public interface BusinessRule {
// validating one field needs some other fields' value in the same record. So the list of all fields for a certain record passed in as parameter
public FieldInfo validate(List<FieldInfo> fields);
}
public class FieldName_Rule implements BusinessRule {
public FieldInfo validate(List<FieldInfo> fields) {
// will do
// 1. pickup those fields required for validating this target field, including this target field
// 2. performs validation logics A, B, C...
// note: all rules only read data from a database, no update/insert operations.
}
}
User can submit 5000 records or more at a time for process. The performance requirement is high. I was thinking to have multiple threads for the submitted, for example 5000, records (means one thread run several records), and in each thread, fork another multiple threads on each record to run rules.
But unfortunately, such embedded multi-threading always died in my case.
Here are some key parts from the above solution:
public class BusinessRuleService {
#Autowired
private ValidationHandler handler;
public String process(String xmlRequest) {
List<Record> records = XmlConverter.unmarshall(xmlRequest).toList();
ExecutorService es = Executors.newFixedThreadPool(100);
List<CompletableFuture<Integer> futures =
records.stream().map(r->CompletableFuture.supplyAsync(()-> handler.invoke(r), es)).collect(Collectors.toList());
List<Integer> result = future.stream().map(CompletableFuture::join).collect(Collectors.toList());
System.out.println("total records %d processed.", result.size());
es.shutdown();
return XmlConverter.marshallObject(records);
}
}
#Component
public class ValidationHandlerImpl implements ValidationHandler {
#Autowired
private List<BusinessRule> rules;
#Override
public int invoke(Record record) {
ExecutorService es = Executors.newFixedThreadPool(250);
List<CompletableFuture<FieldInfo> futures =
rules.stream().map(r->CompletableFuture.supplyAsync(()-> r.validate(record.getFields()), es)).collect(Collectors.toList());
List<FieldInfo> result = future.stream().map(CompletableFuture::join).collect(Collectors.toList());
System.out.println("total records %d processed.", result.size());
es.shutdown();
return 0;
}
}
The workflow is:
User submits a list of records in an xml string format. One of the application endpoint launches the process method in a BusinessRuleService object. The process uses CompletableFuture to compose tasks and submit the tasks to a ExecutorService which has a thread pool of size 100. Each task in the CompletableFuture list then launches ValidationHandler object. The ValidationHandler object composes another CompletableFuture task and submit the task to another ExecutorService which has the pool size the same as the rule list size.
The above solution is proper?
Note: my current solution is: the submitted records are processed in sequence. And the 250 rules are processed in parallel for each record. With this solution, it takes more than 2 hours for 5000 records. Such poor performance is not acceptable by business.
I am very new to concurrent/multi-threading programming.
Much appreciate for all kind of helps!
This is a well known "single producer - multiple consumers" pattern. The classic solution is to create a BlockingQueue<Record> queue, and put records there at the pace of their reading. On the other end of the queue, a number of working threads read records from the queue and process them (in our case, validate the fields):
class ValidatingThread extends Tread {
BlockingQueue<Record> queue;
FieldName_Rule validator = new FieldName_Rule();
public Validator (BlockingQueue<Record> queue) {
this.queue = queue;
}
public void run() {
Record record = queue.take();
validator.validate(collectFields(record));
}
}
The optimal number of threads equals to the Runtime.getRuntime().availableProcessors().
Start them all at the beginning, and do not use "embedded multi-threading".
The task how to stop the threads after all the records are processed, is left as a learning assignment.
Is it possible to modify the runnable object after it has been submitted to the executor service (single thread with unbounded queue) ?
For example:
public class Test {
#Autowired
private Runner userRunner;
#Autowired
private ExecutorService executorService;
public void init() {
for (int i = 0; i < 100; ++i) {
userRunner.add("Temp" + i);
Future runnerFuture = executorService.submit(userRunner);
}
}
}
public class Runner implements Runnable {
private List<String> users = new ArrayList<>();
public void add(String user) {
users.add(user);
}
public void run() {
/* Something here to do with users*/
}
}
As you can see in the above example, if we submit a runnable object and modify the contents of the object too inside the loop, will the 1st submit to executor service use the newly added users. Consider that the run method is doing something really intensive and subsequent submits are queued.
if we submit a runnable object and modify the contents of the object too inside the loop, will the 1st submit to executor service use the newly added users.
Only if the users ArrayList is properly synchronized. What you are doing is trying to modify the users field from two different threads which can cause exceptions and other unpredictable results. Synchronization ensures mutex so multiple threads aren't changing ArrayList at the same time unexpectedly, as well as memory synchronization which ensures that one thread's modifications are seen by the other.
What you could do is to add synchronization to your example:
public void add(String user) {
synchronized (users) {
users.add(user);
}
}
...
public void run() {
synchronized (users) {
/* Something here to do with users*/
}
}
Another option would be to synchronize the list:
// you can't use this if you are iterating on this list (for, etc.)
private List<String> users = Collections.synchronizedList(new ArrayList<>());
However, you'll need to manually synchronize if you are using a for loop on the list or otherwise iterating across it.
The cleanest, most straightforward approach would be to call cancel on the Future, then submit a new task with the updated user list. Otherwise not only do you face visibility issues from tampering with the list across threads, but there's no way to know if you're modifying a task that's already running.
I have a class with following method
public class Test {
private List l1;
public void send() {
for (<type> x : l1) {
//send to receivers and put a log in DB
}
}
}
This Test class is used by different threads which will fill the variable 'l1' with their own data and send them to receivers.
If I have to synchronize this to send data sequentially so that receivers get one full frame of data every time(without jumbling of data from different threads), should I synchronize on 'l1' or synchronize on the class Test.
I read the tutorials and samples but I still have this question.
You should synchronize on the object that represents you "shared state" (l1 in this case); you must ensure that every insert/read operation is synchronized
so you must have a synchronized(l1) {...} block for add (and remove) call and one while sending:
public void send() {
synchronized(l1) {
for (<type> x : l1) {
//send to receivers and put a log in DB
}
}
}
depending on you requirements you can also implement something more complex like:
public void send() {
synchronized(l1) {
List l2=new ArrayList(l1);
//clear l1?
}
for (<type> x : l2) {
//send to receivers and put a log in DB
}
}
and allow a grater degree of concurrency
I have a Java Thread like the following:
public class MyThread extends Thread {
MyService service;
String id;
public MyThread(String id) {
this.id = node;
}
public void run() {
User user = service.getUser(id)
}
}
I have about 300 ids, and every couple of seconds - I fire up threads to make a call for each of the id. Eg.
for(String id: ids) {
MyThread thread = new MyThread(id);
thread.start();
}
Now, I would like to collect the results from each threads, and do a batch insert to the database, instead of making 300 database inserts every 2 seconds.
Any idea how I can accomplish this?
The canonical approach is to use a Callable and an ExecutorService. submitting a Callable to an ExecutorService returns a (typesafe) Future from which you can get the result.
class TaskAsCallable implements Callable<Result> {
#Override
public Result call() {
return a new Result() // this is where the work is done.
}
}
ExecutorService executor = Executors.newFixedThreadPool(300);
Future<Result> task = executor.submit(new TaskAsCallable());
Result result = task.get(); // this blocks until result is ready
In your case, you probably want to use invokeAll which returns a List of Futures, or create that list yourself as you add tasks to the executor. To collect results, simply call get on each one.
If you want to collect all of the results before doing the database update, you can use the invokeAll method. This takes care of the bookkeeping that would be required if you submit tasks one at a time, like daveb suggests.
private static final ExecutorService workers = Executors.newCachedThreadPool();
...
Collection<Callable<User>> tasks = new ArrayList<Callable<User>>();
for (final String id : ids) {
tasks.add(new Callable<User>()
{
public User call()
throws Exception
{
return svc.getUser(id);
}
});
}
/* invokeAll blocks until all service requests complete,
* or a max of 10 seconds. */
List<Future<User>> results = workers.invokeAll(tasks, 10, TimeUnit.SECONDS);
for (Future<User> f : results) {
User user = f.get();
/* Add user to batch update. */
...
}
/* Commit batch. */
...
Store your result in your object. When it completes, have it drop itself into a synchronized collection (a synchronized queue comes to mind).
When you wish to collect your results to submit, grab everything from the queue and read your results from the objects. You might even have each object know how to "post" it's own results to the database, this way different classes can be submitted and all handled with the exact same tiny, elegant loop.
There are lots of tools in the JDK to help with this, but it is really easy once you start thinking of your thread as a true object and not just a bunch of crap around a "run" method. Once you start thinking of objects this way programming becomes much simpler and more satisfying.
In Java8 there is better way for doing this using CompletableFuture. Say we have class that get's id from the database, for simplicity we can just return a number as below,
static class GenerateNumber implements Supplier<Integer>{
private final int number;
GenerateNumber(int number){
this.number = number;
}
#Override
public Integer get() {
try {
TimeUnit.SECONDS.sleep(1);
}catch (InterruptedException e){
e.printStackTrace();
}
return this.number;
}
}
Now we can add the result to a concurrent collection once the results of every future is ready.
Collection<Integer> results = new ConcurrentLinkedQueue<>();
int tasks = 10;
CompletableFuture<?>[] allFutures = new CompletableFuture[tasks];
for (int i = 0; i < tasks; i++) {
int temp = i;
CompletableFuture<Integer> future = CompletableFuture.supplyAsync(()-> new GenerateNumber(temp).get(), executor);
allFutures[i] = future.thenAccept(results::add);
}
Now we can add a callback when all the futures are ready,
CompletableFuture.allOf(allFutures).thenAccept(c->{
System.out.println(results); // do something with result
});
You need to store the result in a something like singleton. This has to be properly synchronized.
This not the best advice as it is not good idea to handle raw Threads.
You could create a queue or list which you pass to the threads you create, the threads add their result to the list which gets emptied by a consumer which performs the batch insert.
The simplest approach is to pass an object to each thread (one object per thread) that will contain the result later. The main thread should keep a reference to each result object. When all threads are joined, you can use the results.
public class TopClass {
List<User> users = new ArrayList<User>();
void addUser(User user) {
synchronized(users) {
users.add(user);
}
}
void store() throws SQLException {
//storing code goes here
}
class MyThread extends Thread {
MyService service;
String id;
public MyThread(String id) {
this.id = node;
}
public void run() {
User user = service.getUser(id)
addUser(user);
}
}
}
You could make a class which extends Observable. Then your thread can call a method in the Observable class which would notify any classes that registered in that observer by calling Observable.notifyObservers(Object).
The observing class would implement Observer, and register itself with the Observable. You would then implement an update(Observable, Object) method that gets called when Observerable.notifyObservers(Object) is called.