Call to MongoRepository in an ExecutorService fails to complete

Call to MongoRepository in an ExecutorService fails to complete - java

I am running a Spring Boot application and have multiple threads calling a MongoRepository. This, however, leads to weird timeout behavior.
This is my MongoRepository:
public interface EquipmentRepository extends MongoRepository<Equipment, String> {
Optional<Equipment> findByEquipmentSerialNumber(String equipmentSerialNumber);
}
This is a reduced version of my code highlighting the problem
ExecutorService taskExecutor = Executors.newFixedThreadPool(4);
taskExecutor.execute(() -> {
LOG.info("Executing query...");
Optional<Equipment> equipment = equipmentRepository.findByEquipmentSerialNumber("21133"); // guaranteed to be found
LOG.info("Query done: {}", equipment.get().getEquipmentSerialNumber());
});
taskExecutor.shutdown();
LOG.info("taskExecutor shut down");
try {
taskExecutor.awaitTermination(30, TimeUnit.SECONDS);
LOG.info("taskExecutor done");
} catch (InterruptedException e) {
System.out.println("Error");
}
That produced output looks like this
taskExecutor shut down
Executing query...
<30 second pause>
taskExecutor done
Query done: 21133
If I increase the timeout of awaitTermination() the pause increases accordingly. So somehow my code inside the execute() lambda is "paused" and only continues after the timeout is reached.
If I remove the call to equipmentRepository, everything works as expected and there is no 30-second pause.
What is keeping my code from completing without reaching the timeout?

Looks like mongo repository waits for 'main' thread to perform query (very strange though)
Not answer for your specific question, but probably solution for your problem: Spring Data can do async requests doc

Related

Executor service forgets about queued tasks

Working with Java 11 and Spring 2.1.6.RELEASE.
Im expierencing an issue where if I send a few records to the topic that this kafka consumer consumes from, everything works as planned. However If I produce A lot of records (a hundred or so) then the executor queues the processing but never actually does the processing. Am I using the executor wrong? I dont think its a kafka issue. Is there a way to query the executor to debug this?
#Configuration
public class ExecutorServiceConfig {
#Bean
public ExecutorService createExecutorService() {
return Executors.newFixedThreadPool(10);
}
}
#KafkaListener(topics = "${kafka.consumer.topic.name}",
groupId = "${spring.kafka.consumer.group-id}")
public void consume(PayrollDto message) {
log.info("Consumed message for processing:" + message); // this log is hit for all records
executor.execute(new ConsumerExecutor(message));
}
private class ConsumerExecutor implements Runnable {
PayrollDto message;
public ConsumerExecutor(PayrollDto message) {
this.message = message;
}
#Override
public void run() {
log.info("Beginning processing for payroll:" + this.message); // this log is hit for only some records
processPayrollList(this.message);
log.info("Finished processing for payroll:" + this.message);
}
}

It looks like you are using pure Java SE ExecutorService classes rather than Spring-specific TaskExecutor classes.
There is not enough information to diagnose this properly. (You haven't provide any clear evidence that the tasks have been "forgotten". Your reported evidence is that they are not executed. The "forgotten tasks" is only one of a number of possible explanations.)
The only explanations that I can think of are:
Your processPayrollList method is not terminating in some circumstances. It could be deadlocking, going into an infinite loop, waiting forever on some external service and so on.
If enough (i.e. 10) tasks failed to terminate, then you would run out of threads in the pool, and no more tasks would be processed. That is consistent with your evidence.
Something in your application is replacing executor with a different ExecutorService object.
Something in your application is removing tasks from the queue without executing them.
A build or deployment "process" issue; e.g. the code you are running is different to the code you are looking at. (It happens.)
An unreported bug in the Java 11 class library.
Of these, (1) is the most likely (IMO). Explanations (2) and (3) involve application code that I assume you would have mentioned in the question. I would treat (5) as implausible ... unless you can provide some clear evidence in the form of a minimal reproducible example.
Am I using the executor wrong?
It doesn't look like it from the code you have shown us.
Is there a way to query the executor to debug this?
You could take a thread stack dump (e.g. using the jstack command) and look at the status of the threads in the pool.
You could also cast executor to ThreadPoolExecutor and use that API to look at the queue length, the number of active threads and so on.
Note that this is not due to the ExecutorService being shut down. If that happened, you would get RejectedExecutionException in calls to execute.

How to wait for Redis cache to cache the information

I am using spring-data-redis and trying to have a junit with which, I can test my Caching logic. The test case sporadically works. I guess if the caching logic completes before the invocation of the second method call then it works else it fails. If some has faced a similar issue, I will like to understand how they made it work. As of now, I am using thread.sleep() but looking for an alternative.
#Test
public void getUserById() {
User user = new User("name", "1234");
when(userRepository.findbyId("1234")).thenReturn(Optional.ofNullable(user));
// first method call
User user1 = userService.findbyId("1234");
assertThat(user.getName()).isEqualTo(user1.getName());
assertThat(user.getId).isEqualTo(user1.getId());
// sleeping the thread so to provide caching aspect sufficient time
// to cache the information
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
// second method call, expecting cache to work.
userCache = userService.findbyId("1234");
verify(userRepository, never()).findbyId("1234");
assertThat(user.getName()).isEqualTo(userCache.getName());
assertThat(user.getId).isEqualTo(userCache.getId());
}

Runtime issues while waiting a short amount of time a really common in a distributed system. To remedy the need of waiting too long for a test assertion, there is a little tool called Awaitility.
With this you can basically do a much cleverer wait by querying an assertion multiple times, in certain intervals until a given timeout was reached (…and much more).
Regarding your example, try this:
Awaitility.await()
.pollInterval(new Duration(1, TimeUnit.SECONDS))
.atMost(new Duration(10, TimeUnit.SECONDS))
.untilAsserted(() ->
User user1 = userService.findbyId("1234");
assertThat(user1.getName()).isEqualTo(user.getName());
Regarding the other part of your question, in an integration test you could actually perform some kind of prewarming of your Redis instance or if you have a containerized integration test (e. g. Docker) you could fire some first requests on it or wait for a certain condition before starting with your tests.

The actual issue was not with the Thread wait time. For Redis cache to work a separate thread need to be spanned. For my service test, I tested it via a separated test case.
#Test
public void getUserById() {
User user = new User("name", "1234");
when(userRepository.findbyId("1234")).thenReturn(Optional.ofNullable(user));
// first method call
User user1 = userService.findbyId("1234");
assertThat(user.getName()).isEqualTo(user1.getName());
assertThat(user.getId).isEqualTo(user1.getId());
}
//ensure this test case is executed after getUserById. I used
//#FixMethodOrder(MethodSorters.NAME_ASCENDING)
#Test
public void getUserById_cache() {
User user1 = userService.findbyId("1234");
Mockito.verify(userRepository, never()).findbyId("1234")
assertThat(user.getName()).isEqualTo(user1.getName());
assertThat(user.getId).isEqualTo(user1.getId());
}

How to be sure that a #scheduled task terminates?

inside a Spring web application I have a scheduled task that is called every five minutes.
#Scheduled(fixedDelay = 300000)
public void importDataTask()
{
importData(); //db calls, file manipulations, etc..
}
Usually the task runs smoothly for days, but sometimes happens that the example method importaData()will not terminate, so importDataTask()will not be called again and everything will be blocked until I restart the application.
The question is: is there a feasibile method to be sure that a method will not be indefinitely blocked (waybe waiting for a resource, or something else)?

The question is: is there a feasibile method to be sure that a method
will not be indefinitely blocked (waybe waiting for a resource, or
something else)?
If the scheduling cannot be planned at a precise regular interval, you should maybe not use a fixed delay but use two conditions : delay + last execution done.
You could schedule a task which checks if the two conditions are met and if it the case, you run the important processing. Otherwise, it waits for the next schedule.
In this way, you should not be blocked. You could wait for some time if the task exceeds the fixed delay. If it is a problem because the fixed delay is often exceeded, you should probably not use a fixed delay or so you should increase sensitively it in order that it is less common.
Here an example (writing without editor. Sorry if any mistake) :
private boolean isLastImportDataTaskFinished;
#Scheduled(fixedDelay = 300000)
public void importDataTaskManager(){
if (isLastImportDataTaskFinished()){
new Thread(new ImportantDataProcessing())).start();
}
else{
// log the problem if you want
}
}
private isLastImportDataTaskFinished(){
// to retrieve this information, you can do as you want : use a variable
// in this class or a data in database,file...
// here a simple implementation
return isLastImportDataTaskFinished;
}
Runnable class :
public class ImportantDataProcessing implements Runnable{
public void run(){
importData(); //db calls, file manipulations, etc..
}
}
Comment:
But if I run it as a thread how can I kill it if I find it's exceeding
the time limit since I don't have any reference to it (in the idea of
using a second task to determine the stuck state)?
You can use an ExecutorService (you have a question about it here : How to timeout a thread).
Here a very simple example :
ExecutorService executor = Executors.newSingleThreadExecutor();
Future future = executor.submit(new ImportantDataProcessing());
try {
future.get(100, TimeUnit.SECONDS);
}
catch (InterruptedException e) {
e.printStackTrace();
}
catch (ExecutionException e) {
e.printStackTrace();
}
catch (TimeoutException e) {
// the timeout to handle but other exceptions should be handled :)
e.printStackTrace();
}
executor.shutdown();
If interesting information may be returned by ImportantDataProcessing processing , you can use a task instead of a runnable instance to type the future.

Firstly, sure. There are many feasibile methods to remind you if the process is blocked, such as log/message/email which embed in you code.
Secondly, it is decided by if you want it block or not. If block is not you intention, new thread or timeout may be you choice.

ExecutorService runnable never hits try when an Exception occurs

I am trying to use a CompletableFuture<T> to respond to a LWJGL OpenGL context being created. This is done by calling the open method on LWJGLGameWindow. Here is the concerning code:
#Override
public CompletableFuture<?> open() {
CompletableFuture<Void> future = new CompletableFuture<>();
scheduledExecutorService.schedule(() -> {
future.completeExceptionally(new TimeoutException("Could not establish contact with LWJGL"));
}, 2000, TimeUnit.MILLISECONDS);
scheduledExecutorService.execute(() -> {
try {
display.setDisplayMode(new DisplayMode(defaultWidth, defaultHeight));
display.create();
future.complete(null);
} catch (LWJGLException e) {
future.completeExceptionally(e);
}
});
return future;
}
The idea is to defer the creation of a display on a scheduled executor service. This is set up to be a single threaded scheduled executor service, because OpenGL contexts are thread-bound. If it takes too long to connect to LWJGL, then the returned future will break out of itself early.
The problem is that in unit tests, this works absolutely swimmingly. However, when I try and debug the program, any call to any of the display methods results in a real exception being thrown by lwjgl (because my library for lwjgl is not linked. This is still thrown as a LwjglException, though). For some reason, this exception is not picked up from the try-catch in this code here, and instead the exception is swallowed; the future never gets completed exceptionally.
So somewhere along the line, my exception is being swallowed in this code.
NB: display is simply a interface around LWJGL's Display - no fancy magic going on there. scheduledExecutorService is a single threaded scheduled executor.
I also appreciate that .submit() and schedule on scheduledExecutorService both return Future<T> but this lacks the composition I would like to use from CompletableFuture<T>. I'd like to be able to keep using that if at all possible.

The code actually works exactly as it should. The real problem is that the error I was expecting, java.lang.UnsatisifiedLinkError, is not an Exception but actually an Error. Amending the code to catch a Throwable solves this issue.

Best practice for interrupting threads that take longer than a threshold

I am using the Java ExecutorService framework to submit callable tasks for execution.
These tasks communicate with a web service and a web service timeout of 5 mins is applied.
However I've seen that in some cases the timeout is being ignored and thread 'hangs' on an API call - hence, I want to cancel all the tasks that take longer than say, 5 mins.
Currently, I have a list of futures and I iterate through them and call future.get until all tasks are complete. Now, I've seen that the future.get overloaded method takes a timeout and throws a timeout when the task doesnt complete in that window. So I thought of an approach where I do a future.get() with timeout and in case of TimeoutException I do a future.cancel(true) to make sure that this task is interrupted.
My main questions
1. Is the get with a timeout the best way to solve this issue?
2. Is there the possibility that I'm waiting with the get call on a task that hasnt yet been placed on the thread pool(isnt an active worker). In that case I may be terminating a thread that, when it starts may actually complete within the required time limit?
Any suggestions would be deeply appreciated.

Is the get with a timeout the best way to solve this issue?
This will not suffice. For instance, if your task is not designed to response to interruption, it will keep on running or be just blocked
Is there the possibility that I'm waiting with the get call on a task that hasnt yet been placed on the thread pool(isnt an active worker). In that case I may be terminating a thread that, when it starts may actually complete within the required time limit?
Yes, You might end up cancelling as task which is never scheduled to run if your thread-pool is not configured properly
Following code snippet could be one of the way you can make your task responsive to interruption when your task contains Non-interruptible Blocking. Also it does not cancel the task which are not scheduled to run. The idea here is to override interrupt method and close running tasks by say closing sockets, database connections etc. This code is not perfect and you need to make changes as per requirements, handle exceptions etc.
class LongRunningTask extends Thread {
private Socket socket;
private volatile AtomicBoolean atomicBoolean;
public LongRunningTask() {
atomicBoolean = new AtomicBoolean(false);
}
#Override
public void interrupt() {
try {
//clean up any resources, close connections etc.
socket.close();
} catch(Throwable e) {
} finally {
atomicBoolean.compareAndSet(true, false);
//set the interupt status of executing thread.
super.interrupt();
}
}
public boolean isRunning() {
return atomicBoolean.get();
}
#Override
public void run() {
atomicBoolean.compareAndSet(false, true);
//any long running task that might hang..for instance
try {
socket = new Socket("0.0.0.0", 5000);
socket.getInputStream().read();
} catch (UnknownHostException e) {
} catch (IOException e) {
} finally {
}
}
}
//your task caller thread
//map of futures and tasks
Map<Future, LongRunningTask> map = new HashMap<Future, LongRunningTask>();
ArrayList<Future> list = new ArrayList<Future>();
int noOfSubmittedTasks = 0;
for(int i = 0; i < 6; i++) {
LongRunningTask task = new LongRunningTask();
Future f = execService.submit(task);
map.put(f, task);
list.add(f);
noOfSubmittedTasks++;
}
while(noOfSubmittedTasks > 0) {
for(int i=0;i < list.size();i++) {
Future f = list.get(i);
LongRunningTask task = map.get(f);
if (task.isRunning()) {
/*
* This ensures that you process only those tasks which are run once
*/
try {
f.get(5, TimeUnit.MINUTES);
noOfSubmittedTasks--;
} catch (InterruptedException e) {
} catch (ExecutionException e) {
} catch (TimeoutException e) {
//this will call the overridden interrupt method
f.cancel(true);
noOfSubmittedTasks--;
}
}
}
}
execService.shutdown();

Is the get with a timeout the best way to solve this issue?
Yes it is perfectly fine to get(timeout) on a Future object, if the task that the future points to is already executed it will return immediately. If the task is yet to be executed or is being executed then it will wait until timeout and is a good practice.
Is there the possibility that I'm waiting with the get call on a task
that hasnt yet been placed on the thread pool(isnt an active worker)
You get Future object only when you place a task on the thread pool so it is not possible to call get() on a task without placing it on thread pool. Yes there is a possibility that the task has not yet been taken by a free worker.

The approach that you are talking about is ok. But most importantly before setting a threshold on the timeout you need to know what is the perfect value of thread pool size and timiout for your environment. Do a stress testing which will reveal whether the no of worker threads that you configured as part of Threadpool is fine or not. And this may even reduce the timeout value. So this test is most important i feel.
Timeout on get is perfectly fine but you should add to cancel the task if it throws TimeoutException. And if you do the above test properly and set your thread pool size and timeout value to ideal than you may not even need to cancel tasks externally (but you can have this as backup). And yes sometimes in canceling a task you may end up canceling a task which is not yet picked up by the Executor.

You can of course cancel a Task by using
task.cancel(true)
It is perfectly legal. But this will interrupt the thread if it is "RUNNING".
If the thread is waiting to acquire an intrinsic lock then the "interruption" request has no effect other than setting the thread's interrupted status. In this case you cannot do anything to stop it. For the interruption to happen, the thread should come out from the "blocked" state by acquiring the lock it was waiting for (which may take more than 5 mins). This is a limitation of using "intrinsic locking".
However you can use explicit lock classes to solve this problem. You can use "lockInterruptibly" method of the "Lock" interface to achieve this. "lockInterruptibly" will allow the thread to try to acquire a lock while remaining responsive to the interruption. Here is a small example to achieve that:
public void workWithExplicitLock()throws InterruptedException{
Lock lock = new ReentrantLock();
lock.lockInterruptibly()();
try {
// work with shared object state
} finally {
lock.unlock();
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.