I'm writing a scheduled task in Thorntail that will run for a long time (approx. 30 minutes). However, it appears that Thorntail limits the execution time to 30 seconds.
My code looks like this (I've removed code that I believe is irrelevant):
#Singleton
public class ReportJobProcessor {
#Schedule(hour = "*", minute = "*/30", persistent = false)
public void processJobs() {
// Acquire a list of jobs
jobs.forEach(this::processJob);
}
private void processJob(ReportJob job) {
// A long running process
}
}
After 30 seconds, I see the following in my logs:
2019-10-01 16:15:14,097 INFO [org.jboss.as.ejb3.timer] (EJB default - 2) WFLYEJB0021: Timer: [id=... timedObjectId=... auto-timer?:true persistent?:false timerService=org.jboss.as.ejb3.timerservice.TimerServiceImpl#42478b98 initialExpiration=null intervalDuration(in milli sec)=0 nextExpiration=Tue Oct 01 16:20:00 CEST 2019 timerState=IN_TIMEOUT info=null] will be retried
Another 30 seconds later, an exception is thrown because the job still didn't complete.
I have no idea how to increase the timeout, and googling my issue returns nothing helpful.
How can I increase the timeout beyond 30 seconds?
I suggest you take a bit different approach.
The scheduled task will distribute jobs to asynchronously running stateless session beans (SLSB) called ReportJobExecutor and finish immediately after job distribution without timing out. The number of simultaneously running SLSBs can be adjustable in project-defaults.yml, the default count is 16, IIRC. This is a very basic example but demonstrates Java EE executions with predefined bean pool that is invoked using EJB Timer. More complicated example would be manual pooling of executors that would allow you to control lifecycle of the executors (e.g. killing them after specified time).
#Singleton
public class ReportJobProcessor {
#Inject ReportJobExecutor reportJobExecutor;
#Schedule(hour = "*", minute = "*/30", persistent = false)
public void processJobs() {
// Acquire a list of jobs
jobs.forEach(job -> reportJobExecutor.run(job));
}
}
#Stateless
#Asynchronous
public class ReportJobExecutor {
public void run(ReportJob job) {
//do whatever with job
}
}
Idea #2:
Another approach would be using Java Batch Processing API (JSR 352), unfortunately, I am not familiar with this API.
Related
// this query returns 0.45 million records and stored in the list.
List<Employee> empList=result.getQuery(query);
Iterating employee list and setting property and finally calling service method to save employee object.
using sequential process method its taking lot of time because of the volume of records so I want to use threads .I am new to groovy and implemented only simple examples.
How to use threads for below logic using groovy?
for (Employee employee : empList) {
employee.setQuantity(8);
employeeService.save(employee);
}
There are frameworks to do this (gpars comes to mind) and also the java executors framework is a better abstraction than straight up threads, but if we want to keep things really primitive, you can split your list up in batches and run each batch on a separate thread by using something like:
def employeeService = new EmployeeService()
def empList = (1..400000).collect { new Employee() }
def batchSize = 10000
def workerThreads = empList.collate(batchSize).withIndex().collect { List<Employee> batch, int index ->
Thread.start("worker-thread-${index}") {
println "worker ${index} starting"
batch.each { Employee e ->
e.quantity = 8
employeeService.save(e)
}
println "worker ${index} completed"
}
}
println "main thread waiting for workers to finish"
workerThreads*.join()
println "workers finished, exiting..."
class Employee {
int quantity
}
class EmployeeService {
def save(Employee e) {
Thread.sleep(1)
}
}
which, when run, prints:
─➤ groovy solution.groovy
worker 7 starting
worker 11 starting
worker 5 starting
worker 13 starting
worker 17 starting
worker 16 starting
worker 2 starting
worker 18 starting
worker 6 starting
worker 15 starting
worker 12 starting
worker 14 starting
worker 1 starting
worker 4 starting
worker 10 starting
worker 8 starting
worker 9 starting
worker 3 starting
worker 0 starting
worker 20 starting
worker 21 starting
worker 19 starting
worker 22 starting
worker 24 starting
worker 23 starting
worker 25 starting
worker 26 starting
worker 27 starting
worker 28 starting
worker 29 starting
worker 30 starting
worker 31 starting
worker 32 starting
worker 33 starting
worker 34 starting
worker 35 starting
worker 36 starting
worker 37 starting
worker 38 starting
worker 39 starting
main thread waiting for workers to finish
worker 0 completed
worker 16 completed
worker 20 completed
worker 1 completed
worker 3 completed
worker 14 completed
worker 7 completed
worker 12 completed
worker 24 completed
worker 10 completed
worker 6 completed
worker 19 completed
worker 33 completed
worker 27 completed
worker 28 completed
worker 35 completed
worker 17 completed
worker 25 completed
worker 38 completed
worker 4 completed
worker 8 completed
worker 13 completed
worker 9 completed
worker 39 completed
worker 15 completed
worker 36 completed
worker 37 completed
worker 18 completed
worker 30 completed
worker 23 completed
worker 11 completed
worker 32 completed
worker 2 completed
worker 29 completed
worker 26 completed
worker 5 completed
worker 22 completed
worker 31 completed
worker 21 completed
worker 34 completed
workers finished, exiting...
List.collate splits the list of employees into chunks (List<Employee>) of size batchSize. withIndex is just there so that each batch also gets an index (i.e. just a number 0, 1, 2, 3...) for debuggability and tracing.
As we are starting threads, we need to wait for them to complete, the workerThreads*.join() is essentially doing the same thing as:
workerThreds.each { t -> t.join() }
but using a more concise syntax and Thread.join() is a java construct for waiting for a thread to complete.
Use the database, not Java
As commented by cfrick, in real work you would be using SQL to do a mass update of rows. In contrast, looping object by object in Java to update row by row in the database would be inordinately slow compared to a simple UPDATE… in SQL.
But for the sake of exploration, we will ignore this fact, and proceed with your Question.
Trying virtual threads with Project Loom
The correct Answer by Matias Bjarland inspired me to try similar code using the Project Loom technology coming to Java. Project Loom brings virtual threads (fibers) for faster concurrency with simpler coding.
Project Loom is still in the experimental stage, but is seeking feedback from the Java community. Special builds of early-access Java 16 with Project Loom technology built-in are available now for the Linux/Mac/Windows OSes.
My code here uses Java syntax, as I do not know Groovy.
I want to try similar code to the other Answer, creating a simple Employee with a a single member field quantity. And with an EmployeeService offering a save method that simulates writing to a database by merely sleeping a full second.
One major feature of Project Loom is that blocking a thread, and switching to work on another thread, now becomes very cheap. So many of the tricks and techniques used in writing Java code to avoid expensive blocking became unnecessary. So the batching seen in the other Answer should not be needed when using virtual threads. So the code below simply loops all half million Employee objects, and creates a new Runnable object for each one. As each of the new half-million Runnable objects are instantiated, they are submitted to an executor service.
We run this code twice, using either of two kinds of executor services. One is the conventional type using platform/kernel threads used for many years in Java before Project Loom, specifically, the executor service backed by a fixed thread pool. The other kind is the new executor service offered in Project Loom for virtual threads.
Executors.newFixedThreadPool( int countThreads )
Executors.newVirtualThreadExecutor()
Code
package work.basil.example;
import java.time.Duration;
import java.time.Instant;
import java.util.List;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
public class HalfMillion
{
public static void main ( String[] args )
{
HalfMillion app = new HalfMillion();
app.demo();
}
private void demo ( )
{
System.out.println( "java.runtime.version " + System.getProperty( "java.runtime.version" ) );
System.out.println( "INFO - `demo` method starting. " + Instant.now() );
// Populate data.
List < Employee > employees = IntStream.rangeClosed( 1 , 500_000 ).mapToObj( i -> new Employee() ).collect( Collectors.toList() );
// Submit task (updating field in each object) to an executor service.
long start = System.nanoTime();
EmployeeService employeeService = new EmployeeService();
try (
//ExecutorService executorService = Executors.newFixedThreadPool( 5 ) ; // 5 of 6 real cores, no hyper-threading.
ExecutorService executorService = Executors.newVirtualThreadExecutor() ;
)
{
employees
.stream()
.forEach(
employee -> {
executorService.submit(
new Runnable()
{
#Override
public void run ( )
{
employee.quantity = 8;
employeeService.save( employee );
}
}
);
}
);
}
// With Project Loom, the code blocks here until all submitted tasks have finished.
Duration duration = Duration.ofNanos( System.nanoTime() - start );
// Report.
System.out.println( "INFO - Done running demo for " + employees.size() + " employees taking " + duration + " to finish at " + Instant.now() );
}
class Employee
{
int quantity;
#Override
public String toString ( )
{
return "Employee{ " +
"quantity=" + quantity +
" }";
}
}
class EmployeeService
{
public void save ( Employee employee )
{
//System.out.println( "TRACE - An `EmployeeService` is doing `save` on an employee." );
try {Thread.sleep( Duration.ofSeconds( 1 ) );} catch ( InterruptedException e ) {e.printStackTrace();}
}
}
}
Results
I ran that code on a Mac mini (2018) with 3 GHz Intel Core i5 processor having 6 real cores and no hyper-threading, with 32 GB 2667 MHz DDR4 memory, and running macOS Mojave 10.14.6.
Using the new virtual threads of Project Loom
Using Executors.newVirtualThreadExecutor() takes under 5 seconds.
java.runtime.version 16-loom+9-316
INFO - `demo` method starting. 2020-12-21T09:20:36.273351Z
INFO - Done running demo for 500000 employees taking PT4.517136095S to finish at 2020-12-21T09:20:40.885315Z
If I enabled the println line within the save method, it took 15 seconds.
Using a fixed pool of 5 conventional platform/kernel threads
Using Executors.newFixedThreadPool( 5 ) takes … well, *much longer. Over a day instead of seconds: 27 hours.
java.runtime.version 16-loom+9-316
INFO - `demo` method starting. 2020-12-21T09:32:07.173561Z
INFO - Done running demo for 500000 employees taking PT27H58M18.930703698S to finish at 2020-12-22T13:30:28.813345Z
Conclusion
Well I’m not sure I can draw a conclusion here.
The results for the conventional thread pool make sense. Remember that each Java thread maps to a kernel thread in the host OS. If we are sleeping one second per employee object, as we saturate 5 cores there will mostly be 5 threads sleeping most of the time. This means the total duration should be at least a hundred thousand seconds.
The results for virtual threads on Project Loom are not believable. The command to sleep the current thread seems to ignored when using virtual threads. But I am not certain; perhaps my five physical cores on this Mac were able to be sleeping simultaneously about a hundred thousand threads each?
Please post criticisms if you find fault with my code or approach. I am not an expert on threading and concurrency.
I have a very strange situation which i can't get my head around. I have defined a thread pool and its usage like this
ExecutorService fixedThreadPool = Executors.newFixedThreadPool(5);
....some code.....
logger.info("Event:{}, message:[{}]", Event.MESSAGE.name(), message);
fixedThreadPool.submit(new Runnable() {
#Override
public void run() {
...some code...
}
});
logger.info("Submitted: Event:{}, message:[{}]", Event.MESSAGE.name(), message);
Now here is my output for the log messages
2017-07-25 20:44:41,020 [New I/O worker #1] XXXXXXXXX.XXXXXXXServiceImpl - Event:MESSAGE, message:[{"delegateTaskId":"_5ejQ7gtTXyfh6qnPrUeJg","sync":true,"accountId":"kmpySmUISimoRrJL6NL73w"}]
2017-07-25 20:45:42,356 [New I/O worker #1] XXXXXXXXX.XXXXXXXServiceImpl - Submitted: Event:MESSAGE, message:[{"delegateTaskId":"_5ejQ7gtTXyfh6qnPrUeJg","sync":true,"accountId":"kmpySmUISimoRrJL6NL73w"}]
See the timestamp of the two messages. Although i am expecting that submitting to the queue for thread pool should be immediate, it takes almost a minute between the two messages. I have tried to eliminate all possibilities like printing GC logs (no major GC with pauses), load pattern etc.
At the time when i see this there is no load on the system and CPU use is minimal. Its running on amazon EC2 T2LARGE box and i can see that there is not much CPU usage.
I read java docs and google around but i couldn't find anything helpful. This is very puzzling. Any pointer is greatly appreciated.
------EDIT-----
I added the time in the log message to make sure that there is no issue of logging. The updated code is
logger.info("Event:{}, time:{}, message:[{}]", Event.MESSAGE.name(), new Date(), message);
fixedThreadPool.submit(new Runnable() {
#Override
public void run() {
...some code...
}
});
logger.info("Submitted: Event:{}, time:{}, message:[{}]", Event.MESSAGE.name(), new Date(), message);
Here is the output
Event:MESSAGE, time:Wed Jul 26 17:50:18 UTC 2017, message:[{"delegateTaskId":"pN7UzXfzSWajjJY33LbM1A","sync":true,"accountId":"kmpySmUISimoRrJL6NL73w"}]
Submitted: Event:MESSAGE, time:Wed Jul 26 17:51:19 UTC 2017, message:[{"delegateTaskId":"pN7UzXfzSWajjJY33LbM1A","sync":true,"accountId":"kmpySmUISimoRrJL6NL73w"}]
As you can see that the time taken to submit the task in the thread pool is almost a minute
I've tested the following code, also on an AWS EC2 t2.large instance:
import java.time.Instant;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;
public class RaghvendraSinghTest
{
public static void main(String[] args)
throws Exception
{
ExecutorService fixedThreadPool = Executors.newFixedThreadPool(5);
System.out.printf("[%s] Before fixedThreadPool.submit()%n", Instant.now());
fixedThreadPool.submit(new Runnable() {
#Override
public void run()
{
System.out.printf("[%s] In run()%n", Instant.now());
}
});
System.out.printf("[%s] After fixedThreadPool.submit()%n", Instant.now());
fixedThreadPool.shutdown();
fixedThreadPool.awaitTermination(30, TimeUnit.SECONDS);
System.out.printf("[%s] After fixedThreadPool.shutdown()%n", Instant.now());
}
}
Running the code results in the following output:
[2017-07-26T20:11:56.730Z] Before fixedThreadPool.submit()
[2017-07-26T20:11:56.803Z] After fixedThreadPool.submit()
[2017-07-26T20:11:56.803Z] In run()
[2017-07-26T20:11:56.804Z] After fixedThreadPool.shutdown()
Which shows that the entire run of the program takes less than 75ms. One thing about your question that stands out to me is the name of your thread - "New I/O worker #1" - which indicates to me that there are multiple ExecutorServices at play here.
If you run the code I have included - and just the code I have included - do you see results similar to mine? If you do (and I suspect that you will), you should include enough code so that we can replicate your problem. Otherwise this certainly appears to be specific to your environment.
I figured out the issue. We have the following in our logback.xml
<appender name="SYSLOG-TLS" class="software.wings.logging.CloudBeesSyslogAppender">
<layout class="ch.qos.logback.classic.PatternLayout">
<pattern>%date{ISO8601} %boldGreen(${process_id}) %boldCyan(${version}) %green([%thread]) %highlight(%-5level) %cyan(%logger) - %msg %n</pattern>
</layout>
<host>XXXXXXXX</host>
<port>XXXXXXXX</port>
<programName>XXXXXXXXX</programName>
<key>XXXXXXXXXXX</key>
<threshold>TRACE</threshold>
</appender>
This configuration made the logger.info call to post the log in logdna and the way our system was configured, the posting of the log to logdna server was synchronous and sometimes it was taking upto 60 secs and our tasks were timing out.
Now need to figure out why these logdna log posting calls are synchronous.
I'm trying to find a way to use a ThreadPoolExecutor in the following scenario:
I have a separate thread producing and submitting tasks on the thread pool
a task submission is synchronous and will block until the task can be started by the ThreadPoolExecutor
at any given time, only a fixed number of tasks can be executing in parallel. An unbounded number of tasks running at the same time may result in memory exhaustion.
before submitting a task, the producer thread always checks that some maximum build time has not been exceeded since the first submitted task. If it was exceeded, the thread shutdowns but any task currently running on the thread pool runs to completion before the application terminates.
when the producer thread terminates, there should be no unstarted task on the queue of the thread pool.
To give more context, I currently just submit all tasks at once and cancel all the futures returned by ExecutorService.submit after the max build time is expired. I ignore all resulting CancellationExceptions since they are expected. The problem is that the behaviour of Future.cancel(false) is odd and inadequate to my use-case:
it prevents any unstarted task to run (good)
it does not interrupt currently running tasks and let them run to completion (good)
however, it ignores any exception thrown by the currently running tasks and instead throws a CancellationException for which Exception.getCause() is null. Therefore, I can't distinguish a task which has been canceled before running from a task which has continued running after the max build time and failed with an exception ! That's unfortunate, because in this case I would like to propagate the exception and report it to some error handling mechanism.
I looked into the different blocking queues Java has to offer and found this: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/SynchronousQueue.html. That seemed ideal at first, but then looking at https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadPoolExecutor.html, it does not seem to play with ThreadPoolExecutor in the way I want it to:
Direct handoffs. A good default choice for a work queue is a
SynchronousQueue that hands off tasks to threads without otherwise
holding them. Here, an attempt to queue a task will fail if no threads
are immediately available to run it, so a new thread will be
constructed. This policy avoids lockups when handling sets of requests
that might have internal dependencies. Direct handoffs generally
require unbounded maximumPoolSizes to avoid rejection of new submitted
tasks. This in turn admits the possibility of unbounded thread growth
when commands continue to arrive on average faster than they can be
processed.
What would be ideal is that the consumer (= the pool) blocks on SynchronousQueue.poll and the producer (= task producer thread) blocks on SynchronousQueue.put.
Any idea how I can implement the scenario I described without writing any complex scheduling logic (what ThreadPoolExecutor should enclose for me) ?
I Believe that you're in the right path... all you have to do is use a SynchronousQueue in conjuction of a RejectedExecutionHandler, using the following constructor ... in that way you can define a fixed max size thread pool (limiting your resources usage) and define a fallback mechanism to re schedule those task that cannot be processed (because the pool was full)... Example:
public class Experiment {
public static final long HANDLER_SLEEP_TIME = 4000;
public static final int MAX_POOL_SIZE = 1;
public static void main(String[] args) throws InterruptedException {
SynchronousQueue<Runnable> queue;
RejectedExecutionHandler handler;
ThreadPoolExecutor pool;
Runnable runA, runB;
queue = new SynchronousQueue<>();
handler = new RejectedExecutionHandler() {
#Override
public void rejectedExecution(Runnable r, ThreadPoolExecutor executor) {
try {
System.out.println("Handler invoked! Thread: " + Thread.currentThread().getName());
Thread.sleep(HANDLER_SLEEP_TIME); // this let runnableA finish
executor.submit(r); // re schedule
} catch (InterruptedException ex) {
throw new RuntimeException("Handler Exception!", ex);
}
}
};
pool = new ThreadPoolExecutor(1, MAX_POOL_SIZE, 10, TimeUnit.SECONDS, queue, handler);
runA = new Runnable() {
#Override
public void run() {
try {
Thread.sleep(3000);
System.out.println("hello, I'm runnable A");
} catch (Exception ex) {
throw new RuntimeException("RunnableA", ex);
}
}
};
runB = new Runnable() {
#Override
public void run() {
System.out.println("hello, I'm runnable B");
}
};
pool.submit(runA);
pool.submit(runB);
pool.shutdown();
}
}
NOTE: the implementation of the RejectedExecutionHandler is up to you! I just only suggest a sleep as a blocking mechanism, but hrer you can do logic more complex as ask the threadpool is it has free threads or not. If not, then sleep; if yes, then submit task again...
I found another option than the one proposed by #Carlitos Way. It consists in directly adding tasks on the queue using BlockingQueue.offer. The only reason I did not manage to make it work at first and I had to post this question is that I did not know that the default behaviour of a ThreadPoolExecutor is to start without any thread. The threads will be created lazily using a thread factory and may be deleted/repopulated depending on the core and max sizes of the pool and the number of tasks being submitted concurrently.
Since the thread creation was lazy, my attempts to block on the call to offer failed because SynchronousQueue.offer immediately exits if nobody is waiting to get an element from the queue. Conversely, SynchronousQueue.put blocks until someone asks to take an item from the queue, which will never happen if the thread pool is empty.
Therefore, the workaround is to force the thread pool to create the core threads eagerly using ThreadPoolExecutor.prestartAllCoreThreads. My problem then becomes fairly trivial. I made a simplified version of my real use-case:
import java.util.Random;
import java.util.concurrent.SynchronousQueue;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.atomic.AtomicLong;
import static java.util.concurrent.TimeUnit.MILLISECONDS;
import static java.util.concurrent.TimeUnit.SECONDS;
public class SimplifiedBuildScheduler {
private static final int MAX_POOL_SIZE = 10;
private static final Random random = new Random();
private static final AtomicLong nextTaskId = new AtomicLong(0);
public static void main(String[] args) throws InterruptedException {
SynchronousQueue<Runnable> queue = new SynchronousQueue<>();
// this is a soft requirement in my system, not a real-time guarantee. See the complete semantics in my question.
long maxBuildTimeInMillis = 50;
// this timeout must be small compared to maxBuildTimeInMillis in order to accurately match the maximum build time
long taskSubmissionTimeoutInMillis = 1;
ThreadPoolExecutor pool = new ThreadPoolExecutor(MAX_POOL_SIZE, MAX_POOL_SIZE, 0, SECONDS, queue);
pool.prestartAllCoreThreads();
Runnable nextTask = makeTask(maxBuildTimeInMillis);
long millisAtStart = System.currentTimeMillis();
while (maxBuildTimeInMillis > System.currentTimeMillis() - millisAtStart) {
boolean submitted = queue.offer(nextTask, taskSubmissionTimeoutInMillis, MILLISECONDS);
if (submitted) {
nextTask = makeTask(maxBuildTimeInMillis);
} else {
System.out.println("Task " + nextTaskId.get() + " was not submitted. " + "It will be rescheduled unless " +
"the max build time has expired");
}
}
System.out.println("Max build time has expired. Stop submitting new tasks and running existing tasks to completion");
pool.shutdown();
pool.awaitTermination(9999999, SECONDS);
}
private static Runnable makeTask(long maxBuildTimeInMillis) {
long sleepTimeInMillis = randomSleepTime(maxBuildTimeInMillis);
long taskId = nextTaskId.getAndIncrement();
return () -> {
try {
System.out.println("Task " + taskId + " sleeping for " + sleepTimeInMillis + " ms");
Thread.sleep(sleepTimeInMillis);
System.out.println("Task " + taskId + " completed !");
} catch (InterruptedException ex) {
throw new RuntimeException(ex);
}
};
}
private static int randomSleepTime(long maxBuildTimeInMillis) {
// voluntarily make it possible that a task finishes after the max build time is expired
return 1 + random.nextInt(2 * Math.toIntExact(maxBuildTimeInMillis));
}
}
An example of output is the following:
Task 1 was not submitted. It will be rescheduled unless the max build time has expired
Task 0 sleeping for 23 ms
Task 1 sleeping for 26 ms
Task 2 sleeping for 6 ms
Task 3 sleeping for 9 ms
Task 4 sleeping for 75 ms
Task 5 sleeping for 35 ms
Task 6 sleeping for 81 ms
Task 8 was not submitted. It will be rescheduled unless the max build time has expired
Task 8 was not submitted. It will be rescheduled unless the max build time has expired
Task 7 sleeping for 86 ms
Task 8 sleeping for 47 ms
Task 9 sleeping for 40 ms
Task 11 was not submitted. It will be rescheduled unless the max build time has expired
Task 2 completed !
Task 10 sleeping for 76 ms
Task 12 was not submitted. It will be rescheduled unless the max build time has expired
Task 3 completed !
Task 11 sleeping for 31 ms
Task 13 was not submitted. It will be rescheduled unless the max build time has expired
Task 13 was not submitted. It will be rescheduled unless the max build time has expired
Task 13 was not submitted. It will be rescheduled unless the max build time has expired
Task 13 was not submitted. It will be rescheduled unless the max build time has expired
Task 13 was not submitted. It will be rescheduled unless the max build time has expired
Task 13 was not submitted. It will be rescheduled unless the max build time has expired
Task 0 completed !
Task 12 sleeping for 7 ms
Task 14 was not submitted. It will be rescheduled unless the max build time has expired
Task 14 was not submitted. It will be rescheduled unless the max build time has expired
Task 1 completed !
Task 13 sleeping for 40 ms
Task 15 was not submitted. It will be rescheduled unless the max build time has expired
Task 12 completed !
Task 14 sleeping for 93 ms
Task 16 was not submitted. It will be rescheduled unless the max build time has expired
Task 16 was not submitted. It will be rescheduled unless the max build time has expired
Task 16 was not submitted. It will be rescheduled unless the max build time has expired
Task 5 completed !
Task 15 sleeping for 20 ms
Task 17 was not submitted. It will be rescheduled unless the max build time has expired
Task 17 was not submitted. It will be rescheduled unless the max build time has expired
Task 11 completed !
Task 16 sleeping for 27 ms
Task 18 was not submitted. It will be rescheduled unless the max build time has expired
Task 18 was not submitted. It will be rescheduled unless the max build time has expired
Task 9 completed !
Task 17 sleeping for 95 ms
Task 19 was not submitted. It will be rescheduled unless the max build time has expired
Max build time has expired. Stop submitting new tasks and running existing tasks to completion
Task 8 completed !
Task 15 completed !
Task 13 completed !
Task 16 completed !
Task 4 completed !
Task 6 completed !
Task 10 completed !
Task 7 completed !
Task 14 completed !
Task 17 completed !
You'll notice, for example, that task 19 was not rescheduled because the max build time expired before the scheduler can attempt to offer it to the queue a second time. You can also see than all the ongoing tasks that started before the max build time expired do run to completion.
Note: As noted in my comments in the code, the max build time is a soft requirement, which means that it might not be met exactly, and my solution indeed allows for a task to be submitted even after the max build time is expired. This can happen if the call to offer starts just before the max build time expires and finishes after. To reduce the odds of it happening, it is important that the timeout used in the call to offer is much smaller than the max build time. In the real system, the thread pool is usually busy with no idle thread, therefore the probability of this race condition to occur is extremely small, and it has no bad consequence on the system when it does happen, since the max build time is a best effort attempt to meet an overall running time, not an exact and rigid constraint.
I tried to setup a sample Play Framework (version 2.2.2) Java application to test its performace on some simple use case scenarios I had in mind. That's what I did:
Play controller
I wrote a basic Application controller to test the performance of a custom library I wanted to use in both sync and async scenarios:
public class Application extends Controller {
public static JsonNode transform(Request request) {
// this method reads a json from request, applies some transformation and returns a new JsonNode
}
public static Result syncTest() {
JsonNode node = transform(request());
if(node.has("error")) {
return badRequest(node);
} else {
return ok(node);
}
}
public static Promise<Result> asyncTest() {
final Request request = request();
Promise<JsonNode> promise = Promise.promise(
new Function0<JsonNode>() {
public JsonNode apply() {
return transform(request);
}
});
return promise.map(new Function<JsonNode, Result> () {
public Result apply(JsonNode node) {
if(node.has("error")) {
return badRequest(node);
} else {
return ok(node);
}
}
});
}
}
I run this service on virtual machine running on Azure with 2 2.0ghz cores and 3.4gb ram.
Testing
I used wrk from a different machine to perform tests on both sync and async routes. These are the commands and the results I got:
./wrk -s post.lua -d30s -c100 -t10 --latency http://my.proxy.net:8080/syncTest
Running 30s test #
10 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 84.98ms 48.13ms 410.73ms 68.95%
Req/Sec 121.23 18.90 181.00 73.67%
Latency Distribution
50% 81.36ms
75% 112.51ms
90% 144.44ms
99% 231.99ms
36362 requests in 30.03s, 10.99MB read
Requests/sec: 1210.80
Transfer/sec: 374.83KB
./wrk -s post.lua -d30s -c100 -t10 --latency http://my.proxy.net:8080/asyncTest
Running 30s test #
10 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 82.07ms 36.55ms 257.93ms 70.53%
Req/Sec 122.44 15.39 161.00 73.24%
Latency Distribution
50% 80.26ms
75% 102.37ms
90% 127.14ms
99% 187.17ms
36668 requests in 30.02s, 11.09MB read
Requests/sec: 1221.62
Transfer/sec: 378.18KB
./wrk -s post.lua -d30s -c1000 -t10 --latency http://my.proxy.net:8080/syncTest
Running 30s test #
10 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 842.98ms 617.40ms 4.18s 59.56%
Req/Sec 118.02 16.82 174.00 77.50%
Latency Distribution
50% 837.67ms
75% 1.14s
90% 1.71s
99% 2.51s
35326 requests in 30.01s, 10.68MB read
Socket errors: connect 0, read 27, write 0, timeout 181
Requests/sec: 1176.97
Transfer/sec: 364.35KB
./wrk -s post.lua -d30s -c1000 -t10 --latency http://my.proxy.net:8080/asyncTest
Running 30s test #
10 threads and 1000 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 5.98s 4.53s 17.97s 72.66%
Req/Sec 21.32 10.45 37.00 59.74%
Latency Distribution
50% 4.86s
75% 8.30s
90% 12.89s
99% 17.10s
6361 requests in 30.08s, 1.92MB read
Socket errors: connect 0, read 0, write 0, timeout 8410
Requests/sec: 211.47
Transfer/sec: 65.46KB
During all tests, both cpus of the server's machine were working 100%. Later, I repeated this experiments but modified the Promises I was creating to run on a different execution context than the default. In this case both sync and async methods performed in a very similar way.
Questions
Why is it that, when using 10 threads with 100 connections, both methods have similar latency and request per seconds.
Why is it that, with 1000 connections, async method seems to have worst performance that async or, in case of a separate execution context, similar performance to sync methods?
Is it related to the transform method not being really cpu intensive, because I did the async implementation wrong or because I have completely misunderstood how this thing is supposed to work?
Thanks in advance!
I am trying to get a simple example of the Quartz scheduler working in JBoss Seam 2.2.0.GA. Everything works fine using the RAMJobStore setting, but changing the store from
org.quartz.jobStore.class = org.quartz.simpl.RAMJobStore
to
org.quartz.jobStore.class org.quartz.impl.jdbcjobstore.JobStoreCMT
org.quartz.jobStore.driverDelegateClass org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
org.quartz.jobStore.useProperties false
org.quartz.jobStore.dataSource quartzDatasource
## FIXME Should be a different datasource for the non managed connection.
org.quartz.jobStore.nonManagedTXDataSource quartzDatasource
org.quartz.jobStore.tablePrefix qrtz_
org.quartz.dataSource.quartzDatasource.jndiURL java:/quartzDatasource
allows the scheduler to start up, but whereas the job was previously being triggered and run at the correct interval, now it is not run at all. There is also nothing persisted to the quartz database.
I am aware that the nonManagedTXDataSource shouldn't be the same as the managed datasource, but I am having issues with the datasource being unable to be found by Quartz, even though there is a message earlier on reporting it being bound successfully (this is probably about to be asked in a separate question). Using the same datasource allows the service to start up without errors.
My components.xml file has the following:
<event type="org.jboss.seam.postInitialization">
<action execute="#{asyncResultMapper.scheduleTimer}"/>
</event>
<async:quartz-dispatcher/>
and ASyncResultMapper has the following:
#In
ScheduleProcessor processor;
private String text = "ahoy";
private QuartzTriggerHandle quartzTriggerHandle;
public void scheduleTimer() {
String cronString = "* * * * * ?";
quartzTriggerHandle = processor.createQuartzTimer(new Date(), cronString, text);
}
and ScheduleProcessor is as follows:
#Name("processor")
#AutoCreate
#Startup
#Scope(ScopeType.APPLICATION)
public class ScheduleProcessor {
#Asynchronous
public QuartzTriggerHandle createQuartzTimer(#Expiration Date when, #IntervalCron String interval, String text) {
process(when, interval, text);
return null;
}
private void process(Date when, String interval, String text) {
System.out.println("when = " + when);
System.out.println("interval = " + interval);
System.out.println("text = " + text);
}
}
The logs show the service starting but nothing about the job:
INFO [QuartzScheduler] Quartz Scheduler v.1.5.2 created.
INFO [JobStoreCMT] Using db table-based data access locking (synchronization).
INFO [JobStoreCMT] Removed 0 Volatile Trigger(s).
INFO [JobStoreCMT] Removed 0 Volatile Job(s).
INFO [JobStoreCMT] JobStoreCMT initialized.
INFO [JobStoreCMT] Freed 0 triggers from 'acquired' / 'blocked' state.
INFO [JobStoreCMT] Recovering 0 jobs that were in-progress at the time of the last shut-down.
INFO [JobStoreCMT] Recovery complete.
INFO [JobStoreCMT] Removed 0 'complete' triggers.
INFO [JobStoreCMT] Removed 0 stale fired job entries.
INFO [QuartzScheduler] Scheduler FlibScheduler$_NON_CLUSTERED started.
I'm sure it's probably something trivial I've missed, but I can't find a solution in the forums anywhere.
Managed to solve this for myself in the end. The issue of the JobStoreCMT version not starting and triggering jobs was caused by a mixture of a missing #Transactional (thanks tair), and more importantly a need to upgrade Quartz. Once Quartz was upgraded to 1.8.5 the error messages became a lot more useful.