I'm not quite sure whether this is more of an Openbravo issue or more of a Quartz issue, but we have some manual processes that run on schedules via Openbravo ProcessRequest objects (OB v2.50MP24), but it seems that the processes are running twice, at the exact same time. Openbravo extends the Quartz platform for their scheduling. I've tried to resolve this issue on my own by ensuring that my process classes extend this class:
import java.util.List;
import org.openbravo.dal.service.OBDal;
import org.openbravo.model.ad.ui.ProcessRequest;
import org.openbravo.scheduling.ProcessBundle;
import org.openbravo.service.db.DalBaseProcess;
public abstract class RBDDalProcess extends DalBaseProcess {
#Override
protected void doExecute(ProcessBundle bundle) throws Exception {
org.quartz.Scheduler sched = org.openbravo.scheduling.OBScheduler
.getInstance().getScheduler();
int runCount = 0;
synchronized (sched) {
List<org.quartz.JobExecutionContext> currentlyExecutingJobs = (List<org.quartz.JobExecutionContext>) sched
.getCurrentlyExecutingJobs();
for (org.quartz.JobExecutionContext jec : currentlyExecutingJobs) {
ProcessRequest processRequest = OBDal.getInstance().get(
ProcessRequest.class, jec.getJobDetail().getName());
if (processRequest == null)
continue;
String processClass = processRequest.getProcess()
.getJavaClassName();
if (bundle.getProcessClass().getCanonicalName()
.equals(processClass)) {
runCount++;
}
}
}
if (runCount > 1) {
System.out.println("Process "
+ bundle.getProcessClass().getSimpleName()
+ " is already running. Cancelling.");
return;
}
doRun(bundle);
}
protected abstract void doRun(ProcessBundle bundle);
}
This worked fine when I tested by requesting the process to run immediately twice at the same time. One of them cancelled. However, it's not working on the scheduled processes. I have S.o.p's set up to log when the processes start, and looking at the logs shows each line of the output twice, each line one right after the other.
I have a sneaking suspicion that it's because the processes are either running in two completely different threads that don't know about each others' processes, however, I'm not sure how to verify my suspicions or, if I am correct, what to do about it. I've already verified that there is only one instance of each of the ProcessRequest objects stored in the database.
Has anyone else experienced this, know why they might be running twice, or know what I can do to prevent them from simultaneously running?
The most common reasons for a double Job execution are the following:
EDITED:
Your application is deployed in a clustered environment and you have not configured Quartz to run in a cluster environment.
Your application is deployed more than once. There are many cases where the application is deployed twice especially in Tomcat server. As a consequence the QuartzInitializerListener is invoked twice and the Jobs are executed twice. In case you use Tomcat server and you are defining contexts explicitly in server.xml, you should turn off automatic application deployment or specify deployIgnore. Both the autoDeploy set to true and the context element existence in server.xml, have as a consequence the twice deployment of the application. Set autoDeploy to false or remove the context element from the server.xml.
Your application has been redeployed without unscheduling the current processes.
I hope this helps you.
Quartz uses a thread pool for the jobs execution. So as you suspect, the RBDDalProcess will probably have separate instances a in separate thread and the counter check will fail.
One thing you can do is list the jobs registered in the Scheduler (you can get the Scheduler using the OB API as: OBScheduler.getScheduler()):
// enumerate each job group
for(String group: sched.getJobGroupNames()) {
// enumerate each job in group
for(JobKey jobKey : sched.getJobKeys(groupEquals(group))) {
System.out.println("Found job identified by: " + jobKey);
}
}
If you see the same job added twice, check out org.quartz.spi.JobFactory and the org.quartz.Scheduler.setJobFactory method for controlling jobs instantiations.
Also make sure you have only one entry for this process in the 'Report and Process' table in Openbravo.
I have used DalBaseProcess in Openbravo 3.0 and I cannot confirm this behavior you're describing. Having this in mind it would be probably a good idea to checkout the reported bugs for Openbravov2.50MP24 and Quartz or post a thread in Openbravo Forge forums with your problem.
Related
Suppose you have Spark + Standalone cluster manager. You opened spark session with some configs and want to launch SomeSparkJob 40 times in parallel with different arguments.
Questions
How to set reties amount on job failures?
How to restart jobs programmatically on failure? This could be useful if jobs failure due lack of resources. Than I can launch one by one all jobs that require extra resources.
How to restart spark application on job failure? This could be useful if job lack resources even when it's launched simultaneously. Than to change cores, CPU etc configs I need to relaunch application in Standalone cluster manager.
My workarounds
1) I pretty sure the 1st point is possible, since it's possible at spark local mode. I just don't know how to do that in standalone mode.
2-3) It's possible to hand listener on spark context like spark.sparkContext().addSparkListener(new SparkListener() {. But seems SparkListener lacks failure callbacks.
Also there is a bunch of methods with very poor documentation. I've never used them, but perhaps they could help to solve my problem.
spark.sparkContext().dagScheduler().runJob();
spark.sparkContext().runJob()
spark.sparkContext().submitJob()
spark.sparkContext().taskScheduler().submitTasks();
spark.sparkContext().dagScheduler().handleJobCancellation();
spark.sparkContext().statusTracker()
You can use SparkLauncher and control the flow.
import org.apache.spark.launcher.SparkLauncher;
public class MyLauncher {
public static void main(String[] args) throws Exception {
Process spark = new SparkLauncher()
.setAppResource("/my/app.jar")
.setMainClass("my.spark.app.Main")
.setMaster("local")
.setConf(SparkLauncher.DRIVER_MEMORY, "2g")
.launch();
spark.waitFor();
}
}
See API for more details.
Since it creates process you can check the Process status and retry e.g. try following:
public boolean isAlive()
If Process is not live start again, see API for more details.
Hoping this gives high level idea of how we can achieve what you mentioned in your question. There could be more ways to do same thing but thought to share this approach.
Cheers !
check your spark.sql.broadcastTimeout and spark.broadcast.blockSize properties, try to increase them .
I have a spark steaming program with the following structure deployed in yarn-client mode with 4 executors.
ListStream.foreachRDD(listJavaRDD -> {
listJavaRDD.foreachPartition(tuple2Iterator -> {
while (tuple2Iterator.hasNext()) {
//Program logic
}
//Program logic
}
//Program logic
return null;
});
At some random points some tasks do not return from executor to spark driver even after program logic is completely executed in executor. (I have verified this by examining the executor logs). The steaming job continues without any issue once I kill the particular job.
The issue is related to the record size or the nature of record as well.
I have not been able to reproduce this particular issue identify the root cause.I would like to hear if anyone has experienced a similar issue or any possible causes.
I'm currently developing some web services in Java (& JPA with MySQL connection) that are being triggered by an SAP System.
To simplify my problem I'm referring the two crucial entities as BlogEntry and Comment. A BlogEntry can have multiple Comments. A Comment always belongs to exactly one BlogEntry.
So I have three Services (which I can't and don't want to redefine, since they're defined by the WSDL I exported from SAP and used parallel to communicate with other Systems): CreateBlogEntry, CreateComment, CreateCommentForUpcomingBlogEntry
They are being properly triggered and there's absolutely no problem with CreateBlogEntry or CreateComment when they're called seperately.
But: The service CreateCommentForUpcomingBlogEntry sends the Comment and a "foreign key" to identify the "upcoming" BlogEntry. Internally it also calls CreateBlogEntry to create the actual BlogEntry. These two services are - due to their asynchronous nature - concurring.
So I have two options:
create a dummy BlogEntry and connect the Comment to it & update the BlogEntry, once CreateBlogEntry "arrives"
wait for CreateBlogEntry and connect the Comment afterwards to the new BlogEntry
Currently I'm trying the former but once both services are fully executed, I end up with two BlogEntries. One of them only has the ID delivered by CreateCommentForUpcomingBlogEntry but it is properly connected to the Comment (more the other way round). The other BlogEntry has all the other information (such as postDate or body), but the Comment isn't connected to it.
Here's the code snippet of the service implementation CreateCommentForUpcomingBlogEntry:
#EJB
private BlogEntryFacade blogEntryFacade;
#EJB
private CommentFacade commentFacade;
...
List<BlogEntry> blogEntries = blogEntryFacade.findById(request.getComment().getBlogEntryId().getValue());
BlogEntry persistBlogEntry;
if (blogEntries.isEmpty()) {
persistBlogEntry = new BlogEntry();
persistBlogEntry.setId(request.getComment().getBlogEntryId().getValue());
blogEntryFacade.create(persistBlogEntry);
} else {
persistBlogEntry = blogEntries.get(0);
}
Comment persistComment = new Comment();
persistComment.setId(request.getComment().getID().getValue());
persistComment.setBody(request.getComment().getBody().getValue());
/*
set other properties
*/
persistComment.setBlogEntry(persistBlogEntry);
commentFacade.create(persistComment);
...
And here's the code snippet of the implementation CreateBlogEntry:
#EJB
private BlogEntryFacade blogEntryFacade;
...
List<BlogEntry> blogEntries = blogEntryFacade.findById(request.getBlogEntry().getId().getValue());
BlogEntry persistBlogEntry;
Boolean update = false;
if (blogEntries.isEmpty()) {
persistBlogEntry = new BlogEntry();
} else {
persistBlogEntry = blogEntries.get(0);
update = true;
}
persistBlogEntry.setId(request.getBlogEntry().getId().getValue());
persistBlogEntry.setBody(request.getBlogEntry().getBody().getValue());
/*
set other properties
*/
if (update) {
blogEntryFacade.edit(persistBlogEntry);
} else {
blogEntryFacade.create(persistBlogEntry);
}
...
This is some fiddling that fails to make things happen as supposed.
Sadly I haven't found a method to synchronize these simultaneous service calls. I could let the CreateCommentForUpcomingBlogEntry sleep for a few seconds but I don't think that's the proper way to do it.
Can I force each instance of my facades and their respective EntityManagers to reload their datasets? Can I put my requests in some sort of queue that is being emptied based on certain conditions?
So: What's the best pracice to make it wait for the BlogEntry to exist?
Thanks in advance,
David
Info:
GlassFish Server 3.1.2
EclipseLink, version: Eclipse Persistence Services - 2.3.2.v20111125-r10461
If you are sure you are getting a CreateBlogEntry call, queue the CreateCommentForUpcomingBlogEntry calls and dequeue and process them once you receive the CreateBlogEntry call.
Since you are on an application server, for queues, you can probably use JMS queues that autoflush to storage or use the DB cache engine (Ehcache ?), in case you receive a lot of calls or want to provide a recovery mechanism across restarts.
We have an application deployed in a clustered environment. Every 5 minutes our application sends a ping operation to all other applications connected to it. We have used non-persistent Quartz scheduler in order to do this work.
The problem is that in a clustered environment only one node is doing this activity(ping operation). Are there any references or any sample code for this? (This is a plain servlet application.)
Since all nodes are working in a cluster, every job runs on just a single machine (most idle one). This is the reason you use clustering. But you want all machines to run given job independently, not being aware of other cluster nodes. Basically, you don't need Quartz (cluster) at all!
Enough is to use ScheduledExecutorService.html#scheduleAtFixedRate():
final ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
final Runnable pinger = new Runnable() {
public void run() {
//send PING
}
};
scheduler.scheduleAtFixedRate(pinger, 5, 5, MINUTES);
Just run this code on every machine and use Quartz where you need it.
I use org.eclipse.core.runtime.jobs.Job to execute stored procedure which deletes data and to update user interface according to the new data. Thus it is important that this job will be completed even if user closes eclipse application.
final Job execStoredProcJob = new Job(taskName) {
protected IStatus run(IProgressMonitor monitor) {
monitor.beginTask(taskName,
// execute stored procedure
// update user interface
monitor.done();
return Status.OK_STATUS;
}
};
execStoredProcJob.schedule();
When I close eclipse app while the Job still running it seems to kill the Job. How to complete the job after user has closed eclipse app? Is it possible?
I think you might want to take a look at scheduling rules
http://www.eclipse.org/articles/Article-Concurrency/jobs-api.html
execStoredProcJob.setRule([Workspace root]);
execStoredProcJob.schedule();
[Workspace root] can be attained like project.getWorkspace().getRoot() if you have a reference to your project.
That will block all jobs that require the same rule. The shutdown job is one of them..
It's also possible to:
IWorkspace myWorkspace = org.eclipse.core.resources.ResourcesPlugin.getWorkspace();
Then use:
myWorkspace.getRoot();
An alternative to scheduling rules is to add code in WorkbenchAdvisor.preShutdown() that will join any outstanding job you have...