WebSphere application server scheduler or java scheduler for insert

WebSphere application server scheduler or java scheduler for insert - java

I am working on an application which is deployed on web-sphere application server 8.0. This application insert record in one table and uses the data-source by jndi lookup.
I need to create a batch job which will read data from the above table and will insert into some other table continuously on a fixed interval of time. It will be deployed on the same WAS server and use the same jndi lookup for data source.
I read on internet that web-sphere application server scheduling is an option and is done using EJB and session beans.
I also read about jdk's ScheduledThreadPoolExecutor. I can create a war having ScheduledThreadPoolExecutor implementation and deploy it on the WAS for this.
I tried to find the difference between these two in terms of usage, complexity, performance and maintainability but could not.
Please help me in deciding which approach will be better for creating the scheduler for insert batch jobs and why. And in case if WAS scheduler is better then please provide me link to create and deploy the same.
Thanks!

Some major differences between WAS Scheduler and Java SE ScheduledThreadPoolExecutor is that WAS Scheduler is transactional (task execution can roll back or commit), persistent (tasks are stored in a database), and can coordinate across members of a cluster (such that tasks can be scheduled from any member but only run on one member). ScheduledThreadPoolExecutor is much lighter weight approach because it doesn't have any of these capabilities and does all of its scheduling within a single JVM. Task executions neither roll back nor retry and are not kept externally in a database in case the server goes down. It should be noted that WebSphere Application Server also has CommonJ TimerManager (and AlarmManager via WorkManager) which are more similar to what you get with ScheduledThreadPoolExecutor if that is what you want. In that case, the application server still manages the threads and ensures that context of the scheduling thread is available on the thread of execution. Hope this helps with your decision.

Related

How can I coordinate a single ejb timer deployed to multiple servers within my WebLogic cluster?

So, I have a web client and an EJB timer, deployed seperately.
The workflow is as follows:
1) User accesses client.
2) User requests an action to take place which is known to be long-running, so we write the request to run this process in a database table.
3) TimerOne is checking this table every few seconds to see if there are any waiting tasks, so it finds the user's request and runs the task.
My problem is that in some environments in which our application is run, we are taking advantage of server clustering. When we do this, both the client and the EJB timer are deployed to each server in the cluster.
It is okay for the client to be deployed to multiple servers, as it helps with workload; however, having the timer run on multiple servers is an issue. When the user requests for a long-running task to be run, both timers grab the task at the same time from the database and start running it. As the long-running jobs usually write to the database, this scenario leads to collisions, among other issues.
My goal is to be able to deploy my EJB timer to both servers, but for there to be some state maintained across the cluster which can be used by the timers to decide whether they should pick up the task or if one of the other instances has already picked it up.
I tried using the database for this and tried file storage, but these are either too slow, or I could not come up with a bullet-proof workflow for synchronization.
Does anyone know of a good way to handle this problem? Is it even possible?
The solution should be able to run on a clustered WebLogic domain, a non-clustered WebLogic domain, a clustered Glassfish domain, and a non-clustered Glassfish domain.
I am open to changing the way this is done, if there is another, more elegent solution.
Thanks for any ideas!

Yes this is possible with clustered timers or a Weblogic Singleton Service (and has been asked a number of times here already). See the following:
Clustered timers:
https://blogs.oracle.com/muraliveligeti/entry/ejb_timer_ejb
http://shaoxiongyang.blogspot.com/2010/10/how-to-use-ejb-3-timer-in-weblogic-10.html
http://java.sys-con.com/node/43944
Singleton Services:
https://blogs.oracle.com/jamesbayer/entry/a_simple_job_scheduler_example
http://developsimpler.blogspot.com/2012/03/weblogic-clusters-and-singleton-service.html

I am open to changing the way this is done, if there is another, more elegent solution.
I know that your question is about a EJB Timer, but take in mind the following:
In my opinion, you have a requirement that need the advantage of asynchronous processing.
In earlier Java EE versions, one of the alternatives to achieve this kind of requirement was to use JMS which allows you to send a message that is processed later for a business layer component. Other possibility was the one that you have described, that required the use of EJB Timer. I think both cases were a workaround that filled a gap in the EE specification.
Since Java EE 6, you can define asynchronous services which allows you make asynchronous calls, avoiding to use features were thought for other purposes.

Executing external services using EJB Timer Service

I have a scenario to ask regarding utilizing the EJB Timer Service.
Use case as follows:
The system should be able to schedule a task that will poll/ask our subversion repository for files changes using some particular timestamp.
The idea is that whenever the scheduled task is about to run, it will execute command against a particular svn repository.
For this particular purpose, I will not call any external process but will use the 'pure' java way of using the SVNKit java library http://svnkit.com/
My only concern is this:
Is it a good idea to use the EJB Timer Service to execute task that will call external processes? My way will use a 'pure' java way but in other scenario such as calling a batch file/command line/external executable directly into the timer service logic.
I worry about the effects of server memory use/performance etc.
Is this a good idea?
The other thought that I am thinking is to just create a 'desktop' application in the server using client based technology such as SWT/Swing that will do the polling and then code the logic there but this will mean that I need to manage two applications. The 'desktop' app that will poll and the 'web' user interface that I will create in Glassfish.
I am leaning towards doing everything in the App server of my choice which is glassfish.
I have used EJB Timer before but it only calls against the database without calling any extenral service and it's just that this scenario came up so I raised a question here to gather more thoughts from those who have experienced doing this.
Any thoughts?

In theory, EJBs aren't supposed to depend on external I/O since it interferes with the container/server's management of bean instances, threads, etc.
In practice, this should work if you take precautions. For example:
isolate the function to its own EJB (i.e., a stateless session bean that only handles these timers) to avoid instance pooling issues
use timeouts while waiting for commands to avoid hung processes from hanging all server threads
ensure that you don't schedule timers so that you have multiple OS commands run simultaneously
Keep in mind that EJB 3.0 timers are persistent (vs EJB 3.1 timers, which have the option of being non-persistent), which means:
They can run on any server in a cluster. If you have multiple machines in your cluster, you need to ensure that they are all capable of running the command.
They survive server restarts. If you schedule a timer to run but the server crashes before it can, it will run when the server restarts. This can cause particular problems for interval timers (all missed timers will fire repeatedly) and if you don't carefully manage existing times (you can easily create redundant timers).

Using Quartz in a clustered environment

I'm looking to use the quartz scheduler in my application because I have a clustered environment and want to guarantee that only one instance of my job runs each hour. My question is...Do I have to use a JDBC job store or some sort of "outside" storage of job data to guarantee that only once instance in my cluster runs the job at any given hour or is there more magic to Quartz that I am aware of?

Yes, you need to use the JDBC-JobStore, or else the TerracottaJobStore to enable a mechanism for the nodes to communicate with each other (in the one case they communicate in the db tables, in the other via the Terracotta networking features).

Need help with java web app design to perform background tasks

I have a local web app that is installed on a desktop PC, and it needs to regularly sync with a remote server through web services.
I have a "transactions" table that stores transactions that have been processed locally and need to be sent to the remote server, and this table also contains transactions that have retrieved from the remote server (that have been processed remotely) and need to be peformed locally (they have been retrieved using a web service call)... The transactions are performed in time order to ensure they are processed in the right order.
An example of the type of transactions are "loans" and "returns" of items from a store, for example a video rental store. For example something may have been loaned locally and returned remotely or vice versa, or any sequence of loan/return events.
There is also other information that is retrieved from the remote server to update the local records.
When the user performs the tasks locally, I update the local db in real time and add the transaction to the table for background processing with the remote server.
What is the best approach for processing the background tasks. I have tried using a Thread that is created in a HTTPSessionListener, and using interrupt() when the session is removed, but I don't think that this is the safest approach. I have also tried using a session attribute as a locking mechanisim, but this also isn't the best approach.
I was also wondering how you know when a thread has completed it's run, as to avoid lunching another thread at the same time. Or whether a thread has ditched before completing.
I have come accross another suggestion, using the Quartz scheduler, I haven't read up on this approach in detail yet. I am going to puchase a copy of Java Concurrency in Practice, but I wanted some help with ideas for the best approach before I get stuck into it.
BTW I'm not using a web app framework.
Thanks.

Safest would be to create an applicationwide threadpool which is managed by the container. How to do that depends on the container used. If your container doesn't support it (e.g. Tomcat) or you want to be container-independent, then the basic approach would be to implement ServletContextListener, create the threadpool with help of Java 1.5 provided ExecutorService API on startup and kill the threadpool on shutdown. If you aren't on Java 1.5 yet or want more abstraction, then you can also use Spring's TaskExecutor
There was ever a Java EE proposal about concurrency utilities, but it has not yet made it into Java EE 6.
Related questions:
What is the recommend way of spawning threads from a servlet?
Background timer task in a JSP web application

Its better to go with Quartz Scheduling framework, because it has most of the features related to scheduling. It has facility to store jobs in Database, Concurrency handling,etc..
Please try this solution
Create a table,which stores some flag like 'Y' or 'N' mapped to some identifiable field with default value as 'N'
Schedule a job for each return while giving loand it self,which executes if flag is 'Y'
On returning change the flag to 'N',which then fires the process which you wanted to do

scheduling tasks on JBoss with clustering

I need to be able to run some scheduled tasks (reports) for an EJB application running on JBoss 4.2.
In my initial implementation I am using a servlet in an associated WAR to read some configuration from a properties file and then reset the scheduled tasks using the Timer Service API. This works but it seems a bit awkward to have the initialization off in a web project. Also I'm not sure if this will work as expected when the app is deployed in a clustered environment.
What are the best practice for accomplishing this type of task? Should I be using something other than Timer Service and is there a better way to initialize the timers when the server starts?

Maybe have a look at Quartz Scheduler. Quoting its website:
Quartz is a full-featured, open source job scheduling system that can be integrated with, or used along side virtually any J2EE or J2SE application - from the smallest stand-alone application to the largest e-commerce system. Quartz can be used to create simple or complex schedules for executing tens, hundreds, or even tens-of-thousands of jobs; jobs whose tasks are defined as standard Java components or EJBs. The Quartz Scheduler includes many enterprise-class features, such as JTA transactions and clustering.
I've used it in the past to trigger EJB jobs and the whole solution was working very well, with very good scalability. To use it with EJB, you'll need to use the JobStoreCMT to store scheduling information (job, triggers and calendars). To tune resources for jobs execution, have a look at the Configure ThreadPool Settings doc. Then, just let the EJB client do its job to load balance requests over the different instances if EJBs are deployed on a cluster.
Quartz itself can also be clustered to get both high availability and scalability through fail-over and load balancing if required.
Regarding the properties file you mentioned, I'm not sure of what kind of data you need to read exactly but, without a servlet, if you need to read something, you'll have to read it from the database.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.