The system contains an admin console and a cluster of working servers. Application state is stored in the database. From admin console user can add new jobs, monitor running jobs etc. Working servers fetch the job from db and process it.
Now, some configuration is stored in database, too. Configurations are also loaded on each working server and most of it is cached, as configuration is not changed frequently.
Admin is able to change configuration (from admin console). The change is stored in database. What would be the best way to push changes to working servers?
My ideas so far:
add triggers on configuration table on update/delete/insert and update the timestamp in some aux table. Each working server before accessing the cache checks this aux table for change.
CONS: Still accessing db.
send request from admin console to all working server that configuration is changed and that has to be read from db on next call.
CONS: introduces http communication between admin and servers - new layer that didn't exist so far - and its questionable how reliable that would be.
Any experience on this subject?
First approach seems more like a quick hack. Checking aux table timestamps almost defeats the purpose of a cache.
Second option seems to be the good one. This could be implemented as a simple task in their task queue to update their own configuration caches.
The main issue changing configuration of multiple services is their dependencies.
If parent service would start running with incompatible configuration to it's child services you might get a crash or undefined behaviour. To avoid this the synchronisation must be implemented.
One way is let parent service to update it's configuration and then issue configuration update commands to child services. After all child services are updated, parent would resume processing. Advantage of this approach would be that a simple management console could instruct only parent services about the configuration changes.
Other way is let management console handle dependent services. It would send a command to parent services to pause execution, update config and wait for resume command. In a mean time it would update all child services and instruct parents to resume. This way service dependencies would be more flexible and their configuration would be decoupled from their implementation. This would require a more advanced administration tool.
Related
The 12-Factor blog suggests an App should 'Run admin/management tasks as one-off processes'.
What does that mean in the context of a Java/ Spring-boot application? Can I get an example.
https://12factor.net/admin-processes
The site does not suggest this. It says that developers may want to do this, and that if they do so, they should apply the same standards as other code:
One-off admin processes should be run in an identical environment as the regular long-running processes of the app. They run against a release, using the same codebase and config as any process run against that release. Admin code must ship with application code to avoid synchronization issues.
As an example from my application: Users may send invitations, to which the recipient must respond within 7 days or the invitation expires. This is implemented by having a timestamp on the invitation and executing a database query equivalent to DELETE FROM Invitations WHERE expiration < NOW().
Now, we could have someone log in to the database and execute this query periodically. Instead, however, this "cleanup" operation is built into the application at a URL like /internal/admin/cleanInvitations, and that endpoint is executed by an external cron job. The scheduling is outside the main application, but all database configuration, connectivity, and logic are contained within it alongside our main business logic.
The problem statement :
Example : I have table name called "STUDENT" and it has 10 rows and consider one of the rows has name as "Jack". So when my server started and running I make the DB database into cache memory so my application has the value of "jack" and I am using it all over my application.
Now external source changed my "STUDENT" table and changed name "Jack" into "Prabhu Jack". I want the updated information asap into my application with out reloading/refresh into my application.. I dont want to run some constant thread to monitor and update my application. All I want is part of hibernate or any feasible solution to achieve this?
..
What you describe is the classic case of whether to pull or push updates.
Pull
This approach relies on the application using some background thread or task system that periodically polls a resource and requests the desired information. Its the responsibility of the application to perform this task.
In order to use a pull mechanism in conjunction with a cache implementation with Hibernate, this would mean that you'd want your Hibernate query results to be stored in a L2 cache implementation, such as ehcache.
Your ehcache would specify the storage capacity and expiration details and you simply query for the student data at each point you require it. The L2 cache would be consulted first, which lives on the application server side, and would only consult the database if the L2 cache had expired.
The downside is you would need to specify a reasonable time-to-live setting for the L2 cache so that the cache got updated by a query within reason after the rows were updated. Depending on the frequency of change and usage, maybe a 5 minute window is sufficient.
Using the L2 cache prevents the need for a useless background poll thread and allows you to specify a reasonable poll time all within the Hibernate framework backed by a cache implementation.
Push
This approach relies on the point where a change occurs to be capable of notifying interested parties that something changed and allowing the interested party to perform some action.
In order to use a push mechanism, your application would need to expose a way to be told a change occurred and preferably what the change actually was. Then when your external source modifies the table in question, that operation would need to raise an event and notify interested parties.
One way to architect this would be to use a JMS broker and have the external source submit a JMS message to a queue and have your application subscribe to the JMS queue to read the message when its sent.
Another solution would be to couple the place where the external source manipulates the data tightly with your application such that the external source doesn't just manipulate the data in question, but also sends a JSON request to your application, allowing it to update its internal cache immediately.
Conclusion
Using a push situation could require the introduction of additional middleware components, should you want to efficiently decouple the external source side & your application. But it does come with the added benefit that the eventual consistency between the database and your application's cache should happen with relative real-time. This solution also has no additional needs for querying the database after startup for those rows.
Using a pull situation doesn't require anything more than what you're likely already using in your application, other than maybe using a supported L2 cache provider rather than some homegrown solution. However, the eventual consistency between the database and your application's cache is completely dependent on your TTL configuration for that entity's cache. But be aware that this solution will continue to query the database to refresh the cache once your TTL has expired.
We have a requirement, where we have to run many async background processes which accesses DBs, Kafka queues, etc. As of now, we are using Spring Batch with Tomcat (exploded WAR) for the same. However, we are facing certain issues which I'm unable to solve using Spring Batch. I was thinking of other frameworks to use, but couldn't find any that solves all my problems.
It would be great to know if there exists a framework which solves the following problems:
Since Spring Batch runs inside one Tomcat container (1 java process), any small update in any job/step will result in restarting the Tomcat server. This results in hard-stopping of all running jobs, resulting in incomplete/stale data.
WHAT I WANT: Bundle all the jars and run each job as a separate process. The framework should store the PID and should be able to manage (stop/force-kill) the job on demand. This way, when we want to update a JAR, the existing process won't be hindered (however, we should be able to stop the existing process from UI), and no other job (running or not) will also be touched.
I have looked at hot-update of JARs in Tomcat, but I'm skeptical whether to use such a mechanism in production.
Sub-question: Will OSGI integrate with Spring Batch? If so, is it possible to run each job as a separate container with all JARs embedded in it?
Spring batch doesn't have a master-slave architecture.
WHAT I WANT: There should be a master, where the list of jobs are specified. There should be slave machines (workers), which are specified to master in a configuration file. There should exist a scheduler in the master, which when needed to start a job, should assign a slave a job (possibly load-balanced, but not necessary) and the slave should update the DB. The master should be able to send and receive data from the slaves (start/stop/kill any job, give me update of running jobs, etc.) so that it can be displayed on a UI.
This way, in case I have a high load, I should be able to just add machines into the cluster and modify the master configuration file and the load should get balanced right away.
Spring batch doesn't have an in-built alerting mechanism in case of job stall/failure.
WHAT I WANT: I should be able to set up alerts for jobs in case of failure. If necessary, a job should have a timeout where it should able to notify the user (via email probably) or should force stop the job when the job crosses a specified threshold.
Maybe vertx can do the trick.
Since Spring Batch runs inside one Tomcat container (1 java process), any small update in any job/step will result in restarting the Tomcat server. This results in hard-stopping of all running jobs, resulting in incomplete/stale data.
Vertx allows you to build microservices. Each vertx instance is able to communicate with other instances. If you stop one, the others can still work (if there are not dependant, eg if you stop master, slaves will fail)
Vert.x is not an application server.
There's no monolithic Vert.x instance into which you deploy applications.
You just run your apps wherever you want to.
Spring batch doesn't have a master-slave architecture
Since vertx is even driven, you can easily create a master slave architecture. For example handle the http request in an vertx instance and dispatch them between severals other instances depending on the nature of the request.
Spring batch doesn't have an in-built alerting mechanism in case of job stall/failure.
In vertx, you can set a timeout for each message and handle failure.
Sending with timeouts
When sending a message with a reply handler you can specify a timeout in the DeliveryOptions.
If a reply is not received within that time, the reply handler will be called with a failure.
The default timeout is 30 seconds.
Send Failures
Message sends can fail for other reasons, including:
There are no handlers available to send the message to
The recipient has explicitly failed the message using fail
In all cases the reply handler will be called with the specific failure.
EDIT There are other frameworks to do microservices in java. Dropwizard is one of them, but I can't talk much more about it.
In my java-based application, I need a job to read data from a set of tables and insert them into another table. In my first design, I created a oracle job and scheduled it to do the process frequently.
Unfortunately, when the job fails, there is not enough info available about the root causes of the failure. In addition. deploying the system for many system instances has made the work harder.
As an alternative work, I am trying to move the job into my application server, as a Weblogic job. Is this a good design or not?
Having moved my jobs into application server, I have faced the following advantages:
Tracking the job failure is easier.
Non-DBA users can easily read the application Server logs and fix the issues. (Many users do not have access to DB in production line. )
The logic of the job has been moved from my data access layer into my business logic layer and it is more acceptable due to design patterns.
I have a local web app that is installed on a desktop PC, and it needs to regularly sync with a remote server through web services.
I have a "transactions" table that stores transactions that have been processed locally and need to be sent to the remote server, and this table also contains transactions that have retrieved from the remote server (that have been processed remotely) and need to be peformed locally (they have been retrieved using a web service call)... The transactions are performed in time order to ensure they are processed in the right order.
An example of the type of transactions are "loans" and "returns" of items from a store, for example a video rental store. For example something may have been loaned locally and returned remotely or vice versa, or any sequence of loan/return events.
There is also other information that is retrieved from the remote server to update the local records.
When the user performs the tasks locally, I update the local db in real time and add the transaction to the table for background processing with the remote server.
What is the best approach for processing the background tasks. I have tried using a Thread that is created in a HTTPSessionListener, and using interrupt() when the session is removed, but I don't think that this is the safest approach. I have also tried using a session attribute as a locking mechanisim, but this also isn't the best approach.
I was also wondering how you know when a thread has completed it's run, as to avoid lunching another thread at the same time. Or whether a thread has ditched before completing.
I have come accross another suggestion, using the Quartz scheduler, I haven't read up on this approach in detail yet. I am going to puchase a copy of Java Concurrency in Practice, but I wanted some help with ideas for the best approach before I get stuck into it.
BTW I'm not using a web app framework.
Thanks.
Safest would be to create an applicationwide threadpool which is managed by the container. How to do that depends on the container used. If your container doesn't support it (e.g. Tomcat) or you want to be container-independent, then the basic approach would be to implement ServletContextListener, create the threadpool with help of Java 1.5 provided ExecutorService API on startup and kill the threadpool on shutdown. If you aren't on Java 1.5 yet or want more abstraction, then you can also use Spring's TaskExecutor
There was ever a Java EE proposal about concurrency utilities, but it has not yet made it into Java EE 6.
Related questions:
What is the recommend way of spawning threads from a servlet?
Background timer task in a JSP web application
Its better to go with Quartz Scheduling framework, because it has most of the features related to scheduling. It has facility to store jobs in Database, Concurrency handling,etc..
Please try this solution
Create a table,which stores some flag like 'Y' or 'N' mapped to some identifiable field with default value as 'N'
Schedule a job for each return while giving loand it self,which executes if flag is 'Y'
On returning change the flag to 'N',which then fires the process which you wanted to do