Admin Process for 12-factor in Java - java

The 12-Factor blog suggests an App should 'Run admin/management tasks as one-off processes'.
What does that mean in the context of a Java/ Spring-boot application? Can I get an example.
https://12factor.net/admin-processes

The site does not suggest this. It says that developers may want to do this, and that if they do so, they should apply the same standards as other code:
One-off admin processes should be run in an identical environment as the regular long-running processes of the app. They run against a release, using the same codebase and config as any process run against that release. Admin code must ship with application code to avoid synchronization issues.
As an example from my application: Users may send invitations, to which the recipient must respond within 7 days or the invitation expires. This is implemented by having a timestamp on the invitation and executing a database query equivalent to DELETE FROM Invitations WHERE expiration < NOW().
Now, we could have someone log in to the database and execute this query periodically. Instead, however, this "cleanup" operation is built into the application at a URL like /internal/admin/cleanInvitations, and that endpoint is executed by an external cron job. The scheduling is outside the main application, but all database configuration, connectivity, and logic are contained within it alongside our main business logic.

Related

Google Cloud scheduler Java job containerized with Selenium

I've got a Java code to perform some interactions with web pages and used Selenium for it.
Now I'd like to get this code executed every hours and I've thought it's a great occasion to discover the cloud world.
I've created an account on Google Cloud.
Because my app need to have a driver to use Selenium (gecko driver for Firefox), I'll have to create an docker image to set everything it need inside it.
In Google Cloud services, there is the "Cloud Scheduler" which can allow me to run a code when I want to.
But here are my questions :
What kind of target should I configure (HTTP, Pub/Sub, HTTP App Engine)?
Because I'm not using the Google Cloud Functions, my container will always be up, it doesn't seems as a great idea for a pricing reason? I would have like to have my container up only the time of the execution.
Also I was thinking to use Quarkus framework to wrap my application since I've since it was made for the cloud and very quick to start, is that the best option for me?
I'll be very glade if someone can help me to see this a little better. I'm not a total beginner I work as a Java / JavaScript developer for 5 years now and dockerized some application but everything about the cloud is a big piece, not easy to know where to start.
So you:
are using docker images
run your workload occasionally
aren't willing to use Cloud Function
==> Cloud Run is your best bet. Here is Google Cloud Run Quick start : https://cloud.google.com/run/docs/quickstarts/prebuilt-deploy
Keep in mind that your containerised application needs to be listening to HTTP requests so take a look at Cloud Run Container runtime contract
Finally you can indeed trigger Cloud Run from Cloud Scheduler, and here a detailed documentation on how to do it https://cloud.google.com/run/docs/triggering/using-scheduler
As #MBHAPhoenix says, Cloud Run is your best option. You can then trigger the job from Cloud Scheduler. We have this exact scenario currently running for one of our projects but our container is Python. We wrote an article about it here
You should note that to trigger your Cloud Run job from Cloud Scheduler, you'll have to 'secure it'. This means means you won't be able to just type the URL in a web browser. A service account will be responsible for running the Cloud Run job and you'll then need to grant your Cloud Scheduler service access to this service account so it can invoke the Cloud Run Job. I've been meaning to put up a post about the exact steps for doing this (will try to get it done this weekend).
In terms of cost, we have this snippet from our article
...Cloud Run only runs when it receives an HTTP request. It plays dead and comes alive to execute your code when an HTTP request comes in. When it is done executing the request, it goes 'dead' again till the next request comes in. This means you're not paying for time spent idling i.e. when it is not doing anything.....

Recurrent actions via OpenShift

I have an application (Spring Boot + Hibernate + Postgres) which executes ETL process. The application is deployed in OpenShift and has a scale n > 1, so this application always has more than 1 replica. But if every app launched own ETL in same database then data wouldn't be consistent.
Therefore, I think the process should be launched via something external.
I see a decision of my task as a method of API which can "doEtl()" and the method can be called a kubernete (OS) 'schedule' or another kuber (OS) tool. However I can't understand how to google it. I try to look for 'kubernetes custom schedule' but the found results explain 'how to work' or how to write custom the schedule for auto-scale.
Can someone advice me, if it is generally possible and if yes how to google it or how to named it?
You might be looking for the CronJobs object that is available and can be used to regularly execute a certain action.
For OpenShift, you can find more information in the documentation: https://docs.openshift.com/container-platform/4.3/nodes/jobs/nodes-nodes-jobs.html

Invoking Java standalone program in servlet or any other J2EE technologies

Here's what i need.. I have a UI where a user has the capability to upload a file and extract a report based on the inputted(uploaded) data. Since there is a huge data to be extracted, once the user uploads the data i would like to come out of the servlet control so that user doesn't have to wait in the same page and that the control to be passed on to a java stand alone program there by making it possible for the user to work on something else. So once the control goes on to the java standalone,it would invoke back-end sps and build an extract out of it and place it in a file path on the server.
The user how-over has a capability from UI to check if the extract is ready for them to download.
So the question here is, what is the best practice or possibility in achieving the same? Please let me know your valuable comments.
Thanks!
If you're running in a Java EE environment I would suggest having the servlet dispatch the task to a JMS queue and use a message driven bean to do the (async) processing.
As others suggest, it would be fairly trivial to have the upload servlet redirect the user to some ajax-enabled page that polls the backend for job completion.
If you're not in an EE environment, you could create a standalone (thread pooled) application to consume from the queue and provide signalling eg. through the database (I assume the result goes in a DB anyway). The Spring framework provides very capable and extensive facilities for binding it all together.
But really, there are several free/open source EE containers available, from light weight up to enterprise, so there's no need to build the necessary stuff yourself.
Cheers,
Its very easy.
Have one thread in your servlet class.
Run the thread (Thread will extract the data etc).
After running the thread redirect user to a page where you have auto-refresh or something to show how much extraction is done.(You mentioned that you have a way to find it)
If you can't use message driven beans, you could have your servlet upload the data to a location on the filesystem and record a row in a DB table to say there's a job to be processed.
Then you have your standalone program polling for jobs, processing the data and updating the DB row on completion (including reasons for failure etc.).
Finally, you can poll the status of the job from the UI using an ajax request.
Allows the user to build up a queue of data jobs to be processed while they're doing something else.

Pragmatic way to push configuration changes to servers

The system contains an admin console and a cluster of working servers. Application state is stored in the database. From admin console user can add new jobs, monitor running jobs etc. Working servers fetch the job from db and process it.
Now, some configuration is stored in database, too. Configurations are also loaded on each working server and most of it is cached, as configuration is not changed frequently.
Admin is able to change configuration (from admin console). The change is stored in database. What would be the best way to push changes to working servers?
My ideas so far:
add triggers on configuration table on update/delete/insert and update the timestamp in some aux table. Each working server before accessing the cache checks this aux table for change.
CONS: Still accessing db.
send request from admin console to all working server that configuration is changed and that has to be read from db on next call.
CONS: introduces http communication between admin and servers - new layer that didn't exist so far - and its questionable how reliable that would be.
Any experience on this subject?
First approach seems more like a quick hack. Checking aux table timestamps almost defeats the purpose of a cache.
Second option seems to be the good one. This could be implemented as a simple task in their task queue to update their own configuration caches.
The main issue changing configuration of multiple services is their dependencies.
If parent service would start running with incompatible configuration to it's child services you might get a crash or undefined behaviour. To avoid this the synchronisation must be implemented.
One way is let parent service to update it's configuration and then issue configuration update commands to child services. After all child services are updated, parent would resume processing. Advantage of this approach would be that a simple management console could instruct only parent services about the configuration changes.
Other way is let management console handle dependent services. It would send a command to parent services to pause execution, update config and wait for resume command. In a mean time it would update all child services and instruct parents to resume. This way service dependencies would be more flexible and their configuration would be decoupled from their implementation. This would require a more advanced administration tool.

Need help with java web app design to perform background tasks

I have a local web app that is installed on a desktop PC, and it needs to regularly sync with a remote server through web services.
I have a "transactions" table that stores transactions that have been processed locally and need to be sent to the remote server, and this table also contains transactions that have retrieved from the remote server (that have been processed remotely) and need to be peformed locally (they have been retrieved using a web service call)... The transactions are performed in time order to ensure they are processed in the right order.
An example of the type of transactions are "loans" and "returns" of items from a store, for example a video rental store. For example something may have been loaned locally and returned remotely or vice versa, or any sequence of loan/return events.
There is also other information that is retrieved from the remote server to update the local records.
When the user performs the tasks locally, I update the local db in real time and add the transaction to the table for background processing with the remote server.
What is the best approach for processing the background tasks. I have tried using a Thread that is created in a HTTPSessionListener, and using interrupt() when the session is removed, but I don't think that this is the safest approach. I have also tried using a session attribute as a locking mechanisim, but this also isn't the best approach.
I was also wondering how you know when a thread has completed it's run, as to avoid lunching another thread at the same time. Or whether a thread has ditched before completing.
I have come accross another suggestion, using the Quartz scheduler, I haven't read up on this approach in detail yet. I am going to puchase a copy of Java Concurrency in Practice, but I wanted some help with ideas for the best approach before I get stuck into it.
BTW I'm not using a web app framework.
Thanks.
Safest would be to create an applicationwide threadpool which is managed by the container. How to do that depends on the container used. If your container doesn't support it (e.g. Tomcat) or you want to be container-independent, then the basic approach would be to implement ServletContextListener, create the threadpool with help of Java 1.5 provided ExecutorService API on startup and kill the threadpool on shutdown. If you aren't on Java 1.5 yet or want more abstraction, then you can also use Spring's TaskExecutor
There was ever a Java EE proposal about concurrency utilities, but it has not yet made it into Java EE 6.
Related questions:
What is the recommend way of spawning threads from a servlet?
Background timer task in a JSP web application
Its better to go with Quartz Scheduling framework, because it has most of the features related to scheduling. It has facility to store jobs in Database, Concurrency handling,etc..
Please try this solution
Create a table,which stores some flag like 'Y' or 'N' mapped to some identifiable field with default value as 'N'
Schedule a job for each return while giving loand it self,which executes if flag is 'Y'
On returning change the flag to 'N',which then fires the process which you wanted to do

Categories