I have a Spring Boot app which has a scheduler that insert data to a remote database at 2 a.m. every day.
#Scheduled(cron = "0 0 2 * * ?")
public void reportDataToDB() {
// code omitted
}
The problem is, the app runs on multiple machines, so the database would receive multiple duplicate insertions of data.
What is the idiomatic way to solve this?
We solved such a problem by using a central scheduler. In our case we use Rundeck, which then calls a URL on our service (by going through the loadbalancer), which then executes the task (in our case data cleanup). This way you can make sure, that the logic is only executed on one instance of the service.
Related
I am trying to update an existing code to send mdprovider requests to the metadata service to update or publish the metadata in an unpublished model using parallel threads. My model is having 1000 query subjects and initially we are validating it sequentially. It looks almost 4 hrs to complete. Now what I am trying to do is run in 3 parallel threads and my aim to bring down the time.
I have used ExecuterService and created a fixed thread pool of 3 and submitted the task.
ExecutorService exec = Executors.newFixedThreadPool(thread);
exe.submit(task)
and inside the run method I connected to cognos, logon to cognos and calls the updateMetadata()
MetadataService_PortType mdService;
public void run() {
cognosConnect();
if (namespace.length() > 0) {
login(namespace, user name, password);
}
//xml = Will build the xml here
//Calls the method
boolean testdblResult = validateQS(xml);
Boolean validateQS(String actionXml){
//actionXML : transaction XML to test a query subject
//Cognos SDK method
result = mdService.updateMetadata(actionXml);
}
}
This is executing successfully. But the problem is, though 3 threads send request to Cognos SDK method mdService.updateMetadata() in parallel, the response is given back from the method is sequentially. for example lets say in 10th sec it send request for 3 Query subject validation in parallel, But the response of that 3 query subject is given in 15th second, 20th sec, 24th sec sequentially.
Is this the expected behaviour of Cognos? Does mdService.updateMetadata(xmlActionXml); internally execute it sequentially? or is there any other way to achieve parallelism here. I couldn't found any much information in SDK documentation.
Given that I have two scheduled component classes uploading files respectively.
I created a sending email method for each of them in order to send a reminder email to myself in case any uploading exceptions happened.
the flow like this:
Scheduler One --- if exception during uploading ---> sending a email after exception
Scheduler Two --- if exception during uploading ---> sending a email after exception
now I want to upgrade as
Scheduler One + Scheduler Two
--if exception--> sending a mail after two scheduler
Nonetheless, how can I do that?
You use case sounds really odd. Schedulers run independent. So if you want to share information (an exception was thrown) between both thos you have to store this information somewhere. A entry in a database or saving in in a global variable during runtime.
I would however suggest that you merge both of you scheduler into one. If they are not independent why divide the code? It saves you from creating theses hacks where the schedulers need to be connected
I am having a push notifications being send to android and ios application through spring boot every day at 8am Europe/Paris.
If I run multiple instances, the notifications will send multiple times. I am thinking to send every day notifications send on the database, and check them but I am worried it still run multiple times, this is what I am doing:
#Component
public class ScheduledTasks {
private static final Logger log = LoggerFactory.getLogger(ScheduledTasks.class);
private static final SimpleDateFormat dateFormat = new SimpleDateFormat("HH:mm:ss");
#Autowired
private ExpoPushTokenRepository expoPushTokenRepository;
#Autowired
private ExpoPushNotificationService expoPushNotificationService;
#Autowired
private MessageSource messageSource;
// TODO: if instances > 1, this will run multiple times, save to database the notifications send and prevent multiple sending.
#Scheduled(cron = "${cron.promotions.notification}", zone = "Europe/Paris")
public void sendNewPromotionsNotification() {
List<ExpoPushToken> expoPushTokenList = expoPushTokenRepository.findAll();
ArrayList<NotifyRequest> notifyRequestList = new ArrayList<>();
for (ExpoPushToken expoPushToken : expoPushTokenList) {
NotifyRequest notifyRequest = new NotifyRequest(
expoPushToken.getToken(),
"This is a test title",
"This is a test subtitle",
"This is a test body"
);
notifyRequestList.add(notifyRequest);
}
expoPushNotificationService.sendPushNotificationToList(notifyRequestList);
log.info("{} Send push notification to " + expoPushTokenList.size() + " userse", dateFormat.format(new Date()));
}
}
Does anybody have an idea on how I can prevent that safely?
Quartz would be my mostly database-agnostic solution for the task at hand, but was ruled out, so we are not going to discuss it.
The solution we are going to explore instead makes the following assumptions:
Postgres >= 9.5 is used (because we are going to use SKIP LOCKED, which was introduced in Postgresl 9.5).
It is okay to run a native query.
Under this conditions, we can retrieve batches of notifications from multiple instances of the application running through the following query:
SELECT * FROM expo_push_token FOR UPDATE SKIP LOCKED LIMIT 100;
This will retrieve and lock up to 100 entries from the table expo_push_token. If two instances of the application execute this query simultaneously, the received results will be disjoint. 100 is just some sample value. We may want to fine-tune this value for our use case. The locks stay active until the current transaction ends.
After an instance has fetched a batch of notifications, it has to also delete the entries it locked from the table or otherwise mark that this entry has been processed (if we go down this route, we have to modify the query above to filter-out already processed entires) and close the current transaction to release the locks. Each instance of the application would then repeat this query until the query returns zero entries.
There is also an alternative approach: an instance first fetches a batch size of notifications to send, keeps the transaction to the database open (thus continues holding the lock on the database), sends out its notification and then deletes/updates the entries and closes the transactions.
The two solutions have different strengths/weaknesses:
the first solutions keeps the transaction short. But if the application crashes in the middle of sending out notificatiosn, the part of its batch that was not send out is lost in this run.
the second solution keeps the transaction open, for possibly a long time. If it crashes in the middle fo sending out notifications, all entries will be unlocked and its batch would be re-processed, possibly resulting in some notifications being sent out twice.
For this solution to work, we also need some kind of job that fills table expo_push_token with the data we need. This job should run beforehand, i.e. its execution should not overlap with the notification sending process.
I have a service which is running using executor in java.In the main method of that service is as follows
public void method()
{
// will get some records from database
process records one by one
}
For example in my database I have 100 records , after 49 records processed I stopped my server.When I restart the server again it is running from starting means from 1st record.
Is there any possibility to start the service from 50th record.
Possible solution:
whenever server started need to check in previous iteration how many records processed by looking into the database (by maintaining a flag ).Once I found that records , I can skip those records .
Is there any alternative for this or any framework in java which can handle in a proper manner.Please correct me if my solution is not correct or any better solution is available
NOTE:Here we don't require any transaction management
I'm currently developing some web services in Java (& JPA with MySQL connection) that are being triggered by an SAP System.
To simplify my problem I'm referring the two crucial entities as BlogEntry and Comment. A BlogEntry can have multiple Comments. A Comment always belongs to exactly one BlogEntry.
So I have three Services (which I can't and don't want to redefine, since they're defined by the WSDL I exported from SAP and used parallel to communicate with other Systems): CreateBlogEntry, CreateComment, CreateCommentForUpcomingBlogEntry
They are being properly triggered and there's absolutely no problem with CreateBlogEntry or CreateComment when they're called seperately.
But: The service CreateCommentForUpcomingBlogEntry sends the Comment and a "foreign key" to identify the "upcoming" BlogEntry. Internally it also calls CreateBlogEntry to create the actual BlogEntry. These two services are - due to their asynchronous nature - concurring.
So I have two options:
create a dummy BlogEntry and connect the Comment to it & update the BlogEntry, once CreateBlogEntry "arrives"
wait for CreateBlogEntry and connect the Comment afterwards to the new BlogEntry
Currently I'm trying the former but once both services are fully executed, I end up with two BlogEntries. One of them only has the ID delivered by CreateCommentForUpcomingBlogEntry but it is properly connected to the Comment (more the other way round). The other BlogEntry has all the other information (such as postDate or body), but the Comment isn't connected to it.
Here's the code snippet of the service implementation CreateCommentForUpcomingBlogEntry:
#EJB
private BlogEntryFacade blogEntryFacade;
#EJB
private CommentFacade commentFacade;
...
List<BlogEntry> blogEntries = blogEntryFacade.findById(request.getComment().getBlogEntryId().getValue());
BlogEntry persistBlogEntry;
if (blogEntries.isEmpty()) {
persistBlogEntry = new BlogEntry();
persistBlogEntry.setId(request.getComment().getBlogEntryId().getValue());
blogEntryFacade.create(persistBlogEntry);
} else {
persistBlogEntry = blogEntries.get(0);
}
Comment persistComment = new Comment();
persistComment.setId(request.getComment().getID().getValue());
persistComment.setBody(request.getComment().getBody().getValue());
/*
set other properties
*/
persistComment.setBlogEntry(persistBlogEntry);
commentFacade.create(persistComment);
...
And here's the code snippet of the implementation CreateBlogEntry:
#EJB
private BlogEntryFacade blogEntryFacade;
...
List<BlogEntry> blogEntries = blogEntryFacade.findById(request.getBlogEntry().getId().getValue());
BlogEntry persistBlogEntry;
Boolean update = false;
if (blogEntries.isEmpty()) {
persistBlogEntry = new BlogEntry();
} else {
persistBlogEntry = blogEntries.get(0);
update = true;
}
persistBlogEntry.setId(request.getBlogEntry().getId().getValue());
persistBlogEntry.setBody(request.getBlogEntry().getBody().getValue());
/*
set other properties
*/
if (update) {
blogEntryFacade.edit(persistBlogEntry);
} else {
blogEntryFacade.create(persistBlogEntry);
}
...
This is some fiddling that fails to make things happen as supposed.
Sadly I haven't found a method to synchronize these simultaneous service calls. I could let the CreateCommentForUpcomingBlogEntry sleep for a few seconds but I don't think that's the proper way to do it.
Can I force each instance of my facades and their respective EntityManagers to reload their datasets? Can I put my requests in some sort of queue that is being emptied based on certain conditions?
So: What's the best pracice to make it wait for the BlogEntry to exist?
Thanks in advance,
David
Info:
GlassFish Server 3.1.2
EclipseLink, version: Eclipse Persistence Services - 2.3.2.v20111125-r10461
If you are sure you are getting a CreateBlogEntry call, queue the CreateCommentForUpcomingBlogEntry calls and dequeue and process them once you receive the CreateBlogEntry call.
Since you are on an application server, for queues, you can probably use JMS queues that autoflush to storage or use the DB cache engine (Ehcache ?), in case you receive a lot of calls or want to provide a recovery mechanism across restarts.