Avoiding multiple calls when consuming a web service - java

I have a task where a user consumes XML from a third party. The XML feed is only updated once a day. The XML is stored in a database and returned to the user when requested. If the XML is not in the database, then it is retrieved from the third party, stored in the database and returned to the user. All subsequent requests will simply read the XML from the database.
Now my question. Say it takes 10 seconds for the request to the third party to return. In this period, there are multiple server calls for the same data. I don't want each of these to fire off requests to the third party and I don't want the user to receive nothing or an error. They should probably wait for the first request to complete at which point the XML would be available. This is a relatively simple problem but I want to know what the best way of catering for it is.
Do I just use a simple flag to control requests or maybe something like a semaphore? Are there better solutions based on the stack I intend to use which is the Play framework and a cassandra backend. Is there something I could do with callbacks or triggers?
By the way, I need to lazy load the data when the first request comes in. So, in this task it isn't an option to get the data in a separate process or when the app starts...
Thanks

All you need to do is create a separate component that is responsible to get the XML from the third party and save it to the database.
In your code the various thread try to "fetch" the XML from this component.
This component returns the XML from the database if it exists. If it does not exist then you use a ReentrantLock to synchronize.
So you do a trylock and only one of your threads succeeds. The rest will be blocked. When the lock is released the other threads are unblocked but the XML has already been fetched from the third party and stored to the database by the thread that first managed to gain the lock. So the other threads just return the XML from the DB.
Example code (this is just a "pseudo code" to get you started. You should handle exceptions etc but the main skeleton can be used. Do NOT forget to unlock in a finally so that your code does not block indefinitelly):
public String getXML() {
String xml = getXMLFromDatabase();
if(xml == null) {
if(glocalLock.tryLock()) {
try{
xml = getXMLFromThirdParty();
storeXMLToDatabase(xml);
}
finally {
globalLock.unlock(); //ok! got XML and stored in DB. Wake-up others!
}
}
else {
try{ //Another thread got the lock and will do the query. Just wait on lock!
globalLock.lock();
}
finally {
//woken up but the xml is already fetched
xml = getXMLFromDatabase();
globalLock.unlock();
}
}
return xml;
}

Related

How to continuously receive and parse the JSON from REST API in spring boot

There is a remote server keep bringing about the data in JSON format. Here is a REST API named
http://192.168.1.101:8000/v1/status,and if I want to collect the data continuously in Spring Boot.Here is a possible JSON from the REST API:
{
"run-status": 0,
"opr-mode": 0,
"ready": false,
"not-ready-reason": 1,
"alarms":["ps", "prm-switch"]
}
I want to keep collecting or just subscribe the REST API, if there is a JSON and then collect it.
There are two main approaches of achieving what you are looking for:
Polling - If this service already exists and you do not have control
over the code, then this might be your only option. You constantly
poll the given URL to check if data has been changed.
In spring, you can use #Scheduled annotation to execute and poll
at any given frequency (using cron expression or fixed delays).
https://www.baeldung.com/spring-scheduled-tasks - provides a detail
of how to create a scheduled tasks.
Webhooks - If you have control over your server code, you can use
webhooks to notify subscriber about availability of data. It is a
callback mechanism where caller will receive a notification about
data changes on the server, and subscriber can then call server to
fetch data immediately.
More about Polling and Webhooks can be found on this URL: https://dzone.com/articles/webhooks-vs-polling-youre-better-than-this-1
Make a "while" cycle what calls your function then goes to sleep (if needed) for the time you want.
Or just while (true) {}

Hold thread in spring rest request for long-polling

As I wrote in title we need in project notify or execute method of some thread by another. This implementation is part of long polling. In following text describe and show my implementation.
So requirements are that:
UserX send request from client to server (poll action) immediately when he got response from previous. In service is executed spring async method where thread immediately check cache if there are some new data in database. I know that cache is usually used for methods where for specific input is expected specific output. This is not that case, because I use cache to reduce database calls and output of my method is always different. So cache help me store notification if I should check database or not. This checking is running in while loop which end when thread find notification to read database in cache or time expired.
Assume that UserX thread (poll action) is currently in while loop and checking cache.
In that moment UserY (push action) send some data to server, data are stored in database in separated thread, and also in cache is stored userId of recipient.
So when UserX is checking cache he found id of recipient (id of recipient == his id in this case), and then break loop and fetch these data.
So in my implementation I use google guava cache which provide manually write.
private static Cache<Long, Long> cache = CacheBuilder.newBuilder()
.maximumSize(100)
.expireAfterWrite(5, TimeUnit.MINUTES)
.build();
In create method I store id of user which should read these data.
public void create(Data data) {
dataRepository.save(data);
cache.save(data.getRecipient(), null);
System.out.println("SAVED " + userId + " in " + Thread.currentThread().getName());
}
and here is method of polling data:
#Async
public CompletableFuture<List<Data>> pollData(Long previousMessageId, Long userId) throws InterruptedException {
// check db at first, if there are new data no need go to loop and waiting
List<Data> data = findRecent(dataId, userId));
data not found so jump to loop for some time
if (data.size() == 0) {
short c = 0;
while (c < 100) {
// check if some new data added or not, if yes break loop
if (cache.getIfPresent(userId) != null) {
break;
}
c++;
Thread.sleep(1000);
System.out.println("SEQUENCE: " + c + " in " + Thread.currentThread().getName());
}
// check database on the end of loop or after break from loop
data = findRecent(dataId, userId);
}
// clear data for that recipient and return result
cache.clear(userId);
return CompletableFuture.completedFuture(data);
}
After User X got response he send poll request again and whole process is repeated.
Can you tell me if is this application design for long polling in java (spring) is correct or exists some better way? Key point is that when user call poll request, this request should be holded for new data for some time and not response immediately. This solution which I show above works, but question is if it will be works also for many users (1000+). I worry about it because of pausing threads which should make slower another requests when no threads will be available in pool. Thanks in advice for your effort.
Check Web Sockets. Spring supports it from version 4 on wards. It doesn't require client to initiate a polling, instead server pushes the data to client in real time.
Check the below:
https://spring.io/guides/gs/messaging-stomp-websocket/
http://www.baeldung.com/websockets-spring
Note - web sockets open a persistent connection between client and server and thus may result in more resource usage in case of large number of users. So, if you are not looking for real time updates and is fine with some delay then polling might be a better approach. Also, not all browsers support web sockets.
Web Sockets vs Interval Polling
Longpolling vs Websockets
In what situations would AJAX long/short polling be preferred over HTML5 WebSockets?
In your current approach, if you are having a concern with large number of threads running on server for multiple users then you can trigger the polling from front-end every time instead. This way only short lived request threads will be triggered from UI looking for any update in the cache. If there is an update, another call can be made to retrieve the data. However don't hit the server every other second as you are doing otherwise you will have high CPU utilization and user request threads may also suffer. You should do some optimization on your timing.
Instead of hitting the cache after a delay of 1 sec for 100 times, you can apply an intelligent algorithm by analyzing the pattern of cache/DB update over a period of time.
By knowing the pattern, you can trigger the polling in an exponential back off manner to hit the cache when the update is most likely expected. This way you will be hitting the cache less frequently and more accurately.

How to send emails from a Java EE Batch Job

I have a requirement to process a list of large number of users daily to send them email and SMS notifications based on some scenario. I am using Java EE batch processing model for this. My Job xml is as follows:
<step id="sendNotification">
<chunk item-count="10" retry-limit="3">
<reader ref="myItemReader"></reader>
<processor ref="myItemProcessor"></processor>
<writer ref="myItemWriter"></writer>
<retryable-exception-classes>
<include class="java.lang.IllegalArgumentException"/>
</retryable-exception-classes>
</chunk>
</step>
MyItemReader's onOpen method reads all users from database, and readItem() reads one user at a time using list iterator. In myItemProcessor, the actual email notification is sent to user, and then the users are persisted in database in myItemWriter class for that chunk.
#Named
public class MyItemReader extends AbstractItemReader {
private Iterator<User> iterator = null;
private User lastUser;
#Inject
private MyService service;
#Override
public void open(Serializable checkpoint) throws Exception {
super.open(checkpoint);
List<User> users = service.getUsers();
iterator = users.iterator();
if(checkpoint != null) {
User checkpointUser = (User) checkpoint;
System.out.println("Checkpoint Found: " + checkpointUser.getUserId());
while(iterator.hasNext() && !iterator.next().getUserId().equals(checkpointUser.getUserId())) {
System.out.println("skipping already read users ... ");
}
}
}
#Override
public Object readItem() throws Exception {
User user=null;
if(iterator.hasNext()) {
user = iterator.next();
lastUser = user;
}
return user;
}
#Override
public Serializable checkpointInfo() throws Exception {
return lastUser;
}
}
My problem is that checkpoint stores the last record that was executed in the previous chunk. If I have a chunk with next 10 users, and exception is thrown in myItemProcessor of the 5th user, then on retry the whole chunck will be executed and all 10 users will be processed again. I don't want notification to be sent again to the already processed users.
Is there a way to handle this? How should this be done efficiently?
Any help would be highly appreciated.
Thanks.
I'm going to build on the comments from #cheng. My credit to him here, and hopefully my answer provides additional value in organizing and presenting the options usefully.
Answer: Queue up messages for another MDB to get dispatched to send emails
Background:
As #cheng pointed out, a failure means the entire transaction is rolled back, and the checkpoint doesn't advance.
So how to deal with the fact that your chunk has sent emails to some users but not all? (You might say it rolled back but with "side effects".)
So we could restate your question then as: How to send email from a batch chunk step?
Well, assuming you had a way to send emails through an transactional API (implementing XAResource, etc.) you could use that API.
Assuming you don't, I would do a transactional write to a JMS queue, and then send the emails with a separate MDB (as #cheng suggested in one of his comments).
Suggested Alternative: Use ItemWriter to send messages to JMS queue, then use separate MDB to actually send the emails
With this approach you still gain efficiency by batching the processing and the updates to your DB (you were only sending the emails one at a time anyway), and you can benefit from simple checkpointing and restart without having to write complicated error handling.
This is also likely to be reusable as a pattern across batch jobs and outside of batch even.
Other alternatives
Some other ideas that I don't think are as good, listed for the sake of discussion:
Add batch application logic tracking users emailed (with ItemProcessListener)
You could build your own list of either/both successful/failed emails using the ItemProcessListener methods: afterProcess and onProcessError.
On restart, then, you could know which users had been emailed in the current chunk, which we are re-positioned to since the entire chunk rolled back, even though some emails have already been sent.
This certainly complicates your batch logic, and you also have to persist this success or failure list somehow. Plus this approach is probably highly specific to this job (as opposed to queuing up for an MDB to process).
But it's simpler in that you have a single batch job without the need for a messaging provider and a separate app component.
If you go this route you might want to use a combination of both a skippable and a "no-rollback" retryable exception.
single-item chunk
If you define your chunk with item-count="1", then you avoid complicated checkpointing and error handling code. You sacrifice efficiency though, so this would only make sense if the other aspects of batch were very compelling: e.g. scheduling and management of jobs through a common interface, the ability to restart at the failing step within a job
If you were to go this route, you might want to consider defining socket and timeout exceptions as "no-rollback" exceptions (using ) since there's nothing to be gained from rolling back, and you might want to retry on a network timeout issue.
Since you specifically mentioned efficiency, I'm guessing this is a bad fit for you.
use a Transaction Synchronization
This could work perhaps, but the batch API doesn't especially make this easy, and you still could have a case where the chunk completes but one or more email sends fail.
Your current item processor is doing something outside the chunk transaction scope, which has caused the application state to be out of sync. If your requirement is to send out emails only after all items in a chunk have successfully completed, then you can move the emailing part to a ItemWriterListener.afterWrite(items).

To get database updates using servlets or jsp

What I want is to get database updates.
i.e If any changes occur to the database or a new record is inserted it should notify to the user.
Up to know what I implemented is using jQuery as shown below
$(document).ready(function() {
var updateInterval = setInterval(function() {
$('#chat').load('Db.jsp?elect=<%=emesg%>');
},1000);
});
It worked fine for me, but my teacher told to me that it's not a good way to do recommended using comet or long polling technology.
Can anyone give me examples for getting database updates using comet or long polling
in servlets/jsp? I'm using Tomcat as server.
Just taking a shot in the dark since I don't know your exact environment... You could have the database trigger fire a call to a servlet each time a row is committed which would then run some code that looked like the following:
Get the script sessions that are active for the page that we want to update. This eliminates the need to check every reverse ajax script session that is running on the site. Once we have the script sessions we can use the second code block to take some data and update a table on the client side. All that the second code section does is send javascript to the client to be executed via the reverse ajax connection that is open.
String page = ServerContextFactory.get().getContextPath() + "/reverseajax/clock.html";
Browser.withPage(page, new Runnable() {
public void run() {
Util.setValue("clockDisplay", output);
}
});
// Creates a new Person bean.
Person person = new Person(true);
// Creates a multi-dimensional array, containing a row and the rows column data.
String[][] data = {
{person.getId(), person.getName(), person.getAddress(), person.getAge()+"", person.isSuperhero()+""}
};
// Call DWR's util which adds rows into a table. peopleTable is the id of the tbody and
// data conta
ins the row/column data.
Util.addRows("peopleTable", data);
Note that both of the above sections of code are pulled straight from the documentation examples # http://directwebremoting.org/dwr-demo/. These are only simple examples of how reverse ajax can sent data to the client, but your exact situation seems to be more dependent on how you receive the notification than how you update the client screen.
Without some type of database notification to the java code I think you will have to poll the system at set intervals. You could make the system a little more efficient even when polling by verifying that there are reverse ajax script sessions active for the page before polling the database for info.

How to synchronize concurring Web Service calls in Java

I'm currently developing some web services in Java (& JPA with MySQL connection) that are being triggered by an SAP System.
To simplify my problem I'm referring the two crucial entities as BlogEntry and Comment. A BlogEntry can have multiple Comments. A Comment always belongs to exactly one BlogEntry.
So I have three Services (which I can't and don't want to redefine, since they're defined by the WSDL I exported from SAP and used parallel to communicate with other Systems): CreateBlogEntry, CreateComment, CreateCommentForUpcomingBlogEntry
They are being properly triggered and there's absolutely no problem with CreateBlogEntry or CreateComment when they're called seperately.
But: The service CreateCommentForUpcomingBlogEntry sends the Comment and a "foreign key" to identify the "upcoming" BlogEntry. Internally it also calls CreateBlogEntry to create the actual BlogEntry. These two services are - due to their asynchronous nature - concurring.
So I have two options:
create a dummy BlogEntry and connect the Comment to it & update the BlogEntry, once CreateBlogEntry "arrives"
wait for CreateBlogEntry and connect the Comment afterwards to the new BlogEntry
Currently I'm trying the former but once both services are fully executed, I end up with two BlogEntries. One of them only has the ID delivered by CreateCommentForUpcomingBlogEntry but it is properly connected to the Comment (more the other way round). The other BlogEntry has all the other information (such as postDate or body), but the Comment isn't connected to it.
Here's the code snippet of the service implementation CreateCommentForUpcomingBlogEntry:
#EJB
private BlogEntryFacade blogEntryFacade;
#EJB
private CommentFacade commentFacade;
...
List<BlogEntry> blogEntries = blogEntryFacade.findById(request.getComment().getBlogEntryId().getValue());
BlogEntry persistBlogEntry;
if (blogEntries.isEmpty()) {
persistBlogEntry = new BlogEntry();
persistBlogEntry.setId(request.getComment().getBlogEntryId().getValue());
blogEntryFacade.create(persistBlogEntry);
} else {
persistBlogEntry = blogEntries.get(0);
}
Comment persistComment = new Comment();
persistComment.setId(request.getComment().getID().getValue());
persistComment.setBody(request.getComment().getBody().getValue());
/*
set other properties
*/
persistComment.setBlogEntry(persistBlogEntry);
commentFacade.create(persistComment);
...
And here's the code snippet of the implementation CreateBlogEntry:
#EJB
private BlogEntryFacade blogEntryFacade;
...
List<BlogEntry> blogEntries = blogEntryFacade.findById(request.getBlogEntry().getId().getValue());
BlogEntry persistBlogEntry;
Boolean update = false;
if (blogEntries.isEmpty()) {
persistBlogEntry = new BlogEntry();
} else {
persistBlogEntry = blogEntries.get(0);
update = true;
}
persistBlogEntry.setId(request.getBlogEntry().getId().getValue());
persistBlogEntry.setBody(request.getBlogEntry().getBody().getValue());
/*
set other properties
*/
if (update) {
blogEntryFacade.edit(persistBlogEntry);
} else {
blogEntryFacade.create(persistBlogEntry);
}
...
This is some fiddling that fails to make things happen as supposed.
Sadly I haven't found a method to synchronize these simultaneous service calls. I could let the CreateCommentForUpcomingBlogEntry sleep for a few seconds but I don't think that's the proper way to do it.
Can I force each instance of my facades and their respective EntityManagers to reload their datasets? Can I put my requests in some sort of queue that is being emptied based on certain conditions?
So: What's the best pracice to make it wait for the BlogEntry to exist?
Thanks in advance,
David
Info:
GlassFish Server 3.1.2
EclipseLink, version: Eclipse Persistence Services - 2.3.2.v20111125-r10461
If you are sure you are getting a CreateBlogEntry call, queue the CreateCommentForUpcomingBlogEntry calls and dequeue and process them once you receive the CreateBlogEntry call.
Since you are on an application server, for queues, you can probably use JMS queues that autoflush to storage or use the DB cache engine (Ehcache ?), in case you receive a lot of calls or want to provide a recovery mechanism across restarts.

Categories