I'm currently developing some web services in Java (& JPA with MySQL connection) that are being triggered by an SAP System.
To simplify my problem I'm referring the two crucial entities as BlogEntry and Comment. A BlogEntry can have multiple Comments. A Comment always belongs to exactly one BlogEntry.
So I have three Services (which I can't and don't want to redefine, since they're defined by the WSDL I exported from SAP and used parallel to communicate with other Systems): CreateBlogEntry, CreateComment, CreateCommentForUpcomingBlogEntry
They are being properly triggered and there's absolutely no problem with CreateBlogEntry or CreateComment when they're called seperately.
But: The service CreateCommentForUpcomingBlogEntry sends the Comment and a "foreign key" to identify the "upcoming" BlogEntry. Internally it also calls CreateBlogEntry to create the actual BlogEntry. These two services are - due to their asynchronous nature - concurring.
So I have two options:
create a dummy BlogEntry and connect the Comment to it & update the BlogEntry, once CreateBlogEntry "arrives"
wait for CreateBlogEntry and connect the Comment afterwards to the new BlogEntry
Currently I'm trying the former but once both services are fully executed, I end up with two BlogEntries. One of them only has the ID delivered by CreateCommentForUpcomingBlogEntry but it is properly connected to the Comment (more the other way round). The other BlogEntry has all the other information (such as postDate or body), but the Comment isn't connected to it.
Here's the code snippet of the service implementation CreateCommentForUpcomingBlogEntry:
#EJB
private BlogEntryFacade blogEntryFacade;
#EJB
private CommentFacade commentFacade;
...
List<BlogEntry> blogEntries = blogEntryFacade.findById(request.getComment().getBlogEntryId().getValue());
BlogEntry persistBlogEntry;
if (blogEntries.isEmpty()) {
persistBlogEntry = new BlogEntry();
persistBlogEntry.setId(request.getComment().getBlogEntryId().getValue());
blogEntryFacade.create(persistBlogEntry);
} else {
persistBlogEntry = blogEntries.get(0);
}
Comment persistComment = new Comment();
persistComment.setId(request.getComment().getID().getValue());
persistComment.setBody(request.getComment().getBody().getValue());
/*
set other properties
*/
persistComment.setBlogEntry(persistBlogEntry);
commentFacade.create(persistComment);
...
And here's the code snippet of the implementation CreateBlogEntry:
#EJB
private BlogEntryFacade blogEntryFacade;
...
List<BlogEntry> blogEntries = blogEntryFacade.findById(request.getBlogEntry().getId().getValue());
BlogEntry persistBlogEntry;
Boolean update = false;
if (blogEntries.isEmpty()) {
persistBlogEntry = new BlogEntry();
} else {
persistBlogEntry = blogEntries.get(0);
update = true;
}
persistBlogEntry.setId(request.getBlogEntry().getId().getValue());
persistBlogEntry.setBody(request.getBlogEntry().getBody().getValue());
/*
set other properties
*/
if (update) {
blogEntryFacade.edit(persistBlogEntry);
} else {
blogEntryFacade.create(persistBlogEntry);
}
...
This is some fiddling that fails to make things happen as supposed.
Sadly I haven't found a method to synchronize these simultaneous service calls. I could let the CreateCommentForUpcomingBlogEntry sleep for a few seconds but I don't think that's the proper way to do it.
Can I force each instance of my facades and their respective EntityManagers to reload their datasets? Can I put my requests in some sort of queue that is being emptied based on certain conditions?
So: What's the best pracice to make it wait for the BlogEntry to exist?
Thanks in advance,
David
Info:
GlassFish Server 3.1.2
EclipseLink, version: Eclipse Persistence Services - 2.3.2.v20111125-r10461
If you are sure you are getting a CreateBlogEntry call, queue the CreateCommentForUpcomingBlogEntry calls and dequeue and process them once you receive the CreateBlogEntry call.
Since you are on an application server, for queues, you can probably use JMS queues that autoflush to storage or use the DB cache engine (Ehcache ?), in case you receive a lot of calls or want to provide a recovery mechanism across restarts.
Related
I am trying to use spring-kafka 1.3.x (1.3.3 and 1.3.4). What is not clear is whether there is a safe way to consume messages in batch without skipping a message (or set of messages) when an exception occurs eg network outage. My preference is also to leverage the container capabilities as much as possible to remain in Spring framework rather than trying to create a custom framework for dealing with this challenge.
I am setting the following properties onto a ConcurrentMessageListenerContainer :
.setAckOnError(false);
.setAckMode(AckMode.MANUAL);
I am also setting the following kafka specific consumer properties:
enable.auto.commit=false
auto.offset.reset=earliest
If I set a RetryTemplate, I get a class cast exception since it only works for non-batch consumers. Documentation states retry is not available for batch so this may be OK.
I then setup a consumer such as this one:
```java
#KafkaListener(containerFactory = "conatinerFactory",
groupId = "myGroup",
topics = "myTopic")
public void onMessage(#Payload List<Entries> batchedData,
#Header(required = false,
value = KafkaHeaders.OFFSET) List<Long> offsets,
Acknowledgment ack) {
log.info("Working on: {}" + offsets);
int x = 1;
if(x == 1) {
log.info("Failure on: {}" + offsets);
throw new RuntimeException("mock failure");
}
// do nothing else for now
// unreachable code
ack.acknowledge();
}
```
When I send a message into the system to mock the exception above then the only visible action to me is that the listener reports the exception.
When I send another (new) message into the system, the container consumes the new message. The old message is skipped since the offset is advanced to the next offset.
Since I have asked the container not to acknowledge (directly or indirectly) and since there is no other properties that I can see to notify the container not to advance, then I am confused why the container does advance.
What I noticed is that for a similar consideration, what is being recommended is to upgrade to 2.1.x and use the container stop capability that was added into the ContainerAware ErrorHandler there.
But what if you are trapped in 1.3.x for the time being, is there a way or missing property that can be used to ensure the container does not advance to the next message or batch of messages?
I can see an option to create a custom framework around the consumer in order to achieve the desired effect. But are there other options, simpler, and more spring friendly.
Thoughts?
From #garyrussell (spring-kafka github project)
The offset has not been committed but the broker won't send the data again. You have to re-seek the topics/partitions.
2.1 provides the SeekToCurrentBatchErrorHandler which will re-seek automatically for you.
2.0 Added consumer-aware listeners, giving you access to the consumer (for seeking) in the listener.
With 1.3.x you have to implement ConsumerSeekAware and perform the seeks yourself (in the listener after catching the exception). Save off the ConsumerSeekCallback in a ThreadLocal.
You will need to add the partitions to your method signature; then seek to the lowest offset in the list for each partition.
I'd like to listen on a websocket using akka streams. That is, I'd like to treat it as nothing but a Source.
However, all official examples treat the websocket connection as a Flow.
My current approach is using the websocketClientFlow in combination with a Source.maybe. This eventually results in the upstream failing due to a TcpIdleTimeoutException, when there are no new Messages being sent down the stream.
Therefore, my question is twofold:
Is there a way – which I obviously missed – to treat a websocket as just a Source?
If using the Flow is the only option, how does one handle the TcpIdleTimeoutException properly? The exception can not be handled by providing a stream supervision strategy. Restarting the source by using a RestartSource doesn't help either, because the source is not the problem.
Update
So I tried two different approaches, setting the idle timeout to 1 second for convenience
application.conf
akka.http.client.idle-timeout = 1s
Using keepAlive (as suggested by Stefano)
Source.<Message>maybe()
.keepAlive(Duration.apply(1, "second"), () -> (Message) TextMessage.create("keepalive"))
.viaMat(Http.get(system).webSocketClientFlow(WebSocketRequest.create(websocketUri)), Keep.right())
{ ... }
When doing this, the Upstream still fails with a TcpIdleTimeoutException.
Using RestartFlow
However, I found out about this approach, using a RestartFlow:
final Flow<Message, Message, NotUsed> restartWebsocketFlow = RestartFlow.withBackoff(
Duration.apply(3, TimeUnit.SECONDS),
Duration.apply(30, TimeUnit.SECONDS),
0.2,
() -> createWebsocketFlow(system, websocketUri)
);
Source.<Message>maybe()
.viaMat(restartWebsocketFlow, Keep.right()) // One can treat this part of the resulting graph as a `Source<Message, NotUsed>`
{ ... }
(...)
private Flow<Message, Message, CompletionStage<WebSocketUpgradeResponse>> createWebsocketFlow(final ActorSystem system, final String websocketUri) {
return Http.get(system).webSocketClientFlow(WebSocketRequest.create(websocketUri));
}
This works in that I can treat the websocket as a Source (although artifically, as explained by Stefano) and keep the tcp connection alive by restarting the websocketClientFlow whenever an Exception occurs.
This doesn't feel like the optimal solution though.
No. WebSocket is a bidirectional channel, and Akka-HTTP therefore models it as a Flow. If in your specific case you care only about one side of the channel, it's up to you to form a Flow with a "muted" side, by using either Flow.fromSinkAndSource(Sink.ignore, mySource) or Flow.fromSinkAndSource(mySink, Source.maybe), depending on the case.
as per the documentation:
Inactive WebSocket connections will be dropped according to the
idle-timeout settings. In case you need to keep inactive connections
alive, you can either tweak your idle-timeout or inject ‘keep-alive’
messages regularly.
There is an ad-hoc combinator to inject keep-alive messages, see the example below and this Akka cookbook recipe. NB: this should happen on the client side.
src.keepAlive(1.second, () => TextMessage.Strict("ping"))
I hope I understand your question correctly. Are you looking for asSourceOf?
path("measurements") {
entity(asSourceOf[Measurement]) { measurements =>
// measurement has type Source[Measurement, NotUsed]
...
}
}
I have a requirement to process a list of large number of users daily to send them email and SMS notifications based on some scenario. I am using Java EE batch processing model for this. My Job xml is as follows:
<step id="sendNotification">
<chunk item-count="10" retry-limit="3">
<reader ref="myItemReader"></reader>
<processor ref="myItemProcessor"></processor>
<writer ref="myItemWriter"></writer>
<retryable-exception-classes>
<include class="java.lang.IllegalArgumentException"/>
</retryable-exception-classes>
</chunk>
</step>
MyItemReader's onOpen method reads all users from database, and readItem() reads one user at a time using list iterator. In myItemProcessor, the actual email notification is sent to user, and then the users are persisted in database in myItemWriter class for that chunk.
#Named
public class MyItemReader extends AbstractItemReader {
private Iterator<User> iterator = null;
private User lastUser;
#Inject
private MyService service;
#Override
public void open(Serializable checkpoint) throws Exception {
super.open(checkpoint);
List<User> users = service.getUsers();
iterator = users.iterator();
if(checkpoint != null) {
User checkpointUser = (User) checkpoint;
System.out.println("Checkpoint Found: " + checkpointUser.getUserId());
while(iterator.hasNext() && !iterator.next().getUserId().equals(checkpointUser.getUserId())) {
System.out.println("skipping already read users ... ");
}
}
}
#Override
public Object readItem() throws Exception {
User user=null;
if(iterator.hasNext()) {
user = iterator.next();
lastUser = user;
}
return user;
}
#Override
public Serializable checkpointInfo() throws Exception {
return lastUser;
}
}
My problem is that checkpoint stores the last record that was executed in the previous chunk. If I have a chunk with next 10 users, and exception is thrown in myItemProcessor of the 5th user, then on retry the whole chunck will be executed and all 10 users will be processed again. I don't want notification to be sent again to the already processed users.
Is there a way to handle this? How should this be done efficiently?
Any help would be highly appreciated.
Thanks.
I'm going to build on the comments from #cheng. My credit to him here, and hopefully my answer provides additional value in organizing and presenting the options usefully.
Answer: Queue up messages for another MDB to get dispatched to send emails
Background:
As #cheng pointed out, a failure means the entire transaction is rolled back, and the checkpoint doesn't advance.
So how to deal with the fact that your chunk has sent emails to some users but not all? (You might say it rolled back but with "side effects".)
So we could restate your question then as: How to send email from a batch chunk step?
Well, assuming you had a way to send emails through an transactional API (implementing XAResource, etc.) you could use that API.
Assuming you don't, I would do a transactional write to a JMS queue, and then send the emails with a separate MDB (as #cheng suggested in one of his comments).
Suggested Alternative: Use ItemWriter to send messages to JMS queue, then use separate MDB to actually send the emails
With this approach you still gain efficiency by batching the processing and the updates to your DB (you were only sending the emails one at a time anyway), and you can benefit from simple checkpointing and restart without having to write complicated error handling.
This is also likely to be reusable as a pattern across batch jobs and outside of batch even.
Other alternatives
Some other ideas that I don't think are as good, listed for the sake of discussion:
Add batch application logic tracking users emailed (with ItemProcessListener)
You could build your own list of either/both successful/failed emails using the ItemProcessListener methods: afterProcess and onProcessError.
On restart, then, you could know which users had been emailed in the current chunk, which we are re-positioned to since the entire chunk rolled back, even though some emails have already been sent.
This certainly complicates your batch logic, and you also have to persist this success or failure list somehow. Plus this approach is probably highly specific to this job (as opposed to queuing up for an MDB to process).
But it's simpler in that you have a single batch job without the need for a messaging provider and a separate app component.
If you go this route you might want to use a combination of both a skippable and a "no-rollback" retryable exception.
single-item chunk
If you define your chunk with item-count="1", then you avoid complicated checkpointing and error handling code. You sacrifice efficiency though, so this would only make sense if the other aspects of batch were very compelling: e.g. scheduling and management of jobs through a common interface, the ability to restart at the failing step within a job
If you were to go this route, you might want to consider defining socket and timeout exceptions as "no-rollback" exceptions (using ) since there's nothing to be gained from rolling back, and you might want to retry on a network timeout issue.
Since you specifically mentioned efficiency, I'm guessing this is a bad fit for you.
use a Transaction Synchronization
This could work perhaps, but the batch API doesn't especially make this easy, and you still could have a case where the chunk completes but one or more email sends fail.
Your current item processor is doing something outside the chunk transaction scope, which has caused the application state to be out of sync. If your requirement is to send out emails only after all items in a chunk have successfully completed, then you can move the emailing part to a ItemWriterListener.afterWrite(items).
I do not want to block threads in my application and so I am wondering are calls to the the Google Datastore async? For example the docs show something like this to retrieve an entity:
// Key employeeKey = ...;
LookupRequest request = LookupRequest.newBuilder().addKey(employeeKey).build();
LookupResponse response = datastore.lookup(request);
if (response.getMissingCount() == 1) {
throw new RuntimeException("entity not found");
}
Entity employee = response.getFound(0).getEntity();
This does not look like an async call to me, so it is possible to make aysnc calls to the database in Java? I noticed App engine has some libraries for async calls in its Java API, but I am not using appengine, I will be calling the datastore from my own instances. As well, if there is an async library can I test it on my local server (for example app engine's async library I could not find a way to set it up to use my local server for example I this library can't get my environment variables).
In your shoes, I'd give a try to Spotify's open-source Asynchronous Google Datastore Client -- I have not personally tried it, but it appears to meet all of your requirements, including being able to test on your local server. Please give it a try and let us all know how well it meets your needs, so we can all benefit and learn -- thanks!
Other than bare-bones autogen stuff, I cannot find any documentation or samples that show how to actually use the ReplicatedRepositoryBuilder in Carbonado. In particular, the coding pattern to wire up on both the master and replica process sides.
I am using BDB-JE 5.0.58 with Carbonado 1.2.3. Here is the typical doc I found.
Questions, assuming I have a master producing process and a second client replica process which continually reads from the one-way replicated repository side:
are both processes supposed to have an instance of ReplicatedRepositoryBuilder with roles somehow reversed?
can either side execute resync(), even though I only want that driven from the master side?
how is replication being done, given these are simple library databases? Is there a listener under the covers on each end, doing what amounts to an in-process thread replaying changes? Is this going on at both ends?
is the replica client side supposed to do Repository actions through ReplicatedRepositoryBuilder.getReplicaRepositoryBuilder.build(), or just a normal BDBRepositoryBuilder.build()? I don't see how it is supposed to get a replication-friendly replica Repository handle to work from.
I would be ecstatic at just a simple java code example of both the producer and consumer sides, as they would look sitting in different processes on locahost, showing a make-change => resync => other-side-sees-it sequence.
// master producing process
RepositoryBuilder masterBuilder = new BDBRepositoryBuilder(); // ... set some props
RepoitoryBuilder replicaBuilder = new BDBRepositoryBuilder(); // ... set some props
ReplicatedRepositoryBuilder builder = new ReplicatedRepositoryBuilder();
builder.setMaster(true);
builder.setMasterRepositoryBuilder(masterBuilder);
builder.setReplicaRepositoryBuilder(replicaBuilder);
Repository repository = builder.build(); // master CRUD and query done via this handle
... do some CRUD to 'repository' ...
ResyncCapability capability = repository.getCapability(ResyncCapability.class);
capability.resync(MyStorableRecord.class, 1.0, null);
// client reader process
// ?