Design for scalable periodic queue message batching

Design for scalable periodic queue message batching - java

We currently have a distributed setup where we are publishing events to SQS and we have an application which has multiple hosts that drains messages from the queue and does some transformation over it and transmits to interested parties. I have a use case where the receiving end point has scalability concerns with the message volume and hence we would like to batch these messages periodically (say every 15 mins) in the application before sending it.
The incoming message rate is around 200 messages per second and each message is no more than 10 KB. This system need not be real time, but would definitely be a good to have and also the order is not important (its okay if a batch containing older messages gets sent first).
One approach that I can think of is maintaining an embedded database within the application (each host) that batches the events and another thread that runs periodically and clears the data.
Another approach could be to create timestamped buckets in a a distributed key-value store (s3, dynamo etc.) where we write the message to the correct bucket based the messages time stamp and we periodically clear the buckets.
We can run into several issues here, since the messages would be out of order a bucket might have already been cleared (can be solved by having a default bucket though), would need to accurately decide when to clear a bucket etc.
The way I see it, at least two components would be required one which does the batching into a temporary storage and another that clears it.
Any feedback on the above approaches would help, also it looks like a common problem are they any existing solutions that I can leverage ?
Thanks

Related

Replacing a scheduled task with Spring Events

In my Spring Boot app, customers can submit files. Each customer's files are merged together by a scheduled task that runs every minute. The fact that the merging is performed by a scheduler has a number of drawbacks, e.g. it's difficult to write end-to-end tests, because in the test you have to wait for the scheduler to run before retrieving the result of the merge.
Because of this, I would like to use an event-based approach instead, i.e.
Customer submits a file
An event is published that contains this customer's ID
The merging service listens for these events and performs a merge operation for the customer in the event object
This would have the advantage of triggering the merge operation immediately after there is a file available to merge.
However, there are a number of problems with this approach which I would like some help with
Concurrency
The merging is a reasonably expensive operation. It can take up to 20 seconds, depending on how many files are involved. Therefore the merging will have to happen asynchronously, i.e. not as part of the same thread which publishes the merge event. Also, I don't want to perform multiple merge operations for the same customer concurrently in order to avoid the following scenario
Customer1 saves file2 triggering a merge operation2 for file1 and file2
A very short time later, customer1 saves file3 triggering merge operation3 for file1, file2, and file3
Merge operation3 completes saving merge-file3
Merge operation2 completes overwriting merge-file3 with merge-file2
To avoid this, I plan to process merge operations for the same customer in sequence using locks in the event listener, e.g.
#Component
public class MergeEventListener implements ApplicationListener<MergeEvent> {
private final ConcurrentMap<String, Lock> customerLocks = new ConcurrentHashMap<>();
#Override
public void onApplicationEvent(MergeEvent event) {
var customerId = event.getCustomerId();
var customerLock = customerLocks.computeIfAbsent(customerId, key -> new ReentrantLock());
customerLock.lock();
mergeFileForCustomer(customerId);
customerLock.unlock();
}
private void mergeFileForCustomer(String customerId) {
// implementation omitted
}
}
Fault-Tolerance
How do I recover if for example the application shuts down in the middle of a merge operation or an error occurs during a merge operation?
One of the advantages of the scheduled approach is that it contains an implicit retry mechanism, because every time it runs it looks for customers with unmerged files.
Summary
I suspect my proposed solution may be re-implementing (badly) an existing technology for this type of problem, e.g. JMS. Is my proposed solution advisable, or should I use something like JMS instead? The application is hosted on Azure, so I can use any services it offers.
If my solution is advisable, how should I deal with fault-tolerance?

Regarding the concurrency part, I think the approach with locks would work fine, if the number of files submitted per customer (on a given timeframe) is small enough.
You can eventually monitor over time the number of threads waiting for the lock to see if there is a lot of contention. If there is, then maybe you can accumulate a number of merge events (on a specific timeframe) and then run a parallel merge operation, which in fact leads to a solution similar to the one with the scheduler.
In terms of fault-tolerance, an approach based on a message queue would work (haven't worked with JMS but I see it's an implementation of a message-queue).
I would go with a cloud-based message queue (SQS for example) simply because of reliability purposes. The approach would be:
Push merge events into the queue
The merging service scans one event at a time and it starts the merge job
When the merge job is finished, the message is removed from the queue
That way, if something goes wrong during the merge process, the message stays in the queue and it will be read again when the app is restarted.

My thoughts around this matter after some considerations.
I restricted possible solutions to what's available from Azure managed services, according to specifications from OP.
Azure Blob Storage Function Trigger
Because this issue is about storing files, let's start with exploring Blob Storage with trigger function that fires on file creation. According to doc, Azure functions can run up to 230 seconds, and will have a default retry count of 5.
But, this solution will require that files from a single customer arrives in a manner that will not cause concurrency issues, hence let's leave this solution for now.
Azure Queue Storage
Does not guarantee first-in-first-out (FIFO) ordered delivery, hence it does not meet the requirements.
Storage queues and Service Bus queues - compared and contrasted: https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-azure-and-service-bus-queues-compared-contrasted
Azure Service Bus
Azure Service Bus is a FIFO queue, and seems to meet the requirements.
https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-azure-and-service-bus-queues-compared-contrasted#compare-storage-queues-and-service-bus-queues
From doc above, we see that large files are not suited as message payload. To solve this, files may be stored in Azure Blob Storage, and message will contain info where to find the file.
With Azure Service Bus and Azure Blob Storage selected, let's discuss implementation caveats.
Queue Producer
On AWS, the solution for the producer side would have been like this:
Dedicated end-point provides pre-signed URL to customer app
Customer app uploads file to S3
Lambda triggered by S3 object creation inserts message to queue
Unfortunately, Azure doesn't have a pre-signed URL equivalent yet (they have Shared Access Signature which is not equal), hence file uploads must be done through an end-point which in turn stores the file to Azure Blob Storage. When file upload end-point is required, it seems appropriate to let the file upload end-point also be reponsible for inserting messages into queue.
Queue Consumer
Because file merging takes a signicant amount of time (~ 20 secs), it should be possible to scale out the consumer side. With multiple consumers, we'll have to make sure that a single customer is processed by no more than one consumer instance.
This can be solved by using message sessions: https://learn.microsoft.com/en-us/azure/service-bus-messaging/message-sessions
In order to achieve fault tolerance, consumer should use peek-lock (as opposed to receive-and-delete) during file merge and mark message as completed when file merge is completed. When message is marked as completed, consumer may be responsible for
removing superfluous files in Blob Storage.
Possible problems with both existing solution and future solution
If customer A starts uploading a huge file #1 and immediately after that starts uploading a small file #2, file upload of file #2 may be be completed before file #1 and cause an out-of-order situation.
I assume that this is an issue that is solved in existing solution by using some kind of locking mechanism or file name convention.

Spring-boot with Kafka can solve your problem of fault tolerance.
Kafka supports the producer-consumer model. let the customer events posted to Kafka producer.
configure Kafka with replication for not to lose any events.
use consumers that can invoke the Merging service for each event.
once the consumer read the event of customerId and merged then commit the offset.
In case of any failure in between merging the event, offset is not committed so it can be read again when the application started again.
If the merging service can detect the duplicate event with given data then reprocessing the same message should not cause any issue(Kafka promises single delivery of the event). Duplicate event detection is a safety check for an event processed full but failed to commit to Kafka.

First, event-based approach is corrrect for this scenario. You should use external broker for pub-sub event messages.
Attention that, by default, Spring publishing an event is synchronous.
Suppose that, you have 3 services:
App Service
Merge Servcie
CDC Service (change data capture)
Broker Service (Kafka, RabbitMQ,...)
Main flow base on "Outbox Pattern":
App Service save event message to Outbox message table
CDC Service watching outbox table and publish event message from Outbox table to Broker Servie
Merge Service subcribe to Broker Server and receiving event message (messages is orderly)
Merge Servcie perform merge action
You can use eventuate lib for this flow.
Futher more, you can apply DDD to your architecture. Using Axon framework for CQRS pattern, public domain event and process it.
Refer to:
Outbox pattern: https://microservices.io/patterns/data/transactional-outbox.html

It really sounds like you may do with a Stream or an ETL tool for the job. When you are developing an app, and you have some prioritisation/queuing/batching requirement, it is easy to see how you can build a solution with a Cron + SQL Database, with maybe a queue to decouple doing work from producing work.
This may very well be the easiest thing to build as you have a lot of granularity and control to this approach. If you believe that you can in fact meet your requirements this way fairly quickly with low risk, you can do so.
There are software components which are more tailored to these tasks, but they do have some learning curves, and depend on what PAAS or cloud you may be using. You'll get monitoring, scalability, availability resiliency out-of-the-box. An open source or cloud service will take the burden of management off your hands.
What to use will also depend on what your priority and requirements are. If you want to go the ETL approach which is great at banking up jobs you might want to use something like a Glue t. If you want to want prioritization functionality you may want to use multiple queues, it really depends. You'll also want to monitor with a dashboard to see what wait time you should have for your merge regardless of the approach.

Mechanism for reading all messages from a queue in one request

I need a solution for the following scenario which is similar to a queue:
I want to write messages to a queue continuously. My message is very big, containing a lot of data so I do want to make as few requests as possible.
So my queue will contain a lot of messages at some point.
My Consumer will read from the queue every 1 hour. (not whenever a new message is written) and it will read all the messages from the queue.
The problem is that I need a way to read ALL the messages from the queue using only one call (I also want the consumer to make as few requests to the queue as possible).
A close solution would be ActiveMQ but the problem is that you can only read one message at a time and I need to read them all in one request.
So my question is.. Would there be other ways of doing this more efficiently? The actual thing that I need is to persist in some way messages created continuously by some application and then consume them (also delete them) by the same application all at once, every 1 hour.
The reason I thought a queue would be fit is because as the messages are consumed they are also deleted but I need to consume them all at once.

I think there's some important things to keep in mind as you're searching for a solution:
In what way do you need to be "more efficient" (e.g. time, monetary cost, computing resources, etc.)?
It's incredibly hard to prove that there are, in fact, no other "more efficient" ways to solve a particular problem, as that would require one to test all possible solutions. What you really need to know is, given your specific use-case, what solution is good enough. This, of course, requires knowing specifically what kind of performance numbers you need and the constraints on acquiring those numbers (e.g. time, monetary cost, computing resources, etc.).
Modern message broker clients (e.g. those shipped with either ActiveMQ 5.x or ActiveMQ Artemis) don't make a network round-trip for every message they consume as that would be extremely inefficient. Rather, they fetch blocks of messages in configurable sizes (e.g. prefetchSize for ActiveMQ 5.x, and consumerWindowSize for ActiveMQ Artemis). Those messages are stored locally in a buffer of sorts and fed to the client application when the relevant API calls are made to receive a message.
Making "as few requests as possible" is rarely a way to increase performance. Modern message brokers scale well with concurrent consumers. Consuming all the messages with a single consumer drastically limits the message throughput as compared to spinning up multiple threads which each have their own consumer. Rather than limiting the number of consumer requests you should almost certainly be maximizing them until you reach a point of diminishing returns.

Large number of single threaded task queues

At our company we have a server which is distributed into few instances. Server handles users requests. Requests from different users can be processed in parallel. Requests from same users should be executed strongly sequentionally. But they can arrive to different instances due to balancing. Currently we use Redis-based distributed locks but this is error-prone and requires more work around concurrency than business logic.
What I want is something like this (more like a concept):
Distinct queue for each user
Queue is named after user id
Each requests identified by request id
Imagine two requests from the same user arriving at two different instances concurrently:
Each instance put their request id into this user queue.
Additionaly, they both store their request ids locally.
Then some broker takes request id from the top of "some_user_queue" and moves it into "some_user_queue_processing"
Both instances listen for "some_user_queue_processing". They peek into it and see if this is request id they stored locally. If yes, then do processing. If not, then ignore and wait.
When work is done server deletes this id from "some_user_queue_processing".
Then step 3 again.
And all of this happens concurrently for a lot (thousands of them) of different users (and their queues).
Now, I know this sounds a lot like actors, but:
We need solution requiring as small changes as possible to make fast transition from locks. Akka will force us to rewrite almost everything from scratch.
We need production ready solution. Quasar sounds good, but is not production ready yet (more correctly, their Galaxy cluster).
Tops at my work are very conservative, they simply don't want another dependency which we'll need to support. But we already use Redis (for distributed locks), so I thought maybe it could help with this too.
Thanks

The best solution that matches the description of your problem is Redis Cluster.
Basically, the cluster solves your concurrency problem, in the following way:
Two (or more) requests from the same user, will always go to the same instance, assuming that you use the user-id as a key and the request as a value. The value must be actually a list of requests. When you receive one, you will append it to that list. In other words, that is your queue of requests (a single one for every user).
That matching is being possible by the design of the cluster implementation. It is based on a range of hash-slots spread over all the instances.
When a set command is executed, the cluster performs a hashing operation, which results in a value (the hash-slot that we are going to write on), which is located on a specific instance. The cluster finds the instance that contains the right range, and then performs the writing procedure.
Also, when a get is performed, the cluster does the same procedure: it finds the instance that contains the key, and then it gets the value.
The transition from locks is very easy to perform because you only need to have the instances ready (with the cluster-enabled directive set on "yes") and then to run the cluster-create command from redis-trib.rb script.
I've worked last summer with the cluster in a production environment and it behaved very well.

Parallelism and Failover of a Sequential Data

Good time guys!
We have a pretty straightforward application-adapter: once in 30 seconds it reads records from a database (can't write to it) of one system, converts each of these records into an internal format, performs filtering, encrichment, ..., and, finally, transforms the resulting, let's say, entities into an xml format and sends them via a JMS to other system. Nothing new.
Let's add some spice here: records in the database are sequential (that means that their identifies are generated by a sequence), and when it is time to read a new bunch of records, we get a last-processed-sequence-number -- which is stored in our internal databese and updated each time the next record is processed (sent to the JMS) -- and start reading from that record (+1).
The problem is our customers gave us an NFR: processing of a read record bunch must not last longer than 30 seconds. As far as there are a lot of steps in the workflow (with some pretty long running ones), and it is possible to get a pretty big amount of records, and as far as we process them one by one, it can take more than 30 seconds.
Because of all the above I want to ask 2 questions:
1) Is there an approach of a parallel processing of sequential data, maybe with one or several intermediate storages, or Disruptor patern, or cqrs-like, or a notification-based, or ... that provides a possibility of working in such a system?
2) A general one. I need to store a last-processed-number and send an entity to the JMS. If I save a number to a database and then some problem raises with the JMS, on an application's restart my adapter will think that it successfuly sended the entity, which is not true and it won't be ever received. If I send an entity and after that try so save a number to a database and get an exception, on an application's restart a reprocessing will be performed which will lead to duplications in the JMS. I'm not sure that xa transactions will help here or some kind of a last resorce gambit...
Could somebody, please, share experience or ideas?
Thanks in advance!

1) 30 seconds is a long time and you can do a lot in that time esp with more than one CPU. Without specifics I can only say it is likely you can make it faster if you profile it and use more CPUs.
2) You can update the database before you send and listen to the JMS queue yourself to see it was received by the broker.

Dimitry - I don't know the detail around your problem so I'm just going to make a set of assumptions. I hope it willtrigger an idea that will lead to the solution at least.
Here goes:
Grab you list of items to process.
Store the last id (and maybe the starting id)
Process each item on a different thread (suggest using Tasks).
Record any failed item in a local failed queue.
When you grab the next bunch, ensure you process the failed queue first.
Have a way of determining a max number of retries and a way of moving/marking it as permanently failed.
Not sure if that was what you were after. NServiceBus has a retry process where the gap between each retry gets longer up to a point, then it is marked as failed.

Folks, finally we ended up with the following solution. We implemented a kind of the Actor Model. The idea is the following.
There are two main (internal) database tables for our application, let's call them READ_DATA_INFO, which contains a last-read-record-number of the 'source' external system, and DUMPED_DATA, which stores a metadata about each read record of the source system. This is how it all works: each n (a configurable property) seconds a service bus reads the last processed identifier of the source system and sends a request to the source system to get new records from it. If there are several new records, they are being wrapped with a DumpRecordBunchMessage message and sent to a DumpActor class. This class begins a transaction which comprises two operations: update the last-read-record-number (the READ_DATA_INFO table) and save a metadata about each reacord (the DUMPED_DATA table) (each dumped record gets the 'NEW' status. When a record is successfully processed, it gets the 'COMPLETED' status; otherwise - the 'FAILED' status). In case of a successfull transaction commit each of those records is wrapped with a RecordMessage message class and send to next processing actor; otherwise those records are just skipped - they would be reread after next n seconds.
There are three interesting points:
an application's disaster recovery. What if our application will be stopped somehow at the middle of a processing. No problem, at an application's startup (#PostConstruct marked method) we find all the records with the 'NEW' statuses at the DUMPED_DATA table and with a help of a stored metadata rebuild restore them from the source system.
parallel processing. After all records are successfully dumped, they become independent, which means that they can be processed in parallel. We introduced several mechanisms of a parallelism and a loa balancing. The simplest one is a round robin approach. Each processing actor consists of a parant actor (load balancer) and a configurable set of it's child actors (worker). When a new message arrives to the parent actor's queue, it dispatches it to the next worker.
duplicate record prevention. This is the most interesting one. Let's assume that we read data each 5 seconds. If there is an actor with a long running operation, it is possible to have several tryings to read from the source system's database starting from the same last-read-record number. Thus there would potentially be a lot duplicate records dumped and processed. In order to prevent this we added a CAS-like check of DumpActor's messages: if the last-read-record from a message is equal to a one from the DUMPED_DATA table, this message should be processed (no messages were processed before it); otherwise this message is rejected. Rather simple, but powerfull.
I hope this overview will help somebody. Have a good time!

Concurrent email processing (without spamming)

I have a scenario where I need to process a csv file that contains some simulation data from a device. Each line is an output representing the device state at a point in time. On processing each line, specific columns are checked for variance / anomalies.
If there are anomalies, an email has to be sent to a bunch of folks with the detected anomaly. However to avoid spamming them (csv can occasionally be several 100 thousand lines) I have to maintain a threshold of X seconds.i.e If a mail was sent for the same anomaly from the same condition (from the same device being simulated) < X seconds back, I must just ignore sending the mail.
Currently the solution I use seems clumsy to me, where
1) I save the mail-message and device id with anomaly detection time.
2) Create one "alert" per email-id with a create-time-stamp, sent-time-stamp, message-id (from step 1) and device-id with status as "NEW".
3) Before sending each mail I do a database read to see if the last email with status as 'SENT' has a time stamp that exceeds the threshold to ignore. ( now - sent-time-stamp > threshold)
If yes, then I get all the alerts using the message-id and send them out and update all their status to SENT- else just ignore.
I started off with a thread pool executor and realized halfway through that the read-send condition can fail once there are multiple threads trying to send out emails and update the sent-time-stamp. So for now I have set the thread pool size to 1 - which beats the purpose of an executor. (I don't have row level locking as I use Mongo as the backing db). The backing datastore has to be a nosql store as the fields can vary drastically and will not fit a machine's disk as more simulations get piped in.
The application is distributed - so a csv file can be picked by any random node to process and notify.
Would Akka be a good candidate for this kind of process ? Any insights or lessons from prior experience implementing this are welcome (I have to stick with JVM).

You can use distributed Akka as replacement (see good tutorial here http://www.addthis.com/blog/2013/04/16/building-a-distributed-system-with-akka-remote-actors/#.U-HWzvmSzy4) but why? Just bit update what already works:
1) Remove Executor at all, it's not needed here, send emails one by one (I suppose you're not trying to send millions of mail messages at once, right?)
2) Cleanup database for old messages on application start to resolve problems with disk space.

Akka could help you with the distribution if you use Akka Cluster. That gives you a dynamic peer-to-peer cluster on your nodes, very nice if you need it. FApart from that, Akka works message-based which sounds like a good match to model your domain.
However, be aware that Akka bases on the actor programming model, which is great but really different from multi-threaded programs in java. So there is a learning curve. If you need a quick solution, it will probably not be the best match. If you are willing to put some time into this and learn what Akka is about, it could be a good match.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.