the following is from the documentation (akka):
Delivery guarantees
Stream refs utilise normal actor messaging for their trainsport, and therefore provide the same level of basic delivery guarantees. Stream refs do extend the semantics somewhat, through demand re-delivery and sequence fault detection. In other words:
messages are sent over actor remoting
which relies on TCP (classic remoting or Artery TCP) or Aeron UDP for basic redelivery mechanisms
messages are guaranteed to to be in-order
messages can be lost, however:
a dropped demand signal will be re-delivered automatically (similar to system messages)
a dropped element signal will cause the stream to fail
(link -> https://doc.akka.io/docs/akka/current/stream/stream-refs.html)
After reading this, i am curious. Does akka stream provide guaranteed delivery then.
For eg. A bunch of actors store events in journal that feeds a stream that batches messages (with flow of lets say 1 second for max 1000 messages) to the other actor. Does this guarantee delivery?
Also, as a side question. If the system messages re-delivers droped messages automaticly, does this mean that the event stream guarantees delivery?
StreamRefs do not currently (Akka 2.6.1) implement any reliability other than a sequence numbering for the element and demand re-signalling:
If a gap or out of order delivery is detected from the element sequence number the streams are failed. There is no redelivery of elements.
If a demand request from the receiving side is lost it is resent after a timeout.
The receiving side has a buffer and in case of stream failure all elements in it will be lost in addition to any elements in flight before the sending side sees the failure signal from the receiving side (which goes across network so is not immediate).
Related
I have below configuration for rabbitmq
prefetchCount:1
ack-mode:auto.
I have one exchange and one queue is attached to that exchange and one consumer is attached to that queue. As per my understanding below steps will be happening if queue has multiple messages.
Queue write data on a channel.
As ack-mode is auto,as soon as queue writes message on channel,message is removed from queue.
Message comes to consumer,consumer start performing on that data.
As Queue has got acknowledgement for previous message.Queue writes next data on Channel.
Now,my doubt is,Suppose consumer is not finished with previous data yet.What will happen with that next data queue has written in channel?
Also,suppose prefetchCount is 10 and I have just once consumer attached to queue,where these 10 messages will reside?
The scenario you have described is one that is mentioned in the documentation for RabbitMQ, and elaborated in this blog post. Specifically, if you set a sufficiently large prefetch count, and have a relatively small publish rate, your RabbitMQ server turns into a fancy network switch. When acknowledgement mode is set to automatic, prefetch limiting is effectively disabled, as there are never unacknowledged messages. With automatic acknowledgement, the message is acknowledged as soon as it is delivered. This is the same as having an arbitrarily large prefetch count.
With prefetch >1, the messages are stored within a buffer in the client library. The exact data structure will depend upon the client library used, but to my knowledge, all implementations store the messages in RAM. Further, with automatic acknowledgements, you have no way of knowing when a specific consumer actually read and processed a message.
So, there are a few takeaways here:
Prefetch limit is irrelevant with automatic acknowledgements, as there are never any unacknowledged messages, thus
Automatic acknowledgements don't make much sense when using a consumer
Sufficiently-large prefetch when auto-ack is off, or any use of autoack = on will result in the message broker not doing any queuing, and instead doing routing only.
Now, here's a little bit of expert opinion. I find the whole notion of a message broker that "pushes" messages out to be a little backwards, and for this very reason- it's difficult to configure properly, and it is unclear what the benefit is. A queue system is a natural fit for a pull-based system. The processor can ask the broker for the next message when it is done processing the current message. This approach will ensure that load is balanced naturally and the messages don't get lost when processors disconnect or get knocked out.
Therefore, my recommendation is to drop the use of consumers altogether and switch over to using basic.get.
I am trying to understand the best use of RabbitMQ to satisfy the following problem.
As context I'm not concerned with performance in this use case (my peak TPS for this flow is 2 TPS) but I am concerned about resilience.
I have RabbitMQ installed in a cluster and ignoring dead letter queues the basic flow is I have a service receive a request, creates a persistent message which it queues, in a transaction, to a durable queue (at this point I'm happy the request is secured to disk). I then have another process listening for a message, which it reads (not using auto ack), does a bunch of stuff, writes a new message to a different exchange queue in a transaction (again now happy this message is secured to disk). Assuming the transaction completes successfully it manually acks the message back to the original consumer.
At this point my only failure scenario is is I have a failure between the commit of the transaction to write to my second queue and the return of the ack. This will lead to a message being potentially processed twice. Is there anything else I can do to plug this gap or do I have to figure out a way of handling duplicate messages.
As a final bit of context the services are written in java so using the java client libs.
Paul Fitz.
First of all, I suggest you to look a this guide here which has a lot of valid information on your topic.
From the RabbitMQ guide:
At the Producer
When using confirms, producers recovering from a channel or connection
failure should retransmit any messages for which an acknowledgement
has not been received from the broker. There is a possibility of
message duplication here, because the broker might have sent a
confirmation that never reached the producer (due to network failures,
etc). Therefore consumer applications will need to perform
deduplication or handle incoming messages in an idempotent manner.
At the Consumer
In the event of network failure (or a node crashing), messages can be
duplicated, and consumers must be prepared to handle them. If
possible, the simplest way to handle this is to ensure that your
consumers handle messages in an idempotent way rather than explicitly
deal with deduplication.
So, the point is that is not possibile in any way at all to guarantee that this "failure" scenario of yours will not happen. You will always have to deal with network failure, disk failure, put something here failure etc.
What you have to do here is to lean on the messaging architecture and implement if possibile "idempotency" of your messages (which means that even if you process the message twice is not going to happen anything wrong, check this).
If you can't than you should provide some kind of "processed message" list (for example you can use a guid inside every message) and check this list every time you receive a message; you can simply discard them in this case.
To be more "theorical", this post from Brave New Geek is very interesting:
Within the context of a distributed system, you cannot have
exactly-once message delivery.
Hope it helps :)
I am adding two JMS messages in the same destination sequentially. Will both of these messages be received in the same order in which I have added them or is there a chance for reverse ordering, that is, which ever the message is received first in the destination will be retrieved first.
I am adding into a destination as:
producer.send(Msg1);
producer.send(Msg2);
Msg1 and Msg2 will be added sequentially in all the cases (like network failures and latency. etc.)?
Message ordering is not guaranteed (and not mandated by the specification) and Total JMS Message ordering explains the details of why. Also see the Stack Overflow post How to handle order of messages in JMS?.
As per JMS2 specs
JMS defines that messages sent by a session to a destination must be received
in the order in which they were sent. This defines a partial ordering
constraint on a session’s input message stream.
JMS does not define order of message receipt across destinations or across
a destination’s messages sent from multiple sessions. This aspect of a
session’s input message stream order is timing-dependent. It is not under
application control.
Also
Although clients loosely view the messages they produce within a session
as forming a serial stream of sent messages, the total ordering of this stream
is not significant. The only ordering that is visible to receiving clients is
the order of messages a session sends to a particular destination.
Several things can affect this order like message priority,
persistent/non persistent etc.
So to answer your question messages will be received in the same order they were sent with above information. The order in which messages are delivered to the server however will be constrained by limitations like message priority, persistent/non persistent etc.
Is there a message queue implementation that allows breaking up work into 'batches' by inserting 'message barriers' into the message stream? Let me clarify. No messages after a message barrier should be delivered to any consumers of the queue, until all messages before the barrier are consumed. Sort of like a synchronization point. I'd also prefer if all consumers received notification when they reached a barrier.
Anything like this out there?
I am not aware of existing, widely-available implementations, but if you'll allow me I'd propose a very simple, generic implementation using a proxy, where:
producers write to the proxy queue/topic
the proxy forwards to the original queue/topic until a barrier message is read by the proxy, at which point:
the proxy may notify topic subscribers of the barrier by forwarding the barrier message to the original topic, or
the proxy may notify queue subscribers of the barrier by:
periodically publishing barrier messages until the barrier has been cleared; this does not guarantee that all consumers will receive exactly one notification, although all will eventually clear the barrier (some may receive 0 notifications, others >1 notifications -- all depending on the type of scheduler used to distribute messages to consumers e.g. if non-roundrobin)
using a dedicated topic to notify each consumer exactly once per barrier
the proxy stops forwarding any messages from the proxy queue until the barrier has been cleared, that is, until the original queue has emptied and/or all consumers have acknowledged all queue/topic messages (if any) leading up to the barrier
the proxy resumes forwarding
UPDATE
Thanking Miklos for pointing out that under JMS the framework does not provide acknowledgements for asynchronous deliveries (what is referred to as "acknowledgements" in JMS are purely a consumer side concept and are not proxiable as-such.)
So, under JMS, the existing implementation (to be adapted for barriers) may already provide application-level acknowledgements via an "acknowledgement queue" (as opposed to the original queue -- which would be a "request queue".) The consumers would have to acknowledge execution of requests by sending acknowledgement messages to the proxy acknowledgement queue; the proxy would use the acknowledgement messages to track when the barrier has been cleared, after having also forwarded the acknowledgement messages to the producer.
If the existing implementation (to be adapted for barriers) does not already provide application-level acknowledgements via an "acknowledgement queue", then you could either:
have the proxy use the QueueBrowser, provided that:
you are dealing with queueus not events, that
you want to synchronize on delivery not acknowledgement of execution, and
it is OK to synchronize on first delivery, even if the request was actually aborted and has to be re-delivered (even after the barrier has been cleared.) I think Miklos already pointed this problem out IIRC.
otherwise, add an acknowledgment queue consumed by the proxy, and adapt the consumers to write acknowledgements to it (essentially the JMS scenario above, except it is not necessary for the proxy to forward acknowledgement messages to the producer unless your producer needs the functionality.)
You could achieve this using a topic for the 'Barrier Message' and a queue for the 'batched items' which are consumed with selective receivers.
Publishing the Barrier Message to a topic ensures that all consumers receive their own copy of the Barrier Message.
Each consumer will need two subscriptions:
To the Barrier Topic
A selective receiver against the batch queue, using selection criteria defined by the Barrier Message.
The Barrier Message will need to contain a batch key that must be applied to the queue consumers selection criteria.
e.g. batchId = n
or JMSMessageID < 100
or JMSTimestamp < xxx
Whenever a barrier message is received,
the current queue consumer must be closed
the queue selection criteria must be modified using the content of the Barrier Message
a new selective consumer must be started using the modified selection criteria
If you are going to use a custom batch key for the selection criteria such as 'batchId' above, then the assumption is that all message producers are capable of setting that JMS property or else a proxy will have to consume the messages set the property and republish to the queue where the selective consumers are listening.
For more info on selective receivers see these links:
http://java.sun.com/j2ee/1.4/docs/api/javax/jms/Message.html
http://java.sun.com/j2ee/sdk_1.3/techdocs/api/javax/jms/QueueSession.html#createReceiver(javax.jms.Queue,%20java.lang.String)
Say I load messages in a queue from multiple nodes.
Then, one or many nodes are pulling messages from the queue.
Is it possible (or is this normal usage?) that the queue guarantees to not hand out a message to more than one server/node?
And does that server/node have to tell the queue it has completed the operation and the queue and delete the message?
A message queuing system that did not guarantee to hand out a given message to just one recipient would not be worth the using. Some message queue systems have transactional controls. In that case, if a message is collected by one receiver as part of a transaction, but the receiver does not then commit the transaction (and the message queue can identify that the original recipient is no longer available), then it would be reissued. However, the message would not be made available to two processes concurrently.
What messaging/queuing technology are you using ? AMQP can certainly guarantee this behaviour (amongst many others, including pub/sub models)
If you want this in Java - then a JMS compliant messaging system will do what you want - and most messaging systems have a JMS client. You can Use Spring's JmsTemplate for real ease of use too.
With JMS - a message from a Queue will only be consumed by one and only one client - and once it is consumed (acknowledged) - it will be removed from the messaging system. Also when you publish a message using JMS - if its persistent - it will be sent synchronously, and the send() method won't return until the message is stored on the broker's disk - this is important - if you don't want to run the risk of loosing messages in the event of failure.