If I have many queues and each has an unique ID, would a Hashtable of Queues be the way to go? I know it sounds strange that I'm asking this, but just wondering if there might be a better way for optimization.
Sorry for the lack of information. I'm basically storing queues of messages which are identified by the client id.
The client will request to get messages from the server.
In the case when the ack does not reach the server, the message still remains in the queue until the client makes another attempt to get the oldest message.
The idea is to retain all the messages if the client fails to ack and to retrieve all messages in a FIFO manner.
The question doesn't provide any detail on what you want to do with this. And this is very important, because the usage pattern is critical in determining which data structure is going to be most efficient for your use case.
But I'd say that in the absence of other details, a HashTable of Queues sounds like a sensible choice, with the HashTable using the ID as a key and the corresponding Queue as a value.
Then the following operations will both be O(1) with very low overhead:
Add an item to the queue with a given ID
Pull the first item from the Queue with a given ID
Which is probably the usage pattern you are going to be needing in most cases.....
since java is statically typed, i would definitely use a hashtable... (that is, if we are talking about optimization)
Related
I am creating multiple reliable topics in Hazelcast. As I understand from the documentation, it is backed by a ringbuffer. How can I configure the ringbuffer for a topic to suit my needs?
I want to persist only top 100 messages for one topic and entire history for another.
You can configure the reliable topic backing ring buffers by using the prefix _hz_rb_ in front of your reliable topic's name. For instance; assume that you have a reliable topic with the name myReliableTopic. The ring buffer backing this reliable topic will have the name as _hz_rb_myReliableTopic. So, you can configure it as below:
<ringbuffer name="_hz_rb_myReliableTopic">
<capacity>100</capacity>
</ringbuffer>
You can access this prefix using the RingbufferService.TOPIC_RB_PREFIX static field in the application.
Please remind that this prefix is not a part of the public API, thus it is not guaranteed to be kept unchanged in future releases.
Keeping the name same for ReliableTopic & RingBuffer does not work. While getting RingBuffer Object we have to prefix it with _hz_rb_as said by Alparslan Avci ealier is the only solution. Hazelcast document says otherwise but it does not work. Looks like a bug in Hazelcast.
Rakesh
I have a Storm topology that creates many Spouts and Bolts. They will obviously be spread out on various systems/nodes which have their own JVM's.
I understand that Storm will automatically manage the network communications so that the tuples emitted by the Spout will reach the Bolts on a different JVM.
What I don't understand is about how I can maintain a few variables that can keep track of things.
I want one variable that counts the number of tuples that have been processed by all instances of Bolt-A. Another variable for counting for Bolt-B and so on.
I also need a variable that acts as a flag so that I'll know when the Spouts have no more data to emit, so that the Bolts can start writing to SQL.
I considered using Redis, but wanted to know if that is the best way or is there any other way? Any code samples available anywhere? I Google-searched, but couldn't find much useful info.
First of all, there's no way to share the variable between tasks on Storm.
Instead of directly sharing the flag, you can define your own 'control' message and send it to Bolts to know there're no message for Spout to emit.
Sharing state with Redis is one of possible options (you need to implement your own logic), but flag value could be flickering so you may want to take care of it.
You should be able to get the number of tuples emitted and transferred per component and also per instance of each component from the Storm UI. There is even a REST API to retrieve the values.
For the fist requirement you can may use Metrics API (http://storm.apache.org/releases/0.10.1/Metrics.html)
For the second requirement, why not send a "flush" tuple similar to the timer tuple?
I have several similar systems which are authoritative for different parts of my data, but there's no way I can tell just from my "keys" which system owns which entities.
I'm working to build this system on top of AMQP (RabbitMQ), and it seems like the best way to handle this would be:
Create a Fanout exchange, called thingInfo, and have all of my other systems bind their own anonymous queues to that exchange.
Send a message out to the exchange: {"thingId": "123abc"}, and set a reply_to queue.
Wait for a single one of the remote hosts to reply to my message, or for some timeout to occur.
Is this the best way to go about solving this sort of problem? Or is there a better way to structure what I'm looking for? This feels mostly like the RPC example from the RabbitMQ docs, except I feel like using a broadcast exchange complicates things.
I think I'm basically trying to emulate the model described for MCollective's Message Flow, but, while I think MCollective generally expects more than one response, in this case, I would expect/require precisely one or, preferably, a clear "nope, don't have it, go fish" response from "everyone" (if it's really possible to even know that in this sort of architecture?).
Perhaps another model that mostly fits is "Scatter-Gather"? It seems there's support for this in Spring Integration.
It's a reasonable architecture (have the uninterested consumers simply ignore the message).
If there's some way to extract the pertinent data that the consumers use to decide interest into headers, then you can gain some efficiency by using a topic exchange instead of a fanout.
In either case, it gets tricky if more than one consumer might reply.
As you say, you can use a timeout if zero consumers reply, but if you think that might be frequent, you may be better off using arbitrary two-way messaging and doing the reply correlation in your code rather than using request/reply and tying up a thread waiting for a reply that will never come, and timing out.
This could also deal with the multi-reply case.
Could someone explain the Broker pattern to me in plain english? Possibly in terms of Java or a real life analogy.
Try to imagine that 10 people have messages they need to deliver. Another 10 people are expecting messages from the previous group. In an open environment, each person in the first group would have to deliver their message to the recipient manually, so each person has to visit at least one member of the second group. This is inefficient and chaotic.
In broker, there is a control class (in this case the postman) who receives all the messages from group one. The broker then organizes the messages based off destination and does any operations needed, before visiting each recipient once to deliver all messages for them. This is far more efficient.
In software design, this lets remote and heterogeneous classes communicate with each other easily. The control class has an interface which all incoming messages can interact with so a sorts of messages can be sent and interpreted correctly. Keep in mind this is not very scalable, so it loses effectiveness for larger systems.
Hope this helped!
I have an existing Protocol I'd like to write a java Client for. The Protocol consists of messages that have a header containing message type and message length, and then the announced number of bytes which is the payload.
I'm having some trouble modeling it, since creating a Class for each message type seems a bit excessive to me (that would turn out to be 20+ classes just to represent the messages that go over the wire) I was thinking about alternative models. But I can't come up with one that works.
I don't want anything fancy to work on the messages aside from notifying via publish subscribe when a message comes in and in some instances reply back.
Any pointers as to where to look?
A class for each message type is the natural OO way to model this. The fact that there are 20 classes should not put you off. (Depending on the relationship between the messages, you can probably implement common featues in superclasses.)
My advice is to not worry too much about efficiency to start with. Just focus on getting clean APIs that provide the required functionality. Once you've got things working, profile the code and see if the protocol classes are a significant bottleneck. If they are ... then think about how to make the code more efficient.