Advice on MoM and large messages - java

I'm designing a system that will use jms and some messaging software (I'm leaning towards ActiveMQ) as middleware. There will be less than 100 agents, each pushing at most 5000 messages per day through the queue.
The payload per message will be around 100 bytes each. I expect roughly half (2500) of the messages to cluster around midnight and the other half will be somewhat evenly distributed during the day.
The figures given above are all on the higher end of what I expect. (Yeah, I'll probably eat that statement in a near future).
There is one type of message where the payload will be considerably larger, say in the range of 5-50mb.
These messages will only be sent a few times per day from each agent.
My questions are:
Will this cause me problems in any way or is it perfectly normal to send larger amounts of data through a message queue?
For example, will it reduce throughput (smaller messages queuing up) while dealing with the larger messages?
Or will the message queue choke on larger messages?
Or should I approach this in a different way, say sending the location of the data through jms, and let the end receiver pick up the data elsewhere?
(I was hoping not to have a special case due to coupling, security issues, and extra configuration).
I'm completely new to the practical details of jms, so just tell me if I need to provide more details.
Edited:
I accepted Andres truly awesome answer. Keep posting advices and opinions, I will keep upvote everything useful.

Larger messages will definitely have an impact, but the sizes you mention here (5-50MB) should be managable by any decent JMS server.
However, consider the following. While processing a particular message, the entire message is read into memory. So if 100 agents each send a 50MB message to a different queue at around the same time, or at different times but the messages take long to dequeue, you could run into a situation where you are trying to put 5000MB worth of messages into memory. I have run into similar problems with 4MB messages with ActiveMQ in the past, however there were more messages being sent than the figures mentioned here. If the messages are all sent to the same (persistent) queue, this should not be a problem, as only the message being processed needs to be in memory.
So it depends on your setup. If the theoretical upper limit of 5000MB is managable for you (and keep the 32-bit JVM limit of 2000MB in mind) then go ahead, however this approach clearly does not scale very well so I wouldn't suggest it. If everything is sent to one persistent queue, it would probably be fine, however I would recommend putting a prototype under load first to make sure. Processing might be slow, but not necessarily slower than if it is fetched by some other mechanism. Either way, I would definitely recommend sending the smaller messages to separate destinations where they can be processed in parallel with the larger messages.

We are running a similar scenario with a hgher amount of messages. We did it similar to Andres proposal with using different queues for the big amount of smaller messages (which are still ~3-5MB in our scenario) and the few big messages that are around 50-150 MB.
In addition to the memory problems already cited, we also encountered general performance issuees on the message broker when processing a huge number of persistent large messages. This is caused by the need of persisting these messages somehow into the filesystem, we ran into bottlenecks on this side.

of cause the message size has an impact on the throughput (in msgs/sec). the larger the messages, the smaller the throughput.

Related

Thread per connection vs one thread for all connections in java

I have two different types of server and clients working at the moment and i am trying to decide which one would be better for an MMO server or at least a small MMO like server with at least 100 players at a time.
my first server is using a thread per connection model and sending objects over the socket using ObjectOutputStream.
my second server is using java nio to use only one thread for all the connections and using select to loop through them. this server is also using ObjectOutputStream to send data
for my question, what would be a better approach to an MMO server and if the single thread model is better how would sending an object over the socket channel be affected, would it not be read all the way and not get the full object?
for each object being sent over it just contains for example an int and 2 floats for sending position and player id.
I will relate this question to why MMO use UDP over TCP. The reason being that UDP promises fast delivery whereas TCP promises guaranteed delivery.
A similar analogy can be applied to a single-threaded vs a multi-threaded model.Regardless of which you choose, your overall CPU cycles remain the same i.e. the server can process only so much information per second.
Lets see what happens in each of these scenarios
1.Single-Threaded Model :
In this, your own implementation or the underlying library will end up creating a pipeline where the requests start queuing. If you are at min load, the queue will remain virtually empty and execution will be real-time, however a lot of CPU may be wasted. At max load, there will be a long queue-up and execution will have latency with increasing load, however delivery will be guaranteed and CPU utilization will be optimum. Typically a slow client will slow everybody else down.
Multi-Threaded Model :
In this, depending on how your own implementation or the underlying library implements mutli-threading, parallel execution of requests will start happening. The catch to MT is that it's easy to get fooled. For example, java.util.concurrent.ThreadPoolExecutor doesnt actually do any parallel processing unless you set the queue size to a low value. Once parallel processing starts happening, at min load, your execution will be superfast and CPU utilization will be optimum and game performance will be great. However, at max load your RAM usage will be high and CPU utilization will still be optimum. Typically you'll need to put thread interrupts to avoid a slow client hogging all the threads, which will mean glitchy performance for the slow client. Additionally as you start exhausting your thread pool and resources, threads will either get queued or just get dropped leading to glitchy performance.
In gaming, performance matters more over stability, hence there is no question that you should use MT wherever you can, however tuning your thread parameters to compliment your server resources will decide whether its a boon or a complete bane

Design : A Java Application with high throughput

I have a scenario, in which
A HUGE Input file with a specific format, delimited with \n has to be read, it has almost 20 Million records.
Each Record has to be read and processed by sending it to server in specific format.
=====================
I am thinking on how to design it.
- Read the File(nio)
- The thread that reads the file can keep those chunks into a JMS queue.
- Create n threads representing n servers (to which the data is to be sent). and then n Threads running in parallel can pick up one chunk at a time..execute that chunk by sending requests to the server.
Can you suggest if the above is fine, or you see any flaw(s) :). Also it would be great if you can suggest better way/ technologies to do this.
Thank you!
Updated : I wrote a program to read that file with 20m Records, using Apache Commons IO(file iterator) i read the file in chunks (10 lines at at time). and it read the file in 1.2 Seconds. How good is this? Should i think of going to nio? (When i did put a log to print the chunks, it took almost 26seconds! )
20 million records isn't actually that many so first off I would try just processing it normally, you may find performance is fine.
After that you will need to measure things.
You need to read from the disk sequentially for good speed there so that must be single threaded.
You don't want the disk read waiting for the networking or the networking waiting for the disk reads so dropping the data read into a queue is a good idea. You probably will want a chunk size larger than one line though for optimum performance. Measure the performance at different chunk sizes to see.
You may find that network sending is faster than disk reading already. If so then you are done, if not then at that point you can spin up more threads reading from the queue and test with them.
So your tuning factors are:
chunk size
number of threads.
Make sure you measure performance over a decent sized amount of data for various combinations to find the one that works best for your circumstances.
I believe you could batch the records instead of sending one at a time. You could avoid unnecessary network hops given the volume of data that need to be processed by the server.

Design pattern for accumulate / flush messages

currently we have a publish / consumer service where the consumer writes the messages received to AWS S3. We are currently writing more than 100.000.000 objects per month.
However, we can group this messages based on some rules in order to save some money.
These rules, can be something like:
If we have received 10 messages of the User 1 -> group them, and
write to S3.
If we have received < 10 messages of the User 1 and the elapsed time since the last message is more than 5 seconds, flush to S3.
If the "internal" queue, is bigger than N, start to flush
What we don't want is to eat our memory... Because of that, I am looking of what would be the best approach from design patterns perspective, taking into consideration that we are speaking about a high loaded system, so we don't have infinite memory resources.
Thanks!,
Well, based on your further explanation in comments, there are related algorithms called Leaky bucket and Token bucket. Their primary purpose is slightly different but you may consider using some modification - especially you may consider viewing the "leaking droplets out of the bucket" as the regular commit of all the messages of a single user in a bunch flush to S3.
So more or less modification like this (please read the description of the algorithms first):
You have a bucket per user (you may easily afford it, as you have only about 300 users)
Each bucket is filled by messages coming from each user
You regularly let each bucket leak (flush all the messages or just a limited bunch of messages)
I guess that it somehow follows what your original requirement might have been.

Java distributed list or queue for online production algorithm

I currently involved in a project to move our search algorithm which is based on dijkstra/a-star algorithm to a production system. Basically, the algorithm receives a request and starts a search until the optimal solutions is found, which usually takes a few seconds. The problem is that the prototype version of our algorithm relies on the JDK priority queue (which is basically a binary heap) which consumes a high amount of memory during the search. So one of the big problems is how to handle the scalability of the system if we want to put the algorithm in a production system, handling multiple requests concurrently. We are trying to figure out what's the best option to do that, and the ideas that are flying through our minds are:
The most trivial approach is to create a new instance of the algorithm each time a request is received, but it does not look like an efficient way to solve the problem (we would require a high amounts of ram for each instance).
Use some kind of persistent and efficient store/database to move part of the elements of the queue to there when the size of the queue is too big. This can alleviate the memory problems but new problems arise, like keeping the order between elements in the in-memory queue and the elements in the store.
Delegate the task of handling the queue to a big framework, like Hazelcast. Each instance of the algorithm can use a distributed queue with hazelcast. The problem is that Hazelcast does not have any kind of sorted queues, so we have to explicitly handle the order of the queue from outside the queue, which is a big performance issue.
We are also considering the idea of using ActiveMQ, although the framework is not designed for this sort of problems. The priority queue of ActiveMQ manages only 9 different priorities, which is not enough for our problem as we sort the elements in the queue based in a float-value (infinity priorities).
We are completely lost in this architecture design problem. Any advice is welcome.

Achieving consistent response times in GAE?

When running load tests against my app I am seeing very consistent response times. Once there is a constant level of load on GAE, the mean reponse times get smaller and smaller. But I want to have the same consistency on other apps that receive far fewer requests per second. In those I never need to support more than ~3 requests/second.
Reading the docs makes me think increasing the number of minimum idle instances should result in more consistent response times. But even then clients will still be see higher response times, every time GAE's scheduler thinks more instances are required. I am looking for a setup where users do not see those initial slow requests.
When I increase the number of minimum idle instances to 1, I want GAE to use the one resident instance only. As load increases, it should bring up and warm up new (dynamic) instances. Only once they are warmed up, GAE should send requests to them. But judging from the response times it seems as if client requests arrive in dynamic instances as they are brought up. As a result, those requests take a long time (up to 30 seconds).
Could this happen if my warmup code is incomplete?
Could the first calls on the dynamic instances be so slow because they involve code paths that have not been warmed up yet?
I do not experience this problem during load tests or when enough people are using the app. But my testing environments practically unusable by clients when nobody is using the app yet e.g. in the morning.
Thanks!
Some generic thoughts:
30 seconds startup-time for instances seems very much. We do a lot of initialization (including database-hits), and we have around 5 seconds overhead.
Warmup-Requests aren't guaranteed. If all instances are busy, and the scheduler believes that the request will be answered faster if it starts a new instance instead of queuing it on a busy one, it will do so without wasting time with a warmup-request
I don't think this is an issue of an cold code-path (though i don't know java's hotspot in detail), its probably the (mem-) cache which needs to fill first
I don't know what you meant with "incomplete warmup code"; just check your logs for requests to /_ah/warmup - if there are any, warmup-requests are enabled and working.
Increasing the amount of idle instances beyond the 1-instance mark probably won't help here.
Sadly, there aren't any generic tricks to avoid that, but you could try to
defer initialization-code (doing only the absolute required minimum of instance-startup overhead)
start a backend keeping the (mem-) cache hot
If you don't mind the costs (and don't need automatic scaling for your low-volume application), you could even have all requests served by always-on backends

Categories