Multiple POST request eats RES memory (htop)

Multiple POST request eats RES memory (htop) - java

I'll try to be brief and simple. I have a program that does:
Consumes JSON string messages from a RabbitMQ server.
Extracts important fields and puts them into another JSON (different structure). I'm using the POJO model from Jackson's parser.
Sends them to a server through a HTTP POST request. Using HttpUrlConnection (one connection per message) and DataOutputStream#write(byte[]).
I have 3 consumers (3 threads, each one consuming a different queue) performing these 3 steps and it works fine. However, I have a very high volume of received messages (500 messages in one hour) and it seems that the requests are eating the resident memory slowly but progressively. The heap is stable and its cleaned regularly by the garbage collector.
The problem may be at:
My RabbitMQ consumers: they are limited to 1 message (qos=1) till it's acknowledged so I don't think the problem is here.
The HttpURLConnection object: I think this is the problem because if we only consume the data the memory is stable.
With a profiler I can see that the heap grows till it reaches a peak and then the garbage collector cleans it (see the attached image):
How can I control the resident memory used? Is there anything I can do to control massive POST requests?

Related

Limiting memory usage with large body requests

I'm running a vertx java web server to handle large body requests. In order to avoid memory overflow, I'm using the vertx backpressure mechanism with a Pump and implementation of the WriteStream interface which works properly to pause/resume the socket.
Changing the TCP receive buffer size (HttpServerOptions) is a good way to slow down the increase of the virtual memory when the server receives a large request but cannot limit it.
So I need the writeQueueFull method to return true when the vertx input buffer size reaches a given threshold. The problem is that I haven't found a way to monitor this memory amount that vertx uses at runtime.
I could simply look at the JVM memory usage (Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()) to take the pause/resume decision but it's not very precise. Is there another way?
Thanks

How do I limit the GRPC send queue?

When you perform onNext() on a stream response in GRPC it queues it up for transmission, this allocates on direct buffer memory rather than heap as such java.lang.OutOfMemoryError: Direct buffer memory will not generate a useful heap dump. This can be simulated by creating a
message Chunk {
bytes data = 1
}
And sending multiple small chunks into a stream where the receiving end may not be as quick will cause this to trigger. The proper fix would be to make sure the server does not do anything stupid like send many small chunks, but this still can be a vector of DoS attack that could shut down a service.
My question is in GRPC, is there a setting on the server side to limit the amount and block further onNext until the queue is diminished, with a timeout to cancel the operation when the transfer takes too long? That way it won't shutdown the service but just the GRPC call.
I am thinking the answer would somewhere be in this Github issue though it seems a lot of code for something so fundamental.

The local send buffer size on the server is hard-coded to 32 KB. You can use the ServerCallStreamObserver.isReady() or the onReadyHandler to block to achieve flow control (and also timeout for how long you wait).

Understanding IgniteDataStreamer: ordering and buffering

I'm using IgniteDataStreamer with allowOverwrite to load continious data.
Question 1.
From javadoc:
Note that streamer will stream data concurrently by multiple internal threads, so the data may get to remote nodes in different order from which it was added to the streamer.
Reordering is not acceptable in my case. Will perNodeParallelOperations set to 1 guarantee keeping order of addData calls? There is a number of caches being simultaneously loaded with IgniteDataStreamer, so Ignite server node threads will all be utilized anyway.
Question 2.
My streaming application could hang for a couple of seconds due to GC pause. I want to avoid cache loading pause at that moments and keep high average cache writing speed. Is iy possible to configure IgniteDataStreamer to keep (bounded) queue of incoming batches on server node, that would be consumed while streaming (client) app hangs? See question 1, queue should be consumed sequentially. It's OK to utilize some heap for it.
Question 3.
perNodeBufferSize javadoc:
This setting controls the size of internal per-node buffer before buffered data is sent to remote node
According to javadoc, data transfer is triggered by tryFlush / flush / autoFlush, so how does it correlate with perNodeBufferSize limitation? Would flush be ignored if there is less than perNodeBufferSize messages (I hope no)?

I don't recommend trying to avoid reordering in DataStreamer, but if you absolutely need to do that, you will also need to set data streamer pool size to 1 on server nodes. If it's larger then data is split into stripes and not sent sequentially.
DataStreamer is designed for throughput, not latency. So there's not much you can do here. Increasing perThreadBufferSize, perhaps?
Data transfer is automatically started when perThreadBufferSize is reached for any stripe.

Thread per connection vs one thread for all connections in java

I have two different types of server and clients working at the moment and i am trying to decide which one would be better for an MMO server or at least a small MMO like server with at least 100 players at a time.
my first server is using a thread per connection model and sending objects over the socket using ObjectOutputStream.
my second server is using java nio to use only one thread for all the connections and using select to loop through them. this server is also using ObjectOutputStream to send data
for my question, what would be a better approach to an MMO server and if the single thread model is better how would sending an object over the socket channel be affected, would it not be read all the way and not get the full object?
for each object being sent over it just contains for example an int and 2 floats for sending position and player id.

I will relate this question to why MMO use UDP over TCP. The reason being that UDP promises fast delivery whereas TCP promises guaranteed delivery.
A similar analogy can be applied to a single-threaded vs a multi-threaded model.Regardless of which you choose, your overall CPU cycles remain the same i.e. the server can process only so much information per second.
Lets see what happens in each of these scenarios
1.Single-Threaded Model :
In this, your own implementation or the underlying library will end up creating a pipeline where the requests start queuing. If you are at min load, the queue will remain virtually empty and execution will be real-time, however a lot of CPU may be wasted. At max load, there will be a long queue-up and execution will have latency with increasing load, however delivery will be guaranteed and CPU utilization will be optimum. Typically a slow client will slow everybody else down.
Multi-Threaded Model :
In this, depending on how your own implementation or the underlying library implements mutli-threading, parallel execution of requests will start happening. The catch to MT is that it's easy to get fooled. For example, java.util.concurrent.ThreadPoolExecutor doesnt actually do any parallel processing unless you set the queue size to a low value. Once parallel processing starts happening, at min load, your execution will be superfast and CPU utilization will be optimum and game performance will be great. However, at max load your RAM usage will be high and CPU utilization will still be optimum. Typically you'll need to put thread interrupts to avoid a slow client hogging all the threads, which will mean glitchy performance for the slow client. Additionally as you start exhausting your thread pool and resources, threads will either get queued or just get dropped leading to glitchy performance.
In gaming, performance matters more over stability, hence there is no question that you should use MT wherever you can, however tuning your thread parameters to compliment your server resources will decide whether its a boon or a complete bane

Advice on MoM and large messages

I'm designing a system that will use jms and some messaging software (I'm leaning towards ActiveMQ) as middleware. There will be less than 100 agents, each pushing at most 5000 messages per day through the queue.
The payload per message will be around 100 bytes each. I expect roughly half (2500) of the messages to cluster around midnight and the other half will be somewhat evenly distributed during the day.
The figures given above are all on the higher end of what I expect. (Yeah, I'll probably eat that statement in a near future).
There is one type of message where the payload will be considerably larger, say in the range of 5-50mb.
These messages will only be sent a few times per day from each agent.
My questions are:
Will this cause me problems in any way or is it perfectly normal to send larger amounts of data through a message queue?
For example, will it reduce throughput (smaller messages queuing up) while dealing with the larger messages?
Or will the message queue choke on larger messages?
Or should I approach this in a different way, say sending the location of the data through jms, and let the end receiver pick up the data elsewhere?
(I was hoping not to have a special case due to coupling, security issues, and extra configuration).
I'm completely new to the practical details of jms, so just tell me if I need to provide more details.
Edited:
I accepted Andres truly awesome answer. Keep posting advices and opinions, I will keep upvote everything useful.

Larger messages will definitely have an impact, but the sizes you mention here (5-50MB) should be managable by any decent JMS server.
However, consider the following. While processing a particular message, the entire message is read into memory. So if 100 agents each send a 50MB message to a different queue at around the same time, or at different times but the messages take long to dequeue, you could run into a situation where you are trying to put 5000MB worth of messages into memory. I have run into similar problems with 4MB messages with ActiveMQ in the past, however there were more messages being sent than the figures mentioned here. If the messages are all sent to the same (persistent) queue, this should not be a problem, as only the message being processed needs to be in memory.
So it depends on your setup. If the theoretical upper limit of 5000MB is managable for you (and keep the 32-bit JVM limit of 2000MB in mind) then go ahead, however this approach clearly does not scale very well so I wouldn't suggest it. If everything is sent to one persistent queue, it would probably be fine, however I would recommend putting a prototype under load first to make sure. Processing might be slow, but not necessarily slower than if it is fetched by some other mechanism. Either way, I would definitely recommend sending the smaller messages to separate destinations where they can be processed in parallel with the larger messages.

We are running a similar scenario with a hgher amount of messages. We did it similar to Andres proposal with using different queues for the big amount of smaller messages (which are still ~3-5MB in our scenario) and the few big messages that are around 50-150 MB.
In addition to the memory problems already cited, we also encountered general performance issuees on the message broker when processing a huge number of persistent large messages. This is caused by the need of persisting these messages somehow into the filesystem, we ran into bottlenecks on this side.

of cause the message size has an impact on the throughput (in msgs/sec). the larger the messages, the smaller the throughput.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.