I am not getting the performance that I am expecting from using Apache Ignite DataGrid. I have tried a few configuration changes but at this point don't know how to investigate performance bottlenecks, and am looking for expert help.
I am using Apache Ignite to cache a byte array using a wrapper class I call ByteArray. My test code attempts to benchmark the cache performance by calling multiple puts and then multiple gets from another process. I've tried running the get process on the same node and on different nodes. I have also created a baseline performance spec using Java HashMap as my cache, and this has much better performance (10000x for put).
Right now, on the same node, I get the following:
Hashmap cache, same node: put 2600 MB/s; get 300 MB/s
Ignite same node cache: put 0.4 MB/s; get 2.0 MB/s
Ignite cache, 2 nodes: put 0.3 MB/s; get 0.7 MB/s
I ran these in replicated mode but I see similar results for partitioned mode. I run multiple iterations of the test and average the timing. My nodes have 25GB memory and my test consumes ~1GB. I have configured the VM to use 10GB max.
First of all, comparing Ignite performance with HashMap doesn't make a lot of sense. Ignite is a distributed and scalable system, while HashMap is not even thread-safe.
The fact that you used HashMap as a baseline, makes me think that your test is single threaded. If you try to query Ignite from multiple threads/clients, I'm sure you will notice much better throughput.
Also keep in mind that working with Ignite implies sending data across network. There is a chance you're limited by its speed.
Related
In our project we are currently (still) using Apache Ignite 2.81. We are currently facing OOMs on server nodes when multiple clients are simultaneously fetching the content of a specific cache. So far, we thought the reason is that the data is stored only off-heap and therefore with each client-request a copy of the data is moved into the heap (-> Heap >= number_of_clients * size_of_cache). We expected to mitigate this by putting onHeapEnabled = 'True' for the given cache as according to our understanding only one copy of the data should then exist in the heap and it should therefore not explode anymore.
Are our assumptions in general correct?
Aren't the server nodes using some kind of byte-stream internally when responding the
data to clients? In this case it would be even more surprising that with on-heap
activated the heap still explodes.
We are aware that scaling the server nodes/providing more heap would be a solution here but we would be interested in finding a resource-saving one.
The cause of the OOM most likely is because of Ignite's internal metrics & meta data which is per client that causes OOM when multiple clients frequently fetch data from caches (especially non-trivial sized data, since the metrics internally hold references to the data) and there is connectivity problems with these clients either because of slow clients (due to things like JVM pauses, etc) or because the server config/threads aren't enough to handle the clients.
Therefore, the onHeapEnabled = 'True' option is not going to address the OOM, if anything it will only make it worse.
Instead, I would suggest that you enable Near Cache for this specific cache that you mention along with configuring things like nearStartSize & nearEvictionPolicy on the client nodes. That will solve your issue.
Note that, near caches are fully transactional & also get updated or invalidated automatically whenever the data changes on the server nodes, as clearly mentioned in the docs.
Thanks
We are evaluating Apache Ignite for our product. In our scenario, we may have 10000 caches, and I have a try in the yardstick benchmark framework. I find that when the cache numbers climb to 8192, the Ignite server became abnormal. The case is expected to be finished after 1 minute since I have set the duration in the configuration, but the test keep running in 10 minutes long and I have to kill the test.
If I set the cache number to 4096, the test finished in 1 minute as expected.
So the question: Does Apache Ignite support 10 thousands of cache?
One cache will use around 20M heap for its data structures (per node). Multiple that by 10000 and you have 200G right here. In practice Java will not work with that much heap.
Why do you need 10,000 caches anyway? Please consider at least using Cache Groups. The best approach will be having a few caches and routing between them.
Recently, I tried to use Hibernate Search indexing and I'm working in order to find a stable solution for a production environment. The case is in a wildfly 10 AS I am using indexing using a HibernateOGM PersistenceContext. This automatically adds data to index(Infinispan file-cache-store).
The problem is that I have an MDB consuming data from a JMS queue and I need in on call of this function(onMessage, one queue entry contains around 1 million entities - big requests) to persist around 1 million entities and publish them to another AMQP queue via a stateless EJB.
While persisting and publishing, I noticed that after a specific amount of time major gc cannot happen and after old gen gets full, eden space becomes also and there is a strong degrade in the rate of persisting and publishing messages.
My thoughts are that the onMessage function needs a transaction and until it finishes it keeps all the data in memory or something(indexing or persisted data) and can't just clean the old gen in order to be able to rollback.
I provide some monitoring pictures. You can easily see that suddenly after both spaces of memory(old gen and eden) are full and trying to go empty, there is a strong degrade at the rate of publishing messages to the other queue(it's like I create one by one entities from a list that comes as a request from the jms, I persist them and publish them in a for loop to a rabbitmq queue). Is there any way to keep index always on disk with infinispan if that's the case? Already tried minimum value at eviction, small chunk size etc. Didn't work well. Also tried to change GC algorithms but I end up always in the same situtation. Maybe another infinispan persistent file store implementation? I use single-file-cache-store for now and used soft-index cache store before. Any suggestions-thoughts?
Thanks
Hibernate Search 5.6.1, Infinispan 8.2.4, Hibernate OGM 5.1, Wildfly 10
VisualGC from visualVM
VisualVM
RabbitMQ
JMS Threads
Hibernate Search Sync Thread
The latest version of Infinispan (9.2) is able to store data "off heap" so the short answer is yes it's possible. But consider the big picture before choosing to do that, not all scenarious benefit from off heap storage as this depends on a number of factors.
Infinispan by definition is meant to buffer hottest data in memory, by default "on heap" as that will help your overall performance when it's just Java objects as you can then skip (de)serialization overhead; you need to tune your heap sizes to accomodate for the load you are planning, it can not do that automatically. The easiest strategy is to observe it with similar tools under load when enabling a very generous heap size and then trim it down to a reasonable size you know will work for your load.
So try to verify first if you're not just having a too small heap for its peak operation requirements before suspecting a leak or an unbounded growth. If there actually is an actual leak, you might first want to try upgrading as those versions are quite old - a lot of issues have been fixed already.
I developed a plugin of my own in Neo4j in order to speed the process of inserting node. Mainly because I needed to insert node and relationship only if they didn't exists before which can be too slow using the REST API.
If I try to call my plugin a 100 time, inserting roughly 100 nodes and 100 relationship each time, it take approximately 350ms on each call. Each call is inserting different nodes, in order to rule out locking cause.
However if I parallelize my calls (2, 3 , 4... at time), the response time drop accordingly to the parallelism degree. It takes 750ms to insert my 200 objects when I do 2 call at a time, 1000ms when I do 3 etc.
I'm calling my plugin from a .NET MVC controller, using HttpWebRequest. I set the maxConnection to 10000, and I can see all the TCP connection opened.
I investigated a little on this issue but it seems very wrong. I must have done something wrong, either in my neo4j configuration, or in my plugin code. Using VisualVM I found out that the threads launched by Neo4j to handle my calls are working sequentially. See the picture linked.
http://i.imgur.com/vPWofTh.png
My conf :
Windows 8, 2 core
8G of RAM
Neo4j 2.0M03 installed as a service with no conf tuning
Hope someone will be able to help me. As it is, I will be unable to use Neo4j in production, where there will be tens of concurrent calls, which cannot be done sequentially.
Neo4j is transactional. Every commit triggers an IO operation on filesystem which needs to run in a synchronized block - this explains the picture you've attached. Therefore it's best practice to run writes single threaded. Any pre-processing prior can of course benefit from parallelizing.
In general for maximum performance go with the stable version (1.9.2 as of today). Early milestone builds are not optimized yet, so you might get a wrong picture.
Another thing to consider is the transaction size used in your plugin. 10k to 50k in a single transaction should give you best results. If your transactions are very small, transactional overhead is significant, in case of huge transactions, you need lots of memory.
Write performance is heavily driven by the performance of underlying IO subsystem. If possible use fast SSD drives, even better stripe then.
Like the subject reads, is it important that I get dedicated hardware to run a hadoop cluster and not VMs? If yes, what is acceptable network latency? Are you required to have Gigabit ethernet? I would like to leverage hadoop in speeding up an ETL process. In trying to do so, I did setup a few VMs (512-1GB RAM, 1core per VM of a dual core 2.2Mhz CPU) which are about 500 miles apart, with a network latency of 10-25ms on a 100Mpbs ethernet. I am unable to match a single machine performance for my ETL process, with 3-4 VMs as nodes. So, I thought I would ask this question here for more insight.
It greatly depends on your tasks, but, generally, it's all important - including network latencies, bandwidths, CPU loads / availability,
I can picture a few scenarios where network bandwidth would be not very important - for example, if you've already loaded your data array to a HDFS, i.e. it's cleanly distributed across all the nodes, and you're going to do a complex computation on this array in mappers, without reducers at all or with very little fraction of that data going to reducers. For example, if you're going to count the number of lines in text files, mappers would read multi-gigabyte files and push only one simple number to reducers - number of lines. Reducers would sum up these numbers and push single answer in the output. It's virtually nothing transferred across the network => no effect on performance.
However, in real life, you'd encounter such tasks rather rarely. Usually there are some group-by going on between mappers and reducers and thus most of the calculation-per-group is performed by reducers - i.e. reducers have to transfer all the data from mappers, usually using the network heavily.
If you'll tell more about your tasks, I can give more detailed estimations of what hardware you'd want to use and what are the weak points of current solution.
Dedicated hardware is always important.
Your VMs have definitely not enough RAM, network latency will matter, but 100Mbps is probably enough with 3-4 nodes.