I have been looking for high-performance file storage solution to be used for persisting soap messages in Java EE environment.
We are currently using a CLOB table on Oracle RMDBS, but it is very expensive to scale. While oracle works well for storing the related metadata, it doesn't perform too well with the message content. Insert on a table with a CLOB gives roughly 1000% worse performance than one without it (This was measured by comparing performance of VARCHAR2(4000)-insert to CLOB-insert when in row storage has been disabled for CLOB)
Persisting messages on file system is one option, but I have some serious doubts how an average file systems would perform storing millions of files per day. Considering we have to keep those files for several months, it just doesn't sound right.
I know there are several open source key-value databases (jackrabbit, mongodb to name few) that might be up for the task, but I just can't find time to evaluate them all. I would also like to hear about performance of open source RMDBS.
Considering that volume of transmitted messages is ever increasing, priority is on low latency and high performance. We do not require clustering or transactionality and (minor) data loss on system failure is acceptable.
Requirements:
Must be able to maintain rate of at least 100persisted messages/sec when message size is 8kilobytes
Must be able to store at least 100million messages
Must support deletion of persisted messages by age
Must support persisting while deletion is in progress
Must support retrieval of message by id
Help is appreciated
Here is nice comparison between MongoDB and SQL Server (I believe Oracle will have similar performance). You can see from charts that Mongo can handle 20 000 inserts per second. Mongo has also query language based on JSON which can do almost everything like regular SQL and it has Sharded Clusters and Replica sets which can handle all neccesary backups and failover (some basic info here).
Also, if you are interested in digging little bit deeper, 10 gen has an online course starting in two weeks awarded with a certificate.
You can try the following products:
HBase
MongoDB
Cassandra
Solr 4.0 (only)
These are the guys that I have any experience. There are a lot of other good products that can do what you want in the market.
Some observations: none of them have this "delete by age" feature out-of-the-box, as far as I know it. But it should be really simple to implement it. Easier in MogoDB I must assume.
If you will try Solr, you should stick with versions 4.X as these are the only ones with support to near realtime commits, and it will affect your "delete and insert" requirement.
All of them have great performance, but I did not run a benchmark with your requirement. If I were you I would make my own benchmarks.
Oracle11g has the data deduplication featured introduced. This feature will improve the performance of the oracle database with clob.
This is what I've discovered so far. I will try to update this answer after evaluating each product.
I started my experiments using MongoDB, which on paper looked like a viable option. Here's a summary of my findings:
Written in C++
Replication (replicaset) requires 3 nodes for high availability
One of the nodes is elected as a master - only the master can write
Scaling out is done by sharding (partitioning)
Each shard is essentially a replicaset - therefore sharded environment requires atleast 6 nodes for high availability
mongod instance consumes all available memory - virtualization should be used for resource partitioning (if you intend to run application server on same hardware)
Master re-election may take up to 1minute
Document collections (tables) use exclusive lock during write operation
Java API is exceptionally easy to use and includes a virtual filesystem called GridFS
Single node write performance on test system was ~20000 inserts/sec for 1kbyte document
Single node read performance was ~20000 read/sec for 1kbyte document
The fact that MongoDB would require 6 nodes on a two data center configuration, made me look further for more cost-efficient solutions.
Apache Cassandra:
Written in Java
Replication requires 3 nodes for high availability
Database survives network partitioning
Replication algorithm has been designed for multiple data centers
All nodes are writable
Scaling out can be done by adding more nodes (up to a certain limit)
Cassandra may require JVM garbage collection tuning
Java API is not the easiest to work with
Single node write performance was ~7000 inserts/sec for 1kbyte document
Single node read performance was ~7000 reads/sec for 1kbyte document
While Cassandra was slower in a single node configuration, write performance on a high availability configuration would match MongoDB's performance. The ability perform writes on every node (even during network partitioning) is very welcome addition for logging.
Couchbase:
Unfortunately I was unable to test Couchbase.
For now we'll keep using Oracle SecureFiles. Would we run out of resources on Oracle, both Cassandra and MongoDB seem like viable alternatives.
Related
In my use case the data is relatively small (~1000.000 Strings), but i have to access as fast as possible (every nano sec counts), from a multithreaded environment (implemented in pure Java)
Currently I'm using redis (in localhost) and I'm basically happy with it, but i want to know if there is some better alternative, since redis has all the network stuff, and is not designed for multithred stuff. The persistence is also very low priority for my use case.
I want to run in the same machine (no networking at all)
I want to be as fast as possible
Relativity small data (my current Redis instance is about 20MB max in memory)
i don't want to :
use other solution than NoSql database.
There are lots of great NoSQL databases that function as a key-value store. Each have unique capabilities.
Redis is great in a single server and is dead easy to install and use. But Redis becomes difficult to shard and manage when your data outgrows beyond a single server.
Thumbtack Technologies (of NYC) published two white papers comparing performance and reliability of MongoDB, Cassandra and Aerospike. The papers are very objective, the benchmarks where done using the YCSB benchmarking tool and were conducted on the same hardware.
Which one to used depends on what you need.
MongoDB is a feature rich key-value store with lots of nice programmer features. It offers queries on secondary indexes and is a very good document store. It's a In-memory database so all you data must fit into RAM. Mongo can be clustered and I have heard that it becomes tricky to manage if you have a big cluster.
CouchBase is great for storing large amounts of data and a portion of that data is cached in RAM. So its very quick if the value you are after is in the cache working set. This is great if your use case mostly works with hot data and accesses cold data less often.
Cassandra is really good for a 'write heavy' use case. Its easy to use and is a good programmer experience. It is written in Java and periodically pauses while it does GC, so you need to tune you GC parameters.
Aerospike is good for storing large amounts of data in a small number of servers. It boasts single digit millisecond (or better) latencies, high availability and high reliability, and it is probably (IMHO) the easiest to maintain and scale. It is multi-code aware, NUMA node aware and has a self-healing zero touch cluster technology. It's great for "real-time" use cases where access to any record needs to be fast and predictable. Aerospike is my favorite.
Cassandra, CouchBase, MongoDB and Aerospike all have an "analytics" capability, and which one you choose depends on the use case and your performance envelope.
You have 1 million strings?
That's a tiny amount of data. If you want speed than nothing will be faster than just using in-memory data structure inside your application code itself. Just store all the data in a file, load up into a list on program startup then serialize back to the file if you need to save it.
Avoid all the overhead of running and interacting with a database - especially you don't care about persistence.
A simple flat file with each line being a separate string will take about 100ms to read and parse.
I have a Java application deployed on a cluster of JBoss AS 5.1 which requires a lot (> 3 GB) of data to be cached.
Right now the server cluster has just 2 nodes (separate machines).
Here are specific requirements:
Both nodes should not require data to be loaded into cache (i.e., there should either be replication or cache should reside on a separate server)
The data should never expire.
Both of the above requirements are REALLY important for the application. I'd be thankful if the suggestion would be made keeping both of these in mind.
I should also add a third requirement:
ease of use
The application was initially using Hashmap. I tried replacing the hashmap with JBoss Cache 3.2.1 for its replication and thread safety features. But i'm not really happy with JBoss Cache performance. Also when i load the data in the cache the 8 Gig of RAM is almost entirely being used (most of it is used by the cache entries).
I'd like to hear the experience of people who have handled such kind of caching scenario themselves. Thanks for your time in advance.
You can try out using GigaSpaces XAP datagrid is a replicated cache. It is very highly performant.
http://www.gigaspaces.com/datagrid
If you want a cache that provides a Java HashMap interface and can easily support gigabytes of cache data, with no expiry, then check out Oracle Coherence. This would use the Coherence "distributed cache" option (which is the default configuration). For more info, see: http://coherence.oracle.com/
Elastic. Just add nodes. Auto-discovery. Auto-load-balancing. No data loss. No interruption. Every time you add a node, you get more data capacity and more throughput.
Use both RAM and flash. Transparently. Easily handle 10s or even 100s of gigabytes per Coherence node (e.g. up to a TB or more per physical server).
Automatic high availability (HA). Kill a process, no data loss. Kill a server, no data loss.
Datacenter continuous availability (CA). Kill a data center, no data loss.
For the sake of full disclosure, I work at Oracle. The opinions and views expressed in this post are my own, and do not necessarily reflect the opinions or views of my employer.
I have 'solved' this problem before (work code, can't show you)... but, I can tell you this much:
with large volumes, a large amount of memory is used in overhead in HashMaps.
you can save a lot of memory by replacing java.util.* classes with smart uses of arrays.
every time you have memory allocations you also have to scan/collect that memory in the GC, so saving memory also improves performance.
Wherever you can, use arrays....
Edit: Apparently the concept of Hash Maps has been forgotten.... Has the Java implementation of HashMap made people believe it is the only way? A structured set of arrays, with a hash function, and a binary sort.... all basic structures... http://en.wikipedia.org/wiki/Hash_table
One array to add keys to. A parallel array to store the values in, and an int-based hash table to make a fast lookup in to the key array...
Computer Science - maybe second year ;-)
Edit again: I used to core concepts I have described in the JDOM project here: https://github.com/hunterhacker/jdom/blob/master/core/src/java/org/jdom2/StringBin.java
I am doing POC on Posgtresql replication. I am using latest version of postgresql i.e. 9.1. There are multiple replication solutions avaliable in the market (PGCluster, Pgpool-II, Slony-I). Postgresql also provide in-built replication solutions (Streaming replication, Warm Standby and hot standby). I am confused which solution is best for the financial application for which I am doing POC. The application will write around 160 million records with row size of 2.5 KB in database. My questions is for following scenarios which replication solution will be suitable:
If I would require replication for backup purpose only
If I would require to scale the reads
If I would require High Avaliability and Consistency
Also It will be very helpful if you can share the perfomance or experience with postgresql replication solutions.
The short answer is "whatever your problem is, there is a solution."
Let's look at just a few of he main ones.
Slony-I is a replication solution that allows you to scale reads across part or all of your database. This is designed so that you could take part of your database and replicate it into your DMZ for, say, customer reports. On the other hand, this flexibility begets complexity, and while Slony lets you replicate only part of your database, Slony lets you replicate only part of your database...... Also Slony's flexibility doesn't stop there. It allows you to replicate across different versions of Pgsql, therefore making sure you have zero downtime for read queries during major upgrades.
Postgres-XC is really the successor in spirit to PGCluster. It offers Teradata-style clustering for PostgreSQL. If you really need to scale reads and writes, this is the solution for you but again that adds complexity too.
The built-in replication solutions are simplest, allow you to scale for purposes of taking backups and taking writes. It ensures high availability and consistency but a major upgrade requires downtime of all nodes.
So the thing is you need to figure out exactly what you want and then look for help in selecting the right tool for the job. I would recommend asking on the pgsql-general email lists when you get to that point.
I am building an application that includes a feature to bulk tag millions of records, more or less interactively. The user interaction is very similar to Gmail where users can tag individual emails, or bulk tag large amounts of emails. I also need quick read access to these tag memberships as well, and where the read pattern is more or less random.
Right now we're using Mysql and inserting one row for every tag-document pair. Writing millions of rows to Mysql takes a while (high I/O), even with bulk insertions and heavy optimization. We need this to be an interactive process, not a batch process.
For the data that we're storing and reading, consistency and availability of the data are not as important as performance and scalability. So in the event of system failure while the writes are occurring, I can deal with some data loss. However, the data definitely needs to be persisted to secondary storage at some point.
So, to sum up, here are the requirements:
Low latency bulk writes of potentially tens of millions of records
Data needs to be persisted in some way
Low latency random reads
Durable writes not required
Eventual consistency is okay
Here are some solutions I've looked at:
Write behind caches (Terracotta, Gigaspaces, Coherence) where records are written to memory and drained to the database asynchronously. These scare me a little because they appear to add a certain amount of complexity to the app that I'd want to avoid.
Highly scalable key-value stores, like MongoDB, HBase, Tokyo Tyrant
If you have the budget to use Coherence for this, I highly recommend doing so. There is direct support for write-behind, eventual consistency behavior in Coherence and it is very survivable to both a database outage and Coherence cluster node outages (if you use >= 3 Coherence nodes on separate JVMs, preferably on separate hosts). I have implemented this for doing high-volume CRM for a Fortune 100 company's e-commerce site and it works fantastically.
One of the best aspects of this architecture is that you write your Java application code as if none of the write-behind behavior were taking place, and then plug in the Coherence topology and configuration that makes it happen. If you need to change the behavior or topology of Coherence later, no change in your application is required. I know there are probably a handful of reasonable ways to do this, but this behavior is directly supported in Coherence rather than having to invent or hand-roll a way of doing it.
To make a really fine point - your worry about adding application complexity is a good one. With Coherence, you simply write updates to the cache (or if you're using Hibernate it can be the L2 cache provider). Depending upon your Coherence configuration and topology, you have the option to deploy your application to use write-behind, distributed, caches. So, your application is no more complex (and, frankly unaware) due to the features of the cache.
Finally, I implemented the solution mentioned above from 2005-2007 when Coherence was made by Tangosol and they had the best possible support. I'm not sure how things are now under Oracle - hopefully still good.
I've worked on a large project that used asyncrhonous writes althoguh in that case it was just hand-written using background threads. You could also implement something like that by offloading the db write process to a JMS queue.
One thing that will certainly speed up db writes is to do them in batches. JDBC batch updates can be orders of magnitude faster than individual writes, and if you're doing them asynchronously you can just write them 500 at a time.
Depending on how your data is organized perhaps you would be able to use sharding,
if the read latency isn't low enough you can also try to add caching. Memcache is one popular solution.
Berkeley DB has a very high performance disk-based hash table that supports transactions, and integrates with a Java EE environment if you need that. If you're able to model the data as key/value pairs, this can be a very scalable solution.
http://www.oracle.com/technology/products/berkeley-db/je/index.html
(Note: oracle bought berkeley db about 5-10 years ago; the original product has been around for 15-20 years).
I need to store records into a persistant storage and retrieve it on demand. The requirement is as follows:
Extremely fast retrieval and insertion
Each record will have a unique key. This key will be used to retrieve the record
The data stored should be persistent i.e. should be available upon JVM restart
A separate process would move stale records to RDBMS once a day
What do you guys think? I cannot use standard database because of latency issues. Memory databases like HSQLDB/ H2 have performace contraints. Moreover the records are simple string objects and do not qualify for SQL. I am thinking of some kind of flat file based solution. Any ideas? Any open source project? I am sure, there must be someone who has solved this problem before.
There are lot of diverse tools and methods, but I think none of them can shine in all of the requirements.
For low latency, you can only rely on in-memory data access - disks are physically too slow (and SSDs too). If data does not fit in the memory of a single machine, we have to distribute our data to more nodes summing up enough memory.
For persistency, we have to write our data to disk after all. Supposing optimal organization
this can be done as background activity, not affecting latency.
However for reliability (failover, HA or whatever), disk operations can not be totally independent of the access methods: we have to wait for the disks when modifying data to make shure our operation will not disappear. Concurrency also adds some complexity and latency.
Data model is not restricting here: most of the methods support access based on a unique key.
We have to decide,
if data fits in the memory of one machine, or we have to find distributed solutions,
if concurrency is an issue, or there are no parallel operations,
if reliability is strict, we can not loose modifications, or we can live with the fact that an unplanned crash would result in data loss.
Solutions might be
self implemented data structures using standard java library, files etc. may not be the best solution, because reliability and low latency require clever implementations and lots of testing,
Traditional RDBMS s have flexible data model, durable, atomic and isolated operations, caching etc. - they actually know too much, and are mostly hard to distribute. That's why they are too slow, if you can not turn off the unwanted features, which is usually the case.
NoSQL and key-value stores are good alternatives. These terms are quite vague, and cover lots of tools. Examples are
BerkeleyDB or Kyoto Cabinet as one-machine persistent key-value stores (using B-trees): can be used if the data set is small enough to fit in the memory of one machine.
Project Voldemort as a distributed key-value store: uses BerkeleyDB java edition inside, simple and distributed,
ScalienDB as a distributed key-value store: reliable, but not too slow for writes either.
MemcacheDB, Redis other caching databases with persistency,
popular NoSQL systems like Cassandra, CouchDB, HBase etc: used mainly for big data.
A list of NoSQL tools can be found eg. here.
Voldemort's performance tests report sub-millisecond response times, and these can be achieved quite easily, however we have to be careful with the hardware too (like the network properties mentioned above).
Have a look at LinkedIn's Voldemort.
If all the data fits in memory, MySQL can run in memory instead of from disk (MySQL Cluster, Hybrid Storage). It can then handle storing itself to disk for you.
What about something like CouchDB?
I would use a BlockingQueue for that. Simple, and built into Java.
I do something similar using realtime data from Chicago Merchantile Exchange.
The data is sent to one place for realtime use... and to another place (via TCP),
using a BlockingQueue (Producer/Consumer) to persist the data to a database (Oracle,H2).
The Consumer uses a time delayed commit to avoid fdisk sync issues in the database.
(H2 type databases are asyncronous commit by default and avoid that issue)
I log the persisting in the Consumer to keep track of the queue size to be sure
it is able to keep up with the Producer. Works pretty good for me.
MySQL with shards may be a good idea. However, it depends on what is the data volume, transactions per second and latency you need.
In memory databases are also a good idea. In fact MySQL provides memory-based tables as well.
Would a Tuple space / JavaSpace work? Also check out other enterprise data fabrics like Oracle Coherence and Gemstone.
MapDB provides highly performant HashMaps/TreeMaps that are persisted to disk. Its a single library that you can embed in your Java program.
Have you actually proved that using an out-of-process SQL database like MySQL or SQL Server is too slow, or is this an assumption?
You could use a SQL database approach in conjunction with an in-memory cache to ensure that retrievals do not hit the database at all. Despite the fact that the records are plaintext I would still advise using SQL over a flat file solution (e.g. using a text column in your table schema) as the RDBMS will perform optimisations that a file system cannot (e.g. caching recently accessed pages, etc).
However, without more information about your access patterns, expected throughput, etc. I can't provide much more in the way of suggestions.
If you are looking for a simple key-value store and don't need complex sql querying, Berkeley DB might be worth a look.
Another alternative is Tokyo Cabinet, a modern DBM implementation.
How bad would it be if you lose a couple of entries in case of a crash?
If it isn't that bad the following approach might work for you:
Create flat files for each entry, name of file equals id. Possible one file for a not so big number of consecutive entries.
Make sure your controller has a good cache and/or use one of the existing caches implemented in Java.
Talk to a file system expert how to make this really fast
It is simple and it might be fast.
Of course you lose transactions including the ACID principles.
Sub millisecond r/w means you cannot depend on disk, and you have to be careful about network latency. Just forget about standard SQL based solutions, main-memory or not. In a ms, you cannot get more than 100 KByte over a GBit network. Ask a telecom engineer, they are used to solving these kind of problems.
How much does it matter if you lose a record or two? Where are they coming from? Do you have a transactional relationship with the source?
If you have serious reliability requirements then I think you may need to be prepared to pay some DB Overhead.
Perhaps you could separate the persistence problem from the in-memory problem. Use a pup-sub approach. One subscriber look after in-memory, the other persisting the data ready for subsequent startup?
Distributed cahcing products such as WebSphere eXtreme Scale (no Java EE dependency) might be relevent if you can buy rather than build.
Chronicle Map is a ConcurrentMap implementation which stores keys and values off-heap, in a memory-mapped file. So you have persistence on JVM restart.
ChronicleMap.get() is consistently faster than 1 us, sometimes as fast as 100 ns / operation. It's the fastest solution in the class.
Will all the records and keys you need fit in memory at once? If so, you could just use a HashMap<String,String>, since it's Serializable.