JCS Cache shutdown,guaranteed persistence to disk - java

Iam using JCS for caching.Now I am using disk cache to temporarily store all the data.The problem is when I use JCS,the keys are written to disk only if the cache is properly shutdown.
I am using the disk usage pattern as UPDATE which tells JCS to immediately write data to the disk without keeping it in memory.But the problem is we are not maitaining the key list of objects in the cache.So I use group cache access and get the keys from the cache and then iterate through the keys to get the results.
So now I am caught in a situation where I have to shutdown the cache properly i.e after all the data is written to disk using Indexed disk cache.But there is a complexity here,the indexed disk cache uses a background thread to write to disk which does not return anything on its status.
So now,I am unable to guarantee that indexed disk cache has written data to the disk to my front end implementation.Is there a way to tackle this situation,because now I am just sleeping some random time(say 10 seconds),before the cache is shutdown,which is a very stupid way of doing it actually.
Edit : I am facing this issue with Memory Cache as well,but a sleep of one second is mostly enough for 500mb of data.But the case of disk cache is little different.

It could be because your objects are stored in the memory and waiting to write to disk. If you need to write the objects immediately to disk while in execution then you need to make the MaxObjects of your cache configs to 0.
jcs.region.<yourRegion>.cacheattributes.MaxObjects=0
jcs.region.<yourRegion>.cacheattributes.DiskUsagePattern=UPDATE
I know you already aware of UPDATE. Adding it for reference again.

Related

java fastest concurrent random file R/W method for SSDs without memory swap

I have a linux box with 32GB of ram and a set of 4 SSD in a raid 0 config that maxes out at about 1GB of throughput (random 4k reads) and I am trying to determine the best way of accessing files on them randomly and conccurently using java. The two main ways I have seen so far are via random access file and mapped direct byte buffers.
Heres where it gets tricky though. I have my own memory cache for objects so any call to the objects stored in a file should go through to disk and not paged memory (I have disabled the swap space on my linux box to prevent this). Whilst mapped direct memory buffers are supposedly the fastest they rely on swapping which is not good because A) I am using all the free memory for the object cache, using mappedbytebuffers instead would incur a massive serialization overhead which is what the object cache is there to prevent.(My program is already CPU limited) B) with mappedbytebuffers the OS handles the details of when data is written to disk, I need to control this myself, ie. when I write(byte[]) it goes straight out to disk instantly, this is to prevent data corruption incase of power failure as I am not using ACID transactions.
On the other hand I need massive concurrency, ie. I need to read and write to multiple locations in the same file at the same time (whilst using offset/Range locks to prevent data corruption) I'm not sure how I can do this without mappedbytebuffers, I could always just que the reads/Writes but I'm not sure how this will negatively affect my throughput.
Finally I can not have a situation when I am creating new byte[] objects for reads or writes, this is because I perform almost a 100000 read/write operations per second, allocating and Garbage collecting all those objects would kill my program which is time sensitive and already CPU limited, reusing byte[] objects is fine through.
Please do not suggest any DB software as I have tried most of them and they add to much complexity and cpu overhead.
Anybody had this kind of dilemma?
Whilst mapped direct memory buffers are supposedly the fastest they rely on swapping
No, not if you have enough RAM. The mapping associates pages in memory with pages on disk. Unless the OS decides that it needs to recover RAM, the pages won't be swapped out. And if you are running short of RAM, all that disabling swap does is cause a fatal error rather than a performance degradation.
I am using all the free memory for the object cache
Unless your objects are extremely long-lived, this is a bad idea because the garbage collector will have to do a lot of work when it runs. You'll often find that a smaller cache results in higher overall throughput.
with mappedbytebuffers the OS handles the details of when data is written to disk, I need to control this myself, ie. when I write(byte[]) it goes straight out to disk instantly
Actually, it doesn't, unless you've mounted your filesystem with the sync option. And then you still run the risk of data loss from a failed drive (especially in RAID 0).
I'm not sure how I can do this without mappedbytebuffers
A RandomAccessFile will do this. However, you'll be paying for at least a kernel context switch on every write (and if you have the filesystem mounted for synchronous writes, each of those writes will involve a disk round-trip).
I am not using ACID transactions
Then I guess the data isn't really that valuable. So stop worrying about the possibility that someone will trip over a power cord.
Your objections to mapped byte buffers don't hold up. Your mapped files will be distinct from your object cache, and though they take address space they don't consume RAM. You can also sync your mapped byte buffers whenever you want (at the cost of some performance). Moreover, random access files end up using the same apparatus under the covers, so you can't save any performance there.
If mapped bytes buffers aren't getting you the performance you need, you might have to bypass the filesystem and write directly to raw partitions (which is what DBMS' do). To do that, you probably need to write C++ code for your data handling and access it through JNI.

Key/Value store extremely slow on SSD

What I am sure of :
I am working with Java/Eclipse on Linux and trying to store a very large number of key/value pairs of 16/32 bytes respectively on disk. Keys are fully random, generated with SecureRandom.
The speed is constant at ~50000 inserts/sec until it reaches ~1 million entries.
Once this limit is reached, the java process oscillates every 1-2 seconds from 0% CPU to 100%, from 150MB of memory to 400MB, and from 10 inserts/sec to 100.
I tried with both Berkeley DB and Kyoto Cabinet and with both Btrees and Hashtables. Same results.
What might contribute :
It's writing on SSD.
For every insert there is on average 1.5 reads −alternating reads and writes constantly.
I suspect the nice 50000 rate is up until some cache/buffer limit is reached. Then the big slow down might be due to SSD not handling read/write mixed together, as suggested on this question : Low-latency Key-Value Store for SSD.
Question is :
Where might this extreme slow down be from ? It can't be all SSD's fault. Lots of people use happily SSD for high speed DB process, and I'm sure they mix read and write a lot.
Thanks.
Edit : I've made sure to remove any memory limit, and the java process has always room to allocate more memory.
Edit : Removing readings and doing inserts only does not change the problem.
Last Edit : For the record, for hash tables it seems related to the initial number buckets. On Kyoto cabinet that number cannot be changed and is defaulted to ~1 million, so better get the number right at creation time (1 to 4 times the maximum number of records to store). For BDB, it is designed to grow progressively the number of buckets, but as it is ressource consuming, better predefine the number in advance.
Your problem might be related to the strong durability guarantees of the databases you are using.
Basically, for any database that is ACID-compliant, at least one fsync() call per database commit will be necessary. This has to happen in order to guarantee durability (otherwise, updates could be lost in case of a system failure), but also to guarantee internal consistency of the database on disk. The database API will not return from the insert operation before the completion of the fsync() call.
fsync() can be a very heavy-weight operation on many operating systems and disk hardware, even on SSDs. (An exception to that would be battery- or capacitor-backed enterprise SSDs - they can treat a cache flush operation basically as a no-op to avoid exactly the delay you are probably experiencing.)
A solution would be to do all your stores inside of one big transaction. I don't know about Berkeley DB, but for sqlite, performance can be greatly improved that way.
To figure out if that is your problem at all, you could try to watch your database writing process with strace and look for frequent fsync() calls (more than a few every second would be a pretty strong hint).
Update:
If you are absolutely sure that you don't require durability, you can try the answer from Optimizing Put Performance in Berkeley DB; if you do, you should look into the TDS (transactional data storage) feature of Berkeley DB.

Caching large amounts of spatial data in Java - is it feasible?

I have run into a situation in which I would like to store an in-memory cache of spatial data which is not immediately needed, and is not loaded from disk, but generated algorithmically. Because the data is accessed spatially, data would be deleted from the cache based on irrelevance factors and the distance from the location of the most recent read operation. The problem is that Java's garbage collection does not seem to integrate well with this system. I would like to use the spatial knowledge of the data to enable it to be garbage-collected by the JVM. Is there a way to mark these cache objects as garbage-collectible? If the JVM encounters an out-of-memory exception, is there a way to catch that exception and delete the cache objects to free up memory?
Or is this the wrong way to do things?
Is there a way to mark these cache objects as garbage-collectible?
The simplest way is to store
some data with strong references e.g. in a LinkedHashMap, possible as a LRU cache.
data which you would like to retain if possible in a SoftReferences cache. These will not be cleaned up immediately but will be cleaned up before an OOME.
data which can be discarded with little cost in a WeakHashMap. This data is available until the GC is performed.
If the JVM encounters an out-of-memory exception, is there a way to catch that exception and delete the cache objects to free up memory?
You can do this but its not ideal as the error can be thrown anywhere in just about any thread.

memcached and performance

I might be asking very basic question, but could not find a clear answer by googling, so putting it here.
Memcached caches information in a separate Process. Thus in order to get the cached information requires inter-process communication (which is generally serialization in java). That means, generally, to fetch a cached object, we need to get a serialized object and generally transport it to network.
Both, serialization and network communication are costly operations. if memcached needs to use both of these (generally speaking, there might be cases when network communication is not required), then how Memcached is fast? Is not replication a better solution?
Or this is a tradeoff of distribution/platform independency/scalability vs performance?
You are right that looking something up in a shared cache (like memcached) is slower than looking it up in a local cache (which is what i think you mean by "replication").
However, the advantage of a shared cache is that it is shared, which means each user of the cache has access to more cache than if the memory was used for a local cache.
Consider an application with a 50 GB database, with ten app servers, each dedicating 1 GB of memory to caching. If you used local caches, then each machine would have 1 GB of cache, equal to 2% of the total database size. If you used a shared cache, then you have 10 GB of cache, equal to 20% of the total database size. Cache hits would be somewhat faster with the local caches, but the cache hit rate would be much higher with the shared cache. Since cache misses are astronomically more expensive than either kind of cache hit, slightly slower hits are a price worth paying to reduce the number of misses.
Now, the exact tradeoff does depend on the exact ratio of the costs of a local hit, a shared hit, and a miss, and also on the distribution of accesses over the database. For example, if all the accesses were to a set of 'hot' records that were under 1 GB in size, then the local caches would give a 100% hit rate, and would be just as good as a shared cache. Less extreme distributions could still tilt the balance.
In practice, the optimum configuration will usually (IMHO!) be to have a small but very fast local cache for the hottest data, then a larger and slower cache for the long tail. You will probably recognise that as the shape of other cache hierarchies: consider the way that processors have small, fast L1 caches for each core, then slower L2/L3 caches shared between all the cores on a single die, then perhaps yet slower off-chip caches shared by all the dies in a system (do any current processors actually use off-chip caches?).
You are neglecting the cost of disk i/o in your your consideration, which is generally going to be the slowest part of any process, and is the main driver IMO for utilizing in-memory caching like memcached.
Memory caches use ram memory over the network. Replication uses both ram-memory as well as persistent disk memory to fetch data. Their purposes are very different.
If you're only thinking of using Memcached to store easily obtainable data such as 1-1 mapping for table records :you-re-gonna-have-a-bad-time:.
On the other hand if your data is the entire result-set of a complex SQL query that may even oveflow the SQL memory pool (and need to be temporarily written to disk to be fetched) you're going to see a big speed-up.
The previous example mentions needing to write data to disk for a read operation - yes it happens if the result set is too big for memory (imagine a CROSS JOIN) that means that you both read and write to that drive (thrashing comes to mind).
In A highly optimized application written in C for example you may have a total processing time of 1microsec and may need to wait for networking and/or serialization/deserialization (marshaling/unmarshaling) for a much longer time than the app execution time itself. That's when you'll begin too feel the limitations of memory-caching over the network.

Using a concurrent hashmap to reduce memory usage with threadpool?

I'm working with a program that runs lengthy SQL queries and stores the processed results in a HashMap. Currently, to get around the slow execution time of each of the 20-200 queries, I am using a fixed thread pool and a custom callable to do the searching. As a result, each callable is creating a local copy of the data which it then returns to the main program to be included in the report.
I've noticed that 100 query reports, which used to run without issue, now cause me to run out of memory. My speculation is that because these callables are creating their own copy of the data, I'm doubling memory usage when I join them into another large HashMap. I realize I could try to coax the garbage collector to run by attempting to reduce the scope of the callable's table, but that level of restructuring is not really what I want to do if it's possible to avoid.
Could I improve memory usage by replacing the callables with runnables that instead of storing the data, write it to a concurrent HashMap? Or does it sound like I have some other problem here?
Don't create copy of data, just pass references around, ensuring thread safety if needed. If without data copying you still have OOM, consider increasing max available heap for application.
Drawback of above approach not using copy of data is that thread safety is harder to achieve, though.
Do you really need all 100-200 reports at the same time?
May be it's worth to limit the 1st level of caching by just 50 reports and introduce a 2nd level based on WeakHashMap?
When 1st level exceeds its size LRU will be pushed to the 2nd level which will depend on the amount of available memory (with use of WeakHashMap).
Then to search for reports you will first need to query 1st level, if value is not there query 2nd level and if value is not there then report was reclaimed by GC when there was not enough memory and you have to query DB again for this report.
Do the results of the queries depend on other query results? If not, whenever you discover the results in another thread, just use a ConcurrentHashMap like you are implying. Do you really need to ask if creating several unnecessary copies of data is causing your program to run out of memory? This should almost be obvious.

Categories