regarding of improvement of the efficiency of a cache heavy system

regarding of improvement of the efficiency of a cache heavy system - java

i'm about to improve the efficiency of a cache heavy system, which has the following properties/architecture:
The system has 2 components, a single instance backend and multiple frontend instances, spread across remote data centers.
The backend generates data and writes it to a relational database that is replicated to multiple data centers.
The frontends handle client requests (common web traffic based) by reading data from the database and serving it. Data is stored in a local cache for an hour before it expires and has to be retrieved again.
(The cache’s eviction policy is LRU based).
I want to mention that there are two issues with the implementation above:
It turns out that many of the database accesses are redundant because the underlying data didn't actually change.
On the other hand, a change isn't reflected until the cache TTL elapses, causing staleness issues.
can you advice for a solution that fixes both of these problems?
should the solution change if the data was stored in nosql db like cassandra and not a classic database?

Unfortunately, there is no silver bullet here. There are two obvious variants:
Keep long TTL or cache forever, but invalidate the cache data when it is updated. This might get quite complex and error prone
Simply lower the TTL to get faster updates. The low TTL approach is IMHO the KISS approach. We go as low as 27 seconds. A cache with this low TTL has not a lot hits during normal operation, but helps a lot when a flash crowd hits your application
In case your database is powerful enough and has acceptable latency the approach 2 is the simplest one.
If your database, does not have acceptable latency, or maybe your application is doing a multiple of sequential reads from the database per web request, then your can use a cache that provides refresh ahead or background refresh. This means, the cache refreshes the entries automatically and there is no additional latency except for the first read. However, this approach come with the downside of increasing the database load.
Cassandra may not support the same access strategies like the classic database. Changing to Cassandara will affect your caching as well, e.g. in case you cache also query results. However, the high level concept keeps the same. Your data access layers may change to an asynchronous or reactive pattern, since Cassandara has support for that.
If you want to do invalidation (solution 1), with Cassandara, you can get information from the database which data has updated, see CASSANDRA-8844. You may get similar information from "classical" SQL databases, but that is a vendor specific feature.

Related

Java Caching frameworks for maintaining huge data

Java Caching frameworks for storing huge data.
Context: We are developing a Restful service using Jersey 2.6 and will deploy it on WAS 8.5. This service need to serve more than 10 million requests per day.
We need to implement a cache to store more than 300k object (data will come from DB). And we need some way to update the cache on a daily basis.
Is this approach of caching 300k object and updating them on a daily basis is recommended?
Are there any Java framework which supports this kind of functionality?

Your question is too general to get a clear answer. You need to be describe what the problem you are trying to solve is.
Are you concerned about response times?
Are you trying to protect your DB from doing heavy lifting?
Are expecting to have to scale out and want to be sure that you can deal with future loads?
Additionally some more contextual information would be useful, especially:
How dynamic is your data compared to your requests?
What percentage of your data population will be requested on average per day? (How many of the 3 lakh objects will be enquired upon at least once per day? If you don't know, provide your best guess).
Your figures given as 3 lakh (300k) data points and 10M requests means that you are expecting to hit each object on average 33 times a day, which indicates that you are more concerned about back end DB load than your responses being right up to date.
In my experience there are a lot of fairly primitive solutions which will work much better than going for a heavyweight distributed systems such as Mongo, Cassandra or Coherence.
My first response would be: Keep it simple - 300k objects is not too much to store in an internal hash table which you flush once a day and populate on first request.
If you need to scale horizontally, I would suggest Memcache Spymemcached with a 1 day cache time, which populate when you don't find an existing entry.
I would NOT go for something like Cassandra or Mongo unless you have real compelling reasons to require a persistent store. Rationale: Purging can become really onerous, especially if your data is fast moving. For example: Cassandra does not really know how to delete, but instead "tombstones" deleted entries, which means that your data store will simply grow and grow until you create a strategy for purging.

Question is if caching must be distributed. Remember the caching is something you have seen. And posting this around for the chance it might be of use... well why.
Distributed Cache system: Redis, Cassandra in Memory. MongoDB in memory.
Local RocksDB (let you store byte[] -> byte[]) and SSDs makes a fine local cache layer. You might also add distributed layer on top of it. Usually better than something from the shelves. Should also be easy to implement.
10Million Requests per day isnt much. in 10hours tops you can server 1Mio / 60 / 60 => 3000 requests per second. Based on the afford you usually can go with an efficient frontend and efficient backend. We can do 40k pages per second and core and having 24 cores.. you know the math. Data in memory no chaching done...

For the caching provider I suggest Coherence, I am using Coherence at my company, and it is very robust and synchronized over multiple clusters.
For the other point about how to handle cache, it depends on the nature of your application, based on my experience with caching, I've decided to update the cache in the following scenarios:
1. Grid paging
2. Browsing
and decided to clear the cache and reload the data again:
Edit item
Add new item
Delete item
And I've decided so as maintaining the cache it an overkill headache that will be blown in your face when you handle some kind of statistics and nested hierarchies.
Hope this helped you.

Yes they are for example: Coherence, Hazelcast. All are distrubuted cashes.
http://java.dzone.com/articles/sneak-peek-jcache-api-jsr-107
In general you should cache what you are using, and cache should be always in sync not daily. You place in cache the recently used objects, and you get read/write through cache to your DB.

If you have money , best one is coherence (its reputation is proved by big financial companies )
Hazelcast is an other distributed cache memory you can use, it is one level lower than coherence based on preformance metrics.

Cou could try ehcache. It can be used as query cache or even hibernate second level cache.
You can configure how long entities should be stored in cache before they are invalidated.

If you already have WebSphere ND 8.5.5, you may take a look at WebSphere Extreme Scale, which is provided with that. It is distributed, partitioned caching solution that integrates with WebSphere. See WebSphere eXtreme Scale overview for more details.

See the new JCache standard (JSR 107 in the Java Community Process). This API is implemented by Coherence and other caching implementations (ehcache etc.), and also has a small reference implementation that you can use for basic use cases.
Yes, any of the Java caching frameworks should be able to help you. Coherence (note: I work with Coherence at Oracle) for example can definitely handle 3,00,000 items easily (I assume you are from India if you use lakh!), but I suggest only using Coherence if you are deploying this on more than one server.

Best way to synchronize cache data between two servers [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Want to synchronize the cache data between two servers. Both database is sharing the same database, but for better execution data i have cached the data into Hash Map at startup.
Thus want to synchronize the cached data without restarting servers. (Both servers starts at same time).
Please suggest me the best and efficient way to do.

Instead of trying to synchronize the cached data between two server instances, why not centralize the caching instead using something like memcached/couchbase or redis? Using distributed caching with something like ehcache is far more complicated and error prone IMO vs centralizing the cached data using a caching server like those mentioned.
As an addendum to my original answer, when deciding what caching approach to use (in memory, centralized), one thing to take into account is the volatility of the data that is being cached.
If the data is stored in the DB, but does not change after the servers load it, then you don't even need synchronization between the servers. Just let them each load this static data into memory from the source and then go about their merry ways doing whatever it is they do. The data won't be changing, so no need to introduce a complicated pattern for keeping the data in sync between the servers.
If there is indeed a level of volatility in the data (like say you are caching looked up entity data from the DB in order to save hits to the DB), then I still think centralized caching is a better approach than in-memory distributed and synchronized caching. You just need to make sure that you use an appropriate expiration on the cached data to allow natural refresh of the data from time to time. Also, you might want to just drop the cached data from the centralized store when in the update path for a particular entity and then just let it be reloaded from the cache on the next request for that data. This is IMO better than trying to do a true write-through cache where you write to the underlying store as well as the cache. The DB itself might make tweaks to the data (via defaulting unsupplied values for example), and your cached data in that case might not match what's in the DB.
EDIT:
A question was asked in the comments about the advantages of a centralized cache (I'm guessing against something like an in memory distributed cache). I'll provide my opinion on that, but first a standard disclaimer. Centralized caching is not a cure-all. It aims to solve specific issues related to in-jvm-memory caching. Before evaluating whether or not to switch to it, you should understand what your problems are first and see if they fit with the benefits of centralized caching. Centralized caching is an architectural change and it can come with issues/caveats of its own. Don't switch to it simple because someone says it's better than what you are doing. Make sure the reason fits the problem.
Okay, now onto my opinion for what kinds of problems centralized caching can solve vs in-jvm-memory (and possibly distributed) caching. I'm going to list two things although I'm sure there are a few more. My two big ones are: Overall Memory Footprint and Data Synchronization Issues.
Let's start with Overall Memory Footprint. Say you are doing standard entity caching to protect your relational DB from undue stress. Let's also say that you have a lot of data to cache in order to really protect your DB; say in the range of many GBs. If you are doing in-jvm-memory caching, and you say had 10 app server boxes, you would need to get that additional memory ($$$) times 10 for each of the boxes that would need to be doing the caching in jvm memory. In addition, you would then have to allocate a larger heap to your JVM in order to accommodate the cached data. I'm from the opinion that the JVM heap should be small and streamlined in order to ease garbage collection burden. If you have a large chunks of Old Gen that can't be collected then your going to stress your garbage collector when it goes into a full GC and tries to reap something back from that bloated Old Gen space. You want to avoid long GC2 pause times and bloating your Old Gen is not going to help with that. Plus, if you memory requirement is above a certain threshold, and you happened to be running 32 bit machines for your app layer, you'll have to upgrade to 64 bit machines and that can be another prohibitive cost.
Now if you decided to centralize the cached data instead (using something like Redis or Memcached), you could significantly reduce the overall memory footprint of the cached data because you could have it on a couple of boxes instead of all of the app server boxes in the app layer. You probably want to use a clustered approach (both technologies support it) and at least two servers to give you high availability and avoid a single point of failure in your caching layer (more on that in a sec). By one having a couple of machines to support the needed memory requirement for caching, you can save some considerable $$. Also, you can tune the app boxes and the cache boxes differently now as they are serving distinct purposes. The app boxes can be tuned for high throughput and low heap and the cache boxes can be tuned for large memory. And having smaller heaps will definitely help out with overall throughput of the app layer boxes.
Now one quick point for centralized caching in general. You should set up your application in such a way that it can survive without the cache in case it goes completely down for a period of time. In traditional entity caching, this means that when the cache goes completely unavailable, you just are hitting your DB directly for every request. Not awesome, but also not the end of the world.
Okay, now for Data Synchronization Issues. With distributed in-jvm-memory caching, you need to keep the cache in sync. A change to cached data in one node needs to replicate to the other nodes and by sync'd into their cached data. This approach is a little scary in that if for some reason (network failure for example) one of the nodes falls out of sync, then when a request goes to that node, the data the user sees will not be accurate against what's currently in the DB. Even worse, if they make another request and that hits a different node, they will see different data and that will be confusing to the user. By centralizing the data, you eliminate this issue. Now, one could then argue that the centralized cache needs concurrency control around updates to the same cached data key. If two concurrent updates come in for the same key, how do you make sure the two updates don't stomp on each other? My thought here is to not even worry bout this; when an update happens, drop the item from the cache (and write though directly to the DB) and let it be reloaded on the next read. It's safer and easier this way. If you don't want to do that, then you can use CAS (Check-And-Set) functionality instead for optimistic concurrency control if you really want to update both the cache and db on updates.
So to summarize, you can save money and better tune your app layer machines if you centralize the data they cache. You also can get better accuracy of that data as you have less data synchronization issues to deal with. I hope this helps.

First, do try to forget about the premature optimization. Do you really need the cache? 99% that you do not need it. In this case you solution is in removing the redundant code.
If however you need it try to stop re-inventing wheels. There are perfect ready-to use libraries. For example ehCache that has distributed mode.

Use HazelCast. It allows data synchronization between servers using multicast protocol. It's easy to use. It supports locking and other features.

How to Implement caching for a web application

What are the different ways to cache a web application data, developed using Java and NoSQL database? Databases also provide caching, are they, the only & always the best option to go with, for caching?
How else can I cache my data of users on the application. Application contains very user specific data like in a social network. Are there some simple thumb rules of what type of things should be cached?
Can I also cache my data on the application server using Java ?

If you want a rule of thumb, here's what Michael Jackson (not that Michael Jackson) said:
The First Rule of Program Optimization: Don't do it.
The Second Rule of Program Optimization (for experts only!): Don't do it yet.
The ancient tradition is that you don't optimise until you've profiled - that is, until you have hard evidence as to what actually needs to be optimised. Cacheing is a kind of optimisation; it is very likely to be important for your app, but until you are able to put your app under load and look at what objects are taking a long time to obtain (loading from the database or whatever), you won't know what needs cacheing. It really doesn't matter how smart you are, or what advice you get here - until you do that, you will not know what needs to be cached.
As for things you can cache, it's anything, but i suppose you can classify it into three groups:
Things that have come fresh from the database. These are easy to cache, because at the point at which you go to the database, you have the identifying information you'd need for a cache key (primary key, query parameters, etc). By cacheing them, you save the time taken to get them from the database - this involves IO, so it is likely to be quite large.
Things that have been produced by computation in the domain model (news feeds in a social app, perhaps). These may be trickier to cache, because more contextual information goes into producing them; you might have to refactor your code to create a single point where the required information is all to hand, so you can apply cacheing to it. Or you might find that this exists already. Cacheing these will save all the database access needed to obtain the information that goes into making them, as well as all the computation; the time taken for computation may or may not be a significant addition to the time taken for IO. Invalidating cached things of this kind is likely to be much harder than pure database objects.
Things that are being sent to the browser - pages, or fragments of pages. These can be quite easy to cache, because in a properly-designed application, they're uniquely identified by either the URL, or the combination of URL and user. Cacheing these will save all the computation in your app; it can even avoid servicing requests, because it can be done by a reverse proxy sitting in front of your app server. Two problems. Firstly, it uses a huge amount of memory: the page rendered from a few kilobytes of objects could be tens or hundreds of kilobytes in size (my Facebook homepage is 50 kB). That means you have to save a vast amount of computation to make it a better deal than cacheing at the database or domain model layers, and there just isn't that much computation between the domain model and the HTML in a sensibly-designed application. Secondly, invalidation is even harder than in the domain model, and is likely to happen prohibitively often - anything which changes the page or the fragment needs to invalidate the cache.
Finally, the actual mechanism: start with something simple and in-process, like a map with limited size and a least-recently-used eviction policy. That's simple but effective. Something out-of-process like EHCache is more complicated, but has two advantages: you can share caches between multiple processes (helpful if you have a cluster, which you probably will at some point), and you can store data where the garbage collector won't see it, which might save some CPU time (might - this is too big a subject to get into here).
But i reiterate my first point: don't cache until you know what needs to be cached, and once you do, be mindful of the limitations on the benefits of cacheing, and try to keep your cacheing strategy as simple as possible (but no simpler, of course).

I'll assume you're building a relatively typical web application that:
has a single server used for persistence
multiple web servers
ties authenticated users to a single server via sticky sessions through a load balancer
Now, with that stated to answer so of your questions. Most persistence, database or NoSQL, likely have some sort of caching built in such that if you execute the same simple query repeatedly (e.g. retrieval by primary key) it's able to cache the result. However, the more complex the query, the less likely persistence can perform caching on it. In addition, if there's only one server for persistence (i.e. no sharding, or write master/read slaves) it quickly becomes the bottleneck. So the application level caching you want to do usually should occur on the web servers to reduce load on the database.
As far as what should be cached, the heuristic is items frequently accessed and/or expensive to generate (in terms of database/web server processing/memory). Typical candidates are the home page and any other landing page of a site - often the best approach for these is generating a static file and serving that. The next pieces depend on your application, but typically the most effective strategy is caching as close to the final result as possible - often the HTML being served. For your social network this might be a list of featured updates or some such.
As far as user sessions are concerned, these are definitely a good candidate for caching. In this case you can probably get a lot of mileage out of judicious use of the web server's session scope (assuming a JSP server). This data lives in memory and is a good place to keep of user specific information shown once a user authenticates on every page (e.g. first and last name).
Now the final thing to consider is dealing with cache invalidation and really is the hard part of all this (naming stuff is the other hard thing in computer science). In this case using something like memcached or ehcache as others have mentioned is the right approach. ehcache can easily run in process with your java application and does a good job of expiring things, with policies for least recently used and least frequently used, and allowing you to use both memory and disk for caching. What you'll need to think about is the situations where you need to expire something form the cache ahead of this schedule because data's changed. In this case you need to work through those dependencies in your application's architecture so that it read/writes to the cache as appropriate.

Persistence strategy for low latency reads and writes

I am building an application that includes a feature to bulk tag millions of records, more or less interactively. The user interaction is very similar to Gmail where users can tag individual emails, or bulk tag large amounts of emails. I also need quick read access to these tag memberships as well, and where the read pattern is more or less random.
Right now we're using Mysql and inserting one row for every tag-document pair. Writing millions of rows to Mysql takes a while (high I/O), even with bulk insertions and heavy optimization. We need this to be an interactive process, not a batch process.
For the data that we're storing and reading, consistency and availability of the data are not as important as performance and scalability. So in the event of system failure while the writes are occurring, I can deal with some data loss. However, the data definitely needs to be persisted to secondary storage at some point.
So, to sum up, here are the requirements:
Low latency bulk writes of potentially tens of millions of records
Data needs to be persisted in some way
Low latency random reads
Durable writes not required
Eventual consistency is okay
Here are some solutions I've looked at:
Write behind caches (Terracotta, Gigaspaces, Coherence) where records are written to memory and drained to the database asynchronously. These scare me a little because they appear to add a certain amount of complexity to the app that I'd want to avoid.
Highly scalable key-value stores, like MongoDB, HBase, Tokyo Tyrant

If you have the budget to use Coherence for this, I highly recommend doing so. There is direct support for write-behind, eventual consistency behavior in Coherence and it is very survivable to both a database outage and Coherence cluster node outages (if you use >= 3 Coherence nodes on separate JVMs, preferably on separate hosts). I have implemented this for doing high-volume CRM for a Fortune 100 company's e-commerce site and it works fantastically.
One of the best aspects of this architecture is that you write your Java application code as if none of the write-behind behavior were taking place, and then plug in the Coherence topology and configuration that makes it happen. If you need to change the behavior or topology of Coherence later, no change in your application is required. I know there are probably a handful of reasonable ways to do this, but this behavior is directly supported in Coherence rather than having to invent or hand-roll a way of doing it.
To make a really fine point - your worry about adding application complexity is a good one. With Coherence, you simply write updates to the cache (or if you're using Hibernate it can be the L2 cache provider). Depending upon your Coherence configuration and topology, you have the option to deploy your application to use write-behind, distributed, caches. So, your application is no more complex (and, frankly unaware) due to the features of the cache.
Finally, I implemented the solution mentioned above from 2005-2007 when Coherence was made by Tangosol and they had the best possible support. I'm not sure how things are now under Oracle - hopefully still good.

I've worked on a large project that used asyncrhonous writes althoguh in that case it was just hand-written using background threads. You could also implement something like that by offloading the db write process to a JMS queue.
One thing that will certainly speed up db writes is to do them in batches. JDBC batch updates can be orders of magnitude faster than individual writes, and if you're doing them asynchronously you can just write them 500 at a time.

Depending on how your data is organized perhaps you would be able to use sharding,
if the read latency isn't low enough you can also try to add caching. Memcache is one popular solution.

Berkeley DB has a very high performance disk-based hash table that supports transactions, and integrates with a Java EE environment if you need that. If you're able to model the data as key/value pairs, this can be a very scalable solution.
http://www.oracle.com/technology/products/berkeley-db/je/index.html
(Note: oracle bought berkeley db about 5-10 years ago; the original product has been around for 15-20 years).

Java Fast Data Storage & Retrieval

I need to store records into a persistant storage and retrieve it on demand. The requirement is as follows:
Extremely fast retrieval and insertion
Each record will have a unique key. This key will be used to retrieve the record
The data stored should be persistent i.e. should be available upon JVM restart
A separate process would move stale records to RDBMS once a day
What do you guys think? I cannot use standard database because of latency issues. Memory databases like HSQLDB/ H2 have performace contraints. Moreover the records are simple string objects and do not qualify for SQL. I am thinking of some kind of flat file based solution. Any ideas? Any open source project? I am sure, there must be someone who has solved this problem before.

There are lot of diverse tools and methods, but I think none of them can shine in all of the requirements.
For low latency, you can only rely on in-memory data access - disks are physically too slow (and SSDs too). If data does not fit in the memory of a single machine, we have to distribute our data to more nodes summing up enough memory.
For persistency, we have to write our data to disk after all. Supposing optimal organization
this can be done as background activity, not affecting latency.
However for reliability (failover, HA or whatever), disk operations can not be totally independent of the access methods: we have to wait for the disks when modifying data to make shure our operation will not disappear. Concurrency also adds some complexity and latency.
Data model is not restricting here: most of the methods support access based on a unique key.
We have to decide,
if data fits in the memory of one machine, or we have to find distributed solutions,
if concurrency is an issue, or there are no parallel operations,
if reliability is strict, we can not loose modifications, or we can live with the fact that an unplanned crash would result in data loss.
Solutions might be
self implemented data structures using standard java library, files etc. may not be the best solution, because reliability and low latency require clever implementations and lots of testing,
Traditional RDBMS s have flexible data model, durable, atomic and isolated operations, caching etc. - they actually know too much, and are mostly hard to distribute. That's why they are too slow, if you can not turn off the unwanted features, which is usually the case.
NoSQL and key-value stores are good alternatives. These terms are quite vague, and cover lots of tools. Examples are
BerkeleyDB or Kyoto Cabinet as one-machine persistent key-value stores (using B-trees): can be used if the data set is small enough to fit in the memory of one machine.
Project Voldemort as a distributed key-value store: uses BerkeleyDB java edition inside, simple and distributed,
ScalienDB as a distributed key-value store: reliable, but not too slow for writes either.
MemcacheDB, Redis other caching databases with persistency,
popular NoSQL systems like Cassandra, CouchDB, HBase etc: used mainly for big data.
A list of NoSQL tools can be found eg. here.
Voldemort's performance tests report sub-millisecond response times, and these can be achieved quite easily, however we have to be careful with the hardware too (like the network properties mentioned above).

Have a look at LinkedIn's Voldemort.

If all the data fits in memory, MySQL can run in memory instead of from disk (MySQL Cluster, Hybrid Storage). It can then handle storing itself to disk for you.

What about something like CouchDB?

I would use a BlockingQueue for that. Simple, and built into Java.
I do something similar using realtime data from Chicago Merchantile Exchange.
The data is sent to one place for realtime use... and to another place (via TCP),
using a BlockingQueue (Producer/Consumer) to persist the data to a database (Oracle,H2).
The Consumer uses a time delayed commit to avoid fdisk sync issues in the database.
(H2 type databases are asyncronous commit by default and avoid that issue)
I log the persisting in the Consumer to keep track of the queue size to be sure
it is able to keep up with the Producer. Works pretty good for me.

MySQL with shards may be a good idea. However, it depends on what is the data volume, transactions per second and latency you need.
In memory databases are also a good idea. In fact MySQL provides memory-based tables as well.

Would a Tuple space / JavaSpace work? Also check out other enterprise data fabrics like Oracle Coherence and Gemstone.

MapDB provides highly performant HashMaps/TreeMaps that are persisted to disk. Its a single library that you can embed in your Java program.

Have you actually proved that using an out-of-process SQL database like MySQL or SQL Server is too slow, or is this an assumption?
You could use a SQL database approach in conjunction with an in-memory cache to ensure that retrievals do not hit the database at all. Despite the fact that the records are plaintext I would still advise using SQL over a flat file solution (e.g. using a text column in your table schema) as the RDBMS will perform optimisations that a file system cannot (e.g. caching recently accessed pages, etc).
However, without more information about your access patterns, expected throughput, etc. I can't provide much more in the way of suggestions.

If you are looking for a simple key-value store and don't need complex sql querying, Berkeley DB might be worth a look.
Another alternative is Tokyo Cabinet, a modern DBM implementation.

How bad would it be if you lose a couple of entries in case of a crash?
If it isn't that bad the following approach might work for you:
Create flat files for each entry, name of file equals id. Possible one file for a not so big number of consecutive entries.
Make sure your controller has a good cache and/or use one of the existing caches implemented in Java.
Talk to a file system expert how to make this really fast
It is simple and it might be fast.
Of course you lose transactions including the ACID principles.

Sub millisecond r/w means you cannot depend on disk, and you have to be careful about network latency. Just forget about standard SQL based solutions, main-memory or not. In a ms, you cannot get more than 100 KByte over a GBit network. Ask a telecom engineer, they are used to solving these kind of problems.

How much does it matter if you lose a record or two? Where are they coming from? Do you have a transactional relationship with the source?
If you have serious reliability requirements then I think you may need to be prepared to pay some DB Overhead.
Perhaps you could separate the persistence problem from the in-memory problem. Use a pup-sub approach. One subscriber look after in-memory, the other persisting the data ready for subsequent startup?
Distributed cahcing products such as WebSphere eXtreme Scale (no Java EE dependency) might be relevent if you can buy rather than build.

Chronicle Map is a ConcurrentMap implementation which stores keys and values off-heap, in a memory-mapped file. So you have persistence on JVM restart.
ChronicleMap.get() is consistently faster than 1 us, sometimes as fast as 100 ns / operation. It's the fastest solution in the class.

Will all the records and keys you need fit in memory at once? If so, you could just use a HashMap<String,String>, since it's Serializable.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.