Efficient way to read key-value pairs for frequent lookups

Efficient way to read key-value pairs for frequent lookups - java

I'm storing a few properties(KV pairs) in a hierarchical db(JCR). As part of business logic, i have to lookup these key-value pairs very frequently and each time i have to call a method which goes and retrieves the persisted value.
I'm working on a CMS called AEM and all these key-value pairs are authored using a component and stored as JCR properties. Presently i've written an OSGi service which will go to that node and retrieve the value corresponding to the key and this method gets invoked several several times. Instead of making repeated calls to the service method to retrieve these values, can you suggest an efficient way to do this? OSGi auto-wiring?

First of all, I would suggest you to thing twice if you really need to get rid of (or decrease) node properties reading. Do you have performance issues because of this reading or you have another important reason?
If you still wanna to mess with it, I would suggest you next configuration:
You have a Cache Component, which contains this map with all key-value pairs.
You have Listener, which listens to node's change, which contains this data, and invalidates cache on such event (so cache will be rebuilt next time it accessed).
There is a great variety of cache implementations, or you can use simple map for this.

Related

How to store different values with same key in memcache?

My application runs on App Engine and different parts of my application needs to store different type of values to memcache with the same key. In the application there are two classes that is caching values with Link as keys. One class is storing Boolean and the other ArrayList as value. However when they need to work with the same Link now there is a conflict. So to avoid this I came up with 3 options:
Use Strings as keys. Convert Link objects to string and prefix them with the class names that is using it when putting and getting values to memcache.
Use namespace feature of memcache to distinguish keys. However, namespaces are typically used for multitenancy purposes, so it doesn't seem right.
Create wrapper classes for the key in different parts of application. But this adds complexity.
I am planning to use the first option since it is the simplest one. What are my other options? Are there any best practices out there? This is the first time I am using memcache, I am not sure which path to take.

Spring MVC: How do I store an application scoped key-value map (considering thread-safety)?

My application handles an html form with a POST metod, and the webapp should generate a static file (xls) for the user entered data.
I'd like to generate a static link for user i.e. /download/{uuid}. This URI should return the static generated file so that user can share or bookmark this link (such link could be destroyed after some time, may be some days).
My webapp doesn't use any db, and I'd like to avoid using db only for one table with key-values data.
The question is how to implement this approach in Spring MVC considering thread safety?
Should I create a Spring bean with singleton scope with syncronized methods for adding/reading Map of uuid/file path?
Please, tell me the best way to solve this problem.

If you are going to use an in-memory data structure, then a singleton scoped object would be one approach. You could create a custom bean, or you could simply create a synchronized HashMap (by wrapping it using Collections.synchronizedMap), or a ConcurrentHashMap instance.
But the problem with that approach is twofold:
It doesn't scale. If you have too many users, or the key-value data is to large, then you can end up using too much memory.
The key-value data will be lost when your server is (hard) restarted.
I think you should consider a database, or alternatively considering implementing persistent sessions and storing the key-value data as session state.

Thinking outside the box, there is one solution that requires no storage, and yet being thread safe. Instead of creating the file and then generate the static link (and put their relation in a map, database or other key-value storage), you create the resulting link first, and then you use the link to generate the name of the file using some kind of transformation method. Next, when the user requests the same file later on you use the same transformation method to re-generate the name of the file, and thus you need no storage at all! Simplest implementation of the transformation method is of course to use the url as the file name (be aware of URL encoding / decoding), but you can make is as difficult as you want.

Java Servlet -> Static Graph in Hashmap with non-static Attributes

I've got a large graph wich is processed in a Java Servlet for routing purpose. The Graph has got 100k+ Nodes so I can't reload it for every new call. At the moment the graph is loaded once from my database into the RAM and referenced in a Hashmap.
When I start the servlet (creating a new instance) I need to find the startnode in the graph by id. Therefore I use the hashmap.
That all works very fine.
My problem is, that within my routing task I need to change certain attributes in the graph, i.e. the travelled distance. These attributes of course need to be individual for each created instance. At the moment I handle that task by resetting all "non-static" attributes when creating a new instance.
That creates two problems.
A) the instances are not thread safe
B) the resetting is very time consuming. Up to 10 times more than the actual routing.
So what I need is a static Hashmap for all instances of my Servlet. This Hashmap needs to contain all nodes of my network. These nodes need to have static attributes like id, coordinates, neighbour nodes etc. but also non-static attributes like travelled distance.
How can I do that?
Thanks for reading and sharing ideas

Your problem can be described as a model built at runtime and instantiated for every execution of your service.
When you say "static", I presume you mean "constant". The variable attributes are really specific to each execution, not to each Servlet instance. During an execution you should build a separate structure with variable attributes that parallels the constant one. Each node in the variable structure references a single node in the constant structure. The variable structure is built gradually and on demand, as a need for each node arises. The structure is discarded at the end of the execution.

I'd advise to keep the "main graph" in RAM in a singleton manner - as Marko Topolnik advised, but I'd keep a Map of only the changed nodes per each session, without the hierarchy, just storing them by ID's (if applicable, per se)
when a session ends, you only have to discard the map in the session, and that's all.
when a new session begins, just create a new Map instance...
You could also pool these maps, if that is critical - but avoid premature optimalization, as it causes far more problems than what it avoids.
if you need to access a node, fetch it from the original Map, then look it up if it exists in the "session local" map, then merge data in the two if found. (or, if you store the full node, not just the changed atributes in the "session local" map, use the changed node from that map)
also, be careful, this has many places that can introduce memory leaks...

Java based memcached client, optimization of putting data inside memcache

I have say list of 1000 beans which I need to share among different projects. I use memcache for this purpose. Currently, loop is run over complete list and each bean is stored in memcache with some unique memcache id. I was wondering, instead of putting each and every bean in memcache independently. Put all the beans in hashmap with the same key which is used for storing beans in memcache, and then put this hashmap in memcache.
Will this give me any significant improvement over putting each and every bean individually in memcached. Or will this cause me any trouble because of large size of the object.
Any help is appreciated.

It won't get you any particular benefit -- it'll actually probably be slower on the load -- serialization is serialization, and adding a hashmap wrapper around it just increases the amount of data that needs to be deserialized and populated. for retrievals, assuming that most lookups are desecrate by the key you want to use for your hashmap you'll have a much much slower retrieval time because you'll be pulling down the whole graph just to get to one of it's discreet member info.
Of course if the data is entirely static and you're only using memcached to populate values in various JVM's you can do it that way and just hold onto the hashmap in a static... but then you're multiplying your memory consumption by the number of nodes in the cluster...

I did some optimization work in spymemcached that helps it do the right thing when doing the wire encoding.
This may, or may not help you with your application. In general, just measure when you have performance questions about your app.

Update cached data in a hashtable

In order to minimize the number of database queries I need some sort of cache to store pairs of data. My approach now is a hashtable (with Strings as keys, Integers as value). But I want to be able to detect updates in the database and replace the values in my "cache". What I'm looking for is something that makes my stored pairs invalid after a preset timespan, perhaps 10-15 minutes. How would I implement that? Is there something in the standard Java package I can use?

I would use some existing solution(there are many cache frameworks).
ehcache is great, it can reset the values on given timespan and i bet it can do much more(i only used that)

You can either use existing solutions (see previous reply)
Or if you want a challenge, make your own easy cache class (not recommended for production project, but it's a great learning experience.
You will need at least 3 members
A cache data stored as hashtable object,
Next cache expiration date
Cache expiration interval set via constructor.
Then simply have public data getter methods, which verify cache expiration status:
if not expired, call hastable's accessors;
if expired, first call "data load" method that is also called in the constructor to pre-populate and then call hashtable accessors.
For an even cooler cache class (I have implemented it in Perl at my job), you can have additional functionality you can implement:
Individual per-key cache expiration (coupled with overall total cache expiration)
Auto, semi-auto, and single-shot data reload (e.g., reload entire cache at once; reload a batch of data defined either by some predefined query, or reload individual data elements piecemail). The latter approach is very useful when your cache has many hits on the same exact keys - that way you don't need to reload universe every time 3 kets that are always accessed expire.

You could use a caching framework like OSCache, EHCache, JBoss Cache, JCS... If you're looking for something that follows a "standard", choose a framework that supports the JCache standard interface (javax.cache) aka JSR-107.
For simple needs like what you are describing, I'd look at EHCache or OSCache (I'm not saying they are basic, but they are simple to start with), they both support expiration based on time.
If I had to choose one solution, I'd recommend Ehcache which has my preference, especially now that it has joined Terracotta. And just for the record, Ehcache provides a preview implementation of JSR107 via the net.sf.cache.jcache package.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Efficient way to read key-value pairs for frequent lookups - java

Related

How to store different values with same key in memcache?

Spring MVC: How do I store an application scoped key-value map (considering thread-safety)?

Java Servlet -> Static Graph in Hashmap with non-static Attributes

Java based memcached client, optimization of putting data inside memcache

Update cached data in a hashtable

Categories

Resources