There are 20 production servers.Whenever the team make a config change and I requested to reload the configuration /restart the services to refresh the cache stored in hashmap..
When the actual transactions are hitting into the server, Will pick the configuration values from map to process the transactions instead of hitting DB every transaction.
I used the following code to connect each server. I am having a couple of questions and suggestions on this approach.
1) Is that logic is fine and stores large data in the memory will create any performance degradation?
2) is there any best approach could suggest on the logic?
httpurlcon = (HttpURLConnection) url.openConnection();
httpurlcon.setDoOutput(true);
httpurlcon.setRequestMethod("POST");
httpurlcon.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
httpurlcon.connect();
If you are implementing a cache on the server side using just a HashMap, it might be a better idea to look at using something like a LRU (Least Recently Used) cache, which is really easy to implement in Java using a LinkedHashMap.
Where the HashMap approach could potentially take up too much memory and give you a an out of memory approach, the least recently used cache will get rid of elements to stay at a certain size, prioritising recently used requests. Also, using just a HashMap, you'll want to synchronize all operations on data structure in order to ensure thread safety.
It would also be worth looking at tuning some parameters of the database itself. In MySQL for example, you can tune InnoDB's (the storage engine of MySQL) parameters such as the buffer pool size or the query cache. InnoDB's buffer pool is itself an LRU cache, and if the database is living on some server by itself, you can set the size of the buffer pool to be quite large and increase performance, since the data will be cached in memory.
Related
I am currently developing an application using Spring MVC4 and hibernate 4. I have implemented hibernate second level cache for performance improvement. If I use Redis which is an in-memory data structure store, used as a database, cache etc, the performance will increase but will it be a drastic change?
Drastic differences you may expect if you cache what is good to be cached and avoid caching data that should not be cached at all. Like beauty is in the eye of the beholder the same is with the performance. Here are several aspects you should have in mind when using hibernate AS second level cache provider:
No Custom serialization - Memory intensive
If you use second level caching, you would not be able to use fast serialization frameworks such as Kryo and will have to stick to java serializable which sucks.
On top of this for each entity type you will have a separate region and within each region, you will have an entry for each key of each entity.
In terms of memory efficiency, this is inefficient.
Lacks the ability to store and distribute rich objects
Most of the modern caches also present computing grid functionality having your objects fragmented into many small pieces decrease your ability to execute distributed tasks with guaranteed data co-location. That depends a little bit on the Grid provider, but for many would be a limitation.
Sub optimal performance
Depending on how much performance you need and what type of application you are having using hibernate second level cache might be a good or a bad choice. Good in terms that it is plug and play...." kind of..." bad because you will never squeeze the performance you would have gained. Also designing rich models mean more upfront work and more OOP.
Limited querying capabilities ON the Cache itself
That depends on the cache provider, but some of the providers really are not good doing JOINs with Where clause different than the ID. If you try to build and in memory index for a query on Hazelcast, for example, you will see what I mean.
Yes, if you use Redis, it will improve your performance.
No, it will not be a drastic change. :)
https://memorynotfound.com/spring-redis-application-configuration-example/
http://www.baeldung.com/spring-data-redis-tutorial
the above links will help you to find out the way of integration redis with your project.
It depends on the movement.
If You have 1000 or more requests per second and You are low on RAM, then Yes, use redis nodes on other machine to take some usage. It will greatly improve your RAM and request speed.
But If it's otherwise then do not use it.
Remember that You can use this approach later when You will see what is the RAM and database Connection Pool usage.
Your question was already discussed here. Check this link: Application cache v.s. hibernate second level cache, which to use?
This was the most accepted answer, which I agree with:
It really depends on your application querying model and the traffic
demands.
Using Redis/Hazelcast may yield the best performance since there won't
be any round-trip to DB anymore, but you end up having a normalized
data in DB and denormalized copy in your cache which will put pressure
on your cache update policies. So you gain the best performance at the
cost of implementing the cache update whenever the persisted data
changes.
Using 2nd level cache is easier to set up but it only stores
entities by id. There is also a query cache, storing ids returned by a
given query. So the 2nd level cache is a two-step process that you
need to fine tune to get the best performance. When you execute
projection queries the 2nd level object cache won't help you, since it
only operates on entity load. The main advantage of 2nd level cache is
that it's easier to keep it in sync whenever data changes, especially
if all your data is persisted by hibernate.
So, if you need ultimate
performance and you don't mind implementing your cache update logic
that ensures a minimum eventual consistency window, then go with an
external cache.
If you only need to cache entities (that usually don't change that
frequently) and you mostly access those through Hibernate entity
loading, then 2nd level cache can help you.
Hope it helps!
I have a simple country states hashmap, which is a simple static final unmodifiable concurrent hashmap.
Now we have implemented memcached cache in our application.
My question is, Is it beneficial to get the values from cache instead of such a simple map?
What benefits I will get or not get if I move this map to cache?
This really depends on the size of the data and how much memory is you've allocated for your JVM.
For simple data like states of a country which are within a few hundred entries, a simple HashMap would suffice and using memcache is an overkill and in fact slower.
If it's large amount of data which grow (typically 10s/100s MBs or larger) and require frequent access, memcache (or any other persistent cache) would be better than an in-memory storage.
It will be much faster as a HashMap because it is stored in memory and the lookup can be done via the jvm by it's reference. The lookup from memcache would require extra work for the processor to look up the map.
If your application is hosted on only one server then you don't need distributed feature of memcache and HashMap will be damn fast. Stats
But this is not case of web applications. ~99% cases for web applications you host it on multiple servers and want to use distributed caching, memcache is best in such cases.
This may be a dumb question, but i am not getting what to google even.
I have a server which fetches the some data from DB, caches this data and when ever any request involves this data, then data is fetched from cache instead of from DB.There by reducing the time taken to serve the request.
This cache can be modified, i.e may be some key can get added to it or deleted or updated.
Any change which occurs in cache will also happen on DB.
The Problem is now due to heavy rush in traffic we want to add a load balancer infront of my server. Lets say i add one more server. Then the two servers will have two different cache. if some thing gets added in the first server cache, how should i inform the second server cache to get it refreshed??
If you ultimately decide to move the cache outside your main webserver process, then you could also take a look at consistent hashing. This would be a alternative to a replicated cache.
The problem with replicated caches, is they scale inversely proportional to the number of nodes participating in the cache. i.e. their performance degrades as you add additional nodes. They work fine when there is a small number of nodes. If data is to be replicated between N nodes (or you need to send eviction messages to N nodes), then every write requires 1 write to the cache on the originating node, and N-1 writes to the other nodes.
In consistent hashing, you instead define a hashing function, which takes the key of the data you want to store or retrieve as input, and it returns the id of the server in the cluster which is responsible for caching the data for that key. So each caching server is responsible for a fraction of the overall keys, the client can determine which server will contain the sought data without any lookup, and data and eviction messages do not need to be replicated between caching servers.
The "consistent" part of consistent hashing, refers to how your hashing function handles new servers being added to or removed from the cluster: some re-distribution of keys between servers is required, but the function is designed to minimize the amount of such disruption.
In practice, you do not actually need a dedicated caching cluster, as your caches could run in-process in your web servers; each web server being able to determine the other webserver which should store cache data for a key.
Consistent hashing is used at large scale. It might be overkill for you at this stage. But just be aware of the scalability bottleneck inherent in O(N) messaging architectures. A replicated cache is possibly a good idea to start with.
EDIT: Take a look at Infinispan, a distributed cache which indeed uses consistent hashing out of box.
Any way you like ;) If you have no idea, I suggest you look at or use ehcache or Hazelcast. It may not be the best solutions for you but it is some of the most widely used. (And CV++ ;) I suggest you understand what it does first.
In Java Web Application, i would like to know if it is a proper (or "standard"?) way that all the essential data such as the config data, message data, code maintenance data, dropdown option data and etc (assuming all data will not updated frequently) are loaded as a "static" variables from database when the server startup.Or is it more preferred way to retrieve data by querying db per request?
Thanks for all your advice here.
It is perfectly valid to pull out all the data that are not going to be modified during application life-cycle into and keep it in memory as singleton or something.
This is a good idea because it saves DB hits and retrieval is faster. A lot of environment specific settings and other data can also be pulled once and kept in an immutable hashmap for any future request.
In a common web-app you generally do not have so many config data/option objects that can eat up lot of memory and cause OOM. But, if you have a table with hundreds of thousands of config data, better assume pulling objects as and when requested. And if you do want to keep it in memory, think of putting this in some key-value store like MemcacheD.
We used DB to store config values and ehcache to avoid a lot of DB hits. This way you don't need to worry about memory consumption (it will use whatever memory you have).
EhCache is one of many available DB cache solution and can be configured on top of JPA etc.
You can configure ehcache (or many other cache providers) to deem the tables read-only, in which case it will only go to the DB if it's explicitly told to invalidate the cache. This performs pretty well. The overhead becomes visible though when the read occurs very frequently (like 100/sec), but usually storing the config value in a local variable and avoiding reading inside loops, passing it on through the method stack during the invocation mitigates this well enough.
Storing values in a Singleton as java objects performs the best, but if you want to modify these without app. start up, it becomes a little bit involved.
Here is a simple way to achieve dynamic configuration with Java objects:
private volatile ImmutableMap<String,Object> param_value
Basically you'll have to start thinking about multi-threaded access, and memory issues (while it's quite unlikely that you'll run out of memory because of configuration values, unless you have binary data as config values etc.).
In essence, I'd recommend using the DB and some cache provider unless that part of code really needs high-performance.
I have the java servlet that retrieves data from a mysql database. In order to minimize roundtrips to the database, it is retrieved only once in init() method, and is placed to a HashMap<> (i.e. cached in memory).
For now, this HashMap is a member of the servlet class. I need not only store this data but also update some values (counters in fact) in the cached objects of underlying hashmap value class. And there is a Timer (or Cron task) to schedule dumping these counters to DB.
So, after googling i found 3 options of storing the cached data:
1) as now, as a member of servlet class (but servlets can be taken out of service and put back into service by the container at will. Then the data will be lost)
2) in ServletContext (am i right that it is recommended to store small amounts of data here?)
3) in a JNDI resource.
What is the most preferred way?
Put it in ServletContext But use ConcurrentHashMap to avoid concurrency issues.
From those 3 options, the best is to store it in the application scope. I.e. use ServletContext#setAttribute(). You'd like to use a ServletContextListener for this. In normal servlets you can access the ServletContext by the inherited getServletContext() method. In JSP you can access it by ${attributename}.
If the data is getting excessive large that it eats too much of Java's memory, then you should consider a 4th option: use a cache manager.
The most obvious way would be use something like ehcache and store the data in that. ehcache is a cache manager that works much like a hash map except the cache manager can be tweaked to hold things in memory, move them to disk, flush them, even write them into a database via a plugin etc. Depends if the objects are serializable, and whether your app can cope without data (i.e. make another round trip if necessary) but I would trust a cache manager to do a better job of it than a hand rolled solution.
If your cache can become large enough and you access it often it'll be reasonable to utilize some caching solution. For example ehcache is a good candidate and easily integrated with Spring applications, too. Documentation is here.
Also check this overview of open-source caching solutions for Java.