I am trying to find what is the usual design/approach for "static/global"! data access/storage in a web app, I'm using struts 2. Background, I have a number of tables I want to display in my web app.
Problem 1.
The tables will only change and be updated once a day on the server, I don't want to access a database/or loading a file for every request to view a table.
I would prefer to load the tables to some global memory/cache once (a day), and each request get the table from there, rather than access a database.
I imagine this is a common scenario and there is an established approach? But I cant find it at the moment.
For struts 2, Is the ActionContext the right place for this data.
If so, any link to a tutorial would be really appreciated.
Problem 2.
The tables were stored in a XML file I unmarshalled with JAXB to get the table objects, and so the lists for the tables.
For a small application this was OK, but I think for the web app, its hacky to store the xml as resources and read in the file as servlet context and parse, or is it?
I realise I may be told to store the tables to a database accessing with a dao, and use hibernate to get the objects.
I am just curious as to what is the usual approach with data already stored in XML file? Given I will have new XML files daily.
Apologies if the questions are basic, I have a large amount of books/reference material, but its just taking me time to get the higher level design answers.
Not having really looked at the caching options I would fetch the data from the DB my self but only after an interval has passed.
Usually you work within the Action scope, the next level up is the Session and the most global is the Application. A simple way to test this is to create an Action class which implements ApplicationAware. Then you can get the values put there from any jsp/action... anywhere you can get to the ActionContext (which is most anyplace) see: http://struts.apache.org/2.0.14/docs/what-is-the-actioncontext.html
Anyways, I would implement a basic interceptor which would check if new data should be available and I have not looked it up already, then load the new data (the user triggering this interceptor may not need this new data, so doing this in a new thread would be a good idea).
This method increases the complexity, as you are responsible for managing some data structures and making them co-operate with the ORM.
I've done this to load data from tables which will never need to be loaded again, and that data stands on it's own (I don't need to find relationships between it and other tables). This is quick and dirty, Stevens solution is far more robust and probably would pay you back at a later date when further performance is a requirement.
This isn't really specific to Struts2 at all. You definitely do not want to try storing this information in the ActionContext -- that's a per-request object.
You should look into a caching framework like EHCache or something similar. If you use Hibernate for your persistence, Hibernate has options for caching data so that it does not need to hit the database on every request. (Hibernate can also use EHCache for its second-level cache).
As mentioned earlier, the best approach would be using EHCache or some other trusted cache manager.
Another approach is to use a factory to access the information. For instance, something to the effect of:
public class MyCache {
private static MyCache cache = new MyCache();
public static MyCache getCache() {
return cache;
}
(data members)
private MyCache() {
(update data members)
}
public synchronized getXXX() {
...
}
public synchronized setXXX(SomeType data) {
...
}
}
You need to make sure you synchronize all your reads and writes to make sure you don't have race conditions while updating the cache.
synchronized (MyCache.getCahce()) {
MyCahce.getCache().getXXX();
MyCache.getCache().getTwo();
...
}
etc
Again, better to use EHCache or something else turn-key since this is likely to be fickle without good understanding of the mechanisms. This sort of cache also has performance issues since it only allows ONE thread to read/write to the cache at a time. (Possible ways to speed up are to use thread locals and read/write locks - but that sort of thing is already built into many of the established cache managers)
Related
So, partly out of frustration with SQL syntax, I decided to try to implement my own Database. I don't need to perform very complex operations - only need to do row lookups and addition of new rows. I have two data structures, User and Circle. These are then put into Java's List and the final Database object looks like this (note that it implements Serializable):
public class Database implements Serializable {
private static final long serialVersionUID = 5790642843089065198L;
List<User> users;
List<Circle> circles;
public Database() {
users = new ArrayList<User>();
circles = new ArrayList<Circle>();
}
}
Whenever I update the object, I also use ObjectOutputStream to "save" the object as a file. Whenever I read from the database, I user ObjectInputStream to "get" the object from a file. I also have a DatabaseHelper class which extends Thread. This class is rather long, but to put it simply, it initializes the Database object as a static variable. My question is not about a specific problem I am having, in fact I have confirmed that my code works as I expect it to work. The database is saved permanently when the program exits or even fails. I am also able to bring up a number of clients which all have an independent connection to the Database, but are also able to see each other's commits.
The problem which I am having has to do with design. Whenever I open up a Thread, the whole Database is read (it's updated only if a commit is made). How do enterprise databases work, say when you need to do a row lookup? Is that whole table read into memory from a file?
This may be a better question for cs.stackexchange.com, but any guidance is appreciated.
A common approach in databases is to use memory mapped files. This give you the convenience of having all the data in memory, almost immediately without having to wait for the data to actually load.
In Java this means mapping your files off heap and bring the data on heap as needed. Once you write the data off heap, it will be asynchronously saved by the OS.
I have a SharedHashMap which is designed for GC-free serialization, concurrent access across processes/threads and lazy persistence. Using memory mapped files means you can read a key/value by touching/reading only a very small number pages, the rest of the data doesn't need to be in memory.
Implementing a database on your own is not a good idea, as even elementar grouping/map-reduce features will cost you time you most certainly better want to spent for the business logic you're about to develop.
Java offers many possibilities for easy data access; certainly the most advanced are JPA (Java Persistence API; an official, generalized API for accessing nearly any database system without a need to write raw SQL queries), and Hibernate.
You may want to use one of these, as they implement exactly what you want (object serialization/hydration), are fast, reliable and use standard RDBMS in the background.
My application handles an html form with a POST metod, and the webapp should generate a static file (xls) for the user entered data.
I'd like to generate a static link for user i.e. /download/{uuid}. This URI should return the static generated file so that user can share or bookmark this link (such link could be destroyed after some time, may be some days).
My webapp doesn't use any db, and I'd like to avoid using db only for one table with key-values data.
The question is how to implement this approach in Spring MVC considering thread safety?
Should I create a Spring bean with singleton scope with syncronized methods for adding/reading Map of uuid/file path?
Please, tell me the best way to solve this problem.
If you are going to use an in-memory data structure, then a singleton scoped object would be one approach. You could create a custom bean, or you could simply create a synchronized HashMap (by wrapping it using Collections.synchronizedMap), or a ConcurrentHashMap instance.
But the problem with that approach is twofold:
It doesn't scale. If you have too many users, or the key-value data is to large, then you can end up using too much memory.
The key-value data will be lost when your server is (hard) restarted.
I think you should consider a database, or alternatively considering implementing persistent sessions and storing the key-value data as session state.
Thinking outside the box, there is one solution that requires no storage, and yet being thread safe. Instead of creating the file and then generate the static link (and put their relation in a map, database or other key-value storage), you create the resulting link first, and then you use the link to generate the name of the file using some kind of transformation method. Next, when the user requests the same file later on you use the same transformation method to re-generate the name of the file, and thus you need no storage at all! Simplest implementation of the transformation method is of course to use the url as the file name (be aware of URL encoding / decoding), but you can make is as difficult as you want.
I need to "preload" some data from a database on servlet startup.
So I thought to create some cache e.g. using a HashMap or some similar synchronized version.
I also need to update the cache on database update changes.
So I thought to add some kind of "listener".
My question is: is this somehow available or do I have to actually implement it?
If yes what design pattern would be the best approach here?
Update:
No JPA or ORM used. But Spring is available
Yes of course you can implement that
I'll draw a small architecture then ill explain it to u:
first of all , you can learn about Mappers here and TDGs here.
A mapper has a method called cacheAll() which calls and delegate to TDG's method cacheAll() which in its turn has a mission to get all rows from a table from the db( the rows you want to cache in the cache object).
so basically first you have to create a listener implementing "ServletContextListener"
which means its a listener for the whole servlet context, and inside its contextInitialized you have to call mp.fill(Mapper.cacheAll()), so it is sthg like ( this is general code, of course write it better and optimize it)
public class myServletContextListener implements ServletContextListener{
#Override
public void contextInitialized(ServletContextEvent sce) {
mp.fill(Mapper.cacheAll());
}
//
}
Don't forget to add your listener in web.xml:
<listener>
<listener-class>myServletContextListener </listener-class>
</listener>
so what this will do , is on startup of the server, will cache all record into a hashmap mp in a cache object.
As for updating cache based on database change, you will have to use observer pattern.
UPDATE
I forgot to mention, about the cache object, i assume you want it accessible for all users or your app, so you should code it as a singleton (singleton pattern), code like that:
public class cacheObject
{
private static Map cMap;
private static cacheObject cObject;
private cacheObject()
{
cMap = Mapper.cacheAll();
}
public static synchronized cacheObject getInstance()
{
if (cObject == null){
cObject = new cacheObject();
}
return cObject;
}
}
Also if the data that you want to cache can be changed by users, so make it a Threadlocal singleton.
You may find your needs served best by Guava here. The wiki article on Caches is probably most relevant to you, but the exact approach here would depend heavily on the conditions for database update changes. If you want to refresh the whole cache on database update changes -- or at least invalidate old entries -- you might just call Cache.invalidateAll() whenever a database update occurs. If you're willing to let the cache be only slightly behind the times, using CacheBuilder.refreshAfterWrite(long, TimeUnit) might work well for you.
Hashmap and it's thread safe variant ConcurrentHashMap is already available.
There are caching solutions which are available like ehcache which also provide advanced support like eviction policies and many more things.
As for the design pattern, read into the Observer design pattern.
I actually had a production level project where I needed to do something like this. My solution was (and this is just MY solution) was to load the object (your so "data") into memory at servlet start up. This decision was made because the object was large enough that it made client requests sluggish to pull it from the database AND I had a small number of concurrent users. Any requests that would change that object's data in the database would also change the in-memory object. You of course would need to use a synchronized object to do this if you are working with a lot of users. If the amount of data is not huge, then you could always pull from the database every time the user requests information about the data.
Good Luck.
I'm new to Google App Engine, and I've spent the last few days building an app using GAE's Memcache to store data. Based on my initial findings, it appears as though GAE's Memcache is NOT global?
Let me explain further. I'm aware that different requests to GAE can potentially be served by different instances (in fact this appears to happen quite often). It is for this reason, that I'm using Memcache to store some shared data, as opposed to a static Map. I thought (perhaps incorrectly) that this was the point of using a distributed cache so that data could be accessed by any node.
Another definite possibility is that I'm doing something wrong. I've tried both JCache and the low-level Memcache API (I'm writing Java, not Python). This is what I'm doing to retrieve the cache:
MemcacheService cache = MemcacheServiceFactory.getMemcacheService();
After deployment, this is what I examine (via my application logs):
The initial request is served by a particular node, and data is stored into the cache retrieved above.
The new few requests retrieve this same cache and the data is there.
When a new node gets spawned to serve a request (from the logs I know when this happens because GAE logs the fact that "This request caused a new process to be started for your application .."), the cache is retrieved and is EMPTY!!
Now I also know that there is no guarantee to how long data will be in Memcache, but from my findings it appears the data is gone the moment a diff instance tries to access the cache. This seems to go against the whole concept of a distributed global cache no?
Hopefully someone can clarify exactly how this SHOULD behave. If Memcache is NOT suppose to be global and every server instance has its own copy, then why even use Memcache? I could simply use a static HashMap (which I initially did until I realized it wouldn't be global due to different instances serving my requests).
Help?
Yes, Memcache is shared across all instances of your app.
I found the issue and got it working. I was initially using the JCache API and couldn't get it to work, so I switched over to the low-level Memcache API but forgot to remove the old JCache code. So they two implementations were stepping on each other.
I'm not sure why the JCache implementation didn't work so I'll share the code:
try {
if (CacheManager.getInstance().getCache(CACHE_GEO_CLIENTS) == null) {
Cache cache = CacheManager.getInstance().getCacheFactory().createCache(Collections.emptyMap());
cache.put(CACHE_GEO_CLIENTS, new HashMap<String, String>());
CacheManager.getInstance().registerCache(CACHE_GEO_CLIENTS, cache);
}
} catch (CacheException e) {
log.severe("Exception while creating cache: " + e);
}
This block of code is inside a private constructor for a singleton called CacheService. This singleton serves as a Cache facade. Note that since requests can be served by different nodes, each node will have this Singleton instance. So when the Singleton is constructed for the first and only time, it'll check to see if my cache is available. If not, it'll create it. This should technically happen only once since Memcache is global yeah? The other somewhat odd thing I'm doing here is creating a single cache entry of type HashMap to store my actual values. I'm doing this because I need to enumerate through all keys and that's something that I can't do with Memcache natively.
What am I doing wrong here?
Jerry, there are two issues I see with the code you posted above:
1) You are using the javax.cache version of the API. According to Google, this has been deprecated:
http://groups.google.com/group/google-appengine-java/browse_thread/thread/5820852b63a7e673/9b47f475b81fb40e?pli=1
Instead, it is intended that we use the net.sf.jsr107 library until the JSR is finalized.
I don't know that using the old API will cause a specific issue, but still could be trouble.
2) I don't see how you are putting and getting from the cache, but the put statement you have is a bit strange:
cache.put(CACHE_GEO_CLIENTS, new HashMap());
It looks like you are putting a second cache inside the main cache.
I have very similar code, but I'm putting and getting individual objects into the cache, not Maps, keyed by a unique ID. And it is working fine for me across multiple instances on GAE.
-John
I have the java servlet that retrieves data from a mysql database. In order to minimize roundtrips to the database, it is retrieved only once in init() method, and is placed to a HashMap<> (i.e. cached in memory).
For now, this HashMap is a member of the servlet class. I need not only store this data but also update some values (counters in fact) in the cached objects of underlying hashmap value class. And there is a Timer (or Cron task) to schedule dumping these counters to DB.
So, after googling i found 3 options of storing the cached data:
1) as now, as a member of servlet class (but servlets can be taken out of service and put back into service by the container at will. Then the data will be lost)
2) in ServletContext (am i right that it is recommended to store small amounts of data here?)
3) in a JNDI resource.
What is the most preferred way?
Put it in ServletContext But use ConcurrentHashMap to avoid concurrency issues.
From those 3 options, the best is to store it in the application scope. I.e. use ServletContext#setAttribute(). You'd like to use a ServletContextListener for this. In normal servlets you can access the ServletContext by the inherited getServletContext() method. In JSP you can access it by ${attributename}.
If the data is getting excessive large that it eats too much of Java's memory, then you should consider a 4th option: use a cache manager.
The most obvious way would be use something like ehcache and store the data in that. ehcache is a cache manager that works much like a hash map except the cache manager can be tweaked to hold things in memory, move them to disk, flush them, even write them into a database via a plugin etc. Depends if the objects are serializable, and whether your app can cope without data (i.e. make another round trip if necessary) but I would trust a cache manager to do a better job of it than a hand rolled solution.
If your cache can become large enough and you access it often it'll be reasonable to utilize some caching solution. For example ehcache is a good candidate and easily integrated with Spring applications, too. Documentation is here.
Also check this overview of open-source caching solutions for Java.