Is there a way to populate a Map once from the DB (through Mongo repository) data and reuse it when required from multiple classes instead of hitting the Database through the repository.
As per your comment, what you are looking for is a Caching mechanism. Caches are components which allow data to live in memory, as opposed to files, databases or other mediums so as to allow for the fast retrieval of information (against a higher memory footprint).
There are probably various tutorials online, but usually caches all have the following behaviour:
1. They are key-value pair structures.
2. Each entity living in the cache also has a Time To Live, that is, how long will it considered to be valid.
You can implement this in the repository layer, so the cache mechanism will be transparent to the rest of your application (but you might want to consider exposing functionality that allows to clear/invalidate part or all the cache).
So basically, when a query comes to your repository layer, check in the cache. If it exists in there, check the time to live. If it is still valid, return that.
If the key does not exist or the TTL has expired, you add/overwrite the data in the cache. Keep in mind that when updating the data model yourself, you also invalidate the cache accordingly so that new/fresh data will be pulled from the DB on the next call.
You can declare the map field as public static and this would allow application wide access to hit via ClassLoadingData.mapField
I think a better solution, if I understood the problem would be a memoized function, that is a function storing the value of its call. Here is a sketch of how this could be done (note this does not handle possible synchronization problem in a multi threaded environment):
class ClassLoadingData {
private static Map<KeyType,ValueType> memoizedValues = new HashMap<>();
public Map<KeyType,ValueType> getMyData() {
if (memoizedData.isEmpty()) { // you can use more complex if to handle data refresh
populateData(memoizedData);
} else {
return memoizedData;
}
}
private void populateData() {
// do your query, and assign result to memoizedData
}
}
Premise: I suggest you to use an object-relational mapping tool like Hibernate on your java project to map the object-oriented
domain model to a relational database and let the tool handle the
cache mechanism implicitally. Hibernate specifically implements a multi-level
caching scheme ( take a look at the following link to get more
informations:
https://www.tutorialspoint.com/hibernate/hibernate_caching.htm )
Regardless my suggestion on premise you can also manually create a singleton class that will be used from every class in the project that goes to interact with the DB:
public class MongoDBConnector {
private static final Logger LOGGER = LoggerFactory.getLogger(MongoDBConnector.class);
private static MongoDBConnector instance;
//Cache period in seconds
public static int DB_ELEMENTS_CACHE_PERIOD = 30;
//Latest cache update time
private DateTime latestUpdateTime;
//The cache data layer from DB
private Map<KType,VType> elements;
private MongoDBConnector() {
}
public static synchronized MongoDBConnector getInstance() {
if (instance == null) {
instance = new MongoDBConnector();
}
return instance;
}
}
Here you can define then a load method that goes to update the map with values stored on the DB and also a write method that instead goes to write values on the DB with the following characteristics:
1- These methods should be synchronized in order to avoid issues if multiple calls are performed.
2- The load method should apply a cache period logic ( maybe with period configurable ) to avoid to load for each method call the data from the DB.
Example: Suppose your cache period is 30s. This means that if 10 read are performed from different points of the code within 30s you
will load data from DB only on the first call while others will read
from cached map improving the performance.
Note: The greater is the cache period the more is the performance of your code but if the DB is managed you'll create inconsistency
with cache if an insertion is performed externally ( from another tool
or manually ). So choose the best value for you.
public synchronized Map<KType, VType> getElements() throws ConnectorException {
final DateTime currentTime = new DateTime();
if (latestUpdateTime == null || (Seconds.secondsBetween(latestUpdateTime, currentTime).getSeconds() > DB_ELEMENTS_CACHE_PERIOD)) {
LOGGER.debug("Cache is expired. Reading values from DB");
//Read from DB and update cache
//....
sampleTime = currentTime;
}
return elements;
}
3- The store method should automatically update the cache if insert is performed correctly regardless the cache period is expired:
public synchronized void storeElement(final VType object) throws ConnectorException {
//Insert object on DB ( throws a ConnectorException if insert fails )
//...
//Update cache regardless the cache period
loadElementsIgnoreCachePeriod();
}
Then you can get elements from every point in your code as follow:
Map<KType,VType> liveElements = MongoDBConnector.getElements();
Related
I'm pretty much new to ignite and have a question about responsibility of client and server nodes. As far as I got from the documentation client nodes are very small machines, so it's not their purpose to perform some heavy cache operations. For instance I need to load data from some persistence store, perform some heavy cache-related computations and put resulting data into cache. It looks like this:
I.
//This is on a client node
public class Loader{
private DataSource dataSource;
#IgniteInstanceResource
private Ignite ignite;
public void load(){
String key;
String values;
//retreive key and value from the dataSource
IgniteDataStreamer<String, String> streamer = ignite.dataStreamer("cache");
String result;
//process value
streamer.addData(key, result); //<---------1
}
}
The question is about //1. Is it client's node responsibility to process loaded data and put it into cache? I actually have intention to do the following: create task for each loaded String key and String value and perform all evaluation and cache related operations on a server node. Like the following:
II.
public class LoaderJob extends ComputeJobAdapter{
private String key;
private String value;
#Override
public Object execute(){
//perform all computation and putting into cache here
//and return Tuple2(key, result);
}
}
public class LoaderTask extends extends ComputeTaskSplitAdapter<Void, Void {
//...
public Void reduce(List<ComputeJobResult> results) throws IgniteException {
results.stream().forEach(result -> {
Tuple2<String, String> jobResult = result.getData();
ignite.dataStreamer("cache").addData(jobResult._1, jobResult._2);
});
return null;
}
}
In the second case what the client is doing is just to load data from the persistance store and then publishing tasks on servers.
What is the common way of doing things like that?
It depends on amount of data and computational complexity. In case of big amount of data you can load data right from server, without using client.
Here is the simplest example for DataStreamer, you need only to add loading data from your persistent store and do calculations before using DataStreamer.
Also, it depends on other things, like a client confuguration(CPU, RAM, network) and connection between client and server nodes. If client have a good configuration, for example, as a server, and it's in the same network as a server nodes, then it's not a problem to make load and computations on client and only after it stream data to cache.
Creating dedicate job for some data by yourself, is bad idea. Something like this doing in streamer (data will be buffered and sent to specific node where are will be stored).
client nodes are very small machines, so it's not their purpose to perform some heavy cache operations
This is not a true statement. You are able to give enough resource to client JVM, to load data.
You should create one data streamer on client side and load data from this machine. Also streamer instance is thread save, so you can load date from some threads simultaneously.
IgniteDataStreamer is the the fastest way to load data in a cache. So, the first case is valid.
I think, the second case make sense if a data will be gathered from persistence store on the server nodes and client send only parameters of the loading.
Hello In my web application I am maintaining list of URL authorized for user in a HashMap and compare the requested URL and revert as per the authorization. This Map has Role as key and URLs as value in form of List. My problem is where I should have this Map?
In Session: It may have hundreds of URLs and that can increase the burden of session.
In Cache at Application loading: The URLs may get modified on the fly and then I need to resync it by starting server again.
In Cache that update periodically: Application level Cache that will update periodically.
I require a well optimized approach that can serve the purpose, help me with the same.
I'm preferring to make it as a singleton Class and Have a thread that updates it periodically .. The thread will maintain the state of the cache .. this thread will be started when you get the fist instance of the cache
public class CacheSingleton {
private static CacheSingleton instance = null;
private HashMap<String,Role> authMap;
protected CacheSingleton() {
// Exists only to defeat instantiation.
// Start the thread to maintain Your map
}
public static CacheSingleton getInstance() {
if(instance == null) {
instance = new CacheSingleton();
}
return instance;
}
// Add your cache logic here
// Like getRole,checkURL() ... etc
}
wherever in your code you can get the cached data
CacheSingleton.getInstance().yourMethod();
I had the problem, that every time i retrieved a collection from the gwt request factory, there was the "findEntity()"-method called for every entity in that collection. And this "findEntity()"-method calls the SQL-Database.
I found out that this happens because request factory checks the "liveness" of every entity in the "ServiceLayerDecorator.isLive()"-method (also described here: requestfactory and findEntity method in GWT)
So i provided my own RequestFactoryServlet:
public class MyCustomRequestFactoryServlet extends RequestFactoryServlet {
public MyCustomRequestFactoryServlet() {
super(new DefaultExceptionHandler(), new MyCustomServiceLayerDecorator());
}
}
And my own ServiceLayerDecorator:
public class MyCustomServiceLayerDecorator extends ServiceLayerDecorator {
/**
* This check does normally a lookup against the db for every element in a collection
* -> Therefore overridden
*/
#Override
public boolean isLive(Object domainObject) {
return true;
}
}
This works so far and I don't get this massive amount of queries against the database.
Now I am wondering if I will get some other issues with that? Or is there a better way to solve this?
RequestFactory expects a session-per-request pattern with the session guaranteeing a single instance per entity (i.e. using a cache).
The proper fix is to have isLive hit that cache, not the database. If you use JPA or JDO, they should do that for you for free. What matters is what "the request" thinks about it (if you issued a delete request, isLive should return false), not really what's exactly stored in the DB, taking into account what other users could have done concurrently.
That being said, isLive is only used for driving EntityProxyChange events on the client side, so if you don't use them, it shouldn't cause any problem unconditionally returning true like you do.
example from : SpringSource
#Cacheable(value = "vets")
public Collection<Vet> findVets() throws DataAccessException {
return vetRepository.findAll();
}
How does findVets() work exactly ?
For the first time, it takes the data from vetRepository and saves the result in cache. But what happens if a new vet is inserted in the database - does the cache update (out of the box behavior) ? If not, can we configure it to update ?
EDIT:
But what happens if the DB is updated from an external source (e.g. an application which uses the same DB) ?
#CachePut("vets")
public void save(Vet vet) {..}
You have to tell the cache that an object is stale. If data change without using your service methods then, of course, you would have a problem. You can, however, clear the whole cache with
#CacheEvict(value = "vets", allEntries = true)
public void clearCache() {..}
It depends on the Caching Provider though. If another app updates the database without notifying your app, but it uses the same cache it, then the other app would probably update the cache too.
It would not do it automatically and there is not way for the cache to know if the data has been externally introduced.
Check #CacheEvict which will help you invalidate the cache entry in case of any change to the underlying collections.
#CacheEvict(value = "vet", allEntries = true)
public void saveVet() {
// Intentionally blank
}
allEntries
Whether or not all the entries inside the cache(s) are removed or not.
By default, only the value under the associated key is removed. Note that specifying setting this parameter to true and specifying a key is not allowed.
you also can use #CachePut on the method which creates the new entry. The return type has to be the same as in your #Cachable method.
#CachePut(value = "vets")
public Collection<Vet> updateVets() throws DataAccessException {
return vetRepository.findAll();
}
In my opinion an exernal service has to call the same methods.
In my application I have to simulate various situations for analysis. Thus insert a (very) large amount of lines into a database. (We're talking about a very large amount of data...several billion)
Model
#Entity
public class Case extends Model {
public String url;
}
Job
public class Simulator extends Job {
public void doJob() {
for (int i = 0; i !=) {
// Somestuff
new Case(someString).save();
}
}
}
After half an hour, there is still nothing in the database. But debug traces show Play inserts some stuff. I suspect it is some kind of cache.
I've tried about everything :
Model.em().flush();
Changes nothing.
Model.em().getTransaction().commit();
throws TransactionRequiredException occured : no transaction is in progress
Model.em().setFlushMode(FlushModeType.COMMIT);
Model.em().setFlushMode(FlushModeType.AUTO);
Changes nothing.
I've also tried #NoTransaction annotations everywhere :
Class & functions in Controller
Class Case
Overriding save method in Model
Class & functions of my Job
Getting quite desperate. Every kind of advice is welcome.
EDIT : After a little research, the first row appears in database. The associated ID is about 550.000. That means about half a million rows are somewhere in between my application and database.
Try
em.getTransaction().begin();
em.persist(model);
em.getTransaction().commit();
You can't commit a transaction before you begin it.
as per documentation, the job should have its own transaction enabled as Play request do, so that's not the issue. Try doing this:
for (int i = 0; i !=) {
// Somestuff
Case tmp = new Case(someString);
tmp = JPA.em().merge(tmp);
tmp.save();
}
The idea is that you add the newly created object to the EntityManager explicitly before saving, making sure the object is part of the "dirty objects" that will be persisted.
You need to instruct Play! when it should run your job by annotating your class with one of these annotations #OnApplicationStart, #Every or #On.
Please check Play! documentation on jobs