Caching a function with refresh rate - java

I have the following scenario. Below is a minimalist version of what I am trying to do
in a simple spring boot REST API Controller and Service.
func()
{
String lv=vService.getlv("1.2.1");
String mv=vService.getmv("1.3.1");
}
#Service
public class VService{
public String getlv(String version){
JsonArray lvVersions=makeHTTPGETLv();
String result=getVersion(lvVersions,version);
return result;
}
public String getMv(String version){
JsonArray mvVersions=makeHTTPGETMv();
String result=getVersion(mvVersions,version);
return result;
}
private String getVersion(JsonArray versions, String version){
Map<String,String> versionMap=new HashMap<>();
for(JsonElement je: versions){
JsonObject jo=je.getAsJsonObject();
//some logic
versionMap.put(jo.get("ver").getAsString(),jo.get("rver").getAsString);
}
return versionMap.get(version);
}
}
The getVersion method internally builds a map every time it is called.
The getlv method and getMv method both invokes the getVersion method
In the call getVersion(lvVersions,version); the lvVersions value will always be same,
It is a JSON array containing the mapping between versions.
And so each time getlv is called we are making a GET request makeHTTPGETLv and then
searching for the corresponding mapping of the given version.
Due to the static nature of this, it could be cached.
The multiple GET calls to makeHTTPGETLv could be avoided as it will always return the same JSON array (the changes could occur in some days though) and also the map that is built inside the getVersion method is unchanging.
Since the JsonArray could change, which is the response of a GET request,
So there could be a logic to update the cache every x minute. which could be 5 minutes or 60 mins.
How do I achieve this in spring boot?
Also to avoid taking the time for the first call, I could use Eager loading.
Steps Taken:
I tried using the #Cacheable annotation on top of the functions. It didn't have any effect. It seems I will have to put some more logic there. I could put out a separate function that builds the map that is being built inside the getVersion function. And so use this map for every call. But this map needs to be updated periodically (could be 5 mins, 60 mins, or even 1 day) which will require a GET call and building up the map. This reduces the problem of updating the Map every x minute. Because then I could directly use the map to fetch the corresponding version and avoid the GET call and parsing the response every time.
Similar Reducible/Smaller/alternate problem:
class Test {
private Map<String,String> map;
private void buildMap(){
}
}
The buildMap function updates the map attribute. Is there a way that buildMap function gets called every 30 minutes so that the map remains updated? That sounds like a cron job. Not sure if some caching is helpful here or if cron job and how to achieve that in spring boot.

Related

Running Async Threading for Bulk API Calls in Spring Boot

I am trying to call a lot of APIs that needs a lot of time to finish. so I've tried to use threading to make it faster. I'm using Spring Boot, and the internet suggest me to use #Async but there's a problem. after using it for testing, I've realized that #Async only works for every async method calls. I don't know if I was mistaken about this, but overall I've created a controller that adds a lot of #Async method calls (up to 50 #Async method calls) and then run the CompletableFuture.allOf(apiCallsList.toArray(new CompletableFuture[0])).join()
Is there a way to make it simpler? because I don't know how to set the taskExecutor to limit the threadingPool if it was written like this.
Example code:
//get 50 person objects to the personPool
if (personPool.size() == 50) {
List<CompletableFuture<Map<String, String>>> results = new ArrayList<>();
for (Person person: personPool) {
//calls async api method call to a list
results.add(service.asyncApiProcess(service.callApi(person.getName())));
}
//run all async methods in paralel
CompletableFuture.allOf(results.toArray(new CompletableFuture[0])).join();
for (CompletableFuture<Map<String, String>> result : results) {
//write to xlsx
}
personPool.clear(); //clears the personPool for the next 50 data
}

Flux.zip returns same results every time, instead of running query for each "zipped" tuple

I'm relatively new to Webflux and am trying to zip two objects together in order to return a tuple, which is then being used to build an object in the following step.
I'm doing it like so:
//Will shorten my code quite a bit.
//I'm including only the pieces that are invovled in my problem //"Flux.zip" call.
//This is a repository that is used in my "problem" code. It is simply an
//interface which extends ReactiveCrudRepository from spring data.
MyRepository repo;
//wiring in my repository...
public MyClass(MyRepository repo) {
this.repo = repo;
}
//Below is later in my reactive chain
//Starting halfway down the chain, we have Flux of objA
(flatMapMany returning Flux<ObjectA>)
//Problem code below...
//Some context :: I am zipping ObjectA with a reference to an object
//I am saving. I am saving an object from earlier, which is stored in an
//AtomicReference<ObjectB>
.flatMap(obj ->
Flux.zip(Mono.just(obj), repo.save(atomicReferenceFromEarlier.get()))
//Below, when calling "getId()" it logs the SAME ID FOR EACH OBJECT,
//even though I want it to return EACH OBJECT'S ID THAT WAS SAVED.
.map(myTuple2 -> log("I've saved this object {} ::" myTuple2.getT2().getId())))
//Further processing....
So, my ultimate issue is, the "second" parameter I'm zipping, the repo.save(someAtomicReferencedObject.get()) is the same for every "zipped" tuple.
in the following step, I'm logging something like "I'm now building object", just to see what object I've returned for each event, but my "second" object in my tuple is always the same...
How can I zip and ensure that the "save" call to the repo returns a unique object for each event in the flux?
However, when I check the database, I really have saved unique entities for each event in my flux. So, the save is happening as expected, but when the repo returns a Mono, it's the same one for each tuple returned.
Please let me know if I should clarify something if anything is unclear. Thank you in advance for any and all help.

Cron Scheduler with WebClient

I am working with a spring boot. I am trying to send data from one database to the other.
First, I did this by making a get request to get the data from the first database and applied post through Web Client to send the data to the other database. It worked!
But when I tried to do it with cron scheduler with #Scheduled annotation it's not posting the data to the database. Even though the function is working fine, as i tried printing stuff through that function, but the WebClient is not posting the data (also checked the data, it was fine).
The Cron class is:
#Component
public class NodeCronScheduler {
#Autowired
GraphService graphService;
#Scheduled(cron = "*/10 * * * * *")
public void createAllNodesFiveSeconds()
{
graphService.saveAlltoGraph("Product");
}
}
saveAlltoGraph function takes all the tuples from a Product table and send post request to the api of graph database, which makes node from the tuples.
Here is the function:
public Mono<Statements> saveAlltoGraph(String label) {
JpaRepository currentRepository = repositoryService.getRepository(label);
List<Model> allModels = currentRepository.findAll();
Statements statements = statementService.createAllNodes(allModels, label);
//System.out.println(statements);
return webClientService.sendStatement(statements);
}
First, the label "Product" is used to get the JpaRepository related to that table. Then we fetch all the tuples of that table in the list, and we create objects according to that, (We can use a serializer to get the JSON).
Here is the sendStatement function:
public Mono<Statements> sendStatement(Statements statements){
System.out.println(statements);
return webClient.post().uri("http://localhost:7474/db/data/transaction/commit")
.body(Mono.just(statements), Statements.class).retrieve().bodyToMono(Statements.class);
}
Everything is working when we call this saveAlltoGraph using a get request mapping, but not working with the scheduler.
I tried with adding .block() and .subscribe() to that. And things started working with the cron scheduler.

How to create a reusable Map

Is there a way to populate a Map once from the DB (through Mongo repository) data and reuse it when required from multiple classes instead of hitting the Database through the repository.
As per your comment, what you are looking for is a Caching mechanism. Caches are components which allow data to live in memory, as opposed to files, databases or other mediums so as to allow for the fast retrieval of information (against a higher memory footprint).
There are probably various tutorials online, but usually caches all have the following behaviour:
1. They are key-value pair structures.
2. Each entity living in the cache also has a Time To Live, that is, how long will it considered to be valid.
You can implement this in the repository layer, so the cache mechanism will be transparent to the rest of your application (but you might want to consider exposing functionality that allows to clear/invalidate part or all the cache).
So basically, when a query comes to your repository layer, check in the cache. If it exists in there, check the time to live. If it is still valid, return that.
If the key does not exist or the TTL has expired, you add/overwrite the data in the cache. Keep in mind that when updating the data model yourself, you also invalidate the cache accordingly so that new/fresh data will be pulled from the DB on the next call.
You can declare the map field as public static and this would allow application wide access to hit via ClassLoadingData.mapField
I think a better solution, if I understood the problem would be a memoized function, that is a function storing the value of its call. Here is a sketch of how this could be done (note this does not handle possible synchronization problem in a multi threaded environment):
class ClassLoadingData {
private static Map<KeyType,ValueType> memoizedValues = new HashMap<>();
public Map<KeyType,ValueType> getMyData() {
if (memoizedData.isEmpty()) { // you can use more complex if to handle data refresh
populateData(memoizedData);
} else {
return memoizedData;
}
}
private void populateData() {
// do your query, and assign result to memoizedData
}
}
Premise: I suggest you to use an object-relational mapping tool like Hibernate on your java project to map the object-oriented
domain model to a relational database and let the tool handle the
cache mechanism implicitally. Hibernate specifically implements a multi-level
caching scheme ( take a look at the following link to get more
informations:
https://www.tutorialspoint.com/hibernate/hibernate_caching.htm )
Regardless my suggestion on premise you can also manually create a singleton class that will be used from every class in the project that goes to interact with the DB:
public class MongoDBConnector {
private static final Logger LOGGER = LoggerFactory.getLogger(MongoDBConnector.class);
private static MongoDBConnector instance;
//Cache period in seconds
public static int DB_ELEMENTS_CACHE_PERIOD = 30;
//Latest cache update time
private DateTime latestUpdateTime;
//The cache data layer from DB
private Map<KType,VType> elements;
private MongoDBConnector() {
}
public static synchronized MongoDBConnector getInstance() {
if (instance == null) {
instance = new MongoDBConnector();
}
return instance;
}
}
Here you can define then a load method that goes to update the map with values stored on the DB and also a write method that instead goes to write values on the DB with the following characteristics:
1- These methods should be synchronized in order to avoid issues if multiple calls are performed.
2- The load method should apply a cache period logic ( maybe with period configurable ) to avoid to load for each method call the data from the DB.
Example: Suppose your cache period is 30s. This means that if 10 read are performed from different points of the code within 30s you
will load data from DB only on the first call while others will read
from cached map improving the performance.
Note: The greater is the cache period the more is the performance of your code but if the DB is managed you'll create inconsistency
with cache if an insertion is performed externally ( from another tool
or manually ). So choose the best value for you.
public synchronized Map<KType, VType> getElements() throws ConnectorException {
final DateTime currentTime = new DateTime();
if (latestUpdateTime == null || (Seconds.secondsBetween(latestUpdateTime, currentTime).getSeconds() > DB_ELEMENTS_CACHE_PERIOD)) {
LOGGER.debug("Cache is expired. Reading values from DB");
//Read from DB and update cache
//....
sampleTime = currentTime;
}
return elements;
}
3- The store method should automatically update the cache if insert is performed correctly regardless the cache period is expired:
public synchronized void storeElement(final VType object) throws ConnectorException {
//Insert object on DB ( throws a ConnectorException if insert fails )
//...
//Update cache regardless the cache period
loadElementsIgnoreCachePeriod();
}
Then you can get elements from every point in your code as follow:
Map<KType,VType> liveElements = MongoDBConnector.getElements();

Using Spring #Cacheable with #PostFilter

I'm attempting to use both the #Cacheable and #PostFilter annotations in Spring. The desired behavior is that the application will cache the full, unfiltered listed of Segments (it's a very small and very frequently referenced list so performance is the desire), but that a User will only have access to certain Segments based on their roles.
I started out with both #Cacheable and #PostFilter on a single method, but when that wasn't working I broke them out into two separate classes so I could have one annotation on each method. However, it seems to behave the same either way I do it, which is to say when User A hits the service for the first time they get their correct filtered list, then when User B hits the service next they get NO results because the cache is only storing User A's filtered results, and User B does not have access to any of them. (So the PostFilter still runs, but the Cache seems to be storing the filtered list, not the full list.)
So here's the relevant code:
configuration:
#Configuration
#EnableCaching
#EnableGlobalMethodSecurity(prePostEnabled = true)
public class BcmsSecurityAutoConfiguration {
#Bean
public CacheManager cacheManager() {
SimpleCacheManager cacheManager = new SimpleCacheManager();
cacheManager.setCaches(Arrays.asList(
new ConcurrentMapCache("bcmsSegRoles"),
new ConcurrentMapCache("bcmsSegments")
));
return cacheManager;
}
}
Service:
#Service
public class ScopeService {
private final ScopeRepository scopeRepository;
public ScopeService(final ScopeRepository scopeRepository) {
this.scopeRepository = scopeRepository;
}
// Filters the list of segments based on User Roles. User will have 1 role for each segment they have access to, and then it's just a simple equality check between the role and the Segment model.
#PostFilter(value = "#bcmsSecurityService.canAccessSegment( principal, filterObject )")
public List<BusinessSegment> getSegments() {
List<BusinessSegment> segments = scopeRepository.getSegments();
return segments; // Debugging shows 4 results for User A (post-filtered to 1), and 1 result for User B (post-filtered to 0)
}
}
Repository:
#Repository
public class ScopeRepository {
private final ScopeDao scopeDao; // This is a MyBatis interface.
public ScopeRepository(final ScopeDao scopeDao) {
this.scopeDao = scopeDao;
}
#Cacheable(value = "bcmsSegments")
public List<BusinessSegment> getSegments() {
List<BusinessSegment> segments = scopeDao.getSegments(); // Simple SELECT * FROM TABLE; Works as expected.
return segments; // Shows 4 results for User A, breakpoint not hit for User B cache takes over.
}
}
Does anyone know why the Cache seems to be storing the result of the Service method after the filter runs, rather than storing the full result set at the Repository level as I'm expecting it should? Or have another way to achieve my desired behavior?
Bonus points if you know how I could gracefully achieve both caching and filtering on the same method in the Service. I only built the superfluous Repository because I thought splitting the methods would resolve the caching problem.
Turns out that the contents of Spring caches are mutable, and the #PostFilter annotation modifies the returned list, it does not filter into a new one.
So when #PostFilter ran after my Service method call above it was actually removing items from the list stored in the Cache, so the second request only had 1 result to start with, and the third would have zero.
My solution was to modify the Service to return new ArrayList<>(scopeRepo.getSegments()); so that PostFilter wasn't changing the cached list.
(NOTE, that's not a deep clone of course, so if someone modified a Segment model upstream from the Service it would likely change in the model in the cache as well. So this may not be the best solution, but it works for my personal use case.)
I can't believe Spring Caches are mutable...

Categories