cache implementation on DAO with custom refresh and evictions java - java

In my application, I have a scenario where I have to refresh cache each 24hrs.
I'm expecting database downtime so I need to implement a use case to refresh cache after 24hrs only if the database is up running.
I'm using spring-ehache and I did implement simple cache to refresh for each 24 hrs, but unable to get my head around to make the retention possible on database downtime .

Conceptually you could split the scheduling and cache eviction into two modules and only clear your cache if certain condition (in this case, database's healthcheck returns true) is met:
SomeCachedService.java:
class SomeCachedService {
#Autowired
private YourDao dao;
#Cacheable("your-cache")
public YourData getData() {
return dao.queryForData();
}
#CacheEvict("your-cache")
public void evictCache() {
// no body needed
}
}
CacheMonitor.java
class CacheMonitor {
#Autowired
private SomeCachedService service;
#Autowired
private YourDao dao;
#Scheduled(fixedDelay = TimeUnit.DAYS.toMillis(1))
public conditionallyClearCache() {
if (dao.isDatabaseUp()) {
service.evictCache();
}
}
}
Ehcache also allows you to create a custom eviction algorithm but the documentation doesn't seem too helpful in this case.

Related

Spring Boot + Spring Data: Concurrent/parallel save/insert into databases

Small question regarding SpringBoot and SpringData, and how to save a pojo into many databases concurrently, in parallel please.
I have a very simple SpringBoot application which does nothing but expose a rest endpoint to save a pojo:
#RestController
public class SaveController {
#Autowired
MyElasticRepository myElasticRepository;
#Autowired
MyMongoRepository myMongoRepository;
#Autowired
MyAARepository myAARepository;
//#Autowired MyBBRepository, MyCCRepository, ... MyYYRepository
#Autowired
MyZZRepository myZZRepository;
#GetMapping("/saveSequential")
public String saveSequential(#RequestBody MyPojo myPojo) {
MyPojo myPojoFromElastic = myElasticRepository.save(myPojo);
MyPojo myPojoFromMongo = myMongoRepository.save(myPojo);
MyPojo myPojoFromAA = myAARepository.save(myPojo);
// myBBRepository.save(myPojo) myCCRepository.save(myPojo) ... myYYRepository.save(myPojo)
MyPojo myPojoFromZZ = myZZRepository.save(myPojo);
return ...;
}
}
However, the pojo needs to be saved in many databases, by many, imagine a good dozens of different databases.
As of now, as you can see from the code, the pojo is saved in each of the databases sequentially. I timed the application, as well as monitoring the DBs, the inserts come one after another.
Hypothetically, if one save takes one second, and I have 20 DB, the rest endpoints takes 20ish seconds to complete.
Since the operation is not dependent of any others, i.e. saving the pojo in Mongo, has no dependency on the data saved in Oracle, etc... I would like to optimize the performance by doing the operation in parallel.
I.e, if each save takes one second, and I have 20 DBs, to parallel the save, which should still take something like oneish second. (I am exaggerating)
For the sake of the question, let us imagine the machine doing the save has many cores, is a very good machine, etc.
What I tried:
I tried using the #Async annotation on the repository, such as:
#Async
#Repository
public interface MyElasticRepository extends ElasticsearchRepository<MyPojo, String> {
}
But unfortunately, timing the endpoint, it still takes a sequential time.
May I ask how to achieve this parallel, concurrent save please?
If possible, I would like to leverage existing features of Spring Framework, and not having to rewrite boiler plate concurrency code.
Thank you!
I think you are best off creating a service layer with an async method.
import org.springframework.data.repository.Repository;
import java.util.concurrent.CompletableFuture;
import org.springframework.stereotype.Service;
import org.springframework.scheduling.annotation.AsyncResult;
#Service
public class MyPojoPersister {
#Async
#CompletableFuture<MyPojo> savePojo(Repository repo, MyPojo pojo) {
return CompletableFuture.completedFuture(repo.save(pojo));
}
}
Then your controller would look something like this:
#RestController
public class SaveController {
#Autowired
MyElasticRepository myElasticRepository;
#Autowired
MyMongoRepository myMongoRepository;
#Autowired
MyAARepository myAARepository;
//#Autowired MyBBRepository, MyCCRepository, ... MyYYRepository
#Autowired
MyZZRepository myZZRepository;
#Autowired MyPojoPersister myPojoPersister;
#GetMapping("/saveSequential")
public String saveSequential(#RequestBody MyPojo myPojo) {
var futureList = Stream.of(myElasticRepository, myMongoRepository, myAARepository, myZZRepository)
.map(repo -> myPojoPersister.savePojo(repo, myPojo))
.collect(Collectors.toList());
CompletableFuture.allOf(futureList.toArray(new CompletableFuture[list.size()])).join();
var someString = futureList.stream()
.map(CompletableFuture::get())
.map(MyPojo::getId())
.collect(Collectors.joining(","));
return someString;
}
}
I added some assumptions that you want to return a comma separated list of the ids of the pojos since they would presumably be different for each repo. But do whatever you need to with the values of the futures.
Don't forget to enable asynchronicity!
#SpringBootApplication
#EnableAsync
public class MyPojoApplication {
public static void main(String[] args) {
SpringApplication.run(AsyncMethodApplication.class, args).close();
}
}

Implement caching framework(ehcache) to cache the Lookupcode and Location dropdown values during the server startup

I want to load both the LookupCode and Location data from database
into cache memory using Spring ehCache when the application starts i.e
when the server starts before any other method is called. In future
few more dropdowns will be added. So there should be a common method
to cache whatever datas comes in based on the criteria of the dropdown
data.
There is a Entity, Repository and Service already written for
Lookupcode and Location
I have written the below for implementing caching framework:
ehcache.xml
<cache name= "LookupCodeRepository.getDropdownValues"/> <cache name= "LocationRepository.getDropdownValues"/>
application.properties
spring.jpa.properties.hibernate.cache.use_second_level_cache = false
spring.jpa.properties.hibernate.cache.use_query_cache = false
spring.jpa.properties.hibernate.cache.region.factory_class =
org.hibernate.cache.ehcache.EhCacheRegionFactory
spring.jpa.properties.hibernate.cache.provider_class =
org.hibernate.cache.EhCacheProvider
spring.jpa.properties.hibernate.cache.use_structured_entries = true
spring.jpa.properties.hibernate.cache.region_prefix =
spring.jpa.properties.hibernate.cache.provider_configuration_file_resource_path
= ehcache.xml spring.jpa.properties.hibernate.cache.use_second_level_cache
and using hibernate-ehcache jar in pom.xml
WebConfig.java
#Configuration public class WebConfig implements
ServletContextInitializer{
#Autowired CustomCache cache;
#Override public void onStartup ( ServletContext servletContext)
throws ServletException{
cache.loadCache();
}
CustomCache.java
public class CustomCache {
#Autowired private LookupCodeService lkupSer;
#Autowired private LocationService locSer;
public void loadCache(){
List<LookupCode> lkup = lkupServ.getDropdownValues();
List<Location> locat = locSer.getDropdownValues();
}
So here in loadCache() method instead of calling each individual
service it should be like, automatic. Whatever service is created
it should automatically be cached. So there should be a common method
to cache whatever datas comes in based on the criteria of the
dropdown data.
How to implement that?
The services you want to work with have a common method. Define an interface for that method:
interface ProvidesDropdownValues<T> {
List<T> getDropdownValues();
}
Now you can do:
class DropdownValuesService {
#Autowired ApplicationContext context;
#Cacheable List getDropdownValues(String beanName) {
ProvidesDropdownValues<?> bean = ((ProvidesDropdownValues) context.getBean(beanName));
return bean.getDropdownValues();
}
}
If your services don't have bean names you could work with class names instead.
For load on startup you could do:
class StartupWarmupService {
#Autowired ApplicationContext context;
#Autowired DropdownValuesService dropDowns;
#PostConstruct void startup() {
for (String n : context.getBeanNamesForType(ProvidesDropdownValues.class)) {
dropDowns.getDropdownValues(n);
}
}
}
I suggest that the load code only runs in the production application. That is why it makes sense to keep it separate from the general caching logic. For testing a single service you don't want to load everything. Startup times for developers should be fast.
Disclaimer: I am not a heavy Spring user, so details may be wrong but the basic approach should work out.

How to cache data during application startup in Spring boot application

I have a Spring boot Application connecting to SQL Server Database. I need some help in using caching in my application. I have a table for CodeCategory which has a list of codes for Many codes. This table will be loaded every month and data changes only once in a month.
I want to cache this entire table when the Application starts. In any subsequent calls to the table should get value from this cache instead of calling the Database.
For Example,
List<CodeCategory> findAll();
I want to cache the above DB query value during application startup. If there is a DB call like List<CodeCategory> findByCodeValue(String code) should fetch the code result from the already Cached data instead of calling the Database.
Please let me know how this can be achieved using spring boot and ehcache.
As pointed out, It takes some time for ehcache to setup and it is not working completely with #PostConstruct. In that case make use of ApplicationStartedEvent to load the cache.
GitHub Repo: spring-ehcache-demo
#Service
class CodeCategoryService{
#EventListener(classes = ApplicationStartedEvent.class )
public void listenToStart(ApplicationStartedEvent event) {
this.repo.findByCodeValue("100");
}
}
interface CodeCategoryRepository extends JpaRepository<CodeCategory, Long>{
#Cacheable(value = "codeValues")
List<CodeCategory> findByCodeValue(String code);
}
Note: There are multiple ways as pointed by others. You can choose as per your needs.
My way is to define a generic cache handler
#FunctionalInterface
public interface GenericCacheHandler {
List<CodeCategory> findAll();
}
And its implementation as below
#Component
#EnableScheduling // Important
public class GenericCacheHandlerImpl implements GenericCacheHandler {
#Autowired
private CodeRepository codeRepo;
private List<CodeCategory> codes = new ArrayList<>();
#PostConstruct
private void intializeBudgetState() {
List<CodeCategory> codeList = codeRepo.findAll();
// Any customization goes here
codes = codeList;
}
#Override
public List<CodeCategory> getCodes() {
return codes;
}
}
Call it in Service layer as below
#Service
public class CodeServiceImpl implements CodeService {
#Autowired
private GenericCacheHandler genericCacheHandler;
#Override
public CodeDTO anyMethod() {
return genericCacheHandler.getCodes();
}
}
Use the second level hibernate caching to cache all the required db queries.
For caching at the application start-up, we can use #PostContruct in any of the Service class.
Syntax will be :-
#Service
public class anyService{
#PostConstruct
public void init(){
//call any method
}
}
Use CommandLineRunner interface.
Basically , you can create a Spring #Component and implement CommandLineRunner interface. You will have to override it's run method. The run method will be called at the start of the app.
#Component
public class DatabaseLoader implements
CommandLineRunner {
#override
Public void run(.... string){
// Any code here gets called at the start of the app.
}}
This approach is mostly used to bootstrap the application with some initial data.

Caching lookups on application startup doesn't work

I am using Spring Boot 1.5.9 on Tomcat 9.0.2 and I am trying to cache lookups using spring #Cacheable scheduling a cache refresh job that runs on application startup and repeats every 24 hours as follows:
#Component
public class RefreshCacheJob {
private static final Logger logger = LoggerFactory.getLogger(RefreshCacheJob.class);
#Autowired
private CacheService cacheService;
#Scheduled(fixedRate = 3600000 * 24, initialDelay = 0)
public void refreshCache() {
try {
cacheService.refreshAllCaches();
} catch (Exception e) {
logger.error("Exception in RefreshCacheJob", e);
}
}
}
and the cache service is as follows:
#Service
public class CacheService {
private static final Logger logger = LoggerFactory.getLogger(CacheService.class);
#Autowired
private CouponTypeRepository couponTypeRepository;
#CacheEvict(cacheNames = Constants.CACHE_NAME_COUPONS_TYPES, allEntries = true)
public void clearCouponsTypesCache() {}
public void refreshAllCaches() {
clearCouponsTypesCache();
List<CouponType> couponTypeList = couponTypeRepository.getCoupons();
logger.info("######### couponTypeList: " + couponTypeList.size());
}
}
the repository code:
public interface CouponTypeRepository extends JpaRepository<CouponType, BigInteger> {
#Query("from CouponType where active=true and expiryDate > CURRENT_DATE order by priority")
#Cacheable(cacheNames = Constants.CACHE_NAME_COUPONS_TYPES)
List<CouponType> getCoupons();
}
later in my webservice, when trying to get the lookup as follows:
#GET
#Produces(MediaType.APPLICATION_JSON + ";charset=utf-8")
#Path("/getCoupons")
#ApiOperation(value = "")
public ServiceResponse getCoupons(#HeaderParam("token") String token, #HeaderParam("lang") String lang) throws Exception {
try {
List<CouponType> couponsList = couponRepository.getCoupons();
logger.info("###### couponsList: " + couponsList.size());
return new ServiceResponse(ErrorCodeEnum.SUCCESS_CODE, resultList, errorCodeRepository, lang);
} catch (Exception e) {
logger.error("Exception in getCoupons webservice: ", e);
return new ServiceResponse(ErrorCodeEnum.SYSTEM_ERROR_CODE, errorCodeRepository, lang);
}
}
The first call it gets the lookup from the database and the subsequent calls it gets it from the cache, while it should get it from the cache in the first call in the web service?
Why am I having this behavior, and how can I fix it?
The issue was fixed after upgrading to Tomcat 9.0.4
While it's not affecting the scheduled task per se, when refreshAllCaches() is invoked in the CacheService, #CacheEvict on clearCouponsTypesCache() is bypassed since it's invoked from the same class (see this answer). It will lead to cache not being purged before
List<CouponType> couponTypeList = couponTypeRepository.getCoupons();
is invoked. This means that the #Cacheable getCoupons() method will not query the database, but will instead return values from the cache.
This makes the scheduled cache refresh action to do its work properly only once, when the cache is empty. After that it's useless.
The #CacheEvict annotation should be moved to refreshAllCaches() method and add beforeInvocation=true parameter to it, so the cache is purged before being populated, not after.
Also, when using Spring 4 / Spring Boot 1.X, these bugs should be taken into consideration:
https://github.com/spring-projects/spring-boot/issues/8331
https://jira.spring.io/browse/SPR-15271
While this bug doesn't seem to affect this specific program, it might be a good idea to separate #Cacheable annotation from JpaRepository interface until migration to Spring 5 / Spring Boot 2.X.
#CacheEvict won't be invoked when called within the same service. This is because Spring creates a proxy around the service and only calls from "outside" go through the cache proxy.
The solution is to either add
#CacheEvict(cacheNames = Constants.CACHE_NAME_COUPONS_TYPES, allEntries = true)
to refreshAllCaches too, or to move refreshAllCaches into a new service that calls ICacheService.clearCouponsTypeCache.

Spring cannot propagate transaction to ForkJoin's RecursiveAction

I am trying to implement a multi-threaded solution so I can parallelize my business logic that includes reading and writing to a database.
Technology stack: Spring 4.0.2, Hibernate 4.3.8
Here is some code to discuss on:
Configuration
#Configuration
public class PartitionersConfig {
#Bean
public ForkJoinPoolFactoryBean forkJoinPoolFactoryBean() {
final ForkJoinPoolFactoryBean poolFactory = new ForkJoinPoolFactoryBean();
return poolFactory;
}
}
Service
#Service
#Transactional
public class MyService {
#Autowired
private OtherService otherService;
#Autowired
private ForkJoinPool forkJoinPool;
#Autowired
private MyDao myDao;
public void performPartitionedActionOnIds() {
final ArrayList<UUID> ids = otherService.getIds();
MyIdPartitioner task = new MyIdsPartitioner(ids, myDao, 0, ids.size() - 1);
forkJoinPool.invoke(task);
}
}
Repository / DAO
#Repository
#Transactional(propagation = Propagation.MANDATORY)
public class IdsDao {
public MyData getData(List<UUID> list) {
// ...
}
}
RecursiveAction
public class MyIdsPartitioner extends RecursiveAction {
private static final long serialVersionUID = 1L;
private static final int THRESHOLD = 100;
private ArrayList<UUID> ids;
private int fromIndex;
private int toIndex;
private MyDao myDao;
public MyIdsPartitioner(ArrayList<UUID> ids, MyDao myDao, int fromIndex, int toIndex) {
this.ids = ids;
this.fromIndex = fromIndex;
this.toIndex = toIndex;
this.myDao = myDao;
}
#Override
protected void compute() {
if (computationSetIsSamllEnough()) {
computeDirectly();
} else {
int leftToIndex = fromIndex + (toIndex - fromIndex) / 2;
MyIdsPartitioner leftPartitioner = new MyIdsPartitioner(ids, myDao, fromIndex, leftToIndex);
MyIdsPartitioner rightPartitioner = new MyIdsPartitioner(ids, myDao, leftToIndex + 1, toIndex);
invokeAll(leftPartitioner, rightPartitioner);
}
}
private boolean computationSetIsSamllEnough() {
return (toIndex - fromIndex) < THRESHOLD;
}
private void computeDirectly() {
final List<UUID> subList = ids.subList(fromIndex, toIndex);
final MyData myData = myDao.getData(sublist);
modifyTheData(myData);
}
private void modifyTheData(MyData myData) {
// ...
// write to DB
}
}
After executing this I get:
No existing transaction found for transaction marked with propagation 'mandatory'
I understood that this is perfectly normal since the transaction doesn't propagate through different threads. So one solution is to create a transaction manually in every thread as proposed in another similar question. But this was not satisfying enough for me so I kept searching.
In Spring's forum I found a discussion on the topic. One paragraph I find very interesting:
"I can imagine one could manually propagate the transaction context to another thread, but I don't think you should really try it. Transactions are bound to single threads with a reason - the basic underlying resource - jdbc connection - is not threadsafe. Using one single connection in multiple threads would break fundamental jdbc request/response contracts and it would be a small wonder if it would work in more then trivial examples."
So the first question arise: Is it worth it to pararellize the reading/writing to the database and can this really hurt the DB consistency?
If the quote above is not true, which I doubt, is there a way to achieve the following:
MyIdPartitioner to be Spring managed - with #Scope("prototype") - and pass the needed arguments for the recursive calls to it and that way leave the transaction management to Spring?
After further readings I managed to solve my problem. Kind of (as I see it now there wasn't a problem at the first place).
Since the reading I do from the DB is in chunks and I am sure that the results won't get edited during that time I can do it outside transaction.
The writing is also safe in my case since all values I write are unique and no constraint violations can occur. So I removed the transaction from there too.
What I mean by saying "I removed the transaction" just override the method's Propagation mode in my DAO like:
#Repository
#Transactional(propagation = Propagation.MANDATORY)
public class IdsDao {
#Transactional(propagation = Propagation.SUPPORTS)
public MyData getData(List<UUID> list) {
// ...
}
}
Or if you decide you need the transaction for some reason then you can still leave the transaction management to Spring by setting the propagation to REQUIRED.
So the solution turns out to be much much simpler than I thought.
And to answer my other questions:
Is it worth it to pararellize the reading/writing to the database and can this really hurt the DB consistency?
Yes, it's worth it. And as long as you have transaction per thread you are cool.
Is there a way to achieve the following: MyIdPartitioner to be Spring managed - with #Scope("prototype") - and pass the needed arguments for the recursive calls to it and that way leave the transaction management to Spring?
Yes there is a way by using pool (another stackoverflow question). Or you can define your bean as #Scope(value = "prototype", proxyMode = ScopedProxyMode.TARGET_CLASS) but then it won't work if you need to set parameters to it since every usage of the instance will give you a new instance. Ex.
#Autowire
MyIdsPartitioner partitioner;
public void someMethod() {
...
partitioner.setIds(someIds);
partitioner.setFromIndex(fromIndex);
partitioner.setToIndex(toIndex);
...
}
This will create 3 instances and you won't be able to use the object beneficial since the fields won't be set.
So in short - there is a way but I didn't need to go for it at first place.
This should be possible with atomikos (http://www.atomikos.com) and optionally with nested transactions.
If you do this, then take care to avoid deadlocks if multiple threads of a same root transaction write to the same tables in the database.

Categories