Association OneToMany not retrieved with Repository on a scheduled task - java

I am in a service, with a scheduled task, and I want to get an object from the database. It has EAGER associations, so the find method should get it totally.
#Service
public class CustomTask {
#Autowired
CustomRepository customRepository;
#Scheduled(fixedRate = 1000)
public void action() {
customRepository.find(1);
}
}
But here it doesn't work. The associations are null.
While inside a Spring Boot Controller, the repository method works perfectly.
Do you know I can get my whole object in this Scheduled method of a Service?

The scheduled task is called at the beginning of the app, at a moment where the environment may not be totally initialized.
With an initial delay to the task, I can access my whole object :
#Scheduled(initialDelay=10000, fixedRate = 1000)
NB : It is more a workaround than a fix.

Related

Rollback changes done to a MariaDB database by a spring test without #Transactional

I have a Spring service that does something like that :
#Service
public class MyService {
#Transactional(propagation = Propagation.NEVER)
public void doStuff(UUID id) {
// call an external service, via http for example, can be long
// update the database, with a transactionTemplate for example
}
}
The Propagation.NEVER indicates we must not have an active transaction when the method is called because we don't want to block a connection to the database while waiting for an answer from the external service.
Now, how could I properly test this and then rollback the database ? #Transactional on the test won't work, there will be an exception because of Propagation.NEVER.
#SpringBootTest
#Transactional
public class MyServiceTest {
#Autowired
private MyService myService;
public void testDoStuff() {
putMyTestDataInDb();
myService.doStuff(); // <- fails no transaction should be active
assertThat(myData).isTheWayIExpectedItToBe();
}
}
I can remove the #Transactional but then my database is not in a consistent state for the next test.
For now my solution is to truncate all tables of my database after each test in a #AfterEach junit callback, but this is a bit clunky and gets quite slow when the database has more than a few tables.
Here comes my question : how could I rollback the changes done to my database without truncating/using #Transactional ?
The database I'm testing against is mariadb with testcontainers, so a solution that would work only with mariadb/mysql would be enough for me. But something more general would be great !
(another exemple where I would like to be able to not use #Transactional on the test : sometimes I want to test that transaction boundaries are correctly put in the code, and not hit some lazy loading exceptions at runtime because I forgot a #Transactional somewhere in the production code).
Some other precisions, if that helps :
I use JPA with Hibernate
The database is create with liquibase when the application context starts
Others ideas I've played with :
#DirtiesContext : this is a lot slower, creating a new context is a lot more expensive than just truncating all tables in my database
MariaDB SAVEPOINT : dead end, it's just a way to go back to a state of the database INSIDE a transaction. This would be the ideal solution IMO if i could work globally
Trying to fiddle with connections, issuing START TRANSACTION statements natively on the datasource before the test and ROLLBACK after the tests : really dirty, could not make it work
Personal opinion: #Transactional + #SpringBootTest is (in a way) the same anti-pattern as spring.jpa.open-in-view. Yes, it's easy to get things working at first and having the automatic rollback is nice, but it loses you a lot of flexibility and control over your transactions. Anything that requires manual transaction management becomes very hard to test that way.
We recently had a very similar case and in the end we decided to bite the bullet and use #DirtiesContext instead. Yeah, tests take 30 more minutes to run, but as an added benefit the tested services behave the exact same way as in production and the tests are more likely to catch any transaction issues.
But before we did the switch, we considered using the following workaround:
Create an interface and a service similar to the following:
interface TransactionService
{
void runWithoutTransaction(Runnable runnable);
}
#Service
public class RealTransactionService implements TransactionService
{
#Transactional(propagation = Propagation.NEVER)
public void runWithoutTransaction(Runnable runnable)
{
runnable.run();
}
}
In your other service wrap the external http calls with the #runWithoutTransaction-Method, e.g.:
#Service
public class MyService
{
#Autowired
private TransactionService transactionService;
public void doStuff(UUID id)
{
transactionService.runWithoutTransaction(() -> {
// call an external service
})
}
}
That way your production code will peform the Propagation.NEVER check, and for the tests you can replace the TransactionService with a different implemention that doesn't have the #Transactional annotations, e.g.:
#Service
#Primary
public class FakeTransactionService implements TransactionService
{
// No annotation here
public void runWithoutTransaction(Runnable runnable)
{
runnable.run();
}
}
This is not limited to Propagation.NEVER. Other propagation types can be implemented in the same way:
#Transactional(propagation = Propagation.REQUIRES_NEW)
public void runWithNewTransaction(Runnable runnable)
{
runnable.run();
}
And finally - the Runnable parameter can be replaced with a Function/Consumer/Supplier if the method needs to return and/or accept a value.
This is bit of wild idea, but if you are using mysql database, then maybe switch to dolt for tests?
Dolt is a SQL database that you can fork, clone, branch, merge, push and pull just like a git repository.
You can wrap it as testcontainers container, load necessary data on start and then, on start of each test run dolt reset.

Springboot with #Transactional Quartz Job causing memory leak

When combining springboot (2.4.5) with springboot-starter-quartz dependency, I ran into memory leak problem when I try to force the Job.execute(JobExecutionContext context) to be #Transactional as I need to do some lazy loading in there, although the example I demonstrate the problem with is not doing any loading, but still creates the leak.
I followed the guidance at Baeldung spring quarz schedule, where the result of my implementation is the following class, which I create a trigger for & schedule it somewhere else in the application (irelevant where).
#Slf4j
#Component
public class FakeQuartzJob extends Job {
#Override
#Transactional
public void execute(JobExecutionContext context) {
log.info("I was called");
}
public static Trigger buildJobTrigger(JobDetail jobDetail) {
return TriggerBuilder.newTrigger()
.forJob(jobDetail)
.withIdentity(UUID.randomUUID().toString(), "fake-job-trigger")
.startAt(Date.from(Instant.now().plusSeconds(1)))
.withSchedule(SimpleScheduleBuilder.simpleSchedule()
.withMisfireHandlingInstructionFireNow())
.build();
}
public static JobDetail buildJobDetail() {
JobDataMap jobDataMap = new JobDataMap();
jobDataMap.put("someRandomData", new FakeDataClass());
return JobBuilder.newJob(FakeQuartzJob.class)
.withIdentity(UUID.randomUUID().toString(), "fake-job-detail")
.usingJobData(jobDataMap)
.build();
}
}
This results in creating transactional proxies for the FakeQuartzJob.execute method (expected with transactional) but when doing some heap investigation I found out it results in publishing this proxy into spring containers bean registry but it's never removed from the registry, see
These proxies are never garbage collected, which leads into heap memory being constantly eaten by this until it reaches the limit and application goes oom.
I tested different behaviors/combinations of this spring/quartz cooperation regarding this problem and
Theres no problem with #Autowire-ing services/repos beans into the job
Theres no problem #Autowiring and calling Job.execute if there's no #Transactional annotation
There is a problem using #Transactional in Job.execute(JobExecutionContext context)
What I expect and desperately need, is to have possibility to do hibernate lazy loading of entity relationships when executing quartz job, but for that I need to have a transaction support via #Transactional.
I do understand that I am scheduling this job, while it is annotated with #Component, so when the class is constructed, it becomes spring managed, but I do that to allow for autowiring.
What I absolutely do not understand is why spring does not free up the proxy classes and keeps holding references to them (class proxy is referenced by method proxy) see
Spring&Quartz question: Is there a way to allow for both autowiring and having possibility of doing #Transactional call in when executing the job?
Technical question: What could be the reason this is happening?

How to schedule executing method for certain time during runtime

Lets say I have some rest api where arguments are time when to execute method and second argument is name of the method in class. What is the best way for invoking call of this method in certain time (just once) in spring-boot application ?
First, enable scheduling in your spring-boot application:
#SpringBootApplication
#EnableScheduling
public class Application {
// ...
Then, inject the TaskScheduler bean and schedule the task programmatically every time the user invokes the REST method:
public class MyScheduler {
#Autowired
private TaskScheduler scheduler;
public void scheduleNewCall(Date dateTime) {
scheduler.schedule(this::scheduledMethod, dateTime);
}
public void scheduledMethod() {
// method that you wish to run
}
}
However, you should also think about limiting the amount of calls to this method, otherwise the malicious user could schedule a lot of tasks and overflow the task pool.

java scheduler spring vs quartz

Currently I am building a spring standalone program in order to learn new methods and architectures.
The last few days I tried to learn scheduler. I never used them before so I read some articles handling the different possible methods. Two of them are especially interesting: The spring nativ #Scheduler and Quartz.
From what I read, Spring is a little bit smaller then Quartz and much more basic. And quartz is not easy to use with spring (because of the autowired and components).
My problem now is, that there is one thing I do not understand:
From my understanding, both methods are creating parallel Threads in order to asynchronously run the jobs. But what if I now have a spring #Service in my main Application, that is holding a HashMap with some information. The data is updated and changed with user interaction. Parallel there are the scheduler. And a scheduler now whants to use this HashMap from the main application as well. Is this even possible?
Or do I understand something wrong? Because there is also the #Async annotation and I did not understand the difference. Because a scheduler itself is already parallel to the main corpus, isn't it?
(summing up, two questions:
can a job that is executed every five seconds, implemented with a scheduler, use a HashMap out of a service inside the main program? (in spring #Scheduler and/or in Quartz?)
Why is there a #Async annotation. Isn't a scheduler already parallel to the main process?
)
I have to make a few assumptions about which version of Spring you're using but as you're in the process of learning, I would assume that you're using spring-boot or a fairly new version, so please excuse if the annotations don't match your version of Spring. This said, to answer your two questions the best I can:
can a job that is executed every five seconds, implemented with a scheduler, use a HashMap out of a service inside the main program? (in spring #Scheduler and/or in Quartz?)
Yes, absolutely! The easiest way is to make sure that the hashmap in question is declared as static. To access the hashmap from the scheduled job, simply either autowire your service class or create a static get function for the hashmap.
Here is an example of a recent Vaadin project where I needed a scheduled message sent to a set of subscribers.
SchedulerConfig.class
#Configuration
#EnableAsync
#EnableScheduling
public class SchedulerConfig {
#Scheduled(fixedDelay=5000)
public void refreshVaadinUIs() {
Broadcaster.broadcast(
new BroadcastMessage(
BroadcastMessageType.AUTO_REFRESH_LIST
)
);
}
}
Broadcaster.class
public class Broadcaster implements Serializable {
private static final long serialVersionUID = 3540459607283346649L;
private static ExecutorService executorService = Executors.newSingleThreadExecutor();
private static LinkedList<BroadcastListener> listeners = new LinkedList<BroadcastListener>();
public interface BroadcastListener {
void receiveBroadcast(BroadcastMessage message);
}
public static synchronized void register(BroadcastListener listener) {
listeners.add(listener);
}
public static synchronized void unregister(BroadcastListener listener) {
listeners.remove(listener);
}
public static synchronized void broadcast(final BroadcastMessage message) {
for (final BroadcastListener listener: listeners)
executorService.execute(new Runnable() {
#Override
public void run() {
listener.receiveBroadcast(message);
}
});
}
}
Why is there a #Async annotation. Isn't a scheduler already parallel to the main process?
Yes, the scheduler is running in its own thread but what occurs to the scheduler on long running tasks (ie: doing a SOAP call to a remote server that takes a very long time to complete)?
The #Async annotation isn't required for scheduling but if you have a long running function being invoked by the scheduler, it becomes quite important.
This annotation is used to take a specific task and request to Spring's TaskExecutor to execute it on its own thread instead of the current thread. The #Async annotation causes the function to immediately return but execution will be later made by the TaskExecutor.
This said, without the #EnableAsync or #Async annotation, the functions you call will hold up the TaskScheduler as they will be executed on the same thread. On a long running operation, this would cause the scheduler to be held up and unable to execute any other scheduled functions until it returns.
I would suggest a read of Spring's Documentation about Task Execution and Scheduling It provides a great explanation of the TaskScheduler and TaskExecutor in Spring

Should I pass a managed entity to a method that requires a new transaction?

My application loads a list of entities that should be processed. This happens in a class that uses a scheduler
#Component
class TaskScheduler {
#Autowired
private TaskRepository taskRepository;
#Autowired
private HandlingService handlingService;
#Scheduled(fixedRate = 15000)
#Transactional
public void triggerTransactionStatusChangeHandling() {
taskRepository.findByStatus(Status.OPEN).stream()
.forEach(handlingService::handle);
}
}
In my HandlingService processes each task in issolation using REQUIRES_NEW for propagation level.
#Component
class HandlingService {
#Transactional(propagation = Propagation.REQUIRES_NEW)
public void handle(Task task) {
try {
processTask(task); // here the actual processing would take place
task.setStatus(Status.PROCCESED);
} catch (RuntimeException e) {
task.setStatus(Status.ERROR);
}
}
}
The code works only because i started the parent transaction on TaskScheduler class. If i remove the #Transactional annotation the entities are not managed anymore and the update to the task entity is not propagated to the db.I don't find it natural to make the scheduled method transactional.
From what i see i have two options:
1. Keep code as it is today.
Maybe it`s just me and this is a correct aproach.
This varianthas the least trips to the database.
2. Remove the #Transactional annotation from the Scheduler, pass the id of the task and reload the task entity in the HandlingService.
#Component
class HandlingService {
#Autowired
private TaskRepository taskRepository;
#Transactional(propagation = Propagation.REQUIRES_NEW)
public void handle(Long taskId) {
Task task = taskRepository.findOne(taskId);
try {
processTask(task); // here the actual processing would take place
task.setStatus(Status.PROCCESED);
} catch (RuntimeException e) {
task.setStatus(Status.ERROR);
}
}
}
Has more trips to the database (one extra query/element)
Can be executed using #Async
Can you please offer your opinion on which is the correct way of tackling this kind of problems, maybe with another method that i didn't know about?
If your intention is to process each task in a separate transaction, then your first approach actually does not work because everything is committed at the end of the scheduler transaction.
The reason for that is that in the nested transactions Task instances are basically detached entities (Sessions started in the nested transactions are not aware of those instances). At the end of the scheduler transaction Hibernate performs dirty check on the managed instances and synchronizes changes with the database.
This approach is also very risky, because there may be troubles if you try to access an uninitialized proxy on a Task instance in the nested transaction. And there may be troubles if you change the Task object graph in the nested transaction by adding to it some other entity instance loaded in the nested transaction (because that instance will now be detached when the control returns to the scheduler transaction).
On the other hand, your second approach is correct and straightforward and helps avoid all of the above pitfalls. Only, I would read the ids and commit the transaction (there is no need to keep it suspended while the tasks are being processed). The easiest way to achieve it is to remove the Transactional annotation from the scheduler and make the repository method transactional (if it isn't transactional already).
If (and only if) the performance of the second approach is an issue, as you already mentioned you could go with asynchronous processing or even parallelize the processing to some degree. Also, you may want to take a look at extended sessions (conversations), maybe you could find it suitable for your use case.
The current code processes the task in the nested transaction, but updates the status of the task in the outer transaction (because the Task object is managed by the outer transaction). Because these are different transactions, it is possible that one succeeds while the other fails, leaving the database in an inconsistent state. In particular, with this code, completed tasks remain in status open if processing another task throws an exception, or the server is restarted before all tasks have been processed.
As your example shows, passing managed entities to another transaction makes it ambiguous which transaction should update these entities, and is therefore best avoided. Instead, you should be passing ids (or detached entities), and avoid unnecessary nesting of transactions.
Assuming that processTask(task); is a method in the HandlingService class (same as handle(task) method), then removing #Transactional in HandlingService won't work because of the natural behavior of Spring's dynamic proxy.
Quoting from spring.io forum:
When Spring loads your TestService it wrap it with a proxy. If you call a method of the TestService outside of the TestService, the proxy will be invoke instead and your transaction will be managed correctly. However if you call your transactional method in a method in the same object, you will not invoke the proxy but directly the target of the proxy and you won't execute the code wrapped around your service to manage transaction.
This is one of SO thread about this topic, and Here are some articles about this:
http://tutorials.jenkov.com/java-reflection/dynamic-proxies.html
http://tutorials.jenkov.com/java-persistence/advanced-connection-and-transaction-demarcation-and-propagation.html
http://blog.jhades.org/how-does-spring-transactional-really-work/
If you really don't like adding #Transaction annotation in your #Scheduled method, you could get transaction from EntityManager and manage transaction programmatically, for example:
UserTransaction tx = entityManager.getTransaction();
try {
processTask(task);
task.setStatus(Status.PROCCESED);
tx.commit();
} catch (Exception e) {
tx.rollback();
}
But I doubt that you will take this way (well, I wont). In the end,
Can you please offer your opinion on which is the correct way of tackling this kind of problems
There's no correct way in this case. My personal opinion is, annotation (for example, #Transactional) is just a marker, and you need a annotation processor (spring, in this case) to make #Transactional work. Annotation will have no impact at all without its processor.
I Will more worry about, for example, why I have processTask(task) and task.setStatus(Status.PROCESSED); live outside of processTask(task) if it looks like does the same thing, etc.
HTH.

Categories