Is it acceptable to introduce volatile state into singleton beans? - java

I want to create proper stateless services, i.e., beans with singleton scope, but sometimes state creeps in. Current candidates in the application I work on are
caching data
keeping Futures for scheduled tasks around to be able to cancel them on demand later
A simplified example service for the Futures could be
class SchedulingService {
#Autowired
TaskScheduler scheduler;
Map<String, ScheduledFuture> scheduledEvents = new HashMap<>();
public void scheduleTask(String id, MyTask task) {
if (scheduledEvents.containsKey(id)) {
scheduledEvents.remove(id).cancel(false);
}
persistTask(task)
scheduledEvents.put(id, scheduler.schedule(task, task.createTrigger()));
}
void persistTask(MyTask task) { /* persist logic here */ }
}
I'm sure requirements like these will pop up all the time. Since the data that should be kept in memory doesn't have to be persisted because it is just derived information from data in the database, is it acceptable to keep state this way? And if not is there a better way of doing this?

Related

Dependency injection of IHttpContextAccessor vs passing parameter up the method chain

Our application calls many external API's which take a session token of the current user as input. So what we currently do is in a controller, get the session token for the user and pass it into a service which in turn might call another service or some API client. To give an idea, we end up with something like this (example is .NET but something similar is I think possible in Java)
public IActionResult DoSomething(string something)
{
this.someService.DoSomethingForUser(this.HttpContext.SessionToken, something);
return View();
}
And then we have
public class SomeService
{
private readonly IApiClient apiClient;
public SomeService(IApiClient apiClient)
{
this.apiClient = apiClient;
}
public void DoSomethingForUser(string sessionToken, something)
{
this.apiClient.DoSomethingForUser(sessionToken, something);
}
}
It can also happen that in SomeService another service is injected which in turn calls the IApiClient instead of SomeService calling IApiClient directly, basically adding another "layer".
We had a discussion with the team if it isn't better to instead of passing the session token, inject it using DI so you get something like this:
public IActionResult DoSomething(string something)
{
this.someService.DoSomethingForUser(something);
return View();
}
And then we have
public class SomeService
{
private readonly IUserService userService;
private readonly IApiClient apiClient;
public SomeService(IUserService userService, IApiClient apiClient)
{
this.userService = userService;
this.apiClient = apiClient;
}
public void DoSomethingForUser(string something)
{
this.apiClient.DoSomethingForUser(userService.SessionToken, something);
}
}
The IUserService would have an IHttpContextAccessor injected:
public class UserService : IUserService
{
private readonly IHttpContextAccessor httpContextAccessor;
public UserService(IHttpContextAccessor httpContextAccessor)
{
this.httpContextAccessor = httpContextAccessor;
}
public string SessionToken => httpContextAccessor.HttpContext.SessionToken;
}
The benefits of this pattern are I think pretty clear. Especially with many services, it keeps the code "cleaner" and you end up with less boilerplate code to pass a token around.
Still, I don't like it. To me the downsides of this pattern are more important than its benefit:
I like that passing the token in the methods is concise. It is clear that the service needs some sort of authentication token for it to function. I'm not sure if you can call it a side effect but the fact that a session token is magically injected three layers deep is impossible to tell just by reading the code
Unit testing is a bit more tedious if you have to Mock the IUserService
You run into problems when calling this in another thread, e.g. calling SomeService from another thread. Although these problems can be mitigated by injecting another concrete type of IUserService which gets the token from some place else, it feels like a chore.
To me it strongly feels like an anti pattern but apart from the arguments above it is mostly a feeling. There was a lot of discussion and not everybody was convinced that it was a bad idea. Therefor, my question is, is it an anti pattern or is it perfectly valid? What are some strong arguments for and against it, hopefully so there can be not much debate that this pattern is indeed, either perfectly valid or something to avoid.
I would say the main point is to enable your desired separation of concerns. I think it is a good question if expressed in those terms. As Kit says, different people may prefer different solutions.
REQUEST SCOPED OBJECTS
These occur quite naturally in APIs. Consider the following example, where a UI calls an Orders API, then the Orders API forwards the JWT to an upstream Billing API. A unique Request ID is also sent, in case the flow experiences a temporary problem. If the flow is retried, the Request ID can be used by APIs to prevent data duplication. Yet business logic should not need to know about either the Request ID or the JWT.
BUSINESS LOGIC CLASS DESIGN
I would start by designing my logic classes with my desired inputs, then work out the DI later. In my example the OrderService class might use claims to get the user identity and also for authorization. But I would not want it to know about HTTP level concerns:
public class OrderService
{
private readonly IBillingApiClient billingClient;
public OrderService(IBillingApiClient billingClient, ClaimsPrincipal user)
{
this.billingClient = billingClient;
}
public async void CreateOrder(OrderInput data)
{
this.Authorize();
var order = this.CreateOrder(data);
await this.billingClient.CreateInvoice(order);
}
}
DI SETUP
To enable my preferred business logic, I would write a little DI plumbing, so that I could inject request scoped dependencies in my preferred way. First, when the app starts, I would create a small middleware class. This will run early in the HTTP request pipeline:
private void ConfigureApiMiddleware(IApplicationBuilder api)
{
api.UseMiddleware<ClientContextMiddleware>();
}
In the middleware class I would then create a ClientContext object from runtime data. The OrderService class will run later, after next() is called:
public class ClientContextMiddleware
{
public async Task Invoke(HttpContext context)
{
var jwt = readJwt(context.Request);
var requestId = readRequestId(context.Request);
var holder = context.RequestServices.GetService<ClientContextHolder>();
holder.ClientContext = new ClientContext(jwt, requestIO);
await this.next(context);
}
}
In my DI composition at application startup I would express that the API client should be created when it is first referenced. In the HTTP request pipeline, the OrderService request scoped object will be constructed after the middleware has run. The below lambda will then be invoked:
private void RegisterDependencies(IServiceCollection services)
{
this.services.AddScoped<IApiClient>(
ctx =>
{
var holder = ctx.GetService<ClientContextHolder>();
return new ApiClient(holder.context);
});
this.services.AddScoped<ClientContextHolder>();
}
The holder object is just due to a technology limitation. The MS stack does not allow you to create new request scoped injectable objects at runtime, so you have to update an existing one. In a previous .NET tech stack, the concept of child container per request was made available to developers, so the holder object was not needed.
ASYNC AWAIT
Request scoped objects are stored against the HTTP request object, which is the correct behaviour when using async await. The current thread ID may switch, eg from 4 to 6 after the call to the Billing API.
If the OrderService class has a transient scope, it could get recreated when the flow resumes on thread 6. If this is the case, then resolution will continue to work.
SUMMARY
Designing inputs first, then writing some support code if needed is a good approach I think, and it is also useful to know the DI techniques. Personally I think natural request scoped objects that need to be created at runtime should be usable in DI. Some people may prefer a different approach though.
See in dotnet the area that I am an expert is not an anti standard on the contrary it is the model that many adopt but it is not a model that I would follow for the following reasons
it is not clear where is the token for those who read and use it being an anti clean code
you load important information in a place that is frequently accessed by the framework in the case of .netCore
your classes will reference a large property carrying a lot of unnecessary information when you could have created a more clean model that costs less memory and allocation time, I'm saying this because the HttpAcessor carries all the information relevant to your request
As I would take care of readability (clean code) and improve my performance
I would make a middleware or filter in my flow mvc where I would do the authentication part and create a class like:
public class TokenAuthenciationValues
{
public string TokenClient { get; set; }
public string TokenValue { get; set; }
}
Of course my method is an example but in my middleware I would implement it by loading its token values ​​after calling the necessary apis (of course this model needs an interface and it needs to be configured as .AddScoped() in the case of .net)
That way I would use it in my methods only instantiating my ITokenAuthenciationValues ​​in the constructor and I would have clear and clean information loaded in memory during the entire request
If it is necessary in the middle of the request to change the token any class can access it and change its value
I would have less memory allocated unused in my classes since the IHttpAcessor contract the ITokenAuthenciationValues ​​only has relevant information
Hope this helps

Spring: PESSIMISTIC_READ/WRITE not working

I have two servers connected to the same database. Both have scheduled jobs I don't really care which one runs the scheduled jobs as long as only one does. So the idea was to keep a key-value pair in DB and whichever reads the value as 0 first gets to run the scheduled job.
Ideally this would work as so:
App A and App B run the scheduled job at the same time.
App A access the DB first, locks the table for reading & writing.
App A sets value to 1 and releases the lock.
App A starts working on the scheduled job.
App B reads the value 1 from it's DB request and does not run the scheduled job.
I have a config table where I keep status on my locks.
config:
name: VARCHAR(55)
value: VARCHAR(55)
The repository:
#Repository
public interface ConfigRepository extends CrudRepository<Config, Long> {
#Lock(LockModeType.PESSIMISTIC_READ)
Config findOneByName(String name);
#Lock(LockModeType.PESSIMISTIC_WRITE)
<S extends Config> S save(S entity);
}
The service:
#Service
public class ConfigService {
#Transactional
public void unlock(ConfigEnum lockable) {
Config lock = configRepository.findOneByName(lockable.getSetting());
lock.setValue("0");
configRepository.save(lock);
}
#Transactional
public void lock(ConfigEnum lockable) {
Config lock = configRepository.findOneByName(lockable.getSetting());
lock.setValue("1");
configRepository.save(lock);
}
#Transactional
public boolean isLocked(ConfigEnum lockable) {
Config lock = configRepository.findOneByName(lockable.getSetting());
return lock.getValue().equals("1");
}
}
The Scheduler:
#Component
public class JobScheduler {
#Async
#Scheduled("0 0 1 * * *")
#Transactional
public void run() {
if (!configService.isLocked(ConfigEnum.CNF_JOB.getJobName())) {
configService.lock(ConfigEnum.CNF_JOB.getJobName());
jobService.run();
configService.unlock(ConfigEnum.CNF_JOB.getJobName());
}
}
}
However I have noticed that the scheduled jobs still run at the same time on both apps. At times one will throw a deadlock but it appears that Spring retries the transaction if it hits a deadlock. At which time it appears that the one app has finished so this one begins the same job again (not sure).
The tasks are not that short that a lock could be established, table updated, task run and lock released. I would like to keep this really simple without involving additional libraries like Quartz or ShedLock.
I think your transactions are too short. You don't start a transaction in the run method, but each ConfigService method is transactional. Most likely each method gets a new transaction and commits when done. A commit will release the lock, so there is a race condition between isLocked and lock.
Combine isLocked and lock:
#Transactional
public boolean tryLock(ConfigEnum lockable) {
Config lock = configRepository.findOneByName(lockable.getSetting());
if("1".equals(lock.getValue()) {
return false;
}
lock.setValue("1");
configRepository.save(lock);
return true;
}
This checks and writes in the same transaction and should work.
As a side note it is a dangerous method. What happens if the node that has the lock dies? There are many possible solutions. One is to lock a specific record and keep that lock throughout the job. The other node cannot proceed and if the first one dies the lock will be released. Another is to use a timestamp instead of 1 and require the timestamp to be updated on a regular basis by the owner. Or you could introduce something like Zookeeper.

How can I run a concurrent queue of tasks using Rx?

I've found a lot of examples about it and doesn't know what's the 'right' implementation right there.
Basically I've got a object (let's call it NBAManager) and there's a method public Completable generateGame() for this object. The idea is that generateGame method gets called a lot of times and I want to generate games in a sequential way: I was thinking about concurrent queue. I came up with the following design: I'd create a singleton instance of NBAService: service for NBAManager and the body of generateGame() will look like this:
public Completable generateGame(RequestInfo info)
return service.generateGame(info);
So basically I'll pass up that Completable result. And inside of that NBAService object I'll have a queue (a concurrent one, because I want to have an opportunity to poll() and add(request) if there's a call of generateGame() while NBAManager was processing one of the earlier requests) of requests. I got stuck with this:
What's the right way to write such a job queue in Rx way? There're so many examples of it. Could you send me a link of a good implementation?
How do I handle the logic of queue execution? I believe we've to execute if there's one job only and if there're many then we just have to add it and that's it. How can I control it without runnable? I was thinking about using subjects.
Thanks!
There are multiple ways to implement this, you can choose how much RxJava should be invoked. The least involvement can use a single threaded ExecutorService as the "queue" and CompletableSubject for the delayed completion:
class NBAService {
static ExecutorService exec = Executors.newSingleThreadedExecutor();
public static Completable generateGame(RequestInfo info) {
CompletableSubject result = CompletableSubject.create();
exec.submit(() -> {
// do something with the RequestInfo instance
f(info).subscribe(result);
});
return result;
}
}
A more involved solution would be if you wanted to trigger the execution when the Completable is subscribed to. In this case, you can go with create() and subscribeOn():
class NBAService {
public static Completable generateGame(RequestInfo info) {
return Completable.create(emitter -> {
// do something with the RequestInfo instance
emitter.setDisposable(
f(info).subscribe(emitter::onComplete, emitter::onError)
);
})
.subscribeOn(Schedulers.single());
}
}

How will multi-threading affect the Easy Rules engine?

I am looking for a rule engine for my web application and I found Easy Rules. However, in the FAQ section, it states that the limitation on thread safety.
Is a Web Container considered as a multi-threaded environment? For HTTP request is probably processed by a worker thread created by the application server.
How does thread safety comes into place?
How to deal with thread safety?
If you run Easy Rules in a multi-threaded environment, you should take into account the following considerations:
Easy Rules engine holds a set of rules, it is not thread safe.
By design, rules in Easy Rules encapsulate the business object model they operate on, so they are not thread safe neither.
Do not try to make everything synchronized or locked down!
Easy Rules engine is a very lightweight object and you can create an instance per thread, this is by far the easiest way to avoid thread safety problems
http://www.easyrules.org/get-involved/faq.html
http://www.easyrules.org/tutorials/shop-tutorial.html
Based on this example, how will multi-threading affects the rule engine?
public class AgeRule extends BasicRule {
private static final int ADULT_AGE = 18;
private Person person;
public AgeRule(Person person) {
super("AgeRule",
"Check if person's age is > 18 and
marks the person as adult", 1);
this.person = person;
}
#Override
public boolean evaluate() {
return person.getAge() > ADULT_AGE;
}
#Override
public void execute() {
person.setAdult(true);
System.out.printf("Person %s has been marked as adult",
person.getName());
}
}
public class AlcoholRule extends BasicRule {
private Person person;
public AlcoholRule(Person person) {
super("AlcoholRule",
"Children are not allowed to buy alcohol",
2);
this.person = person;
}
#Condition
public boolean evaluate() {
return !person.isAdult();
}
#Action
public void execute(){
System.out.printf("Shop: Sorry %s,
you are not allowed to buy alcohol",
person.getName());
}
}
public class Launcher {
public void someMethod() {
//create a person instance
Person tom = new Person("Tom", 14);
System.out.println("Tom:
Hi! can I have some Vodka please?");
//create a rules engine
RulesEngine rulesEngine = aNewRulesEngine()
.named("shop rules engine")
.build();
//register rules
rulesEngine.registerRule(new AgeRule(tom));
rulesEngine.registerRule(new AlcoholRule(tom));
//fire rules
rulesEngine.fireRules();
}
}
Yes, a web application is multithreaded. As you expect, there is a pool of threads maintained by the server. When the serversocket gets an incoming request on the port it's listening to, it delegates the request to a thread from the pool.Typically the request is executed on that thread until the response is completed.
If you try to create a single rules engine and let multiple threads access it, then either
the rules engine data is corrupted as a result of being manipulated by multiple threads (because data structures not meant to be threadsafe can perform operations in multiple steps that can be interfered with by other threads as they're accessing and changing the same data), or
you use locking to make sure only one thread at a time can use the rules engine, avoiding having your shared object get corrupted, but in the process creating a bottleneck. All of your requests will need to wait for the rules engine to be available and only one thread at a time can make progress.
It's much better to give each request its own copy of the rules engine, so it doesn't get corrupted and there is no need for locking. The ideal situation for threads is for each to be able to execute independently without needing to contend for shared resources.

TomEE chokes on too many #Asynchronous operations

I am using Apache TomEE 1.5.2 JAX-RS, pretty much out of the box, with the predefined HSQLDB.
The following is simplified code. I have a REST-style interface for receiving signals:
#Stateless
#Path("signal")
public class SignalEndpoint {
#Inject
private SignalStore store;
#POST
public void post() {
store.createSignal();
}
}
Receiving a signal triggers a lot of stuff. The store will create an entity, then fire an asynchronous event.
public class SignalStore {
#PersistenceContext
private EntityManager em;
#EJB
private EventDispatcher dispatcher;
#Inject
private Event<SignalEntity> created;
public void createSignal() {
SignalEntity entity = new SignalEntity();
em.persist(entity);
dispatcher.fire(created, entity);
}
}
The dispatcher is very simple, and merely exists to make the event handling asynchronous.
#Stateless
public class EventDispatcher {
#Asynchronous
public <T> void fire(Event<T> event, T parameter) {
event.fire(parameter);
}
}
Receiving the event is something else, which derives data from the signal, stores it, and fires another asynchronous event:
#Stateless
public class DerivedDataCreator {
#PersistenceContext
private EntityManager em;
#EJB
private EventDispatcher dispatcher;
#Inject
private Event<DerivedDataEntity> created;
#Asynchronous
public void onSignalEntityCreated(#Observes SignalEntity signalEntity) {
DerivedDataEntity entity = new DerivedDataEntity(signalEntity);
em.persist(entity);
dispatcher.fire(created, entity);
}
}
Reacting to that is even a third layer of entity creation.
To summarize, I have a REST call, which synchronously creates a SignalEntity, which asynchronously triggers the creation of a DerivedDataEntity, which asynchronously triggers the creation of a third type of entity. It all works perfectly, and the storage processes are beautifully decoupled.
Except for when I programmatically trigger a lot (f.e. 1000) of signals in a for-loop. Depending on my AsynchronousPool size, after processing signals (quite fast) in the amount of about half of that size, the application completely freezes for up to some minutes. Then it resumes, to process about the same amount of signals, quite fast, before freezing again.
I have been playing around with AsynchronousPool settings for the last half hour. Setting it to 2000, for instance, will easily make all my signals be processed at once, without any freezes. But the system isn't sane either, after that. Triggering another 1000 signals, resulted in them being created allright, but the entire creation of derived data never happened.
Now I am completely at a loss as to what to do. I can of course get rid of all those asynchronous events and implement some sort of queue myself, but I always thought the point of an EE container was to relieve me of such tedium. Asynchronous EJB events should already bring their own queue mechanism. One which should not freeze as soon as the queue is too full.
Any ideas?
UPDATE:
I have now tried it with 1.6.0-SNAPSHOT. It behaves a little bit differently: It still doesn't work, but I do get an exception:
Aug 01, 2013 3:12:31 PM org.apache.openejb.core.transaction.EjbTransactionUtil handleSystemException
SEVERE: EjbTransactionUtil.handleSystemException: fail to allocate internal resource to execute the target task
javax.ejb.EJBException: fail to allocate internal resource to execute the target task
at org.apache.openejb.async.AsynchronousPool.invoke(AsynchronousPool.java:81)
at org.apache.openejb.core.ivm.EjbObjectProxyHandler.businessMethod(EjbObjectProxyHandler.java:240)
at org.apache.openejb.core.ivm.EjbObjectProxyHandler._invoke(EjbObjectProxyHandler.java:86)
at org.apache.openejb.core.ivm.BaseEjbProxyHandler.invoke(BaseEjbProxyHandler.java:303)
at <<... my code ...>>
...
Caused by: java.util.concurrent.RejectedExecutionException: Timeout waiting for executor slot: waited 30 seconds
at org.apache.openejb.util.executor.OfferRejectedExecutionHandler.rejectedExecution(OfferRejectedExecutionHandler.java:55)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:132)
at org.apache.openejb.async.AsynchronousPool.invoke(AsynchronousPool.java:75)
... 38 more
It is as though TomEE would not do ANY queueing of operations. If no thread is free to process in the moment of the call, tough luck. Surely, this cannot be intended..?
UPDATE 2:
Okay, I seem to have stumbled upon a semi-solution: Setting the AsynchronousPool.QueueSize property to maxint solves the freeze. But questions remain: Why is the QueueSize so limited in the first place, and, more worryingly: Why would this block the entire application? If the queue is full, it blocks, but as soon as a task is taken from it, another should pop in, right? The queue appears to be blocked until it is completely empty again.
UPDATE 3:
For anyone who wants to have a go: http://github.com/JanDoerrenhaus/tomeefreezetestcase
UPDATE 4:
As it turns out, increasing the queue size does NOT solve the problem, it merely delays it. The problem remains the same: Too many asynchronous operations at once, and TomEE chockes so bad, that it cannot even undeploy the application on termination anymore.
So far, my diagnosis is that the task cleanup does not work properly. My tasks are all very small and fast (see the test case on github). I was already afraid that it would be OpenJPA or HSQLDB slowing down on too many concurrent calls, but I commented out all em.persist calls, and the problem remained the same. So if my tasks are quite small and fast, but still manage to block out TomEE so bad that it could not get any further task in after 30 seconds (javax.ejb.EJBException: fail to allocate internal resource to execute the target task), I would imagine that completed tasks linger, clogging up the pipe, so to speak.
How could I resolve this issue?
Basically BlockingQueues use locks to ensure the consistency of data and avoid data loss, so in too highly concurrent environment it will reject a lot of tasks (your case).
You can play on trunk with the RejectedExecutionHandler implementation to retry to offer the task. One implementation can be:
new RejectedExecutionHandler() {
#Override
public void rejectedExecution(final Runnable r, final ThreadPoolExecutor executor) {
for (int i = 0; i < 10; i++) {
if (executor.getQueue().offer(r)) {
return;
}
try {
Thread.sleep(50);
} catch (final InterruptedException e) {
// no-op
}
}
throw new RejectedExecutionException();
}
}
It even works better with random sleep (between min and max).
The idea is basically: if the queue is full, wait some short time to reduce the concurrency.
configurable through WEB-INF/application.properties https://issues.apache.org/jira/browse/TOMEE-1012

Categories