Immutability and Reloadable Configs

Immutability and Reloadable Configs - java

Please note: Although this question mentions Java, I think it's an OOP/concurrency problem at heart and can probably be answered by anyone with significant programming experience.
So I'm building a ConfigurationLoader that will read Configuration POJO from a remote service and make it available via an API. A few things:
As soon as the ConfigurationLoader is asked for the Configuration the first time, a background thread (worker) will ping the remote service every, say, 30 seconds, for updates and then apply those updates to the Configuration instance; and
If the Configuration is modified, the background worker will be notified of the change and will push the "new" Configuration to the remote service;
Both the ConfigurationLoader and the Configuration must be thread-safe
So when I think "thread safety" the first thing I think of is immutability, which leads me towards excellent projects like Immutables. The problem is that Configuration can't be immutable because we need to be able to change it on the client-side and then let the loader ripple those changes back to the server.
My next thought was to try and make both ConfigurationLoader and Configuration singletons, but the problem is there is that the ConfigurationLoader takes a lot of arguments to instantiate it, and as this excellent answer points out, a singleton that takes arguments in construction is not a true singleton.
// Groovy pseudo-code
class Configuration {
// Immutable? Singleton? Other?
}
class ConfigurationLoader {
// private fields
Configuration configuration
ConfigurationLoader(int fizz, boolean buzz, Foo foo, List<Bar> bars) {
super()
this.fizz = fizz
this.buzz = buzz;
// etc.
}
Configuration loadConfiguration() {
if(configuration == null) {
// Create background worker that will constantly update
// 'configuration'
}
}
}
What are my options here? How do I create both the loader and the config to be thread-safe, where the config is changeable by the client-side (on-demand) or asynchronously by a background worker thread?

The problem is that Configuration can't be immutable because we need to be able to change it
It can still be immutable, you just create a new one for every change ("copy-on-write").
What are my options here?
First thing you'll have to think about: How do you want to react to configuration changes in concurrently running tasks? Basically, you have three options:
Ignore configuration change until the task is done
I.e. some directory your codes writes files to - finish writing the current file to the current target dir, put new files in the new dir. Writing some bytes into /new/path/somefile won't be a good idea if you never created that file. Your best option for this is probably an immutable Configuration object that you store in a field of your task instance (i.e. at task creation - in that case you can also make that field final for clarity). This usually works best if your code is designed as a collection of isolated small tasks.
Pros: Config never changes within a single task, so this is simple to get tread-safe and easy to test.
Cons: Config updates never make it to already running tasks.
Make your tasks check for config changes
I.e. your task regularly sends some data to an email address. Have a central storage for your config (like in your pseudo-code) and re-fetch it in some interval (i.e. between collecting data and sending the mail) from your task code. This usually works best for long-running/permanent tasks.
Pros: Config can change during a task run, but still somewhat simple to get safe - just make sure you have some memory barrier in place for reading the config (make your private configuration field volatile, use an AtomicReference, guard it with a lock, whatever).
Cons: Task code will be harder to test than first option. Config values may still be outdated between checks.
Signal config changes to your tasks
Basically option two, but the other way around. Whenever config changes, interrupt your tasks, handle the interrupt as a "config needs updating" message, continue/restart with new config.
Pros: Config values are never outdated.
Cons: This is the hardest to get right. Some might even argue that you cannot get this right, because interruption should only be used for task abortion. There is only very minor benefits (if at all) over the second option if you place your task's update checks at the right spots. Don't do this if you don't have a good reason to.

You need a singleton to pull this off, but your singleton isn't the immutable thing. Its the threadsafe thing. Make your singleton (Configuration) contain a simple Properties object or something and protect access to this with synchronization. Your Configuration Loader somehow knows of this Configuration singleton and functions to set, under synchronization, new instances of the Properties object when it detects change.
I'm pretty sure Apache Commons Configuration does something like this.

Related

OptimisticLockingException with Camunda Service Task

We're seeing OptimisticLockingExceptions in a Camunda process with the following Scenario:
The process consists of one UserTask followed by one Gateway and one ServiceTask. The UserTask executes
runtimeService.setVariable(execId, "object", out);`.
taskService.complete(taskId);
The following ServiceTask uses "object" as input variable (does not modify it) and, upon completion throws said OptimisticLockingException. My problem seems to originate from the fact, that taskService.complete() immediately executes the ServiceTask, prior to flushing the variables set in the UserTask.
I've had another, related issue, which occured, when in one UserTask I executed runtimeService.setVariable(Map<Strong, Boolean>) and tried to access the members of the Map as transition-guards in a gateway following that UserTask.
I've found the following article: http://forums.activiti.org/content/urgenterror-updated-another-transaction-concurrently which seems somehow related to my issue. However, I'm not clear on the question whether this is (un)wanted behaviour and how I can access a DelegateExecution-Object from a UserTask.

After long and cumbersome search we think, we have nailed two issues with camunda which (added together) lead to the Exception from the original question.
Camunda uses equals on serialized objects (represented by byte-arrays) to determine, whether process variables have to be written back to the database. This even happens when variables are only read and not set. As equals is defined by pointer-identity on arrays, a serializabled-Object is never determined "equal" if it has been serialized more than once. We have found, that a single runtimeService.setVariable() leads to four db-updates at the time of completeTask() (One for setVariable itself, the other three for various camunda-internal validation actions). We think this is a bug and will file a bug report to camunda.
Obviously there are two ways to set variables. One way is to use runtimeService.setVariable(), the other is to use delegateTask/delegateExecution.setVariable(). There is some flaw when using both ways at the same time. While we cannot simplify our setup to a simple unit-test, we have identified several components which have to be involved for the Exception to occur:
2.1 We are using a TaskListener to set up some context-variables at the start of Tasks this task-listener used runtimeService.setVariable() instead of delegateTask.setVariable(). After we changed that, the Exception vanished.
2.2 We used (and still use) runtimeService.setVariable() during Task-Execution. After we switched to completeTask(Variables) and omitted the runtimeService.setVariable() calls, the Exception vanished as well. However, this isn't a permanent solution as we have to store process variables during task execution.
2.3 The exception occured only in combination when process variables where read or written by the delegate<X>.getVariable() way (either by our code or implicitly in the camunda implementation of juel-parsing with gateways and serviceTasks or completeTask(HashMap))
Thanks a lot for all your input.

You could consider using an asynchronous continuation on the service task. This will make sure that the service task is executed inside a new transaction / command context.
Consider reading the camunda documentation on transactions and asynchronous continuations.
The DelegateExecution object is meant for providing service task (JavaDelegate) implementations access to process instance variables. It is not meant to be used from a User Task.

Is it bad practise to utilize many threads? (through SwingWorkers)

My Java (Swing) application creates a new SwingWorker object when it needs to (e.g) download data from the Internet and do something at the same time (think display a loader). However, monitoring the threads created, this can quickly reach ~100 threads.
Is this bad practice? If yes; what's the proper way to do it? Doesn't the GC automatically clean up unused threads?

Yes it is a bad practice when you put no upper bound on the number of threads (or generally resources).
In this case you better use a thread pool which contains at most a specific number of threads (say for example 25). You can either create them all at startup, or create them lazily on demand.
Implement a simple request manager system for the pool, which gives to the requesters the resources (or in case of running out of resources, queues them or simply denies them).
In this way, cleaning them in the end will also be easy and obvious.

Java update program configuration

I have a multithreaded program that loads its configuration on startup. The configuration is then handed down to the threads via their constructors.
But now I want to load one new config instance regularly, and pass it to the threads.
One idea would be to make the reference in the thread class to the config file volatile. Then, when an updated config instance is available, call a update update(Config c) method.
Is this the way to go? I will have terrible performance because every time the thread needs some setting it has to do all that volatile checking stuff.
Some better suggestions? Best practice? Maybe don't make it volatile and hope that the processor fetches that new object from main memory from time to time?

You could encapsulate all of your configuration values in a single immutable object, and when the configuration changes create a new instance of the object, and pass it to the threads, through listeners or explicit calling. The object has no volatile fields, only the object itself could be written in a volatile variable (or an AtomicReference).
The direct volatile approach with no other synchronization mechanisms is dangerous: you could read a halfway-rewritten configuration.
In any case, the performance impact on your application is likely to be negligible. Think about optimization later, if you find this is really a bottleneck.

Actually, are you sure that will have terrible performance?
If volatile used mostly for reading it's performance is not that bad. I'd recommend to try volatile first and measure performance degradation and only if it's significant then do any rework.
If you are really worrying about rapid volatile reads - then in you run method in thread you could have check for timeout - if 60 seconds passed since last config read - then reread it. Logic will reversed from update(Config c), to
if(moreThan60SecondsPassed)
{
localConfig = configconfigHolder.getConfig();
}
Also, if you'll be using non volatile - you won't get half read config. The danger is that you could have some threads not see updated value forever (no happens-before relationship).
Bw, did you consider recreating threads on config update? In this case you still could pass config through constructor. It depends on how often you want to update configuration.

You can use observer pattern to notify the threads of new config via listeners.
There is no way you can avoid volatile checking stuff. Maybe is expensive(do some performance tests) but your program will run correctly.

You may want to look at Commons Configuration:
A common issue with file-based configurations is to handle the reloading of the data file when it changes. This is especially important if you have long running applications and do not want to restart them when a configuration file was updated. Commons Configuration has the concept of so called reloading strategies that can be associated with a file-based configuration. Such a strategy monitors a configuration file and is able to detect changes. A default reloading strategy is FileChangedReloadingStrategy. It can be set on a file-based configuration as follows.
ManagedReloadingStrategy is an alternative to automatic reloading. It allows to hot-reload properties on a running application but only when requested by admin. The refresh() method will force a reload of the configuration source.

Java logging across multiple threads

We have a system that uses threading so that it can concurrently handle different bits of functionality in parallel. We would like to find a way to tie all log entries for a particular "transaction" together. Normally, one might use 'threadName' to gather these together, but clearly that fails in a multithreaded situation.
Short of passing a 'transaction key' down through every method call, I can't see a way to tie these together. And passing a key into every single method is just ugly.
Also, we're kind of tied to Java logging, as our system is built on a modified version of it. So, I would be interested in other platforms for examples of what we might try, but switching platforms is highly unlikely.
Does anyone have any suggestions?
Thanks,
Peter
EDIT: Unfortunately, I don't have control over the creation of the threads as that's all handled by a workflow package. Otherwise, the idea of caching the ID once for each thread (on ThreadLocal maybe?) then setting that on the new threads as they are created is a good idea. I may try that anyway.

You could consider creating a globally-accessible Map that maps a Thread's name to its current transaction ID. Upon beginning a new task, generate a GUID for that transaction and have the Thread register itself in the Map. Do the same for any Threads it spawns to perform the same task. Then, when you need to log something, you can simply lookup the transaction ID from the global Map, based on the current Thread's name. (A bit kludgy, but should work)

This is a perfect example for AspectJ crosscuts. If you know the methods that are being called you can put interceptors on them and bind dynamically.
This article will give you several options http://www.ibm.com/developerworks/java/library/j-logging/

However you mentioned that your transaction spans more than one thread, take a look at how log4j cope with binding additional information to current thread with MDC and NDC classes. It uses ThreadLocal as you were advised before, but interesting thing is how log4j injects data into log messages.
//In the code:
MDC.put("RemoteAddress", req.getRemoteAddr());
//In the configuration file, add the following:
%X{RemoteAddress}
Details:
http://onjava.com/pub/a/onjava/2002/08/07/log4j.html?page=3
http://wiki.apache.org/logging-log4j/NDCvsMDC

How about naming your threads to include the transaction ID? Quick and Dirty, admittedly, but it should work (until you need the thread name for something else or you start reusing threads in a thread pool).

If you are logging, then you must have some kind of logger object. You should have a spearate instance in each thread.
add a method to it called setID(String id).
When it is initialized in your thread, set a unique ID using the method.
prepend the set iD to each log entry.

A couple people have suggested answers that have the newly spawned thread somehow knowing what the transaction ID is. Unless I'm missing something, in order to get this ID into the newly spawned thread, I would have to pass it all the way down the line into the method that spawns the thread, which I'd rather not do.
I don't think you need to pass it down, but rather the code responsible for handing work to these threads needs to have the transactionID to pass. Wouldn't the work-assigner have this already?

How can I identify in which Java Applet context running without passing an ID?

I'm part of a team that develops a pretty big Swing Java Applet. Most of our code are legacy and there are tons of singleton references. We've bunched all of them to a single "Application Context" singleton. What we now need is to create some way to separate the shared context (shared across all applets currently showing) and non-shared context (specific to each applet currently showing).
However, we don't have an ID at each of the locations that call to the singleton, nor do we want to propagate the ID to all locations. What's the easiest way to identify in which applet context we're running? (I've tried messing with classloaders, thread groups, thread ids... so far I could find nothing that will enable me to ID the origin of the call).

Singletons are evil, what do you expect? ;)
Perhaps the most comprehensive approach would be to load the bulk of the applet in a different class loader (use java.net.URLClassLoader.newInstance). Then use a WeakHashMap to associate class loader with an applet. If you could split most of the code into a common class loader (as a parent of each per-applet class loader) and into the normal applet codebase, that would be faster but more work.
Other hacks:
If you have access to any component, you can use Component.getParent repeatedly or SwingUtilities.getRoot.
If you are in a per-applet instance thread, then you can set up a ThreadLocal.
From the EDT, you can read the current event from the queue (java.awt.EventQueue.getCurrentEvent()), and possibly find a component from that. Alternatively push an EventQueue with a overridden dispatchEvent method.

If I understand you correctly, the idea is to get a different "singleton" object for each caller object or "context".
One thing you can do is to create a thread-local global variable where you write the ID of the current context. (This can be done with AOP.) Then in the singleton getter, the context ID is fetched from the thread-local to use as a key to the correct "singleton" instance for the calling context.
Regarding AOP there should be no problem using it in applets since, depending on your point-cuts, the advices are woven at compile time and a JAR is added to the runtime dependencies. Hence, no special evidence of AOP should remain at run time.

#Hugo regarding threadlocal:
I thought about that solution. However, from experiments I found two problems with that approach:
Shared thread (server connections, etc) are problematic. This can be solved though by paying special attention to these thread (they're all under my control and are pretty much isolated from the legacy code).
The EDT thread is shared across all applets. I failed to find a way to force the creation of a new EDT thread for each applet. This means that the threadlocal for the EDT would be shared across the applets. This one I have no idea how to solve. Suggestions?

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.