Behavior of singletons in task queues on app-engine

Behavior of singletons in task queues on app-engine - java

What happens to my static variables when app-engine spins new instances? More specifically I am using a Task Queue that can have 40 instances/thread. Within the Servlet in question, I am using a singleton, as in
public class WorkerThread extends HttpServlet {
#Override
protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
..
MySingleton single = MySingleton.getInstance();
..
}
...
}
Here is how the singleton is created
public class MySingleton {
public static I MySingleton getInstance() {
return MySingletonHolder.INSTANCE;
}
private static class MySingletonHolder {
public static final MySingleton INSTANCE = new MySingleton();
}
private MySingleton() {
}
..
}
I have the following questions:
Since this is a Task Queue, do I have to worry about App-Engine starting new instances to scale with high demands?
Does it matter if the singleton is an inner class of the WorkerThread class or another class that the WorkerThread class is accessing?
Are Task Queues instance independent? A believe they are but am not sure.
I hope the question is clear. I want there to be only one instance of my singleton across all instances. Please ask for clarification if the question is not clear.
UPDATE
Following is my exact use case
public class SingletonProductIndexWriter {
private static final Logger LOG = Logger.getLogger(SingletonProductIndexWriter.class.getName());
public static IndexWriter getSingleIndexWriter() {
return IndexWriterHolder.INDEX_WRITER;
}
private static class IndexWriterHolder {
static PorterAnalyzer analyzer = new PorterAnalyzer();
static GaeDirectory index = new GaeDirectory(LuceneWorker.PRODUCTS);// create product index
static IndexWriterConfig config = GaeLuceneUtil.getIndexWriterConfig(LuceneWorker.LUCENE_VERSION, analyzer);
public static final IndexWriter INDEX_WRITER = getIndexWriter();
private static IndexWriter getIndexWriter() {
try {
LOG.info("Create single index writer for workers");
return new IndexWriter(index, config);
}catch(IOException e){
return null;
}
}
}
}
called as
IndexWriter writer = SingletonProductIndexWriter.getSingleIndexWriter();
For pertinent details see Stack Overflow thread: Worker threads cause Lucene LockObtainFailedException

Push and Pull queues are both processed by standard instances (you can target a module/frontend/backend in your queue.xml). Instances will scale up to meet the needs of both your normal traffic as well as your queues.
Singletons (in the classic sense portrayed here) are only unique to the classloader they are loaded by - they definitely are not unique across instances of your application on appengine - no state will be shared. Using this pattern you will have one singleton per appengine instance.
If you need to share state, you will need to use the datastore/cloud sql/something.

Related

What is the best practise for using a service in java threads?

I'm writing an application that will launch several threads - the amount varies per execution, but in general more than 5 and less than 100 - each of which will repeatedly read from a Mongo database.
public class MyThread implements Runnable {
private MyMongoClient myMongoClient = MyMongoClient.getInstance();
public MyThread() {
}
#Override
public void run() {
Document myDocument = myMongoClient.getDocumentById("id");
Document newDocument = new Document("id": "newId");
myMongoClient.writeDocument(newDocument);
}
}
I have an existing singleton service class to query and update Mongo, and would like any advice on patterns to follow for using it in the threads?
public class MyMongoClient {
private static MyMongoClient INSTANCE = new MyMongoClient();
private myCollection;
private MyMongoClient() {
try (MongoClient mongoClient = new MongoClient(host)) {
MongoDatabase db = mongoClient.getDatabase("myDatabase");
myCollection = db.getCollection("myCollection");
}
}
public static MyMongoClient getInstance() {
return INSTANCE;
}
private Document getObjectById(String id) {
// Implementation
}
private write writeDocument(Document document) {
// Implementation
}
}
As shown, each thread will read from existing entries, but not update any of them, and will write new entries using the same service
Should each thread use the same instance of the service, or should I rewrite the service so that each thread has its own instance?

You're going to get an error because you close your MongoClient in that constructor. MongoClient has a built in connection pool so there's no reason to create more than one. Create just the one and share it amongst your threads.

You can use ThreadPoolExecutor. It handles everything, you just submit task to the pool.
In my sample, keepAliveTime=10 secs, its value depends on your requirements.
ExecutorService threadPoolExecutor = new ThreadPoolExecutor(
5,
100,
10,
TimeUnit.SECONDS,
new LinkedBlockingQueue<Runnable>()
);
See the Oracle Tutorial on the Executors framework.

Java : Synchronization of code

I have this piece of code below as shown .
Our Application runs on 5 web servers controlled by a Load Balancer ,all connecting to one Memcache instance .
I guess that this piece of synchrnozation works only for one Instance .
Please let me know how can i synchrnoze this piece of code when 5 web servers are trying to access the Memcache
public class Memcache {
private MemcachedClient memclient = null;
private static Memcache instance = null;
public static Memcache getInstance() {
if (instance == null) {
try {
synchronized (Memcache.class) {
instance = new Memcache();
}
} catch (IOException e) {
throw new RuntimeException(e);
}
}
return instance;
}
private Memcache() throws IOException {
MemcachedClientBuilder builder = new XMemcachedClientBuilder();
memclient = builder.build();
}
}

Why not initialize it like this?
private static Memcache instance = new Memcache();
Bare in mind that what you tried to achieve at the synchronization here is problematic,
As two threads might pass the (if (instance == null) (a context switch might be after that line)
So you can consider the double check pattern,
BUt at some version of java there is a problem with it.
At the link I provided , there is info about problem, and
in this link, you can read about Singleton with the volatile keyword.
I still would go for the option I suggested above.

You can use the lazily initialized ClassHolder pattern to implement synchronized access to a class. Because the Memcache is initialized with a static initializer, it doesn't need more synchronization constructs. The first call to getInstance() by any thread causes MemcacheHolder to be loaded and initialized and the Memcache instance to make itself available to the calling code.
public class MemcacheFactory{
private static class MemcacheHolder {
public static Memcache instance = new Memcache();
}
public static Memcache getInstance() {
return MemcacheFactory.MemcacheHolder.instance;
}
}

How to have a shared context per top-level process/thread without using InheritableThreadLocal?

I'd like to see if there's a good pattern for sharing a context across all classes and subthreads of a top-level thread without using InheritableThreadLocal.
I've got several top-level processes that each run in their own thread. These top-level processes often spawn temporary subthreads.
I want each top level process to have and manage it's own database connection.
I do not want to pass around the database connection from class to class and from thread to subthread (my associate calls this the "community bicycle" pattern). These are big top-level processes and it would mean editing probably hundreds of method signatures to pass around this database connection.
Right now I call a singleton to get the database connection manager. The singleton uses InheritableThreadLocal so that each top-level process has it's own version of it. While I know some people have problems with singletons, it means I can just say DBConnector.getDBConnection(args) (to paraphrase) whenever I need the correctly managed connection. I am not tied to this method if I can find a better and yet still-clean solution.
For various reasons InheritableThreadLocal is proving to be tricky. (See this question.)
Does anyone have a suggestion to handle this kind of thing that doesn't require either InheritableThreadLocal or passing around some context object all over the place?
Thanks for any help!
Update: I've managed to solve the immediate problem (see the linked question) but I'd still like to hear about other possible approaches. forty-two's suggestion below is good and does work (thanks!), but see the comments for why it's problematic. If people vote for jtahlborn's answer and tell me that I'm being obsessive for wanting to avoid passing around my database connection then I will relent, select that as my answer, and revise my world-view.

I haven't tested this, but the idea is to create a customized ThreadPoolExecutor that knows how to get the context object and use #beforeExecute() to transfer the context object to the thread that is going to execute the task. To be a nice citizen, you should also clear the context object in #afterEXecute(), but I leave that as an exercise.
public class XyzThreadPoolExecutor extends ThreadPoolExecutor {
public XyzThreadPoolExecutor() {
super(3, 3, 100, TimeUnit.MILLISECONDS, new LinkedBlockingQueue<Runnable>(), new MyThreadFactory());
}
#Override
public void execute(Runnable command) {
/*
* get the context object from the calling thread
*/
Object context = null;
super.execute(new MyRunnable(context, command));
}
#Override
protected void beforeExecute(Thread t, Runnable r) {
((MyRunnable)r).updateThreadLocal((MyThread) t);
super.beforeExecute(t, r);
}
private static class MyThreadFactory implements ThreadFactory {
#Override
public Thread newThread(Runnable r) {
return new MyThread(r);
}
}
private class MyRunnable implements Runnable {
private final Object context;
private final Runnable delegate;
public MyRunnable(Object context, Runnable delegate) {
super();
this.context = context;
this.delegate = delegate;
}
void updateThreadLocal(MyThread thread) {
thread.setContext(context);
}
#Override
public void run() {
delegate.run();
}
}
private static class MyThread extends Thread {
public MyThread(Runnable target) {
super(target);
}
public void setContext(Object context) {
// set the context object here using thread local
}
}
}

the "community bicycle" solution (as you call it) is actually much better than the global (or pseudo global) singleton that you are currently using. it makes the code testable and it makes it very easy to choose which classes use which context. if done well, you don't need to add the context object to every method signature. you generally ensure that all the "major" classes have a reference to the current context, and that any "minor" classes have access to the relevant "major" class. one-off methods which may need access to the context will need their method signatures updated, but most classes should have the context available through a member variable.

As a ThreadLocal is essentially a Map keyed on your thread, couldn't you implement a Map keyed on your thread name? All you then need is an effective naming strategy that meets your requirements.

As a Lisper, I very much agree with your worldview and would consider it a shame if you were to revise it. :-)
If it were me, I would simply use a ThreadGroup for each top-level process, and associate each connection with the group the caller is running in. If using in conjunction with thread pools, just ensure the pools use threads in the correct thread group (for instance, by having a pool per thread group).
Example implementation:
public class CachedConnection {
/* Whatever */
}
public class ProcessContext extends ThreadGroup {
private static final Map<ProcessContext, Map<Class, Object>> contexts = new WeakHashMap<ProcessContext, Map<Class, Object>>();
public static T getContext(Class<T> cls) {
ProcessContext tg = currentContext();
Map<Class, Object> ctx;
synchronized(contexts) {
if((ctx = contexts.get(tg)) == null)
contexts.put(tg, ctx = new HashMap<Class, Object>());
}
synchronized(ctx) {
Object cur = ctx.get(cls);
if(cur != null)
return(cls.cast(cur));
T new_t;
try {
new_t = cls.newInstance();
} catch(Exception e) {
throw(new RuntimeException(e));
}
ctx.put(cls, new_t);
return(new_t);
}
}
public static ProcessContext currentContext() {
ThreadGroup tg = Thread.currentThread().getThreadGroup();
while(true) {
if(tg instanceof ProcessContext)
return((ProcessContext)tg);
tg = tg.getParent();
if(tg == null)
throw(new IllegalStateException("Not running in a ProcessContext"));
}
}
}
If you then simply make sure to run all your threads in a proper ProcessContext, you can get a CachedConnection anywhere by calling ProcessContext.getContext(CachedConnection.class).
Of course, as mentioned above, you would have to make sure that any other threads you may delegate work to also run in the correct ProcessContext, but I'm pretty sure that problem is inherent in your description -- you would obviously need to specify somehow which one of multiple contexts your delegation workers run in. If anything, it could be conceivable to modify ProcessContext as follows:
public class ProcessContext extends ThreadGroup {
/* getContext() as above */
private static final ThreadLocal<ProcessContext> tempctx = new ThreadLocal<ProcessContext>();
public static ProcessContext currentContext() {
if(tempctx.get() != null)
return(tempctx.get());
ThreadGroup tg = Thread.currentThread().getThreadGroup();
while(true) {
if(tg instanceof ProcessContext)
return((ProcessContext)tg);
tg = tg.getParent();
if(tg == null)
throw(new IllegalStateException("Not running in a ProcessContext"));
}
}
public class RunnableInContext implements Runnable {
private final Runnable delegate;
public RunnableInContext(Runnable delegate) {this.delegate = delegate;}
public void run() {
ProcessContext old = tempctx.get();
tempctx.set(ProcessContext.this);
try {
delegate.run();
} finally {
tempctx.set(old);
}
}
}
public static Runnable wrapInContext(Runnable delegate) {
return(currentContext().new RunnableInContext(delegate));
}
}
That way, you could use ProcessContext.wrapInContext() to pass a Runnable which, when run, inherits its context from where it was created.
(Note that I haven't actually tried the above code, so it may well be full of typos.)

I would not support your world-view and jthalborn's idea on the count that its more testable even.
Though paraphrasing first what I have understood from your problme statement is like this.
There are 3 or 4 top-level processes (and they are basically having a thread of their own). And connection object is what is diffrenet in them.
You need some basic characteristic of Connection to be set up and done once.
The child threads in no way change the Connection object passe to them from top-level threads.
Here is what I propose, you do need the one tim,e set-up of you Connection but then in each of your top-level process, you do 1) further processing of that Connection 2) keep a InheriatbleThreadLocal (and the child process of your top-level thread will have the modified connection object. 3) Pass these threasd implementing classes. MyThread1, MyThread2, MyThread3, ... MyThread4 in the Executor. (This is different from the other linked question of yours that if you need some gating, Semaphore is a better approach)
Why I said that its not less testable than jthalborn's view is that in that case also you anyway again needs to provide mocked Connection object. Here too. Plus conecptually passing the object and keeping the object in ThreadLocal is one and the same (InheritableThreadLocal is a map which gets passed by java inbuilt way, nothing bad here I believe).
EDIT: I did keep in account that its a closed system and we are not having "free" threads tempring with connection

Why do I need the Singleton pattern in this multithreaded application?

I recently had a problem with two threads sticking in deadlock because they weren't monitoring the same object the way I thought they were. As it turned out, implementing the Singleton pattern solved the problem. But why?
I only instantiated one instance of the class of which the object was a private property, so I expected it to be effectively singleton anyway.
For the sake of completeness of the question, here is also some code illustrating the difference:
Before the Singleton pattern was implemented:
class Worker {
private BlockingQueue q = new LinkedBlockingQueue();
public void consume(String s) {
// Called by thread 1.
// Waits until there is anything in the queue, then consumes it
}
public void produce(String s) {
// Called by thread 2.
// Puts an object in the queue.
}
// Actually implements Runnable, so there's a run() method here too...
}
The threads were started like this:
Worker w = new Worker();
new Thread(w).start();
// Producer also implements Runnable. It calls produce on its worker.
Producer p = new Producer(w);
new Thread(p).start();
Now, when I examined the queues that were actually used in produce() and consume(), System.identityHashCode(q) gave different results in the different threads.
With the singleton pattern:
class Worker {
private static BlockingQueue q;
private BlockingQueue getQueue() {
if(q == null) {
q = new LinkedBlockingQueue();
}
return q;
}
// The rest is unchanged...
}
Suddenly, it works. Why is this pattern necessary here?

The problem is that you are creating a new Worker() inside the Server constructor. You have this:
public Server(Worker worker) {
this.clients = new ArrayList<ClientHandle>();
this.worker = new Worker(); // This is the problem.
// Don't do this in the Server constructor.
this.worker = new Worker();
// Instead do this:
this.worker = worker;

Based on the pseudo code you've posted, it is not actually the singleton pattern that made the difference, but simply the use of static. In your first example, the queue is not declared static, so each instance of Worker is going in instantiate its own individual LinkedBlockingQueue. When you declare it static in the second example, the Queue is created at the class level and shared among all instances.
Based on the code you posted in your other question, the error is right here on the last line:
public Server(Worker worker) {
this.clients = new ArrayList<ClientHandle>();
this.worker = new Worker();
So your statement
I only instantiated one instance of the class of which the object was
a private property, so I expected it to be effectively singleton
anyway.
is inaccurate. You're isntantiating a new Worker in every new server, not reusing the one passed in.

Propagating ThreadLocal to a new Thread fetched from a ExecutorService

I'm running a process in a separate thread with a timeout, using an ExecutorService and a Future (example code here) (the thread "spawning" takes place in a AOP Aspect).
Now, the main thread is a Resteasy request. Resteasy uses one ore more ThreadLocal variables to store some context information that I need to retrieve at some point in my Rest method call. Problem is, since the Resteasy thread is running in a new thread, the ThreadLocal variables are lost.
What would be the best way to "propagate" whatever ThreadLocal variable is used by Resteasy to the new thread? It seems that Resteasy uses more than one ThreadLocal variable to keep track of context information and I would like to "blindly" transfer all the information to the new thread.
I have looked at subclassing ThreadPoolExecutor and using the beforeExecute method to pass the current thread to the pool, but I couldn't find a way to pass the ThreadLocal variables to the pool.
Any suggestion?
Thanks

The set of ThreadLocal instances associated with a thread are held in private members of each Thread. Your only chance to enumerate these is to do some reflection on the Thread; this way, you can override the access restrictions on the thread's fields.
Once you can get the set of ThreadLocal, you could copy in the background threads using the beforeExecute() and afterExecute() hooks of ThreadPoolExecutor, or by creating a Runnable wrapper for your tasks that intercepts the run() call to set an unset the necessary ThreadLocal instances. Actually, the latter technique might work better, since it would give you a convenient place to store the ThreadLocal values at the time the task is queued.
Update: Here's a more concrete illustration of the second approach. Contrary to my original description, all that is stored in the wrapper is the calling thread, which is interrogated when the task is executed.
static Runnable wrap(Runnable task)
{
Thread caller = Thread.currentThread();
return () -> {
Iterable<ThreadLocal<?>> vars = copy(caller);
try {
task.run();
}
finally {
for (ThreadLocal<?> var : vars)
var.remove();
}
};
}
/**
* For each {#code ThreadLocal} in the specified thread, copy the thread's
* value to the current thread.
*
* #param caller the calling thread
* #return all of the {#code ThreadLocal} instances that are set on current thread
*/
private static Collection<ThreadLocal<?>> copy(Thread caller)
{
/* Use a nasty bunch of reflection to do this. */
throw new UnsupportedOperationException();
}

Based on #erickson answer I wrote this code. It is working for inheritableThreadLocals. It builds list of inheritableThreadLocals using same method as is used in Thread contructor. Of course I use reflection to do this. Also I override the executor class.
public class MyThreadPoolExecutor extends ThreadPoolExecutor
{
#Override
public void execute(Runnable command)
{
super.execute(new Wrapped(command, Thread.currentThread()));
}
}
Wrapper:
private class Wrapped implements Runnable
{
private final Runnable task;
private final Thread caller;
public Wrapped(Runnable task, Thread caller)
{
this.task = task;
this.caller = caller;
}
public void run()
{
Iterable<ThreadLocal<?>> vars = null;
try
{
vars = copy(caller);
}
catch (Exception e)
{
throw new RuntimeException("error when coping Threads", e);
}
try {
task.run();
}
finally {
for (ThreadLocal<?> var : vars)
var.remove();
}
}
}
copy method:
public static Iterable<ThreadLocal<?>> copy(Thread caller) throws Exception
{
List<ThreadLocal<?>> threadLocals = new ArrayList<>();
Field field = Thread.class.getDeclaredField("inheritableThreadLocals");
field.setAccessible(true);
Object map = field.get(caller);
Field table = Class.forName("java.lang.ThreadLocal$ThreadLocalMap").getDeclaredField("table");
table.setAccessible(true);
Method method = ThreadLocal.class
.getDeclaredMethod("createInheritedMap", Class.forName("java.lang.ThreadLocal$ThreadLocalMap"));
method.setAccessible(true);
Object o = method.invoke(null, map);
Field field2 = Thread.class.getDeclaredField("inheritableThreadLocals");
field2.setAccessible(true);
field2.set(Thread.currentThread(), o);
Object tbl = table.get(o);
int length = Array.getLength(tbl);
for (int i = 0; i < length; i++)
{
Object entry = Array.get(tbl, i);
Object value = null;
if (entry != null)
{
Method referentField = Class.forName("java.lang.ThreadLocal$ThreadLocalMap$Entry").getMethod(
"get");
referentField.setAccessible(true);
value = referentField.invoke(entry);
threadLocals.add((ThreadLocal<?>) value);
}
}
return threadLocals;
}

As I understand your problem, you can have a look at InheritableThreadLocal which is meant to pass ThreadLocal variables from Parent Thread context to Child Thread Context

I don't like Reflection approach. Alternative solution would be to implement executor wrapper and pass object directly as a ThreadLocal context to all child threads propagating a parent context.
public class PropagatedObject {
private ThreadLocal<ConcurrentHashMap<AbsorbedObjectType, Object>> data = new ThreadLocal<>();
//put, set, merge methods, etc
}
==>
public class ObjectAwareExecutor extends AbstractExecutorService {
private final ExecutorService delegate;
private final PropagatedObject objectAbsorber;
public ObjectAwareExecutor(ExecutorService delegate, PropagatedObject objectAbsorber){
this.delegate = delegate;
this.objectAbsorber = objectAbsorber;
}
#Override
public void execute(final Runnable command) {
final ConcurrentHashMap<String, Object> parentContext = objectAbsorber.get();
delegate.execute(() -> {
try{
objectAbsorber.set(parentContext);
command.run();
}finally {
parentContext.putAll(objectAbsorber.get());
objectAbsorber.clean();
}
});
objectAbsorber.merge(parentContext);
}

Here is an example to pass the current LocaleContext in parent thread to the child thread spanned by CompletableFuture[By default it used ForkJoinPool].
Just define all the things you wanted to do in a child thread inside a Runnable block. So when the CompletableFuture execute the Runnable block, its the child thread who is in control and voila you have the parent's ThreadLocal stuff set in Child's ThreadLocal.
The problem here is not the entire ThreadLocal is copied over. Only the LocaleContext is copied. Since the ThreadLocal is of private access to only the Thread it belongs too using Reflection and trying to get and set in Child is all too much of wacky stuff which might lead to memory leaks or performance hit.
So if you know the parameters you are interested from the ThreadLocal, then this solution works way cleaner.
public void parentClassMethod(Request request) {
LocaleContext currentLocale = LocaleContextHolder.getLocaleContext();
executeInChildThread(() -> {
LocaleContextHolder.setLocaleContext(currentLocale);
//Do whatever else you wanna do
}));
//Continue stuff you want to do with parent thread
}
private void executeInChildThread(Runnable runnable) {
try {
CompletableFuture.runAsync(runnable)
.get();
} catch (Exception e) {
LOGGER.error("something is wrong");
}
}

If you look at ThreadLocal code you can see:
public T get() {
Thread t = Thread.currentThread();
...
}
current thread cannot be overwritten.
Possible solutions:
Look at java 7 fork/join mechanism (but i think it's a bad way)
Look at endorsed mechanism to overwrite ThreadLocal class in your JVM.
Try to rewrite RESTEasy (you can use Refactor tools in your IDE to replace all ThreadLocal usage, it's look like easy)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Behavior of singletons in task queues on app-engine - java

Related

What is the best practise for using a service in java threads?

Java : Synchronization of code

How to have a shared context per top-level process/thread without using InheritableThreadLocal?

Why do I need the Singleton pattern in this multithreaded application?

Propagating ThreadLocal to a new Thread fetched from a ExecutorService

Categories

Resources