Servlet 3 spec and ThreadLocal

Servlet 3 spec and ThreadLocal - java

As far as I know, Servlet 3 spec introduces asynchronous processing feature. Among other things, this will mean that the same thread can and will be reused for processing another, concurrent, HTTP request(s). This isn't revolutionary, at least for people who worked with NIO before.
Anyway, this leads to another important thing: no ThreadLocal variables as a temporary storage for the request data. Because if the same thread suddenly becomes the carrier thread to a different HTTP request, request-local data will be exposed to another request.
All of that is my pure speculation based on reading articles, I haven't got time to play with any Servlet 3 implementations (Tomcat 7, GlassFish 3.0.X, etc.).
So, the questions:
Am I correct to assume that ThreadLocal will cease to be a convenient hack to keep the request data?
Has anybody played with any of Servlet 3 implementations and tried using ThreadLocals to prove the above?
Apart from storing data inside HTTP Session, are there any other similar easy-to-reach hacks you could possibly advise?
EDIT: don't get me wrong. I completely understand the dangers and ThreadLocal being a hack. In fact, I always advise against using it in similar context. However, believe it or not, thread context has been used far more frequently than you probably imagine. A good example would be Spring's OpenSessionInViewFilter which, according to its Javadoc:
This filter makes Hibernate Sessions
available via the current thread,
which will be autodetected by
transaction managers.
This isn't strictly ThreadLocal (haven't checked the source) but already sounds alarming. I can think of more similar scenarios, and the abundance of web frameworks makes this much more likely.
Briefly speaking, many people have built their sand castles on top of this hack, with or without awareness. Therefore Stephen's answer is understandable but not quite what I'm after. I would like to get a confirmation whether anyone has actually tried and was able to reproduce failing behaviour so this question could be used as a reference point to others trapped by the same problem.

Async processing shouldn't bother you unless you explcitly ask for it.
For example, request can't be made async if servlet or any of filters in request's filter chain is not marked with <async-supported>true</async-supported>. Therefore, you can still use regular practices for regular requests.
Of couse, if you actually need async processing, you need to use appropriate practices. Basically, when request is processed asynchronously, its processing is broken into parts. These parts don't share thread-local state, however, you can still use thread-local state inside each of that parts, though you have to manage the state manually between the parts.

(Caveat: I've not read the Servlet 3 spec in detail, so I cannot say for sure that the spec says what you think it does. I'm just assuming that it does ...)
Am I correct to assume that ThreadLocal will cease to be a convenient hack to keep the request data?
Using ThreadLocal was always a poor approach, because you always ran the risk that information would leak when a worker thread finished one request and started on another one. Storing stuff as attributes in the ServletRequest object was always a better idea.
Now you've simply got another reason to do it the "right" way.
Has anybody played with any of Servlet 3 implementations and tried using ThreadLocals to prove the above?
That's not the right approach. It only tells you about the particular behaviour of a particular implementation under the particular circumstances of your test. You cannot generalize.
The correct approach is to assume that it will sometimes happen if the spec says it can ... and design your webapp to take account of it.
(Fear not! Apparently, in this case, this does not happen by default. Your webapp has to explicitly enable the async processing feature. If your code is infested with thread locals, you would be advised not to do this ...)
Apart from storing data inside HTTP Session, are there any other similar easy-to-reach hacks you could possibly advise.
Nope. The only right answer is storing request-specific data in the ServletRequest or ServletResponse object. Even storing it in the HTTP Session can be wrong, since there can be multiple requests active at the same time for a given session.

NOTE: Hacks follow. Use with caution, or really just don't use.
So long as you continue to understand which thread your code is executing in, there's no reason you can't use a ThreadLocal safely.
try {
tl.set(value);
doStuffUsingThreadLocal();
} finally {
tl.remove();
}
It's not as if your call stack is switched out randomly. Heck, if there are ThreadLocal values you want to set deep in the call stack and then use further out, you can hack that too:
public class Nasty {
static ThreadLocal<Set<ThreadLocal<?>>> cleanMe =
new ThreadLocal<Set<ThreadLocal<?>>>() {
protected Set<ThreadLocal<?>> initialValue() {
return new HashSet<ThreadLocal<?>>();
}
};
static void register(ThreadLocal<?> toClean) {
cleanMe.get().add(toClean);
}
static void cleanup() {
for(ThreadLocal<?> tl : toClean)
tl.remove();
toClean.clear();
}
}
Then you register your ThreadLocals as you set them, and cleanup in a finally clause somewhere. This is all shameful wankery that you shouldn't probably do. I'm sorry I wrote it but it's too late :/

I'm still wondering why people use the rotten javax.servlet API to actually implement their servlets. What I do:
I have a base class HttpRequestHandler which has private fields for request, response and a handle() method that can throw Exception plus some utility methods to get/set parameters, attributes, etc. I rarely need more than 5-10% of the servlet API, so this isn't as much work as it sounds.
In the servlet handler, I create an instance of this class and then forget about the servlet API.
I can extend this handler class and add all the fields and data that I need for the job. No huge parameter lists, no thread local hacking, no worries about concurrency.
I have a utility class for unit tests that creates a HttpRequestHandler with mock implementations of request and response. This way, I don't need a servlet environment to test my code.
This solves all my problems because I can get the DB session and other things in the init() method or I can insert a factory between the servlet and the real handler to do more complex things.

You are psychic ! (+1 for that)
My aim is ... to get a proof this has stopped working in Servlet 3.0 container
Here is the proof that you were asking for.
Incidentally, it is using the exact same OEMIV filter that you mentioned in your question and, guess what, it breaks Async servlet processing !
Edit: Here is another proof.

One solution is to not use ThreadLocal but rather use a singleton that contains a static array of the objects you want to make global. This object would contain a "threadName" field that you set. You first set the current thread's name (in doGet, doPost) to some random unique value (like a UUID), then store it as part of the object that contains the data you want stored in the singleton. Then whenever some part of your code needs to access the data, it simply goes through the array and checks for the object with the threadName that is currently running and retrieve the object. You'll need to add some cleanup code to remove the object from the array when the http request completes.

Related

What is the best variable scope I should use in servlet operation

I am working on workflow management system.
Have one separate java class which contains logic method. One of this is:
public static in get_nxt_stg(int current_stg,int action)
{
}
and define static variable cur_stg and nxt_stg. used in servlet. call this method.
When multiple users log in and do some action these variables get not proper value. It seems like it is shared between all user requests.
What is best way to use variable in servlet, which is remain specific for that request?

You should not use static in such a way. If you need to share state, consider using the singleton pattern; but try to avoid static. Unwise use of "static" can turn into a nightmare (for example regarding unit testing).
In addition: it seems that you are a beginner with the Java language. But creating servlets is definitely a "advanced" java topic. I really recommend you to start learning more about Java as preparation for working on servlets. Otherwise the user of your server might have many unpleasant experiences ...

What you are doing is wrong. You should use Servlets only for the purpose of reading request parameters and sending responses. What you are trying to do, should be implemented in the Business layer of your application and if you have it implemented with EJBs, then your problem can easily be solved with an Stateful EJB.

Is it a bad practice to use a ThreadLocal Object for storing web request metadata?

I am working on a j2ee webapp divided in several modules. I have some metadata such as user name and preferences that I would like to access from everywhere in the app, and maybe also gather data similar to logging information but specific to a request and store it in those metadata so that I could optionally send it back as debug information to the user.
Aside from passing a generic context object throughout every method from the upper presentation classes to the downer daos or using AOP, the only solution that came in mind was using a threadlocal "Context" object very similar to a session BTW, and add a filter for binding it on ongoing request and unbinding it on response.
But such thing feels a little hacky since this breaks several patterns and could possibly make things complicated when it comes to testing and debugging so I wanted to ask if from your experience it is ok to proceed like this?

ThreadLocal is a hack to make up for bad design and/or architecture. It's a terrible practice:
It's a pool of one or more global variables and global variables in any language are bad practice (there's a whole set of problems associated with global variables - search it on the net)
It may lead to memory leaks, in any J2EE container than manages its threads, if you don't handle it well.
What's even worse practice is to use the ThreadLocal in the various layers.
Data communicated from one layer to another should be passed using Transfer Objects (a standard pattern).
It's hard to think of a good justification for using ThreadLocal. Perhaps if you need to communicate some values between 2 layers that have a third/middle layer between them, and you don't have the means to make changes to that middle layer. But if that's the case, I would look for a better middle layer.
In any case, if you store the values in one specific point in the code and retrieve it in another single point, then it may be excusable, otherwise you just never know what side affects any executing method may have on the values in the ThreadLocal.

Personally I prefer passing a context object, as the fact that the same thread is used for processing is an artifact of the implementation, and you shouldn't rely on such artifacts. The moment you want to use other threads, you'll hit a wall.
If those states are encapsulated in a Context object, I think that's clean enough.

When it comes to testing, the best tool is dependency injection. It allows to inject fake dependencies into the object under test.
And all dependency injection frameworks (Spring, CDI, Guice) have the concept of a scope (where request is one of these scopes). Under the hood, beans stored in the request scoped are indeed associated with a ThreadLocal variable, but this is all done by the dependency injection framework.
What I would do is thus to use a DI framework, which would make request-scope objects available anywhere, but without having to look them up, which would break testability. Just inject a request-scoped object where you want to use it, and the DI framework will retrieve it for you.

You must know that a servlet container can / will re-use threads for requests so if you do use ThreadLocals, you'll need to clean up after yourself once the request is finished (perhaps using a filter)

If you are the only developer in the project and you think you gain something: just do it! Because it is your time. But, be prepared to revert the decision and reorganize the code base later, as should be always the case.
Let's say there are ten developers on the project. Everybody might like to have its thread local variable to pass on parameters like currency, locale, roles, maybe it becomes even a HashMap....
I think in the end, not everything which is feasible, should be done. Complexity will strike back on you....

ThreadLocal can lead to memory leak if we do not set null manually once its out of scope.

How is threading done for Google Cloud Endpoints handlers in Java?

I'm noticing some strange behavior in my app that smells like a lack of thread-safety. I'm working on reproducing it, but in the meantime I wanted to ensure I'm making the right assumptions about how the class that contains my endpoint handlers is used from a threading perspective. Most of what happens is opaque to me, because I'm not the one instantiating the class in the first place. To state the obvious, it must be some black magic in Endpoints.
MY ASSUMPTION
An instance of the class that holds my endpoint handlers is created for every single request that comes into my app. Based upon that assumption, it's ok for that class to have non-thread-safe objects that get used by my handlers.
MY FEAR
The instances of Endpoint handler classes are reused across requests.
So, which is it? Regardless of the answer, I think it would make sense for me to remove the ambiguity in my app and assume the worst, because I don't think I have any control over how Endpoints behaves. In my case, I'm creating a JDO/DataNucleus PersistenceManager (not thread-safe) when constructing the class housing my endpoint handlers. I should probably just create it in each handler as a local, or use a ThreadLocal.
I can probably also fashion a test to prove one or the other. I'll post back an answer to my own question if I do.

Should heavy-use objects be created on application/session level in Coldfusion?

I'm running Coldfusion8/MySQL 5.0.88.
My applications main feature is a search function, which on submit triggers an AJAX request calling a cfc-method. The method assembles the HTML, gzips it and returns gzipped HTML as Ajax response.
This is the gzip part:
<cfscript>
var result="";
var text=createObject("java","java.lang.String").init(arguments[1]);
var dataStream=createObject("java","java.io.ByteArrayOutputStream").init();
var compressDataStream=createObject("java","java.util.zip.GZIPOutputStream").init(dataStream);
compressDataStream.write(text.getBytes());
compressDataStream.finish();
compressDataStream.close();
</cfscript>
I am a little reluctant regarding the use of cfobject here, especially since this script will be called over and over again by every user.
Question:
Would it increase performance if I create the object on the application or session level or at least check for the existence of the object before re-creating it. What's the best way to handle this?

If your use of objects is like what's in the code snippet in the question, I'd not put anything into any scope longer-lived than request. The reasons being:
The objects you are instantiating are not re-usable (Strings are immutable, and the output streams don't look re-usable either)
Even if they were re-usable, the objects in question aren't thread-safe. They can't be shared between concurrent requests, so application scope isn't appropriate and actually session scope probably isn't safe either as concurrent requests for the same session can easily occur.
The objects you're using there are probably very low overhead to create, so there'd be little benefit to trying to cache them, if you could.
If you have objects that are really resource intensive, then caching and pooling them can make sense (e.g. Database Connections), but it's considerable effort to get right, so you need to be sure that you need it first.

Get the HttpServletRequest (request) object from Java code

I need to get hold of the request object in Java code. I can't pass this object down to my code for certain reasons. Is there any way I can say something like: getCurrentHTTPServletRequest?
It is safe for me to assume that I am in a Servlet Context.

Well you should pass it down if you need it. Anything else you do is going to be ugly, basically.
You could use a ThreadLocal variable - basically set the context for that particular thread when you get the request, and then fetch it later on. That will work so long as you only need to get at the request within the thread that's processing it - and so long as you don't do any funky asynchronous request handling. It's brittle though, for precisely those reasons.
However, I would strongly advise you to be explicit about your dependencies instead. Either pass the servlet request down, or just the bits that you need.

Assuming you're not able to pass the request object down the call stack, then some kind of sharing mechanism becomes necessary, which is not ideal, but sometimes necessary.
Spring provides the RequestContextFilter for just this purpose. It uses ThreadLocal, and allows the code to fetch the current request via RequestContextHolder. Note that this filter does not require you to use any other part of Spring:
Servlet 2.3 Filter that exposes the
request to the current thread, through
both LocaleContextHolder and
RequestContextHolder. To be registered
as filter in web.xml.
This filter is mainly for use with
third-party servlets, e.g. the JSF
FacesServlet. Within Spring's own web
support, DispatcherServlet's
processing is perfectly sufficient.
If you're going to use ThreadLocal, then better to use an existing, working solution, rather than risk bugs creeping in, which ThreadLocal code is prone to.

Jon Skeet said practically everything, but one clarification to his advice "just the bits that you need" - if you need your request parameters passed down, but you don't need a dependency on HttpServletRequest, pass request.getParameterMap().
And extending a bit on the ThreadLocal option - you can have a Filter which handles all incoming requests, and sets the request in a
public final static ThreadLocal<HttpServletRequest> httpServletRequestTL =
new ThreadLocal<HttpServletRequest>();
Because you are setting it on each request (careful with the filter mapping), you won't have to worry about the servlet-container thread pool - you will always have the current request.
P.S. this is the logic behind the spring utility proposed by skaffman - I join him recommending the stable component, rather than making your own.

There is no servlet API to do this. However, Tomcat does provide an API call to do this,
HttpServletRequest request = (HttpServletRequest)org.apache.catalina.core.ApplicationFilterChain.getLastServicedRequest();
This will get the last request passed to a servlet for servicing from the current thread.
For this to work, the Tomcat must be in "Strict Servlet Compliance" mode. If not, you need to enable it by adding this JVM parameter:
org.apache.catalina.STRICT_SERVLET_COMPLIANCE=true

Assuming the top-level servlet really is taboo for some crazy business-related reason, there is still the option of defining a ServletFilter to pre-view the request and stuff it into a ThreadLocal. Assuming that the web.xml is not also sacrosanct.
But I agree with Jon Skeet in that this would be very ugly. I'd code this up and then try to find a different job. :)
Actually, given the fact that a filter can totally wrest away control from the receiving servlet, you could use this technique to divert the code to a servlet of your own, do whatever you want, and THEN run the other, "official" servlet... or anything else along those lines. Some of those solutions would even allow you to deal correctly and robustly with your request data.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.