Java Filters Performance Question

Java Filters Performance Question - java

I have two questions. The first is do Filters add a lot of overhead to request. We have a filter and it is set to run on the URL pattern /*. This means it also runs on all the image request. I think that this is not good for performance, but my co-workers think that it doesn't matter if the filter runs 5 or 6 times per request because the filter only has a couple of if statements.
Is there a way to have the filter run once per request, ignoring the image request.
Thanks Doug

Measuring is knowing. If well-written, I'd say, it's negligible. But if it's for example grabbing the session regardless of it's been created (and thus there's a chance that it will unnecessarily be created), then it may have a noticeable impact on performance and/or memory usage because creation of sessions isn't per-se cheap and sessions are stored in sever's memory for a longer term than the requests.
You may want to replace the url-pattern of /* by *.jsp or to move the restricted pages to a specific folder, e.g. /secured, /private, /pages, etc and alter the url-pattern accordingly to /secured/*, /private/*, /pages/*, etc and put all the static content in a different place, e.g. /static. This way the filter won't be invoked for static content anymore.

First, I agree with the Profile-first approach.
Second, as far as I know it depends, web-server use the same technique to invoke a specific servelt(/JSP) as they use for filters.
In case the filter is filtering a static resource(e.g. jpg file), it's a bit of a waste,
In case the filter is filtering a dynamic resource (e.g. Servlet) it's negligible..
(Most of the Java web frameworks like struts and Jboss-seam are using filters heavily..)

It almost never useful to speculate about the performance implications of code without first profiling it. Unless the code being proposed in the filters is doing some operations you know to be slow then measure first before optimising.
Remember even though when you are writing a servlet it may seem like the only thing that happens is the code in your doGet() or doPost() methods a lot of other things happen before your servlet/filter code gets invoked. The servlet container processes the HTTP request bundles it up in Java objects and does all sorts of other processing before it hands over to your code.
If your servlet filters really are only a couple of if statements operating on data that is cheap to get (such as the request itself), it is unlikely this is going to be an issue for you.

Related

JSP and MVC Best Practices

I am new to JSP programming and am writing a web app for a family member. As I study, I hear a lot about how JSP's are supposed to be used for presentation and servlets are for business logic. My question is basically about how far that goes and when my use of JSTL would be bad practice. Here's an example: I have a login page for my app, and I am using c:if's with custom functions connected to my java classes to process the form. Would that be considered poor MVC practice or, since I'm only referencing my logic code from EL, is this a legitimate use of JSP's?

Your question contains a lot of what are best-practices which invokes a lot of opinion and debate, which is usually frowned upon in this forum. In general, the JSP is the "V"iew in MVC and should be used to present the data provided by the "M"odel which would be your Java code. The "C"ontroller is often scattered between the M and the V (inviting more debate, sorry).
Any logic you put in your JSP that is beyond looking at the data given you and deciding how to present it, moves it towards the Model. Your login page should just collect the credentials and present them to the Model, which should in turn respond with "Invalid" and re-request the credentials (or fail completely) or if valid, move on to the next page.
In practice, IMHO, you should not put a lot, if any, code that manipulates the data except for formatting it - creating table entries, wrapping with links, etc. You should not (IMHO) query databases, perform calculations, etc., in the JSP - let the Model do that.
As duffymo stated, JSPs are old, but they are still valid. I would suggest that you also consider AngularJs (ng) (after reading about the controversy of V1 v. V2).

JSP is an outdated technology and there are very few Softwares that still use it. But if you want to use it I would suggest that you use Oracle Coding Standards with it. This page should give you a clear idea of what you should and shouldn't do with it.

The best practice with JSP - is not to use JSP at all. I’ll try to explain why and be clear.
First I have to explain something that does not have a connection to JSP at all, but it will help you to understand exact problems with JSP technology.
In functional programming there is a term - pure function. It means that it does not have side effects. Additionally, such function does guarantee that for each invocation with the same input it ALWAYS return the same output.
In OOP functions are not pure. It may have side effects. It makes our life more complicated. But what is important is that these side effects can happen only WITHIN your function. You can debug it. More or less it is UNDER YOUR CONTROL.
Let’s imagine our functionality written in JSP as a function f with input I and output O:
O f(I)
The first problem with JSP is that it DOES have side effects AND such side effects can happen not only inside of your function f, but also can affect it from outside. A simple example: you use tiles technology, your jsp page is used as a component in a tiles template. Another component of this template uses getOutputstream() method and writes to this output stream. But an application can either call getOutputStream or getWriter on any given response, it's not allowed to do both. JSP engines use getWriter, and so you cannot call getOutputStream. And you get in your jsp page that works fine when it is alone:
java.lang.IllegalStateException: getOutputStream() has already been called for this response
getOutputStream() has already been called for this response
With a function you get explicitly input parameters. The input is always clear. Additionally you can use constants or, if your function has side effect use another service to get data for processing. But it is always WITHIN your function and more or less under control. With JSP pages you do not have such control at all. Input data can be put into session with different servlets/web components, input data can be put into request scope via servlet with a lot of if statements. You must first investigate a logic of this servlet. It is additional complexity that is not obvious when you create “Hello World!” program, but that really makes you crazy when you maintain such pages, written several years ago.
I think you have already read that mixing both output and logic is not a good idea. JSP allows people to do that. Cause it is “so convenient”.
You cannot test logic within your jsp pages. Or it makes it more difficult.
You can say that correct usage of jsp technology and applying best practices resolve most of issues. Yes. Agree. But it will never get rid of its internal drawbacks and complexity. You always have to check, really developers followed best practices or not? There are better, much better technologies our days.
Note: the only exception, or use case, when I’d personally would use it: for localisation. You do not have get all messages from server. You do not want to ask a server to get a localized string one by one. You do want to get a batch of values, that will be used on your web form, for instance. With JSP + JS you can do that very easily and explicite.

Should heavy-use objects be created on application/session level in Coldfusion?

I'm running Coldfusion8/MySQL 5.0.88.
My applications main feature is a search function, which on submit triggers an AJAX request calling a cfc-method. The method assembles the HTML, gzips it and returns gzipped HTML as Ajax response.
This is the gzip part:
<cfscript>
var result="";
var text=createObject("java","java.lang.String").init(arguments[1]);
var dataStream=createObject("java","java.io.ByteArrayOutputStream").init();
var compressDataStream=createObject("java","java.util.zip.GZIPOutputStream").init(dataStream);
compressDataStream.write(text.getBytes());
compressDataStream.finish();
compressDataStream.close();
</cfscript>
I am a little reluctant regarding the use of cfobject here, especially since this script will be called over and over again by every user.
Question:
Would it increase performance if I create the object on the application or session level or at least check for the existence of the object before re-creating it. What's the best way to handle this?

If your use of objects is like what's in the code snippet in the question, I'd not put anything into any scope longer-lived than request. The reasons being:
The objects you are instantiating are not re-usable (Strings are immutable, and the output streams don't look re-usable either)
Even if they were re-usable, the objects in question aren't thread-safe. They can't be shared between concurrent requests, so application scope isn't appropriate and actually session scope probably isn't safe either as concurrent requests for the same session can easily occur.
The objects you're using there are probably very low overhead to create, so there'd be little benefit to trying to cache them, if you could.
If you have objects that are really resource intensive, then caching and pooling them can make sense (e.g. Database Connections), but it's considerable effort to get right, so you need to be sure that you need it first.

Is there a way to limit the number of AJAX calls in the browser that remain open?

I have a software design question on what's the best way to handle a client javascript program that relies in multiple (but mostly consecutive, not simultaneous), short-lived AJAX calls to the server as a response to user interaction [in my particular case, it will be a facebook-GAE/J app, but I believe the question is relevant to any client(browser)/server design].
First, I asked this question: What is the life span of an ajax call? . Based on BalusC answer (I encourage it to read it there), the short answer is "that's up to the browser". So, right now I do not have really control of what's happening after the server sent the response.
If the main use for an AJAX call is to retrieve data just once from the server, is it possible to manually destroy it? Would xhr1.abort() do that?
Or, the best choice is leave it like that? Would manually closing each connection (if even possible) add too much overhead to each call?
Is it possible to manually set the limit per domain?
And last (but not least!), should I really worry about this? What would be a number of calls large enough to start delaying the browser (specially some IE browsers with the leak bug that BalusC mentioned in the other question? Please, bear in mind that this is my first javascript/java servlets project.
Thank you in advance

The usage paradigm for XHR is that you don't have to worry about what happens to the object -- the browser's engine takes care of that behind the scenes for you. So I don't see any point in attempting to "improve" things manually. Browser developers are certainly aware that 99.9999% of JS programmers do not do that, so they have not only taken it into account but probably optimized for that scenario as well.
You should not worry about it unless and until you have a concrete problem in your hands.
As for limiting the number of AJAX calls per domain (either concurrent outstanding calls, or total calls made, or any other metric you might be interested in), the solution would be the venerable CS classic: add another layer of abstraction.
In this case, the extra layer of abstraction would be a function through which all AJAX calls would be routed through; you can then implement logic that tracks the progress of each call (per domain if you want it to) and rejects or postpones incoming calls based on that state. It won't be easy to get it correctly, but it's certainly doable.
However, I suggest also not worrying about this unless and until you have a concrete problem in your hands. :)
Update:
Browsers do enforce their own limits on concurrent AJAX calls; there's a very good question about that here: How many concurrent AJAX (XmlHttpRequest) requests are allowed in popular browsers?
Also, as T. J. Crowder mentions in the comments: make sure you are not keeping references to XHR objects when you are done with them, so that they can be garbage collected -- otherwise, you are creating a resource leak yourself.
Second update:
There is a good blog post about reusing XHR here -- it's actually the start of a chain of relevant posts. On the down side, it's dated and it doesn't come to any practical conclusion. But it covers the mechanics of reusing XHR well.

If the main use for an AJAX call is to retrieve data just once from the server, is it possible to manually destroy it? Would xhr1.abort() do that?
It only aborts the running request. It does not close the connection.
Or, the best choice is leave it like that? Would manually closing each connection (if even possible) add too much overhead to each call?
Not possible. It's the browser's responsibility.
Is it possible to manually set the limit per domain?
Not possible from the server side on. This is a browser specific setting. Best what you could to is to ask in some page dialog the enduser to change the setting if not done yet. But this makes after all no sense, certainly not if the enduser does totally not understand the rationale behind this.
And last (but not least!), should I really worry about this? What would be a number of calls large enough to start delaying the browser (specially some IE browsers with the leak bug that BalusC mentioned in the other question? Please, bear in mind that this is my first javascript/java servlets project.
Yes, you should certainly worry about browser specific bugs. You want your application to work without issues, do you? Why wouldn't you just use an existing ajax library like jQuery? It has already handled all nasty bugs and details under the covers for you (which is many more than only MSIE memory leaking). Just call $.ajax(), $.get(), $.post() or $.getJSON() and that's it. I wouldn't attempt to reinvent the XHR handling wheel when you're fairly new to the materials. You can find some jQuery-Servlet communication examples in this answer.

Servlet 3 spec and ThreadLocal

As far as I know, Servlet 3 spec introduces asynchronous processing feature. Among other things, this will mean that the same thread can and will be reused for processing another, concurrent, HTTP request(s). This isn't revolutionary, at least for people who worked with NIO before.
Anyway, this leads to another important thing: no ThreadLocal variables as a temporary storage for the request data. Because if the same thread suddenly becomes the carrier thread to a different HTTP request, request-local data will be exposed to another request.
All of that is my pure speculation based on reading articles, I haven't got time to play with any Servlet 3 implementations (Tomcat 7, GlassFish 3.0.X, etc.).
So, the questions:
Am I correct to assume that ThreadLocal will cease to be a convenient hack to keep the request data?
Has anybody played with any of Servlet 3 implementations and tried using ThreadLocals to prove the above?
Apart from storing data inside HTTP Session, are there any other similar easy-to-reach hacks you could possibly advise?
EDIT: don't get me wrong. I completely understand the dangers and ThreadLocal being a hack. In fact, I always advise against using it in similar context. However, believe it or not, thread context has been used far more frequently than you probably imagine. A good example would be Spring's OpenSessionInViewFilter which, according to its Javadoc:
This filter makes Hibernate Sessions
available via the current thread,
which will be autodetected by
transaction managers.
This isn't strictly ThreadLocal (haven't checked the source) but already sounds alarming. I can think of more similar scenarios, and the abundance of web frameworks makes this much more likely.
Briefly speaking, many people have built their sand castles on top of this hack, with or without awareness. Therefore Stephen's answer is understandable but not quite what I'm after. I would like to get a confirmation whether anyone has actually tried and was able to reproduce failing behaviour so this question could be used as a reference point to others trapped by the same problem.

Async processing shouldn't bother you unless you explcitly ask for it.
For example, request can't be made async if servlet or any of filters in request's filter chain is not marked with <async-supported>true</async-supported>. Therefore, you can still use regular practices for regular requests.
Of couse, if you actually need async processing, you need to use appropriate practices. Basically, when request is processed asynchronously, its processing is broken into parts. These parts don't share thread-local state, however, you can still use thread-local state inside each of that parts, though you have to manage the state manually between the parts.

(Caveat: I've not read the Servlet 3 spec in detail, so I cannot say for sure that the spec says what you think it does. I'm just assuming that it does ...)
Am I correct to assume that ThreadLocal will cease to be a convenient hack to keep the request data?
Using ThreadLocal was always a poor approach, because you always ran the risk that information would leak when a worker thread finished one request and started on another one. Storing stuff as attributes in the ServletRequest object was always a better idea.
Now you've simply got another reason to do it the "right" way.
Has anybody played with any of Servlet 3 implementations and tried using ThreadLocals to prove the above?
That's not the right approach. It only tells you about the particular behaviour of a particular implementation under the particular circumstances of your test. You cannot generalize.
The correct approach is to assume that it will sometimes happen if the spec says it can ... and design your webapp to take account of it.
(Fear not! Apparently, in this case, this does not happen by default. Your webapp has to explicitly enable the async processing feature. If your code is infested with thread locals, you would be advised not to do this ...)
Apart from storing data inside HTTP Session, are there any other similar easy-to-reach hacks you could possibly advise.
Nope. The only right answer is storing request-specific data in the ServletRequest or ServletResponse object. Even storing it in the HTTP Session can be wrong, since there can be multiple requests active at the same time for a given session.

NOTE: Hacks follow. Use with caution, or really just don't use.
So long as you continue to understand which thread your code is executing in, there's no reason you can't use a ThreadLocal safely.
try {
tl.set(value);
doStuffUsingThreadLocal();
} finally {
tl.remove();
}
It's not as if your call stack is switched out randomly. Heck, if there are ThreadLocal values you want to set deep in the call stack and then use further out, you can hack that too:
public class Nasty {
static ThreadLocal<Set<ThreadLocal<?>>> cleanMe =
new ThreadLocal<Set<ThreadLocal<?>>>() {
protected Set<ThreadLocal<?>> initialValue() {
return new HashSet<ThreadLocal<?>>();
}
};
static void register(ThreadLocal<?> toClean) {
cleanMe.get().add(toClean);
}
static void cleanup() {
for(ThreadLocal<?> tl : toClean)
tl.remove();
toClean.clear();
}
}
Then you register your ThreadLocals as you set them, and cleanup in a finally clause somewhere. This is all shameful wankery that you shouldn't probably do. I'm sorry I wrote it but it's too late :/

I'm still wondering why people use the rotten javax.servlet API to actually implement their servlets. What I do:
I have a base class HttpRequestHandler which has private fields for request, response and a handle() method that can throw Exception plus some utility methods to get/set parameters, attributes, etc. I rarely need more than 5-10% of the servlet API, so this isn't as much work as it sounds.
In the servlet handler, I create an instance of this class and then forget about the servlet API.
I can extend this handler class and add all the fields and data that I need for the job. No huge parameter lists, no thread local hacking, no worries about concurrency.
I have a utility class for unit tests that creates a HttpRequestHandler with mock implementations of request and response. This way, I don't need a servlet environment to test my code.
This solves all my problems because I can get the DB session and other things in the init() method or I can insert a factory between the servlet and the real handler to do more complex things.

You are psychic ! (+1 for that)
My aim is ... to get a proof this has stopped working in Servlet 3.0 container
Here is the proof that you were asking for.
Incidentally, it is using the exact same OEMIV filter that you mentioned in your question and, guess what, it breaks Async servlet processing !
Edit: Here is another proof.

One solution is to not use ThreadLocal but rather use a singleton that contains a static array of the objects you want to make global. This object would contain a "threadName" field that you set. You first set the current thread's name (in doGet, doPost) to some random unique value (like a UUID), then store it as part of the object that contains the data you want stored in the singleton. Then whenever some part of your code needs to access the data, it simply goes through the array and checks for the object with the threadName that is currently running and retrieve the object. You'll need to add some cleanup code to remove the object from the array when the http request completes.

Get the HttpServletRequest (request) object from Java code

I need to get hold of the request object in Java code. I can't pass this object down to my code for certain reasons. Is there any way I can say something like: getCurrentHTTPServletRequest?
It is safe for me to assume that I am in a Servlet Context.

Well you should pass it down if you need it. Anything else you do is going to be ugly, basically.
You could use a ThreadLocal variable - basically set the context for that particular thread when you get the request, and then fetch it later on. That will work so long as you only need to get at the request within the thread that's processing it - and so long as you don't do any funky asynchronous request handling. It's brittle though, for precisely those reasons.
However, I would strongly advise you to be explicit about your dependencies instead. Either pass the servlet request down, or just the bits that you need.

Assuming you're not able to pass the request object down the call stack, then some kind of sharing mechanism becomes necessary, which is not ideal, but sometimes necessary.
Spring provides the RequestContextFilter for just this purpose. It uses ThreadLocal, and allows the code to fetch the current request via RequestContextHolder. Note that this filter does not require you to use any other part of Spring:
Servlet 2.3 Filter that exposes the
request to the current thread, through
both LocaleContextHolder and
RequestContextHolder. To be registered
as filter in web.xml.
This filter is mainly for use with
third-party servlets, e.g. the JSF
FacesServlet. Within Spring's own web
support, DispatcherServlet's
processing is perfectly sufficient.
If you're going to use ThreadLocal, then better to use an existing, working solution, rather than risk bugs creeping in, which ThreadLocal code is prone to.

Jon Skeet said practically everything, but one clarification to his advice "just the bits that you need" - if you need your request parameters passed down, but you don't need a dependency on HttpServletRequest, pass request.getParameterMap().
And extending a bit on the ThreadLocal option - you can have a Filter which handles all incoming requests, and sets the request in a
public final static ThreadLocal<HttpServletRequest> httpServletRequestTL =
new ThreadLocal<HttpServletRequest>();
Because you are setting it on each request (careful with the filter mapping), you won't have to worry about the servlet-container thread pool - you will always have the current request.
P.S. this is the logic behind the spring utility proposed by skaffman - I join him recommending the stable component, rather than making your own.

There is no servlet API to do this. However, Tomcat does provide an API call to do this,
HttpServletRequest request = (HttpServletRequest)org.apache.catalina.core.ApplicationFilterChain.getLastServicedRequest();
This will get the last request passed to a servlet for servicing from the current thread.
For this to work, the Tomcat must be in "Strict Servlet Compliance" mode. If not, you need to enable it by adding this JVM parameter:
org.apache.catalina.STRICT_SERVLET_COMPLIANCE=true

Assuming the top-level servlet really is taboo for some crazy business-related reason, there is still the option of defining a ServletFilter to pre-view the request and stuff it into a ThreadLocal. Assuming that the web.xml is not also sacrosanct.
But I agree with Jon Skeet in that this would be very ugly. I'd code this up and then try to find a different job. :)
Actually, given the fact that a filter can totally wrest away control from the receiving servlet, you could use this technique to divert the code to a servlet of your own, do whatever you want, and THEN run the other, "official" servlet... or anything else along those lines. Some of those solutions would even allow you to deal correctly and robustly with your request data.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.