Session management in Thrift

Session management in Thrift - java

I can't seem to find any documentation on how session management is supposed to be done in Thrift's RPC framework.
I know I can do a
TServer.setServerEventHandler(myEventHandler);
and observe calls to createContext (called when a connection is established) and processContext (called before every method call). Still, I have to get whatever session state I maintain in those message into the handler itself.
So how can I access session information in my handlers?

Not sure if there isn't also a way to use the ServerEventHandler approach mentioned in my question, but here's how I was able to create one handler per connection.
Rather than providing a singleton processor instance, containing a handler instance, once, like this:
XProcessor<XHandler> processor = new X.Processor<XHandler>(new XHandler());
TServer server = new TSimpleServer(new TServer.Args(serverTransport)
.processor(processor));
I instead create and provide a TProcessorFactory:
TProcessorFactory processorFactory = new TProcessorFactory(null)
{
public TProcessor getProcessor(TTransport trans)
{
return new X.Processor<XHandler>(new XHandler());
}
};
TServer server = new TSimpleServer(new TServer.Args(serverTransport)
.processorFactory(processorFactory));

There is no such thing as an built-in session management with Thrift. Remember, Thrift is supposed to be a lightweight RPC and serialization framework. Managing sessions it considered outside the scope, located at least one layer on top of Thrift.
I'm not so sure, if the events approach will work, but maybe it does - I never tried it that way.
My recommendation would be to include the session ID (or whatever equivalent you use) into each call. That's how we do it, and it works quite well.
Altough quite handy, the "one handler per connection" model does not scale very far by design (same is true for the "Threaded" servers, btw). Imagine any multiple of your choice of 1000 connections hammering your service in parallel. If there is the slightest chance of that scenario becoming reality, you should think of a different solution, because most likely the approach you are about to use will not scale.
Actually, there are some plans to integrate kind of "header" data, but that stuff is not in the Apache Thrift code base yet.

Related

Drools singleton StatefulKnowledgeSession as a web service

I am working with Drools 5.6.0 and I’m ready to upgrade to 6.0 so this issue is relevant for both versions.
I have googled a lot about using Drools in a multithreaded environment and I am still unsure how to proceed. In the following scenario I’m trying to find a way to use a singleton StatefulKnowledgeSession pre-initialized with a large number of static facts as business logic for a web service.
I would like to know if there is a best practice for the scenario further described below.
I create a StatefulKnowlegdeSession singleton when the server starts
Right at the initialization I insert over 100.000 Facts into the StatefulKnowlegdeSession. I call these „static facts“ since they will not ever be modified by the rules. Static facts act more like a set of big lookup tables.
Now the rule engine is placed into a web service (Tomcat). The web service receives a Request object which will be inserted into the KnowledgeSession.
After fireAllRules() I expect the KnowledgeSession to calculate an output object which is to be returned as web service Response.
The calculation of the Response makes use of the static facts. The rules create a lot of temporary objects which are inserted into the working memory using insertLogical(). This makes sure that all garbage will be removed from working memory as soon as I retract the original Request object at the end of the web service call.
Now the question is how I will make this work in a multithreaded server?
As far as possible I would like to use only one StatefulKnowledgeSession instance (a singleton) because the static facts are BIG and it could become a memory issue.
I cannot use StatelessKnowledgeSessions freshly created at the beginning of each web service call because inserting all the static facts would take too long.
I am aware of the fact that StatefulKnowlegdeSession is not thread safe. Also, the partitioning option is not supported any more.
However, different WorkingMemoryEntryPoints / EntryPoints can be used from different threads. I could use a pool of EntryPoints and each web service call would use one instance from the pool for inserting the web service request.
This also means that I would need to multiply my rules (?) each using one particular EntryPoint, or at least the first rule, matching web service Request objects:
rule “entry rule for WORKER-1” // rule to be duplicated for entry points WORKER-2, WORKER-3,...
when
$req : Request () from entry-point “WORKER-1”
$stat : StaticFact( attr = $req.getAttr() )
then
insertLogical( new SomeTemporaryStuff ( $req ) );
end
rule “subsequent rule”
when
$tmp : SomeTemporaryStuff()
then
...go on with the calculation and create a Response at some point...
end
Subsequent rules create temporary objects in the working memory, and at this point I’m really afraid of messing up something if I’d be bombing the engine with dozens of concurrent Requests.
I could also start the KnowledgeSession in “fireUntilHalt” mode but in this case I don’t know how I could get a synchronous response from the rule engine for returning it as web service Response.

I would not use multiple entry points. Queue the requests to the thread running the session. If you want to utilize a multicore, run several services or service threads.
For your 100k facts, check carefully how its fields are represented. It's possible that String.intern() can provide considerable savings. Other objects can - since it is all static - be shared. Typically, in this sort of scenario, some extra overhead during element construction is beneficial later on, e.g., less GC overhead.
(Otherwise this is a very nice summary, almost a "howto" for runnning this scenario. +1

For what sounds like a similar system, I just use 'synchronized' on the service method. The service being a Spring bean. There aren't loads of users and responses are fast, so queuing is rare and minimal.
Depending on the number of concurrent clients which might be invoking the service and how long each request takes to get a response, you could also create a small pool of services (memory permitting).

I know this is somewhat old...but I'm posting an answer hoping the information helps somebody in a similar situation using Drools 6.x.
Two things I've learned about Drools over the past few days:
I've been caching creating of KnowledgeBase because i create the DRL objects at runtime, and once created, i cache it (using Google Guava)...but I've learned that creating StatefulKnowledgeSession (using newStatefulKnowledgeSession()fast (enough) in a single-threaded environment...but once you go multi-threaded (where I create a newStatefulKnowledgeSession` per request), you can see that creation takes longer and longer (as if new session creation tasks are being serialized) as confirmed (sort of) in this nabble forum thread
Knowledge* classes have been deprecated by newer Kie* classes in version 6.0 (which is annoying, since 99% of examples still use the older classes) ... so KnowledgeBase is replaced by KieContainer and StatefulKnowledgeSession is replaced by KieSession...in my process of optimizing my code, I upgraded from the 5.x classes to the 6.x classes
In my case (I use Drools 6.x in a REST service), I ended up pooling the Drools sessions (where the instances are reused) using Apache Commons Pooling as suggested in that same nabble thread...I can't use a Singleton because the REST service needs to be fast and I don't want other requests to be potentially blocked if one request takes longer...and so far that seems to work for me.

How to approach JMX Client polling

recently I dove into the world of JMX, trying to instrument our applications, and expose some operations through a custom JMXClient. The work of figuring out how to instrument the classes without having to change much about our existing code is already done. I accomplished this using a DynamicMBean implementation. Specifically, I created a set of annotations, which we decorate our classes with. Then, when objects are created (or initialized if they are used as static classes), we register them with our MBeanServer through a static class, that builds a dynamicMBean for the class and registers it. This has worked out beautifully when we just use JConsole or VisualVM. We can execute operations and view the state of fields all like we should be able to. My question is more geared toward creating a semi-realtime JMXClient like JConsole.
The biggest problem I'm facing here is how to make the JMXClient report the state of fields in as close to realtime as I can reasonably get, without having to modify the instrumented libraries to push notifications (eg. in a setter method of some class, set the field, then fire off a JMX notification). We want the classes to be all but entirely unaware they are being instrumented. If you check out JConsole while inspecting an attribute, there is a refresh button at the bottom of the the screen that refreshes the attribute values. The value it displays to you is the value retrieved when that attribute was loaded into the view, and wont ever change without using the refresh button. I want this to happen on its own.
I have written a small UI which shows some data about connection states, and a few field on some instrumented classes. In order to make those values reflect the current state, I have a Thread which spins in the background. Every second or so the thread attempts to get the current values of the fields I'm interested in, then the UI gets updated as a result. I don't really like this solution very much, as its tricky to write the logic that updates the underlying models. And even trickier to update the UI in a way that doesn't cause strange bugs (using Swing).
I could also write an additional section of the JMXAgent in our application side, with a single thread that runs through the list of DynamicMBeans that have been registered, determines if the values of their attributes have change, then pushes a notification(s). This would move the notification logic out of the instrumented libraries, but still puts more load on the applications :(.
I'm just wondering if any of you have been in this position with JMX, or something else, and can guide me in the right direction for a design methodology for the JMXClient or really any other advice that could make this solution more elegant than the one I have.
Any suggestions you guys have would be appreciated.

If you don't want to change the entities then something is going to have to poll them. Either your JMXAgent or the JMX client is going to have to request the beans every so often. There is no way for you to get around this performance hit although since you are calling a bunch of gets, I don't think it's going to be very expensive. Certainly your JMXAgent would be better than the JMX client polling all of the time. But if the client is polling all of the beans anyway then the cost may be exactly the same.
You would not need to do the polling if the objects could call the agent to say that they have been changed or if they supported some sort of isDirty() method.
In our systems, we have a metrics system that the various components used. Each of the classes incremented their own metric and it was the metrics that were wired into a persister. You could request the metric values using JMX or persist them to disk or the wire. By using a Metric type, then there was separation between the entity that was doing the counting and the entities that needed access to all of the metric values.
By going to a registered Metric object type model, your GUI could then query the MetricRegistrar for all of the metrics and display them via JMX, HTML, or whatever. So your entities would just do metric.increment() or metric.set(...) and the GUI would query the metric whenever it needed the value.
Hope something here helps.

Being efficient here means staying inside the mbean server that contains the beans you're looking at. What you want is a way to convert the mbeans that don't know how to issue notifications into mbeans that do.
For watching numeric and string attributes, you can use the standard mbeans in the monitor package. Instantiate those in the mbean server that contains the beans you actually want to watch, and then set the properties appropriately. You can do this without adding code to the target because the monitor package is standard in the JVM. The monitor beans will watch the objects you select for changes and will emit change notifications only when actual changes are observed. Use setGranularityPeriod to tell the monitor beans how often to look at the target.
Once the monitor beans are in place, just register for the MonitorNotifications that will be created upon change.

not a solution per se but you can simplify your polling-event translator JMXAgent implementation using spring integration. It has something called JMX Attribute Polling Channel which seems to fulfill your need. example here

Share a JDBC connection pool between many classes

I have some c3p0 pool encapsulated in a class that I use to execute SQL statements.
It's initialized this way:
public PooledQueryExecutor(String url, Properties connectionProperties) throws DALException {
try {
dataSource = new ComboPooledDataSource();
dataSource.setDriverClass(DRIVER);
dataSource.setJdbcUrl(url);
dataSource.setProperties(connectionProperties);
} catch (PropertyVetoException ve) {
throw new DALException(ve);
}
}
Then - inside the same class - I use some methods to perform basic tasks:
public CachedRowSet executeSelect(String sql) throws DALException {
// Get a connection, execute the SQL, return the rows that match, return resources to the pool
}
The "question" is:
I have a lot of different classes that represent network packets that I receive. Most classes need to have this PooledQueryExecutor to perform DB operations, but some don't. Do I pass this PooledQueryExecutor to the constructor of the classes that need it (80% of the packets), or do I make the PooledQueryExecutor a singleton? Or maybe "something else"? I also though of using a ThreadLocal to avoid polluting my constructor, but I don't think that's a good idea, is it?
EDIT: It's not a web application, and no dependency injection framework is currently used.
Thank you for your time!

I assume you are not using any DI framework? If this is the case you have a few choices:
Pass PooledQueryExecutor to the constructor of classes that require it. This is actually pretty good from testing and architecture perspective.
let the classes requiring JDBC implement some simple interface like:
interface PooledQueryExecutorAware {
void setPooledQueryExecutor(PooledQueryExecutor executor);
}
Then you can even iterate over classes and discover which implement this and inject PooledQueryExecutor where needed. You are one step to rediscovering DI here, but never mind.
Similar approach would be to create an abstract base class that would require PooledQueryExecutorAware as a dependency and have protected final field holding it.
Let every class be aware PooledQueryExecutor - not recommended, unnecessary coupling
Singleton is the worst what you can do, untestable and hard to understand code. Please, don't.
ThreadLocal? Forget about it. Remember, explicitness is king.
Also have a look at JdbcTemplate. It is part of the Spring, but you can only include spring-jdbc.jar and few other without using the whole framework. I think it can easily replace your PooledQueryExecutor.

I prefer setting (injecting) the pool. That way you can decide if some instances should get different pools, you can wrap the pool(s) with loggers etc and it is clear which classes needs the pool.
I also think this makes the code easier to test.

A common way to do in e.g. web servers it is to have JNDI up and running and then get datasources from JNDI. Simpel JDNI implementations exist for stand alone applications.
If you use dependency injection in your program, then this is an excellent example of a resource that can be injected where needed. This either mean Java EE 6 or a Spring/Guice/CDI enabled application.

The objects represent network packets? What are they using database connections for? This doesn't feel right to me. Would you like to talk a bit more about what you're trying to do, and what you architecture is?
I would lean towards making the packet objects a pure model of the domain - they can hold state, and they should define behaviour, but they shouldn't be in the business of grubbing around in databases. The database access in should be a different part of the system, a storage layer or subsystem. It's quite possible that key objects in that will be singletons (or, at least, that they will have a single instance, even if they're not formal singletons), and will be able to quite naturally contain a single global connection pool.
There is then the question of how the domain objects interact with the storage subsystem. Indeed, you have essentially the same question there as you asked about the connection pools. In general, i look to have some sort of controller objects orchestrating the program's work; they are well-placed to talk to objects in different layers, and to introduce them to one another.
Concrete examples would make this explanation easier. Tell us about your packets!

Servlet 3 spec and ThreadLocal

As far as I know, Servlet 3 spec introduces asynchronous processing feature. Among other things, this will mean that the same thread can and will be reused for processing another, concurrent, HTTP request(s). This isn't revolutionary, at least for people who worked with NIO before.
Anyway, this leads to another important thing: no ThreadLocal variables as a temporary storage for the request data. Because if the same thread suddenly becomes the carrier thread to a different HTTP request, request-local data will be exposed to another request.
All of that is my pure speculation based on reading articles, I haven't got time to play with any Servlet 3 implementations (Tomcat 7, GlassFish 3.0.X, etc.).
So, the questions:
Am I correct to assume that ThreadLocal will cease to be a convenient hack to keep the request data?
Has anybody played with any of Servlet 3 implementations and tried using ThreadLocals to prove the above?
Apart from storing data inside HTTP Session, are there any other similar easy-to-reach hacks you could possibly advise?
EDIT: don't get me wrong. I completely understand the dangers and ThreadLocal being a hack. In fact, I always advise against using it in similar context. However, believe it or not, thread context has been used far more frequently than you probably imagine. A good example would be Spring's OpenSessionInViewFilter which, according to its Javadoc:
This filter makes Hibernate Sessions
available via the current thread,
which will be autodetected by
transaction managers.
This isn't strictly ThreadLocal (haven't checked the source) but already sounds alarming. I can think of more similar scenarios, and the abundance of web frameworks makes this much more likely.
Briefly speaking, many people have built their sand castles on top of this hack, with or without awareness. Therefore Stephen's answer is understandable but not quite what I'm after. I would like to get a confirmation whether anyone has actually tried and was able to reproduce failing behaviour so this question could be used as a reference point to others trapped by the same problem.

Async processing shouldn't bother you unless you explcitly ask for it.
For example, request can't be made async if servlet or any of filters in request's filter chain is not marked with <async-supported>true</async-supported>. Therefore, you can still use regular practices for regular requests.
Of couse, if you actually need async processing, you need to use appropriate practices. Basically, when request is processed asynchronously, its processing is broken into parts. These parts don't share thread-local state, however, you can still use thread-local state inside each of that parts, though you have to manage the state manually between the parts.

(Caveat: I've not read the Servlet 3 spec in detail, so I cannot say for sure that the spec says what you think it does. I'm just assuming that it does ...)
Am I correct to assume that ThreadLocal will cease to be a convenient hack to keep the request data?
Using ThreadLocal was always a poor approach, because you always ran the risk that information would leak when a worker thread finished one request and started on another one. Storing stuff as attributes in the ServletRequest object was always a better idea.
Now you've simply got another reason to do it the "right" way.
Has anybody played with any of Servlet 3 implementations and tried using ThreadLocals to prove the above?
That's not the right approach. It only tells you about the particular behaviour of a particular implementation under the particular circumstances of your test. You cannot generalize.
The correct approach is to assume that it will sometimes happen if the spec says it can ... and design your webapp to take account of it.
(Fear not! Apparently, in this case, this does not happen by default. Your webapp has to explicitly enable the async processing feature. If your code is infested with thread locals, you would be advised not to do this ...)
Apart from storing data inside HTTP Session, are there any other similar easy-to-reach hacks you could possibly advise.
Nope. The only right answer is storing request-specific data in the ServletRequest or ServletResponse object. Even storing it in the HTTP Session can be wrong, since there can be multiple requests active at the same time for a given session.

NOTE: Hacks follow. Use with caution, or really just don't use.
So long as you continue to understand which thread your code is executing in, there's no reason you can't use a ThreadLocal safely.
try {
tl.set(value);
doStuffUsingThreadLocal();
} finally {
tl.remove();
}
It's not as if your call stack is switched out randomly. Heck, if there are ThreadLocal values you want to set deep in the call stack and then use further out, you can hack that too:
public class Nasty {
static ThreadLocal<Set<ThreadLocal<?>>> cleanMe =
new ThreadLocal<Set<ThreadLocal<?>>>() {
protected Set<ThreadLocal<?>> initialValue() {
return new HashSet<ThreadLocal<?>>();
}
};
static void register(ThreadLocal<?> toClean) {
cleanMe.get().add(toClean);
}
static void cleanup() {
for(ThreadLocal<?> tl : toClean)
tl.remove();
toClean.clear();
}
}
Then you register your ThreadLocals as you set them, and cleanup in a finally clause somewhere. This is all shameful wankery that you shouldn't probably do. I'm sorry I wrote it but it's too late :/

I'm still wondering why people use the rotten javax.servlet API to actually implement their servlets. What I do:
I have a base class HttpRequestHandler which has private fields for request, response and a handle() method that can throw Exception plus some utility methods to get/set parameters, attributes, etc. I rarely need more than 5-10% of the servlet API, so this isn't as much work as it sounds.
In the servlet handler, I create an instance of this class and then forget about the servlet API.
I can extend this handler class and add all the fields and data that I need for the job. No huge parameter lists, no thread local hacking, no worries about concurrency.
I have a utility class for unit tests that creates a HttpRequestHandler with mock implementations of request and response. This way, I don't need a servlet environment to test my code.
This solves all my problems because I can get the DB session and other things in the init() method or I can insert a factory between the servlet and the real handler to do more complex things.

You are psychic ! (+1 for that)
My aim is ... to get a proof this has stopped working in Servlet 3.0 container
Here is the proof that you were asking for.
Incidentally, it is using the exact same OEMIV filter that you mentioned in your question and, guess what, it breaks Async servlet processing !
Edit: Here is another proof.

One solution is to not use ThreadLocal but rather use a singleton that contains a static array of the objects you want to make global. This object would contain a "threadName" field that you set. You first set the current thread's name (in doGet, doPost) to some random unique value (like a UUID), then store it as part of the object that contains the data you want stored in the singleton. Then whenever some part of your code needs to access the data, it simply goes through the array and checks for the object with the threadName that is currently running and retrieve the object. You'll need to add some cleanup code to remove the object from the array when the http request completes.

How to expose an EJB as a webservice that will later let me keep client compatibility when ejb changes?

Lots of frameworks let me expose an ejb as a webservice.
But then 2 months after publishing the initial service I need to change the ejb or any part of its interface. I still have clients that need to access the old interface, so I obviously need to have 2 webservices with different signatures.
Anyone have any suggestions on how I can do this, preferably letting the framework do the grunt work of creating wrappers and copying logic (unless there's an even smarter way).
I can choose webservice framework on basis of this, so suggestions are welcome.
Edit: I know my change is going to break compatibility,and I am fully aware that I will need two services with different namespaces at the same time. But how can I do it in a simple manner ?

I don't think, you need any additional frameworks to do this. Java EE lets you directly expose the EJB as a web service (since EJB 2.1; see example for J2EE 1.4), but with EE 5 it's even simpler:
#WebService
#SOAPBinding(style = Style.RPC)
public interface ILegacyService extends IOtherLegacyService {
// the interface methods
...
}
#Stateless
#Local(ILegacyService.class)
#WebService(endpointInterface = "...ILegacyService", ...)
public class LegacyServiceImpl implements ILegacyService {
// implementation of ILegacyService
}
Depending on your application server, you should be able to provide ILegacyService at any location that fits. As jezell said, you should try to put changes that do not change the contract directly into this interface. If you have additional changes, you may just provide another implementation with a different interface. Common logic can be pulled up into a superclass of LegacyServiceImpl.

I'm not an EBJ guy, but I can tell you how this is generally handled in the web service world. If you have a non-breaking change to the contract (for instance, adding a property that is optional), then you can simply update the contract and consumers should be fine.
If you have a breaking change to a contract, then the way to handle it is to create a new service with a new namespace for it's types. For instance, if your first service had a namespace of:
http://myservice.com/2006
Your new one might have:
http://myservice.com/2009
Expose this contract to new consumers.
How you handle the old contract is up to you. You might direct all the requests to an old server and let clients choose when to upgrade to the new servers. If you can use some amount of logic to upgrade the requests to the format that the new service expects, then you can rip out the old service's logic and replace it with calls to the new. Or, you might just deprecate it all together and fail all calls to the old service.
PS: This is much easier to handle if you create message class objects rather than reusing domain entities.

Ok here goes;
it seems like dozer.sourceforge.net is an acceptable starting-point for doing the grunt work of copying data between two parallel structures. I suppose a lot of web frameworks can generate client proxies that can be re-used in a server context to maintain compatibility.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.