Apache Thrift reusing connection in different thread - java

I'm creating a program in Java in which a client can send read/write operations to a node which sends a read/write request to a supernode. The supernode is using a Thrift HaHs server.
There are multiple nodes and thus should be able to handle concurrent operations. Writing is no proble, but I'm having a problem with reading.
Each time a node connects to the server, a thread will take its read/write request and put it in a queue (LinkedBlockingQueue since this is a critical section that needs to be locked).
The server has a separate pool of worker threads that will process each request in the queue concurrently.
My problem is that after I get a specific file, I need to pass it back to the connecting node (using the same Thrift connection). However, I'm not sure how to do that since the requests are handled by separate worker threads. Here's the order of steps.
// node calls this supernode method (via Thrift RPC)
Request connect(Request req) {
queue.put(req)
return req;
}
// Inside Worker Thread class (which is inside supernode)
public void run() {
try {
while ( true ) {
Request req = queue.take();
processRequest(req);
}
}
catch ( InterruptedException ie ) {
// just terminate
}
}
Basically, I'm trying to figure out how I can send something back to the same Thrift socket inside processRequest
Any help would be much appreciated!

My solution currently is to use a separate 'Completion Queue' which will add completed requests and the incoming call will have to poll that queue--of course this is not efficient due to polling. I assume there is some way to wait the main thread while the worker thread is processing then signal the main thread to continue, but then I'm not sure how to pass back the computed request (i.e contents of file retrieved)
I would first ask myself why I need two threads where one seems enough?
Aside from that, my approach would be sth. like this (rough sketch):
The request data, response data and a waitable event object all are wrapped togehter in a "work package". By waitable event object I mean whatever the Java equivalent of that is exactly, technically.
That work package object is shared between threads. It is put into the queue to be processed by the worker, and also the calling thread keeps holding a reference to it.
The worker grabs the request data from the work package object, processes it and attaches the resulting data to that work package object.
Once the worker is done, he signals the event object mentioned above.
The calling thread waits for that event to happen. Since we use an event object, this is completely w/o polling. Once the event becomes signaled, the thread pulls the result data from the completed work package and returns to the client.
Of course additional code may be added to cover all edge cases, such as necessary timeouts etc.

Related

Returning synchronous message from service, but then doing asynchronous processing - concern about hanging threads?

Essentially I've written a service in Java that will do initial synchronous processing (a couple simple calls to other web services). Then, after that processing is done, I return an acknowledgement message to the caller, saying I've verified their request and there is now downstream processing happening in the background asynchronously.
In a nutshell, what I'm concerned about is the complexity of the async processing. The sum of those async calls can take up to 2-3 minutes depending on certain parameters sent. My thought here is: what if there's a lot of traffic at once hitting my service, and there are a bunch of hanging threads in the background, doing a large chunk of processing. Will there be bad data as a result? (like one request getting mixed in with a previous request etc)
The code follows this structure:
Validation of headers and params in body
Synchronous processing
Return acknowledgement message to the caller
Asynchronous processing
For #4, I've simply made a new thread and call a method that does all the async processing within it. Like:
new Thread()
{
#Override
public void run()
{
try {
makeDownstreamCalls(arg1, arg2 , arg3, arg4);
} catch (Exception e) {
e.printStackTrace();
}
}
}.start();
I'm basically wondering about unintended consequences of lots of traffic hitting my service. An example I'm thinking about: a thread executing downstream calls for request A, and then another request comes in, and a new thread has to be made to execute downstream calls for request B. How is request B handled in this situation, and what happens to request A, which is still in-progress? Will the async calls in request A just terminate in this case? Or can each distinct request, and thread, execute in parallel just fine and complete, without any strange consequences?
Well, the answer depends on your code, of which you posted a small part, so my answer contains some guesswork. I'll assume that we're talking about some sort of multi-threaded server which accepts client requests, and that those request come to some handleRequest() method which performs the 4 steps you've mentioned. I'll also assume that the requests aren't related in any way and don't affect each other (so for instance, the code doesn't do something like "if a thread already exists from a previous request then don't create a new thread" or anything like that).
If that's the case, then your handleRequest() method can be simultaneously invoked by different server threads concurrently. And each will execute the four steps you've outlined. If two requests happen simultaneously, then a server thread will execute your handler for request A, and a different one will execute it for B at the same time. If during the processing of a request, a new thread is created, then one will be created for A, another for B. That way, you'll end up with two threads performing makeDownstreamCalls(), one with A's parameters one with B's.
In practice, that's probably a pretty bad idea. The more threads your program will create, the more context-switching the OS has to do. You really don't want the number of requests to increase the number of threads in your application endlessly. Modern OSes are capable of handling hundreds or even thousands of threads (as long as they're bound by IO, not CPU), but it comes at a cost. You might want to consider using a Java executor with a limited number of threads to avoid crushing your process or even OS.
If there's too much load on a server, you can't expect your application to handle it. Process what you can within the limit of the application, and reject further request. Accepting more requests when you're fully loaded means that your application crashes, and none of the requests are processed - this is known as "Load Shedding".

running two threads simultaneously inside a stateless agent and guaranting communication between them

I am developing a stateless Agent in Java that takes informations from one Server and transfer it to another client. It means that the agent is located between a client and a server. So I am thinking to run two threads simultaneously on the agent: one thread (thread1) runs a serverSocket and get request from client while another threads (thread2)is runnning and makes communication with the server. The problem consists in synchronizing between the two threads. I am thinking in making thread 1 asking whole the time thread 2 about a new Information. If thread 2 has nothing new, he will not answer it. What is the best way to synchronize between them. Should I use a global variable (a flag) to synchronize between them? Can I save Information when I have a stateless agent?
I think you should modify your app into async model.
Your app needs:
- an entry point to accept incoming connections -> a good example is an async servlet (or one dedicated thread).
- a ThreadPoolExecutor that provides fixed numbers of workers and a blocking queue (use this constructor).
The workflow:
Accept incomming request.
Wrapp incoming request into (Runnable) task.
Put task into blocking queue.
If ThreadPoolExecutor has a free worker starts processing the task
An advantage of such a model is that you are able to handle one request using one thread. So there is no need to manually synchronize anything.

How to get pass a process ID between threads in Java?

I'm building a paired client/server application for a research project. The server-side application is a java binary that has a main loop and a ServerSocket object. When a user with the client-side application rings up the server, I instantiate a new object of type ClientSession, and give it the socket over which communications will occur. Then I run it as a new thread so that the central thread of the server application can go back to waiting for calls.
Each ClientSession thread then does some processing based on requests from the client program. It might accept a string and throw back a small SQLite .db file in response, or accept a different string and send back a Java serialized object containing a list of files and file sizes. In all cases the ClientSession thread is shortlived, and closes when then socket is closed.
And so far this is working fine, but today I have a new challenge to run, of all things, a perl script as per a client's request. The script basically wraps some low-level Unix OS functions that are useful to my problem domain. If a client makes a request to start up a copy of this script, I can't just launch it in the ClientSession thread. The perl script needs to persist after the lifetime of a call (perhaps for days or weeks running in a loop with a sleep timer).
What I'd like to do is set up a separate thread on my server application and have it wrap this perl script.
Runtime.getRuntime.exec('perl script.pl');
This will start it up, but right now it'll die with the ClientSession thread.
My thought would be something like declare a PerlThread object, require it to know the UserID of who's requesting the new thread as part of its constructor, have it implement the Runnable interface, then just do a thread.start() on a new instance of it. But before I can get there, I have to do some message passing between a child thread and the main thread of my server application. So that setup brings me to my question:
How do I pass a message back to my main server program from a child thread and tell it that a Client requested a perl wrapper thread to start?
You should look at ExecutorService for a better technique for running jobs on separate threads. It's much more powerful and convenient than Thread.start() or Runnable.run().
You can look at the Callable construct to see how to get results back from jobs which run on other threads.
You could pass the information you want to get back from the task when it completes into the constructor for your domain specific callable, and when it completes, you can get it back.
How do I pass a message back to my main server program from a child thread and tell it that a Client requested a perl wrapper thread to start?
I would store an object somewhere that wraps the UserId as well as the associated Process object. You could store it in a static volatile field if there is just one of them. Something like:
public class ClientHandler {
...
private static volatile UserProcess userProcess;
...
// client handler
if (userProcess != null) {
// return error that the perl process is already started
} else {
// start the process
Process process = Runtime.getRuntime.exec('perl script.pl');
userProcess = new UserProcess(userId, process);
}
Then the main program can find out if any clients have the perl script running by looking at ClientHandler.userProcess.
If you can have multiple processes running on the client then I'd wrap some sort of ProcessManager class around a collect of these UserProcess objects. Something like startProcess(...), listProcesses(), ... Then the main class would ask the ProcessManager what processes are running or what users started what processes, etc..
The UserProcess class is just a wrapper around the UserId and Process fields.

Java Servlet - Observer Pattern causing null Response object

I have a Java HttpServlet. This servlet contains a set of objects that make use of the observer pattern in order to return data through the servlet's Response object. Here is a simplified version of my doGet() method in the HttpServlet:
protected void doGet(final HttpServletRequest request, final HttpServletResponse response)
MyProcess process = new MyProcess();
// This following method spawns a few threads, so I use a listener to receive a completion event.
process.performAsynchronousMethod(request, new MyListener() {
public void processComplete(data) {
response.getWriter().print(data.toString());
}
}
}
As the example shows, I have a process that I execute, which spawns a variety of threads in order to produce a final dataset. This process can take anywhere from seconds to a minute. My problem is, it appears that as the doGet() method completes, the response object becomes null. When processComplete() is called, the response object will be null - thus preventing me from writing any data out.
It appears as if the servlet is closing the connection as soon as the asynchronous method is called.
Is there a better way to implement this type of servlet when using the observer pattern for asynchronous tasks? Should I do this in another way?
The servlet response will be sent back to the client when the doGet method terminates, it won't wait for your asynchronous call to finish as well. You will need to find a way to block until all your asynchronous tasks have completed, and only then allow the doGet() method to return.
The answers to this question should point you in the right direction.
Something else to watch out for is that you have no guarantee that the threads will write to the response writer in series, you may find that the various print operations overlap and the output will be garbled (this may not matter to you, depending on what the data is, and how it will be used)
You could try asynchronous servlets available in spec version 3.0, not all web servers support it, only some modern. But it means that server will hold socket connection for this amount of time. So, you should know how many clients could be connected simultaneously, not all hardware/operation system could handle a lot of open connections.
And web client will wait, and could have a timeout. You should also consider a situation that socket connection could be disconnected and client will never get result (e.g. some proxy servers break long running connections). So you should allow "resume" operation.

Java: Communicating data from one thread to another

I am working on creating a chat client based on UDP. The main structure is that there is a server where clients register and where clients can also request to form a connection with another client that is registered with the server. The clients are structures as follows using pseudo code:
public UDPClient() {
// Create datagram socket
// Execute RECEIVE thread using datagram socket above
// Execute SEND thread using datagram socket above
}
The idea is to have the send and receive executing on separate threads so I don't get blocked I/O on the receive. Both of these threads have loops within their run methods that allow you to continually send and receive messages. The problem I have is this. If a message comes in on the RECEIVE thread that changes how my SEND should be executing, how do I communicate this to the SEND thread? Do I have to shoot a datagram off to myself or can I communicate this in the code somehow?
Assuming boths threads have no reference to each other, create a third singleton class, which both read/send threads (classes) reference, that has a volatile member field to store the state data you want shared and which has synchronized access.
The volatile keyword, combined with synchronized access, guarantees that a change made to the field by one thread will be seen by another thread. Without this, changes may not be visible due to the java memory model specification.
Edited:
Following "separation of concerns" design guideline, it would be better to not have the read/send threads know about each other and to use a third class to orchestrate their activities/behaviour. Add methods to your read/send classes to stop(), start() etc and call these from the other class.
Using a separate class would also allow:
Behaviour control by other means, for example a "stop sending" button on an admin web page
Allowing multiple threads of each type, yet still having proper control through a central point, perhaps using a pool of such threads (without a separate class, you would have a many-to-many nightmare and lots of code that has nothing to do with the job at hand: ie ending and receiving)
Easier testing of your worker classes, because they do less and are more focused
porting/embedding them stand-alone for other uses
your SEND thread should have public (accesible) method (synchronized if possible) that you should be able to access from your RECEIVE thread. You could use this method to create a boolean flag, string message, etc. that you should always read before you .send(yourPacket); from your SEND thread.
Have a member variable in your READ method that your code can read from and change the SEND method based on that variable.

Categories