How to make a Thrift client for multiple threads?

How to make a Thrift client for multiple threads? - java

I have a working Thrift client in the below snippet.
TTransport transport = new THttpClient(new Uri("http://localhost:8080/api/"));
TProtocol protocol = new TBinaryProtocol(transport);
TMultiplexedProtocol mp = new TMultiplexedProtocol(protocol, "UserService");
UserService.Client userServiceClient = new UserService.Client(mp);
System.out.println(userServiceClient.getUserById(100));
When running the client within multi-threaded environment
threads[i] = new Thread(new Runnable() {
#Override
public void run() {
System.out.println(userServiceClient.getUserById(someId));
}
}
I got an exception: out of sequence response
org.apache.thrift.TApplicationException: getUserById failed: out of sequence response
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
I guess the reason is that Thrift generated Client is not thread safe.
But if I want multi-clients to call the same method getUserById() simultaneously, how can I make it?

Thrift clients are not designed to be shared across threads. If you need multiple client threads, set up one Thrift client per thread.
But if I want multi-clients to call the same method getUserById() simultaneously, how can I make it?
We don't know much about the context, so I have to guess a bit. If the issue is that there are a lot of such calls coming in at a time, a possible solution could be to group calls to save roundtrip time:
service wtf {
list<string> getUsersById( 1 : list<int> userIds)
}
That's just a short idea. Maybe you want to return list<user_data_struct> instead. For practical reasons I would also recommend to wrap the returned list into a struct, so the whole thing becomes extensible.

Related

RMI Service run similar to sockets

So if I have a socket server, I can accept each socket and pass it to a executory
while(true){
Socket conn = socketServ.accept();
Runnable task = new Runnable() {
#Override
public void run() {
try{
server.executor(conn);
} catch(IOException e){
}
}
};
exec1.execute(task);
}
Doing this allows my server to run on my threads and does not block the same thread. Because I also have reference to that socket... called "conn" I can successfully return messages as well.
Now I have an RMI interface, which basically lets me call methods back and forth.
for example if I had this method:
public MusicServerResponseImpl CreatePlayerlist(String Name, UserObjectImpl uo) throws RemoteException {
MusicServerResponseImpl res = new MusicServerResponseImpl();
return res;
}
Which returns a serializable object. My concern is when this message gets called, I think it is going to get called in the main thread of the server, and thus will block that thread and slow down parallelism.
What I think is the solution is to have every single RMI method also create a task for an executor.. to speed up the execution of everything...this issue I am seeing however is unlike the socket where I have an object to send information back to, I am unsure how I would return a response from the RMI method, without somehow having to block the thread.
Does that make sense? Basically I am asking how I can execute in parallel with RMI methods while still being able to return results!
Thanks for the help!

Does that make sense?
No. Concurrent calls are natively supported.
See this documentation page and look for the property named maxConnectionThreads.
You could also have tested your assumptions by, for example, printing the current thread name in your server code, and trying to execute concurrent calls and see what happens.

Convert a for loop to a Multi-threaded chunk

I have a following piece for loop in a function which I intended to parallelize but not sure if the load of multiple threads will overweight the benefit of concurrency.
All I need is to send different log files to corresponding receivers. For the timebeing lets say number of receivers wont more than 10. Instead of sending log files back to back, is it more efficient if I send them all parallel?
for(int i=0; i < receiversList.size(); i++)
{
String receiverURL = serverURL + receiversList.get(i);
HttpPost method = new HttpPost(receiverURL);
String logPath = logFilesPath + logFilesList.get(i);
messagesList = readMsg(logPath);
for (String message : messagesList) {
StringEntity entity = new StringEntity(message);
log.info("Sending message:");
log.info(message + "\n");
method.setEntity(entity);
if (receiverURL.startsWith("https")) {
processAuthentication(method, username, password);
}
httpClient.execute(method).getEntity().getContent().close();
}
Thread.sleep(500); // Waiting time for the message to be sent
}
Also please tell me how can I make it parallel if it is gonna work? Should I do it manual or use ExecutorService?

All I need is to send different log files to corresponding receivers. For the time being lets say number of receivers won't be more than 10. Instead of sending log files back to back, is it more efficient if I send them all parallel?
There are a lot of questions to be asked before we can determine if doing this in parallel will buy you anything. You mentioned "receivers" but are you really talking about different receiving servers on different web addresses or are all threads sending their log files to the same server? If it is the latter then chances are you will get very little improvement in speed with concurrency. A single thread should be able to fill the network pipeline just fine.
Also, you probably would get no speed up if the messages are small. Only large messages would take any time and give you any true savings if they were sent in parallel.
I'm most familiar with the ExecutorService classes. You could do something like:
ExecutorService threadPool = Executors.newFixedThreadPool(10);
...
threadPool.submit(new Runnable() {
// you could create your own Runnable class if each one needs its own httpClient
public void run() {
StringEntity entity = new StringEntity(message);
...
// we assume that the client is some sort of pooling client
httpClient.execute(method).getEntity().getContent().close();
}
}
});
What will be good is if you want to queue up these messages and send them in a background thread to not slow down your program. Then you could submit the messages to the threadPool and keep on moving. Or you could put them in BlockingQueue<String> and have a thread taking from the BlockingQueue and calling the httpClient.execute(...).
More implementation details from this good ExecutorService tutorial.
Lastly, how about putting all of your messages into one entity and divide the messages on the server. That would be the most efficient although you might not control the server handler code.

Hello ExecutorService is certainly an option. You have 4 ways to do it in Java.
Using Threads (exposes to many details easy to make mistake)
Executor service as you have already mentioned. It comes from Java 6
Here is a tutorial demonstrating ExecutorService http://tutorials.jenkov.com/java-util-concurrent/executorservice.html
ForkJoin framework comes from Java 7
ParallelStreams comes from Java 8 bellow is a solution using ParallelStreams
Going for higher level api will spare you some errors you might otherwise do.
receiversList.paralelstream().map(t->{
String receiverURL = serverURL + receiversList.get(i);
HttpPost method = new HttpPost(receiverURL);
String logPath = logFilesPath + logFilesList.get(i);
return readMsg(logPath);
})
.flatMap(t->t.stream)
.forEach(t->{
StringEntity entity = new StringEntity(message);
log.info("Sending message:");
log.info(message + "\n");
method.setEntity(entity);
if (receiverURL.startsWith("https")) {
processAuthentication(method, username, password);
}
httpClient.execute(method).getEntity().getContent().close();})

Linking two Threads in a Client-Server Socket program - Java

I create threads of class A and each sends a serialized object to a Server using ObjectOutputStream.
The Server creates new Threads B for each socket connection (whenever a new A client connects)
B will call a synchronized method on a Shared Resource Mutex which causes it (B) to wait() until some internal condition in the Mutex is true.
In this case how A can know that B is currently waiting?
Hope this description is clear.
Class Arrangement:
A1--------->B1-------->| |
A2--------->B2-------->| Mutex |
A3--------->B3-------->| |
EDIT:
it's a must to have wait(), notify() or notifyAll(), since this is for an academic project where concurrency is tested.

Normally A would read on the socket, which would "block" (i.e. not return, hang up) until some data was sent back by B. It doesn't need to be written to deal with the waiting status of B. It just reads and that inherently involves waiting for something to read.
Update So you want A's user interface to stay responsive. By far the best way to do that is take advantage of the user interface library's event queue system. All GUI frameworks have a central event loop that dispatches events to handlers (button click, mouse move, timer, etc.) There is usually a way for a background thread to post something to that event queue so that it will be executed on the main UI thread. The details will depend on the framework you're using.
For example, in Swing, a background thread can do this:
SwingUtilities.invokeAndWait(someRunnableObject);
So suppose you define this interface:
public interface ServerReplyHandler {
void handleReply(Object reply);
}
Then make a nice API for your GUI code to use when it wants to submit a request to the server:
public class Communications {
public static void callServer(Object inputs, ServerReplyHandler handler);
}
So your client code can call the server like this:
showWaitMessage();
Communications.callServer(myInputs, new ServerReplyHandler() {
public void handleReply(Object myOutputs) {
hideWaitMessage();
// do something with myOutputs...
}
});
To implement the above API, you'd have a thread-safe queue of request objects, which store the inputs object and the handler for each request. And a background thread which just does nothing but pull requests from the queue, send the serialised inputs to the server, read back the reply and deserialise it, and then do this:
final ServerReplyHandler currentHandler = ...
final Object currentReply = ...
SwingUtilities.invokeAndWait(new Runnable() {
public void run() {
currentHandler.handleReply(currentReply);
}
});
So as soon as the background thread has read back the reply, it passes it back into the main UI thread via a callback.
This is exactly how browsers do asynchronous communication from JS code. If you're familiar with jQuery, the above Communications.callServer method is the same pattern as:
showWaitMessage();
$.get('http://...', function(reply) {
hideWaitMessage();
// do something with 'reply'
});
The only difference in this case is that you are writing the whole communication stack by hand.
Update 2
You asked:
You mean I can pass "new ObjectOutputStream().writeObject(obj)" as
"myInputs" in Communications.callServer?
If all information is passed as serialised objects, you can build the serialisation into callServer. The calling code just passes some object that supports serialisation. The implementation of callServer would serialise that object into a byte[] and post that to the work queue. The background thread would pop it from the queue and send the bytes to the server.
Note that this avoids serialising the object on the background thread. The advantage of this is that all background thread activity is separated from the UI code. The UI code can be completely unaware that you're using threads for communication.
Re: wait and notify, etc. You don't need to write your own code to use those. Use one of the standard implementations of the BlockingQueue interface. In this case you could use LinkedBlockingQueue with the default constructor so it can accept an unlimited number of items. That means that submitting to the queue will always happen without blocking. So:
private static class Request {
public byte[] send;
public ServerReplyHandler handler;
};
private BlockingQueue<Request> requestQueue;
public static callServer(Object inputs, ServerReplyHandler handler) {
ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
new ObjectOutputStream(byteStream).writeObject(inputs);
Request r = new Request();
r.send = byteStream.toByteArray();
r.handler = handler;
requestQueue.put(r);
}
Meanwhile the background worker thread is doing this:
for (;;) {
Request r = requestQueue.take();
if (r == shutdown) {
break;
}
// connect to server, send r.send bytes to it
// read back the response as a byte array:
byte[] response = ...
SwingUtilities.invokeAndWait(new Runnable() {
public void run() {
currentHandler.handleReply(
new ObjectInputStream(
new ByteArrayInputStream(response)
).readObject()
);
}
});
}
The shutdown variable is just:
private static Request shutdown = new Request();
i.e. it's a dummy request used as a special signal. This allows you to have another public static method to allow the UI to ask the background thread to quit (would presumably clear the queue before putting shutdown on it).
Note the essentials of the pattern: UI objects are never accessed on the background thread. They are only manipulated from the UI thread. There is a clear separation of ownership. Data is passed between threads as byte arrays.
You could start multiple workers if you wanted to support more than one request happening simultaneously.

CORBA unclear stuff

I've recently began on working on my first CORBA project. I think I got the basic stuff , however there are some things that still elude me. One of these things is how CORBA handles several calls on the same object .
Suppose I have a client that registers itself with the server , and then can receive work. The server sends work at random times.
Are all these calls handled on the same thread ? This would mean that while the client is working , it cannot receive anything. In this case how could I give him a multithread behavior.
Or on the other hand is a thread spawned for every call received ?. In this case do I need to protect the common data that can be accessed on each call ? What would be a good practice to do so
Other thing I'd like to do is to create several workers and have them receive work ,but in my implementation only one worker is active .
Below :
public static void main(String[] args)
{
try
{
connectWithServer(args);
createWorkers();
// wait for invocations from clients
orb.run();
}
catch (Exception e)
{
System.out.println("ERROR : " + e) ;
e.printStackTrace(System.out);
}
}
static public void connectWithServer(String[] args)throws Exception
{
orb = ORB.init(args, null);
// get reference to rootpoa & activate the POAManager
rootpoa = POAHelper.narrow(orb.resolve_initial_references("RootPOA"));
rootpoa.the_POAManager().activate();
// get the root naming context
org.omg.CORBA.Object objRef = orb.resolve_initial_references("NameService");
// Use NamingContextExt instead of NamingContext. This is
// part of the Interoperable naming Service.
NamingContextExt ncRef = NamingContextExtHelper.narrow(objRef);
// resolve the Object Reference in Naming
taskBagImpl = TaskBagHelper.narrow(ncRef.resolve_str(SERVER_NAME));
System.out.println(TAG + " Obtained a handle on server object: " + taskBagImpl);
}
public static void createWorkers() throws Exception
{
for(int i = 0; i < nrOfWorkers; i++)
{
WorkerImpl w = new WorkerImpl();
rootpoa.activate_object((Servant) w);
Worker ref = WorkerHelper.narrow(rootpoa.servant_to_reference(w));
w.setRef(ref);
taskBagImpl.registerWorker(w.getId(), ref);
}
}

Threading options are not specified in the CORBA standard. The only configuration possible in respect to threading is the POA policy ThreadingPolicy. Possible values are either ORB_CTRL_MODEL or SINGLE_THREAD_MODEL. The former specifies nothing about threading, and the ORB implementation decides which threading model to use. The latter guarantees that every request that an object receives (within the same POA) is serialized, so no re-entrancy or multi-threading capabilities has to be implemented in the servant.
CORBA implementors, however, took notice of this limitation and implemented some standard default policies, that have to be configured by other means (maybe program options via ORB.init() or configuration files). Usually, you can find three different policies (once you select ORB_CTRL_MODEL):
Thread per request: Spawns a new thread each request.
Thread per client: Spawns a new thread for each different client.
Thread pool: The ORB pre-allocates some pool of threads and uses them to serve all requests.
Others are possible, but those tend to be the common ground. Of couse, either of them will force you to use any kind of locking strategy to support concurrent clients.

See this Java IDL FAQ :
What is the thread model supported by the CORBA implementation in this release?

How do I make an async call to Hive in Java?

I would like to execute a Hive query on the server in an asynchronous manner. The Hive query will likely take a long time to complete, so I would prefer not to block on the call. I am currently using Thirft to make a blocking call (blocks on client.execute()), but I have not seen an example of how to make a non-blocking call. Here is the blocking code:
TSocket transport = new TSocket("hive.example.com", 10000);
transport.setTimeout(999999999);
TBinaryProtocol protocol = new TBinaryProtocol(transport);
Client client = new ThriftHive.Client(protocol);
transport.open();
client.execute(hql); // Omitted HQL
List<String> rows;
while ((rows = client.fetchN(1000)) != null) {
for (String row : rows) {
// Do stuff with row
}
}
transport.close();
The code above is missing try/catch blocks to keep it short.
Does anyone have any ideas how to do an async call? Can Hive/Thrift support it? Is there a better way?
Thanks!

AFAIK, at the time of writing Thrift does not generate asynchronous clients. The reason as explained in this link here (search text for "asynchronous") is that Thrift was designed for the data centre where latency is assumed to be low.
Unfortunately as you know the latency experienced between call and result is not always caused by the network, but by the logic being performed! We have this problem calling into the Cassandra database from a Java application server where we want to limit total threads.
Summary: for now all you can do is make sure you have sufficient resources to handle the required numbers of blocked concurrent threads and wait for a more efficient implementation.

It is now possible to make an asynchronous call in a Java thrift client after this patch was put in:
https://issues.apache.org/jira/browse/THRIFT-768
Generate the async java client using the new thrift and initialize your client as follows:
TNonblockingTransport transport = new TNonblockingSocket("127.0.0.1", 9160);
TAsyncClientManager clientManager = new TAsyncClientManager();
TProtocolFactory protocolFactory = new TBinaryProtocol.Factory();
Hive.AsyncClient client = new Hive.AsyncClient(protocolFactory, clientManager, transport);
Now you can execute methods on this client as you would on a synchronous interface. The only change is that all methods take an additional parameter of a callback.

I know nothing about Hive, but as a last resort, you can use Java's concurrency library:
Callable<SomeResult> c = new Callable<SomeResult>(){public SomeResult call(){
// your Hive code here
}};
Future<SomeResult> result = executorService.submit(c);
// when you need the result, this will block
result.get();
Or, if you do not need to wait for the result, use Runnable instead of Callable.

After talking to the Hive mailing list, Hive does not support async calls using Thirft.

I don't know about Hive in particular but any blocking call can be turned in an asynch call by spawning a new thread and using a callback. You could look at java.util.concurrent.FutureTask which has been designed to allow easy handling of such asynchronous operation.

We fire off asynchronous calls to AWS Elastic MapReduce. AWS MapReduce can run hadoop/hive jobs on Amazon's cloud with a call to the AWS MapReduce web services.
You can also monitor the status of your jobs and grab the results off S3 once the job is completed.
Since the calls to the web services are asynchronous in nature, we never block our other operations. We continue to monitor the status of our jobs in a separate thread and grab the results when the job is complete.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to make a Thrift client for multiple threads? - java

Related

RMI Service run similar to sockets

Convert a for loop to a Multi-threaded chunk

Linking two Threads in a Client-Server Socket program - Java

CORBA unclear stuff

How do I make an async call to Hive in Java?

Categories

Resources