I am currently studying how Java RMI works but I do not understand a certain aspect.
In a non-distributed multithredaded environment if methods on the same object are called simultaneously from different threads each of them will be executed on the respective thread's stack (accessing shared data is not a part of my question).
In a distributed system since a client process calls methods on the stub and the actual call is executed on the stack of the process that created the remote object how are simultaneous calls to a method handled? In other words what happens at the lets say server thread when there are two (or more) requests to execute the same method on that thread?
I thought of this question as I want to compare this to what I am used to - the executions being on different stacks.
how are simultaneous calls to a method handled?
It isn't specified. It is carefully stated in the RMI specification: "The RMI runtime makes no guarantees with respect to mapping remote object invocations to threads."
The occult meaning of this is that you can't assume the server is single-threaded.
In other words what happens at the lets say server thread when there are two (or more) requests to execute the same method on that thread?
There can't be two or more requests to execute the method on the same thread. The question doesn't make sense. You've posited a unique 'lets say server thread' that doesn't actually exist.
There can however be two or more requests to execute the method arising from two or more concurrent clients, or two or more concurrent threads in a single client, or both, and because of the wording of the RMI Specification you can't assume a single-threaded despatching model at the server.
In the Oracle/Sun implementation it is indeed multi-threaded, ditto the IBM implementation. I'm not aware of any RMI implementation that isn't multi-threaded, and any such implementation would be basically useless.
Related
I just had a discussion with a colleague who asked me why i would do a static Http request like this:
HttpClient.doGet(HashMap<String,String> Parameters);
instead of invoking an object of the class via default constructor and use a nonstatic method like this:
new HttpClient().doGet(HashMap<String,String> Parameters)
If assuming that the implementation of the method doGet only uses the parameters of the function without any member variables, would the static implementation be problematic in any way, e.g. thread safety?
It depends on what you mean by problematic, but going off just your given example, the answer is no, the static method call is not problematic, and is arguably better, since no object needs to be instantiated.
You mentioned thread safety, so I will touch on that. You only need to be concered with thread safety if there is "mutable shared state" involved. Mutable being the key-word here. For example, if multiple threads were sharing the same instance of HttpClient, and that HttpClient was keeping track of some state by mutating one or more of its member variables, then that definitely has the potential to be problematic.
... but also, every HTTP request has to go out on a network, to a physical computer someplace else, then to return, "at least many milli- seconds later." So, there's really no point in "multi-threading" that chore. A single thread can be given the responsibility for sending out parallel I/O-requests to the remote hosts, receiving requests from the rest of your code by means of some thread-safe queue and returning the responses in like manner on another queue (or, queues).
It is wasteful to associate "a thread" with "a request." A very small pool of workers can be consuming the responses that come off of that reply-queue.
(And of course, there are plenty of existing Java open-source frameworks that implement all of this very-familiar plumbing for you.)
I am implementing something like a database where data manipulation statements (inserts, updates and deletes) get evaluated. Some statements can execute concurrently and others cannot (I compute that). I like the ease of use and convenience of RMI, however I need to have a deeper understanding of the RMI service implementation w.r.t multithreading. For example,
Can the multithreading be controlled in any way?
Is a thread created for each remote call (on server side) or are thread pools used?
More generally, using RMI, how can I ensure that some rmi calls wait for other calls to terminate?
Is there another non-RMI approach, with the same convenience and efficiency that would work better for this?
If I want multi-threading should I just create threads myself on the server side code? The concern is that if the RMI Service creates multiple threads than I would be adding additional unnecessary threads.
If, for example, a thread is created on each call, then I can use the java join method to order the statement execution. On the other hand, if thread pools are used then the join method won't work (since the threads don't terminate).
Overview
There seems to be a few questions within this post, so I will attempt to walk you through each portion in some detail.
Question 1 - Can the multi-threading be controlled in any way?
Yes! Your implementation of the multi-threading can be whatever you want it to be. A RMI implementation is only the communication between seperate JVMs with enough abstraction to feel like they exist on 1 JVM; thus has no effect on multi-threading as it is only the communication layer.
Question 2 - Is a thread created for each remote call (on the server side) or are thread-pools used?
See the documentation here. The short answer to if they are on separate threads is no.
A method dispatched by the RMI runtime to a remote object implementation may or may not execute in a separate thread. The RMI runtime makes no guarantees with respect to mapping remote object invocations to threads. Since remote method invocation on the same remote object may execute concurrently, a remote object implementation needs to make sure its implementation is thread-safe.
RMI using thread-pools depends on the implementation, but as a developer utilizing RMI this should be of no concern as it is encapsulated in the RMI connection layer.
Question 3 - Using RMI, how can I ensure that some RMI calls wait for other calls to terminate?
This is a rather vague question, but I think what your asking is how do you properly block when synchronizing in RMI. This comes with your design of the application. Lets take the scenario where you are trying to access the database and you must synchronize DB access. If the client attempts to invoke access through RMI, it will invoke the remote server's method that holds all the synchronization, thus wait for a lock if it must. Therefore, the Client will be waiting for its turn to access the DB via the server. So, with your current scenario, you want your synchronization of the DB to be present on the server-side.
Question 4 - Is there another non-RMI approach, with the same convenience and efficiency that would work better for this?
Absolutely. Below is a brief list of communication implementations that could be utilized for communication.
1) RESTful
2) RMI
3) Sockets
4) gRPC
My recommendation is to utilize RESTful as it is the most straight-forward and has plenty of implementations/documentation on the internet. Efficiency seems to be quite a high concern for you, but your operations are only manipulating a DB in a standard manner. Therefore, I believe a Restful implementation would provide more than enough efficiency.
Think of it like this; you have N number clients, a load-balancer, and M servers. There exists no constant connection between clients and servers thus reducing complexity and computation. As N clients grows, the load balancer creates more instances of servers and allocating the load appropriately. Note, the requests between clients and servers are actually quite small as they will have a payload and a request type. Additionally, servers will receive the requests and compute the operations as normal and in parallel. The optimization can be done on the server side via threadpools or frameworks such as spring.
What you are asking for is a way to coordinate execution of the tasks according to their dependencies. The fact that your tasks make RMI calls is insignificant. We can imagine a pure computational tasks which does not access remote machines and still are dependent on each other, for example, by providing values computed in one task as parameters for other tasks.
The coordination of dependent tasks is the central problem of asynchronous programming. The support of asynchronous programming in JDK is not full but sufficient for your problem. You need to use 2 things: CompletableFuture and an Executor. Note that an RMI call blocks the thread it runs on, so using an Executor with limited number of threads can lead to the deadlock of specific kind, named "thread starvation", when the computation cannot move on because all available threads are blocked. So use an executor with unlimited number of threads, the simplest is the one which creates new thread for each task:
Executor newThreadExecutor = (Runnable r)->new Thread(r).start();
Then, for each RMI call (or any other task), declare the task method. If the task does not depend on other tasks, then the method should have no parameters. If the task depends on the result(s) produced by other task(s), then declare one or two parameters (greater number of parameters is not supported by CompletableFuture directly). Let we have:
String m0a() {return "ok";} // no args
Integer m0b() {return 1;} // no args
Double m2(String arg, Integer arg2) {return arg2/2.0;} // 2 args
Let we want to compute the following result:
String r0a = m0a();
Integer r0b = m0b();
Double r2 = m2(r0a, r0b);
but asynchronously, so that calls to m0a and m0b are executed in parallel, and the call to m2 starts as soon as both m0a and m0b finished.
Then, wrap each task method with an instance of CompletableFuture Depending on the signature of the task method, different methods of CompletableFuture are used:
CompletableFuture<String> t0a = CompletableFuture.supplyAsync(this::m0a, newThreadExecutor)
CompletableFuture<Integer> t0b = CompletableFuture.supplyAsync(this::m0b, newThreadExecutor)
CompletableFuture<Double> t2 = t0a.thenCombineAsync(t0b, this::m2, newThreadExecutor)
The tasks start execution right after declaration, no call to special start method is required.
To get the final result from the last task t2, method get() of interface Future can be used:
Double res = t2.get();
I have been using java RMI for a while now but I couldn't figure out if the RMI Remote Stubs (on the server side) are singleton? The reason I ask is:
lets assume that one of the RMI implementation methods lower down in the chain of calls have a synchronized method. If for some reason the logic in the Synchronized Method is messed up (or hangs), the future RMI calls (from the client) will hang too while trying to get access to that synchronized method. This will hold true only if the RMI stubs are going to be singleton. If a new object is created on the server side at every remote call from the client, this won't be a problem because than the methods are being called from a different object and synchronized method won't be an issue anymore.
Long story short. I am trying to understand how JVM internally maintains rmi remote objects on the server side and if they are singleton. I tried many different javadocs but they don't explicitly mention this anywhere.
Any and all help is appreciated !
EDIT
Based on some questions and comments, I am refining the question: my real question is, does RMI on the server side happen to keep some kind of an object pool based on what one object you export and register ? Can you bind more than one object of the same type with the same name (somewhat simulating an object pool where RMI can give me any of the objects that I registered) or in order to have multiple instances of the same object, I will have to register them with different names
First of all, the "stub" is a client-side concept, there are no stubs on the server.
As for the remote objects themselves, the RMI system doesn't instantiate the objects for you, it's up to you to create instances and export them. You create one instance of the object, export that object, and bind it in the registry under a particular name. All calls on client stubs obtained from that same name in the registry will ultimately end up at the same object on the server.
Can you bind more than one object of the same type with the same name (somewhat simulating an object pool where RMI can give me any of the objects that I registered)
No, you can only bind one object in the registry under a given name. But the object you bind could itself be a proxy to your own object pool, for example using the Spring AOP CommonsPoolTargetSource mechanism.
RMI its based on proxy design pattern.
See what says here
A RMI Server is an application that creates a number of remote objects. An RMI Server is responsible for:
Creating an instance of the remote object (e.g. CarImpl instance = new CarImpl());
Exporting the remote object;
Binding the instance of the remote object to the RMI registry.
Stubs are not singletons, but your question is really about the server-side objects. They are not singletons either, unless you implement them that way yourself. RMI doesn't do anything about that whatsoever.
EDIT Based on some questions and comments, I am refining the question: my real question is, does RMI on the server side happen to keep some kind of an object pool based on what one object you export and register?
No.
Can you bind more than one object of the same type with the same name
No.
I will have to register them with different names
You don't have to register them at all. You need one singleton remote object bound into the Registry: consider that as a factory method for further remote objects, which are returned as results from its remote methods. For example, a remote Login object is bound in the Registry and has a single login() method that returns a remote session object, a new one per login, with its own API.
From the Java docs:
http://docs.oracle.com/javase/7/docs/platform/rmi/spec/rmi-arch3.html
A method dispatched by the RMI runtime to a remote object
implementation may or may not execute in a separate thread. The RMI
runtime makes no guarantees with respect to mapping remote object
invocations to threads. Since remote method invocation on the same
remote object may execute concurrently, a remote object implementation
needs to make sure its implementation is thread-safe.
Yes, the server side method is synchronized. The implementation is platform-specific. You cannot assume anything else about threading. And you certainly cannot assume whether or not the remote object is a singleton.
Also, it might be useful to look at Remote Object Activitation:
http://docstore.mik.ua/orelly/java-ent/jenut/ch03_06.htm
http://docs.oracle.com/javase/7/docs/api/java/rmi/activation/package-summary.html
I have a web application that retrieves a (large) list of results from the database, then needs to pare down the list by looking at each result, and throwing out "invalid" ones. The parameters that make a result "invalid" are dynamic, and we cannot pass the work on to the database.
So, one idea is to create a thread pool and ExecutorService and check these results concurrently. But I keep seeing people saying "Oh, the spec prohibits spawning threads in a servlet" or "that's just a bad idea".
So, my question: what am I supposed to do? I'm in a servlet 2.5 container, so all the asynchrous goodies as part of the 3.0 spec are unavailable to me. Writing a separate service that I communicate with via JMS seems like overkill.
Looking for expert advice here.
Jason
Nonsense.
The JEE spec has lots of "should nots" and "thou shant's". The Servlet spec, on the other hand, has none of that. The Servlet spec is much more wild west. It really doesn't dive in to the actual operational aspects like the JEE spec does.
I've yet to see a JEE container (either a pure servlet container ala Tomcat/Jetty, or full boat ala Glassfish/JBoss) that actually prevented me from firing off a thread on my own. WebSphere might, it's supposed to be rather notorious, but I've not used WebSphere.
If the concept of creating unruly, self-managed threads makes you itch, then the full JEE containers internally have a formal "WorkManager" that can be used to peel threads off of. They just all expose them in different ways. That's the more "by the book-ish" mechanism for getting a thread.
But, frankly, I wouldn't bother. You'll likely have more success using the Executors out of the standard class library. If you saturate your system with too many threads and everything gets out of hand, well, that's on you. Don't Do That(tm).
As to whether an async solution is even appropriate, I'll punt on that. It's not clear from your post whether it is or not. But your question was about threads and Servlets.
Just Do It. Be aware it "may not be portable", do it right (use an Executor), take responsibility for it, and the container won't be the wiser, nor care.
Doesn't look like concurrency will help you much here. Unless it's very expensive to check each entry, making that check concurrent won't speed things up. Your bottleneck is passing the result set through the database connection, and you couldn't multithread that even if you weren't working on a servlet.
There's nothing to stop you from hitting some ThreadPool from your Servlet, the challenge comes in getting the results. If the Servlet invocation is expecting some result from your submission of a Task to the TreadPool you will end up blocking waiting for the TreadPool stuff to finish so you can compose a response to the doGet/doPut invocation.
If, on the other hand, you devise your service such that a doPut, for example, submits a Task to a ThreadPool but gets back a "handle" or some other unique identifier of the Task returning that to the client, then the client can "poll" the handle through some doGet API to see if the task is done. When the task is done, the client can get the results.
It's completely fine and appropriate. I have done countless work with Servlets that use thread pools on different containers without any problems whatsoever.
EJB containers (like JBoss) tend to warn against spawning threads, but this is because EJB guarantees that an instance of a Bean is only called by one thread, and some of the facilities rely on this and thus you could mess that up by using your own threads. In Servlet there is no such reliance and hence nothing you can mess up this way.
Even in EJB containers, you can use thread pools and be fine as long as you don't interact (like call) with EJB facilities from your own threads.
The thing to watch out for with servlet/threads is that member variables of the servlet need to be thread safe.
Technically nothing stops you from using a thread pool in your servlet to do some post processing but you could shoot yourself in the foot if you create a static thread pool with say 20 threads and 50 clients access your servlet concurrently because 30 clients will be waiting (depending on how long your post-processing takes).
Does RMI handles multiple clients by itself? i.e.
is it possible to use a server function by multiple clients at the same time?
if no, how can I do such a thing?
if yes, how it works? does it make a new thread for each call? if one clients blocks the function what would happen with the next client? etc.
yes
how it works? does it make a new thread for each call? if one clients blocks the function what would happen with the next client? etc.
It creates a thread for each client connection.
If one client calls a synchronized method or one which blocks other calls, calls made by other threads will block until that call releases the resource.
It sounds like you already worked out the answers, do you have a more specific doubt?
Yes RMI does handles multiple clients, but you must make your server threadsafe , RMI will dispatch multiple threads into a single server object if multiple clients simultanuosly
make methods call on it so if your server isn't threadsafe your application will fail.