I have the following situation:
I have 2 JVM processes (really 2 java processes running separately, not 2 threads) running on a local machine. Let's call them ProcessA an ProcessB.
I want them to communicate (exchange data) with one another (e.g. ProcessA sends a message to ProcessB to do something).
Now, I work around this issue by writing a temporary file and these process periodically scan this file to get message. I think this solution is not so good.
What would be a better alternative to achieve what I want?
Multiple options for IPC:
Socket-Based (Bare-Bones) Networking
not necessarily hard, but:
might be verbose for not much,
might offer more surface for bugs, as you write more code.
you could rely on existing frameworks, like Netty
RMI
Technically, that's also network communication, but that's transparent for you.
Fully-fledged Message Passing Architectures
usually built on either RMI or network communications as well, but with support for complicated conversations and workflows
might be too heavy-weight for something simple
frameworks like ActiveMQ or JBoss Messaging
Java Management Extensions (JMX)
more meant for JVM management and monitoring, but could help to implement what you want if you mostly want to have one process query another for data, or send it some request for an action, if they aren't too complex
also works over RMI (amongst other possible protocols)
not so simple to wrap your head around at first, but actually rather simple to use
File-sharing / File-locking
that's what you're doing right now
it's doable, but comes with a lot of problems to handle
Signals
You can simply send signals to your other project
However, it's fairly limited and requires you to implement a translation layer (it is doable, though, but a rather crazy idea to toy with than anything serious.
Without more details, a bare-bone network-based IPC approach seems the best, as it's the:
most extensible (in terms of adding new features and workflows to your
most lightweight (in terms of memory footprint for your app)
most simple (in terms of design)
most educative (in terms of learning how to implement IPC). (as you mentioned "socket is hard" in a comment, and it really is not and should be something you work on)
That being said, based on your example (simply requesting the other process to do an action), JMX could also be good enough for you.
I've added a library on github called Mappedbus (http://github.com/caplogic/mappedbus) which enable two (or many more) Java processes/JVMs to communicate by exchanging messages. The library uses a memory mapped file and makes use of fetch-and-add and volatile read/writes to synchronize the different readers and writers. I've measured the throughput between two processes using this library to 40 million messages/s with an average latency of 25 ns for reading/writing a single message.
What you are looking for is inter-process communication. Java provides a simple IPC framework in the form of Java RMI API. There are several other mechanisms for inter-process communication such as pipes, sockets, message queues (these are all concepts, obviously, so there are frameworks that implement these).
I think in your case Java RMI or a simple custom socket implementation should suffice.
Sockets with DataInput(Output)Stream, to send java objects back and forth. This is easier than using disk file, and much easier than Netty.
I tend to use jGroup to form local clusters between processes. It works for nodes (aka processes) on the same machine, within the same JVM or even across different servers.
Once you understand the basics it is easy working with it and having the options to actually run two or more processes in the same JVM makes it easy to test those processes easily.
The overhead and latency is minimal if both are on the same machine (usually only a TCP rountrip of about >100ns per action).
socket may be a better choice, I think.
Back in 2004 I implement code which do the job with sockets. Until then, many times I search for a better solution, because socket approach triggers firewall and my clients worry. There is no better solution until now. Client must serialize your data, send and server must receive and unserialize.
It is easy.
Related
I suppose this is not possible. But I am looking at best way to separate different layers of my service yet be able to access layers quickly or without overhead of IPC/RMI.
The main programming language I am using is java, but can use C++ if required.
What we have right now is a server that host database and access control. And we use RMI for consumers to request data. This slow and doesn't scale very well.
We need performance and scalability which we dont have at the moment.
What we are thinking of is using a layered architecture with database at base, access control ontop of it along with a notification bus to notify clients of changes in database.
The main problem is the overhead of communication that we want to avoid/or minimize.
Is there any magic thread that can run in two context (switch context) and share information that way. I know the short answer would be no, but what are the options?
Update
We are currently using Java RMI.
Our base layer will provide an API that can be used to create plugins that will run on top. So its not a fixed collectors/consumer we have. We can have 5-6 collectors running and same amount of consumers.
We can have upto 1000 consumers.
My first suggestion is that you should buy a book (or find an online tutorial) on building scalable applications, because you seem to be pretty lost.
Sharing a thread between processes doesn't make sense at any level - it is meaningless, but you can share the data that the thread accesses, which is probably what you want.
The fastest method will be C based IPC (e.g., shared memory, semasphores, etc: Shmget). You say you want to avoid the overhead of IPC, but really, it isn't going to get any faster than that.
But why do you want multiple processes? If you are worried about the overhead of communicating between processes, just have your threads in one process? There is no reason your different layers have to be in different processes.
But anyway, I am not convinced that your original statement that RMI is slow and doesn't scale is completely correct. If it is not scaling, you are probably not using the right framework. Maybe you have an issue that you only have one RMI end point on the server. Have you considered an J2EE system with stateless session beans?
Without knowing about your requirements, it is hard to say.
It is not possible in general to share thread between two processes due to OS design. The problem of sharing data between two or more processes is usually solved by sharing files, sharing database or sharing messages (which in turn can be synchronous or asynchronous), having processes communicate via pipes, say in Linux, or even sharing memory. You scenario description is not very precise, you need to describe all processes and how information is supposed to flow, what triggers information flow, etc.
Most likely you need high performance messaging library, https://github.com/real-logic/Aeron/ is one. But to get precise answer you would need to describe better what overhead exactly you want to minimize.
If your goal is to notify users, you should consider publish/subscribe messaging (pub/sub). There are many middleware vendors out there that provide this architecture though most are expensive in production scenarios. For open source, check out http://redis.io/topics/pubsub. (No affiliation.)
My question: What approach could/should I take to communicate between two or more JVM instances that are running locally?
Some description of the problem:
I am developing a system for a project that requires separate JVM instances to isolate certain tasks from each other entirely.
In it's running, the 'parent' JVM will create 'child' JVMs that it will expect to execute and then return results to it (in the format of relatively simple POJO classes, or perhaps structured XML data). These results should not be transferred using the SysErr/SysOut/SysIn pipes as the child may already use these as part of its running.
If a child JVM does not respond with results within a certain time, the parent JVM should be able to signal to the child to cease processing, or to kill the child process. Otherwise, the child JVM should exit normally at the end of completing its task.
Research so far:
I am aware there are a number of technologies that may be of use e.g....
Using Java's RMI library
Using sockets to transfer objects
Using distribution libraries such as Cajo, Hessian
...but am interested in hearing what approaches others may consider before pursuing one of these options, or any others.
Thanks for any help or advice on this!
Edits:
Quantity of data to transfer- relatively small, it will mostly be just a handful of POJOs containing strings that will represent the result of the child executing. If any solution would be inefficient on larger amounts of information, this is unlikely to be a problem in my system. The amount being transferred should be pretty static and so this does not have to be scalable.
Latency of transfer- not a critical concern in this case, although if any 'polling' of results is needed this should be able to be fairly frequent without significant overheads, so I can maintain a responsive GUI on top of this at a later time (e.g. progress bar)
Not directly an answer to your question, but a suggestion of an alternative.
Have you considered OSGI?
It lets you run java projects in complete isolation from each other, within the SAME jvm.
The beauty of it is that communication between projects is very easy with services (see Core Specifications PDF page 123). This way there is not "serialization" of any sort being done as the data and calls are all in the same jvm.
Furthermore all your requirements of quality of service (response time etc...) go away - you only have to worry about whether the service is UP or DOWN at the time you want to use it. And for that you have a really nice specification that does that for you called Declarative Services (See Enterprise Spec PDF page 141)
Sorry for the off-topic answer, but I thought some other people might consider this as an alternative.
Update
To answer your question about security, I have never considered such a scenario. I don't believe there is a way to enforce "memory" usage within OSGI.
However there is a way of communicating outside of JVM between different OSGI runtimes. It is called Remote Services (see Enterprise Spec PDF, page 7). They also have nice discussion there of the factors to take into consideration when doing something like that (see 13.1 Fallacies).
Folks at Apache Felix (implementation of OSGI) I think have implementation of this with iPOJO, called Distributed Services with iPOJO (their wrapper to make using services easier). I've never used this - so ignore me if I am wrong.
I'd use KryoNet with local sockets since it specialises heavily in serialisation and is quite lightweight (you also get Remote Method Invocation! I'm using it right now), but disable the socket disconnection timeout.
RMI basically works on the principle that you have a remote type and that the remote type implements an interface. This interface is shared. On your local machine, you bind the interface via the RMI library to code 'injected' in-memory from the RMI library, the result being that you have something that satisfies the interface but is able to communicate with the remote object.
akka is another option, as well as other java actor frameworks, it provides communication and other goodies derived from the actor model.
If you can't use stdin/stdout, then i'd go with sockets. You need some sort of serialization layer on top of the sockets (as you would with stdin/stdout), and RMI is a very easy to use and pretty effective such layer.
If you used RMI and found the performance wasn't good enough, i'd switch to some more efficient serializer - there are plenty of options.
I wouldn't go anywhere near web services or XML. That seems like a complete waste of time, likely take more effort and deliver less performance than RMI.
Not many people seem to like RMI any longer.
Options:
Web Services. e.g. http://cxf.apache.org
JMX. Now, this is really a means of using RMI under the table, but it would work.
Other IPC protocols; you cited Hessian
Roll-your-own using sockets, or even shared memory. (Open a mapped file in the parent, open it again in the child. You'd still need something for synchronization.)
Examples of note are Apache ant (which forks all sorts of Jvms for one purpose or another), Apache maven, and the open source variant of the Tanukisoft daemonization kit.
Personally, I'm very facile with web services, so that's the hammer which which I tend to turn things into nails. A typical JAX-WS+JAX-B or JAX-RS+JAX-B service is very little code with CXF, and manages all the data serialization and deserialization for me.
It was mentioned above, but i wanted to expand a bit on the JMX suggestion. we actually are doing pretty much exactly what you are planning to do (from what i can glean from your various comments). we landed on using jmx for a variety of reasons, a few of which i'll mention here. for one thing, jmx is all about management, so in general it is a perfect fit for what you want to do (especially if you already plan on having jmx services for other management tasks). any effort you put into jmx interfaces will do double duty as apis you can call using java management tools like jvisualvm. this leads to my next point, which is the most relevant to what you want. the new Attach API in jdk 6 and above is very sweet. it enables you to dynamically discover and communicate with running jvms. this allows, for example, for your "controller" process to crash and restart and re-find all the existing worker processes. this is the makings of a very robust system. it was mentioned above that jmx basically rmi under the hood, however, unlike using rmi directly, you don't need to manage all the connection details (e.g. dealing with unique ports, discoverability, etc). the attach api is a bit of a hidden gem in the jdk, as it isn't very well documented. when i was poking into this stuff initially, i didn't know the name of the api, so figuring how the "magic" in jvisualvm and jconsole worked was very difficult. finally, i came across an article like this one, which shows how to actually use the attach api dynamically in your own program.
Although it's designed for potentially remote communication between JVMs, I think you'll find that Netty works extremely well between local JVM instances as well.
It's probably the most performant / robust / widely supported library of its type for Java.
A lot is discussed above. But be it sockets, rmi, jms - there is a lof of dirty work involved.
I would ratter advice akka. It is a actor based model which communicate with each other using Messages.
The beauty is, the actors can be on same JVM or another (very little config) and akka takes care the rest for you. I haven't seen a more cleaner way than doing this :)
Try out jGroups if the data to be communicated is not huge.
How about http://code.google.com/p/protobuf/
It is lightweight.
As you mentioned you can obviously send the objects over the network but that is a costly thing not to mention start up a separate JVM.
Another approach if you just want to separate your different worlds inside one JVM is to load the classes with different classloaders. ClassA#CL1!=ClassA#CL2 if they are loaded by CL1 and CL2 as sibling classloaders.
To enable communications between classA#CL1 and classA#CL2 you could have three classloaders.
CL1 that loads process1
CL2 that loads process2 (same classes as in CL1)
CL3 that loads communication classes (POJOs and Service).
Now you let CL3 be the parent classloader of CL1 and CL2.
In classes loaded by CL3 you can have a light-weight communication send/receive functionality (send(Pojo)/receive(Pojo)) the POJOs between classes in CL1 and classes in CL2.
In CL3 you expose a static service that enables implementations from CL1 and CL2 register to send and receive the POJOs.
In delphi, I am trying to call a function from an external Java program. Is there any way to do it ?
The standard process to call native code is via JNI. A search on JNI and Delphi will reveal multiple pages that detail how this is done, like this and this
What is more desirable (setting up some out of process server (like Peter already detailed, so I skipped that) or using JNI to call a library depends on how often (and how realtime) you need this to be, and on allowable installation/configuration complexity
If it is a running Java application you will need to expose access to that function. There are a myriad of solutions possible.
If it is only 1 function or very limited functionality, then listening on the humble socket or named pipe is a solution which is currently undervalued and kind of forgotten.
On the next step of integration I would look at asynchronous message passing. It is easy to embed an activemq server or similar or start it in a separate process. This has a number of advantages like that the request are easily synchronized in the Java process by simply using one listening thread, that the behavior is well defined when the Java program is not available or the Delphi one. It is very easy to manage and you get the instrumentation for free.
An embedded Jetty webserver is an easy, reliable solution and implement a servlet to do your bidding. Again a lot of the complexity is now handled by using ubiquitous and standard protocols.
Then there are the synchronous RPC methods like COM, Corba, SOAP which I personally find much too complex, error-prone and maintenance unfriendly to use for ad-hoc communication between processes. If you want to build a complete infrastructure of stuff talking to each other it might be worth it, but not to get 2 programs talking.
I can create multiple threads for supporting multi-client feature in socket programming; that's working fine. But if 10,000 clients want to be connected, my server cannot create so many threads.
How can I manage the threads so that I can listen to all these clients simultaneously?
Also, if in this case the server wants to send something to a particular client, then how is it possible?
You should investigate Java's NIO ("New I/O") library for non-blocking network programming. NIO was designed to solve precisely the server scalability problem you are facing!
Introductory article about NIO: Building Highly Scalable Servers with Java NIO
Excerpts from O'Reilly's Java NIO book
Highly scalable socket programming in Java requires the selectable channels provided in the "New I/O", or NIO packages. By using non-blocking IO, a single thread can service many sockets, tending only to those sockets that are ready.
One of the more scalable open-source NIO applications is the Grizzly component of the Glassfish application server. Jean-Francois Arcand has written a number of informative, in-depth blog posts about his work on the project, and covers many subtle pitfalls in writing this kind of software with NIO.
If the concept of non-blocking IO is new to you, using existing software like Grizzly, or at least using it as a starting point for your adaptation, might be very helpful.
The benefits of NIO are debatable. See Paul Tyma's blog entries here and here.
A thread-per-connection threading model (Blocking Socket I/O) will not scale too well. Here's an introduction to Java NIO which will allow you to use non-blocking socket calls in java:
http://today.java.net/cs/user/print/a/350
As the article states, there are plenty of frameworks available so you don't have to roll your own.
As previously mentioned, 10.000 clients is not easy. For java, NIO (possibly augmented with a separate threadpool to handle each request without blocking the NIO thread) is usual way to handle a large amount of clients.
As mentioned, depending on implementation, threads might actually scale, but it depends a lot on how much interaction there is between client connections. Massive threads are more likely to work if there is little synchronization between the threads.
That said, NIO is notoriously difficult to get 100% right the first time you implement it.
I'd recommend either trying out, or at least looking at the source for the Naga NIO lib at naga.googlecode.com. The codebase for the lib is small compared to most other NIO frameworks. You should be able to quickly implement a test to see if you can get 10.000 clients up and running.
(The Naga source also happens to be free to modify or copy without attributing the original author)
This is not a simple question, but for a very in depth (sorry, not in java though) answer see this: http://www.kegel.com/c10k.html
EDIT
Even with nio, this is still a difficult problem. 10000 connections is a tremendous resource burden on the machine, even if you are using non-blocking sockets. This is why large web sites have server farms and load balancers.
Why don't you process only a certain amount of requests at a time.
Let's say you want to process a maximum of 50 requests at a time (for not creating too many threads)
You create a threadpool of 50 threads.
You put all the requests in a Queue (accept connections, keep sockets open), and each thread, when it is done, gets the next request then process it.
This should scale more easily.
Also, if the need arise, it will be easier to do load balancing, since you could share your queues for multiple servers
Personally I would rather use create a custom I/O non blocking setup, for example using one thread to accept clients and using one other thread to process them (checking if any input is available and writing data to the output if necessary).
You'll have to figure out why your application is failing at 10,000 threads.
Is there a hard limit to the number of threads in the JVM or the OS? If so, can it be lifted?
Are you running out of memory? Try configuring a smaller stack size per thread, and/or add more memory to the server.
Something else? Fix it.
Only once you have determined the source of the problem will you be able to fix it. In theory 10,000 threads should be OK but at that level of concurrency it requires some extra tuning of the JVM and operating system if you want it to work out.
You can also consider NIO but I think it can work fine with threads as well.
At what point is it better to switch from java.net to java.nio? .net (not the Microsoft entity) is easier to understand and more familiar, while nio is scalable, and comes with some extra nifty features.
Specifically, I need to make a choice for this situation: We have one control center managing hardware at several remote sites (each site with one computer managing multiple hardware units (a transceiver, TNC, and rotator)). My idea was to have write a sever app on each machine that acts as a gateway from the control center to the radio hardware, with one socket for each unit. From my understanding, NIO is meant for one server, many clients, but what I'm thinking of is one client, many servers.
I suppose a third option is to use MINA, but I'm not sure if that's throwing too much at a simple problem.
Each remote server will have up to 8 connections, all from the same client (to control all the hardware, and separate TX/RX sockets). The single client will want to connect to several servers at once, though. Instead of putting each server on different ports, is it possible to use channel selectors on the client side, or is it better to go multi-threaded io on the client side of things and configure the servers differently?
Actually, since the remote machines serve only to interact with other hardware, would RMI or IDL/CORBA be a better solution? Really, I just want to be able to send commands and receive telemetry from the hardware, and not have to make up some application layer protocol to do it.
Avoid NIO unless you have a good reason to use it. It's not much fun and may not be as beneficial as you would think. You may get better scalability once you are dealing with tens of thousands of connections, but at lower numbers you'll probably get better throughput with blocking IO. As always though, make your own measurements before committing to something you might regret.
Something else to consider is that if you want to use SSL, NIO makes it extremely painful.
Scalability will probably drive your choice of package. java.net will require one thread per socket. Coding it will be significantly easier. java.nio is much more efficient, but can be hairy to code around.
I would ask yourself how many connections you expect to be handling. If it's relatively few (say, < 100), I'd go with java.net.
There is almost no reason to write this kind of networking code from scratch now. Packages like netty.io will almost always get you more reliable and flexible code with fewer lines of code than a hand-crafted solution will. Also, with Netty, you can get SSL support w/o complicating your implementation at all. Libraries like netty also obviate the "async vs threads" question almost entirely, gives you good performance, and still lets you tweak the threading model as needed.
The number of connections you're talking about tells me you should use java.net. Really, there's no reason to complexify your task with non-blocking I/O. (Unless your remote systems are underpowered, but then why are you using Java on them?)
Take a look at Apache's XML-RPC package. It's easy to use, completely hides the network stuff from you, and works over good ol' HTTP. No need to worry about protocol issues ... it'll all look like method calls to you, on both ends.
Given the small number of connections involved, java.net sounds like the right solution.
Other posters talked about using XML-RPC. This is a good choice if the volumes of data being shipped are small, however I have had bad experiences with XML-based protocols when writing inter-process communications that ship large amounts of data (e.g. large request/responses, or frequent small amounts of data). The cost of XML parsing is typically orders of magnitude higher than more optimised wire formats (e.g. ASN.1).
For low volume control applications the simplicity of XML-RPC should outweigh the performance costs. For high volume data communications it may be better to use a more efficient wire protocol.