I have 2 processes that need to communicate over the same PC and different PCs. In the local case the process communication is among different processes e.g Process A and Process B.
In the remote case it will be among 2 instances of Process A running in different PCs.
I will create them from scratch and I am wondering what is the best approach. I am aware of RMI and sockets but I was wondering for my case as described, and taking also into account that the messages exchanged are small and the number of APIs really small, if there is a standard approach/library for this.
Any suggesstions are highly welcome
Update after #EJP comments:
My interest is 1)to implement the requirement for communication in a light manner since the API exposed will be really small and the messages as well 2)use and learn a new popular framework if possible (I already know RMI and sockets)
If you are just looking for messaging frameworks, there's a bunch available out such as
RabbitMQ - http://www.rabbitmq.com/
ZeroC Ice - http://www.zeroc.com/ice.html
AMQP - http://www.amqp.org
OpenSplice DDS - http://www.prismtech.com/opensplice
But when you use a 3rd party framework, you are then adding an additional dependency to your application. If it is something very simple like your case, perhaps writing a TCP client/server would be sufficient for a client/server paradigm or if you are looking for publisher/subscriber paradigm then you can look into using UDP multicast. You just need your data class to extends Serializable if you want to be able to marshal and unmarshal your data to buffer and send it over to network using typical JAVA socket API.
I strongly suggest having a look at Thrift. From all the technologies I've used (web services, RMI, XML-RPC, Corba comes to mind) it is currently my favourite. Essentially the steps involved are:
Download the Thrift compiler.
Add the Maven dependency (make sure it is the same version as the compiler!) I currently use 0.8.0.
Write your Thrift IDL (incredibly easy, google for it as there are plenty of examples).
Compile it for Java.
Writer your server/client.
In general, you can whip together a server and a client in about 30 lines of code. In terms of speed and reliability it has never failed me before.
You might have a look at Versile Java (full disclosure: I am one of the developers), it satisfies at least your criteria #1. From the API documentation, here are some examples of writing remote-enabled objects, running a service, and connecting to a service.
If you want to learn something new then I'd look at OpenSplice. The reason is pretty simple, among the technologies suggested above is the only one that provides you with Data-Centric abstractions.
The cool thing about OpenSplice is that gives you the abstraction of a Global Data Space, yet the implementation of this global data space is fully distributed and very high performance.
Take a look at some of the slides available at http://www.slideshare.net/angelo.corsaro and I am sure you'll get in love with the technology.
Finally OpenSplice is Open Source.
Happy Hacking.
A+
JMX is a good alternative .
Example :
http://www.javalobby.org/java/forums/t49130.html
IMB JMX Example
http://alvinalexander.com/blog/post/java/source-code-java-jmx-hello-world-application
Related
I have the following situation:
I have 2 JVM processes (really 2 java processes running separately, not 2 threads) running on a local machine. Let's call them ProcessA an ProcessB.
I want them to communicate (exchange data) with one another (e.g. ProcessA sends a message to ProcessB to do something).
Now, I work around this issue by writing a temporary file and these process periodically scan this file to get message. I think this solution is not so good.
What would be a better alternative to achieve what I want?
Multiple options for IPC:
Socket-Based (Bare-Bones) Networking
not necessarily hard, but:
might be verbose for not much,
might offer more surface for bugs, as you write more code.
you could rely on existing frameworks, like Netty
RMI
Technically, that's also network communication, but that's transparent for you.
Fully-fledged Message Passing Architectures
usually built on either RMI or network communications as well, but with support for complicated conversations and workflows
might be too heavy-weight for something simple
frameworks like ActiveMQ or JBoss Messaging
Java Management Extensions (JMX)
more meant for JVM management and monitoring, but could help to implement what you want if you mostly want to have one process query another for data, or send it some request for an action, if they aren't too complex
also works over RMI (amongst other possible protocols)
not so simple to wrap your head around at first, but actually rather simple to use
File-sharing / File-locking
that's what you're doing right now
it's doable, but comes with a lot of problems to handle
Signals
You can simply send signals to your other project
However, it's fairly limited and requires you to implement a translation layer (it is doable, though, but a rather crazy idea to toy with than anything serious.
Without more details, a bare-bone network-based IPC approach seems the best, as it's the:
most extensible (in terms of adding new features and workflows to your
most lightweight (in terms of memory footprint for your app)
most simple (in terms of design)
most educative (in terms of learning how to implement IPC). (as you mentioned "socket is hard" in a comment, and it really is not and should be something you work on)
That being said, based on your example (simply requesting the other process to do an action), JMX could also be good enough for you.
I've added a library on github called Mappedbus (http://github.com/caplogic/mappedbus) which enable two (or many more) Java processes/JVMs to communicate by exchanging messages. The library uses a memory mapped file and makes use of fetch-and-add and volatile read/writes to synchronize the different readers and writers. I've measured the throughput between two processes using this library to 40 million messages/s with an average latency of 25 ns for reading/writing a single message.
What you are looking for is inter-process communication. Java provides a simple IPC framework in the form of Java RMI API. There are several other mechanisms for inter-process communication such as pipes, sockets, message queues (these are all concepts, obviously, so there are frameworks that implement these).
I think in your case Java RMI or a simple custom socket implementation should suffice.
Sockets with DataInput(Output)Stream, to send java objects back and forth. This is easier than using disk file, and much easier than Netty.
I tend to use jGroup to form local clusters between processes. It works for nodes (aka processes) on the same machine, within the same JVM or even across different servers.
Once you understand the basics it is easy working with it and having the options to actually run two or more processes in the same JVM makes it easy to test those processes easily.
The overhead and latency is minimal if both are on the same machine (usually only a TCP rountrip of about >100ns per action).
socket may be a better choice, I think.
Back in 2004 I implement code which do the job with sockets. Until then, many times I search for a better solution, because socket approach triggers firewall and my clients worry. There is no better solution until now. Client must serialize your data, send and server must receive and unserialize.
It is easy.
I need to interact with a server over TCP/IP with a basic message/response protocol (so for each request I shall receive a defined response).
In the JVM ecosystem, I think Java Socket was the tool to use 15 years ago, but I'm wondering if there is anything more suitable nowadays in the JDK? For example, with Java Sockets one still needs to manually timeout a request if no answer is received, which feels really old fashioned.
So is there anything new in the JDK or JVM universe?
No, there are much better option nowadays which allow you to implement your client/server asynchronously without additional threading.
Look at SocketChannel from nio or even better AsynchronousSocketChannel from nio2. Check this tutorial
Especially the latter option will allow you to just start the connection listener and register callback which will be called whenever new connection is requested, data arrived, data was written etc.
Also consider looking at some high level solutions like Netty. It will take care of the network core, distribute load evenly to executors. Additionally it provides clears separation of the network layer and processing layer. For the processing layer it will provide you with lot of reusable codecs for common protocols.
You can try RMI which works on top of TCP/IP but hides all the hardwork with a convenient set of APIs.
Follow this tutorial post
Well, there are really a lot of other technologies to use, for example JMS has various implementations which work out of the box.
Sockets are low-level building blocks of network communications, like wires in the electricity network of your house. Yes, they're old fashioned, yes, we likely don't want to see them, but they're there and they will stay there for a good reason.
On top of Sockets, you can e.g. pick the HTTPUrlConnection, which implements most of the HTTP protocol. Still, setting timeout policies are in your hands, which I find quite useful, and extremelly painful at the same time.
http://www.mkyong.com/java/how-to-send-http-request-getpost-in-java/
You are free to move one abstraction level above, and use a ready-made REST library, such as this: http://unirest.io/java.html
The example above connects to a server, configures a HTTP query string, perform the request (timeout, encodings, all the mess under the hood), and finally get the response in Json format in a few lines:
Unirest.post("http://httpbin.org/post")
.queryString("name", "Mark")
.field("last", "Polo")
.asJson();
Nowadays a vast amount of web services are available using REST protocol, which is a simple command-response over HTTP. If you have a chance, I'd suggest using REST, as you can easily find available client and server side implementations, and you don't need to reinvent the wheel on the command-protocol layer either.
On client side, unirest is quite convenient. On the server side, we have had really great experience in the 1.2.xx series of Play! framework. But there are thousands of these things out there, just search for "REST".
Take a look to Netty Project, "Netty is a NIO client server framework which enables quick and easy development of network applications such as protocol servers and clients. It greatly simplifies and streamlines network programming such as TCP and UDP socket server."
This framework give us a lot of capabilities that simplify the programming process, allowing a big scalability.
Is used by Twitter and a lot of big companies in the tecnology industry.
This is a nice presentation from Norman Maurer.
Our frontend is simple Jetty (might be replaced with Tomcat later on) server. Through servlets, we are providing a public HTTP API (more or less RESTful) to expose our product functionality.
In the backend, we have a Java process which does several kind of maintenance tasks. While the backend process usually does it own tasks when it's time, now and then, the frontend needs to wake-up the backend to execute a certain task in the background.
Which (N)IO library would be ideal for this task? I found Netty, Grizzly, kryonet and plain RMI. For now, I am inclined to say Netty, it seems simple to use and it is probably very reliable.
Does any of you have experience in this kind of setups? What would your choice be?
thanks!
Try to translate this document which answer to your question.
http://blog.xebia.fr/2011/11/09/java-nio-et-framework-web-haute-performance/
This society, as french famous Java EE experts, did a lot of poc of NIO servers in the context of a french challenge sponsored by VmWare (USI2011). It was about building a simple quizz app that can handle a load of 1 million connected users.
They won that challenge with great results.
Their implementation was Netty + Gemfire and they only replaced the CachedThreadPool by a MemoryAwareThreadPool.
Netty seems to offer great performances, and is well documented.
They also considered Deft, inspired by Tornado (python/facebook) but it's still a bit immature for them
Edit: here's the translated link provided in the comments
My preference is Netty. It's simple yet flexible. Very fast and the community around Netty is awesome.
The company I work for is currently evaluating CoralReactor. It is a commercial software but it has the easiest API I have ever seen for Java NIO. My personal opinion is that Netty makes things too complicated, especially if you want to go garbage-free and single-threaded, which are a requirement for many companies from the finance, advertisement and game industry.
I would decouple them by using JMS, just have some (set of) control queues your backend sits there listening on and you're done. No need to write a custom nio api here.
One sample provider is hornetq. This can be run as an in process jms broker as well, it uses Netty under the covers.
My question: What approach could/should I take to communicate between two or more JVM instances that are running locally?
Some description of the problem:
I am developing a system for a project that requires separate JVM instances to isolate certain tasks from each other entirely.
In it's running, the 'parent' JVM will create 'child' JVMs that it will expect to execute and then return results to it (in the format of relatively simple POJO classes, or perhaps structured XML data). These results should not be transferred using the SysErr/SysOut/SysIn pipes as the child may already use these as part of its running.
If a child JVM does not respond with results within a certain time, the parent JVM should be able to signal to the child to cease processing, or to kill the child process. Otherwise, the child JVM should exit normally at the end of completing its task.
Research so far:
I am aware there are a number of technologies that may be of use e.g....
Using Java's RMI library
Using sockets to transfer objects
Using distribution libraries such as Cajo, Hessian
...but am interested in hearing what approaches others may consider before pursuing one of these options, or any others.
Thanks for any help or advice on this!
Edits:
Quantity of data to transfer- relatively small, it will mostly be just a handful of POJOs containing strings that will represent the result of the child executing. If any solution would be inefficient on larger amounts of information, this is unlikely to be a problem in my system. The amount being transferred should be pretty static and so this does not have to be scalable.
Latency of transfer- not a critical concern in this case, although if any 'polling' of results is needed this should be able to be fairly frequent without significant overheads, so I can maintain a responsive GUI on top of this at a later time (e.g. progress bar)
Not directly an answer to your question, but a suggestion of an alternative.
Have you considered OSGI?
It lets you run java projects in complete isolation from each other, within the SAME jvm.
The beauty of it is that communication between projects is very easy with services (see Core Specifications PDF page 123). This way there is not "serialization" of any sort being done as the data and calls are all in the same jvm.
Furthermore all your requirements of quality of service (response time etc...) go away - you only have to worry about whether the service is UP or DOWN at the time you want to use it. And for that you have a really nice specification that does that for you called Declarative Services (See Enterprise Spec PDF page 141)
Sorry for the off-topic answer, but I thought some other people might consider this as an alternative.
Update
To answer your question about security, I have never considered such a scenario. I don't believe there is a way to enforce "memory" usage within OSGI.
However there is a way of communicating outside of JVM between different OSGI runtimes. It is called Remote Services (see Enterprise Spec PDF, page 7). They also have nice discussion there of the factors to take into consideration when doing something like that (see 13.1 Fallacies).
Folks at Apache Felix (implementation of OSGI) I think have implementation of this with iPOJO, called Distributed Services with iPOJO (their wrapper to make using services easier). I've never used this - so ignore me if I am wrong.
I'd use KryoNet with local sockets since it specialises heavily in serialisation and is quite lightweight (you also get Remote Method Invocation! I'm using it right now), but disable the socket disconnection timeout.
RMI basically works on the principle that you have a remote type and that the remote type implements an interface. This interface is shared. On your local machine, you bind the interface via the RMI library to code 'injected' in-memory from the RMI library, the result being that you have something that satisfies the interface but is able to communicate with the remote object.
akka is another option, as well as other java actor frameworks, it provides communication and other goodies derived from the actor model.
If you can't use stdin/stdout, then i'd go with sockets. You need some sort of serialization layer on top of the sockets (as you would with stdin/stdout), and RMI is a very easy to use and pretty effective such layer.
If you used RMI and found the performance wasn't good enough, i'd switch to some more efficient serializer - there are plenty of options.
I wouldn't go anywhere near web services or XML. That seems like a complete waste of time, likely take more effort and deliver less performance than RMI.
Not many people seem to like RMI any longer.
Options:
Web Services. e.g. http://cxf.apache.org
JMX. Now, this is really a means of using RMI under the table, but it would work.
Other IPC protocols; you cited Hessian
Roll-your-own using sockets, or even shared memory. (Open a mapped file in the parent, open it again in the child. You'd still need something for synchronization.)
Examples of note are Apache ant (which forks all sorts of Jvms for one purpose or another), Apache maven, and the open source variant of the Tanukisoft daemonization kit.
Personally, I'm very facile with web services, so that's the hammer which which I tend to turn things into nails. A typical JAX-WS+JAX-B or JAX-RS+JAX-B service is very little code with CXF, and manages all the data serialization and deserialization for me.
It was mentioned above, but i wanted to expand a bit on the JMX suggestion. we actually are doing pretty much exactly what you are planning to do (from what i can glean from your various comments). we landed on using jmx for a variety of reasons, a few of which i'll mention here. for one thing, jmx is all about management, so in general it is a perfect fit for what you want to do (especially if you already plan on having jmx services for other management tasks). any effort you put into jmx interfaces will do double duty as apis you can call using java management tools like jvisualvm. this leads to my next point, which is the most relevant to what you want. the new Attach API in jdk 6 and above is very sweet. it enables you to dynamically discover and communicate with running jvms. this allows, for example, for your "controller" process to crash and restart and re-find all the existing worker processes. this is the makings of a very robust system. it was mentioned above that jmx basically rmi under the hood, however, unlike using rmi directly, you don't need to manage all the connection details (e.g. dealing with unique ports, discoverability, etc). the attach api is a bit of a hidden gem in the jdk, as it isn't very well documented. when i was poking into this stuff initially, i didn't know the name of the api, so figuring how the "magic" in jvisualvm and jconsole worked was very difficult. finally, i came across an article like this one, which shows how to actually use the attach api dynamically in your own program.
Although it's designed for potentially remote communication between JVMs, I think you'll find that Netty works extremely well between local JVM instances as well.
It's probably the most performant / robust / widely supported library of its type for Java.
A lot is discussed above. But be it sockets, rmi, jms - there is a lof of dirty work involved.
I would ratter advice akka. It is a actor based model which communicate with each other using Messages.
The beauty is, the actors can be on same JVM or another (very little config) and akka takes care the rest for you. I haven't seen a more cleaner way than doing this :)
Try out jGroups if the data to be communicated is not huge.
How about http://code.google.com/p/protobuf/
It is lightweight.
As you mentioned you can obviously send the objects over the network but that is a costly thing not to mention start up a separate JVM.
Another approach if you just want to separate your different worlds inside one JVM is to load the classes with different classloaders. ClassA#CL1!=ClassA#CL2 if they are loaded by CL1 and CL2 as sibling classloaders.
To enable communications between classA#CL1 and classA#CL2 you could have three classloaders.
CL1 that loads process1
CL2 that loads process2 (same classes as in CL1)
CL3 that loads communication classes (POJOs and Service).
Now you let CL3 be the parent classloader of CL1 and CL2.
In classes loaded by CL3 you can have a light-weight communication send/receive functionality (send(Pojo)/receive(Pojo)) the POJOs between classes in CL1 and classes in CL2.
In CL3 you expose a static service that enables implementations from CL1 and CL2 register to send and receive the POJOs.
They both provide roughly the same functionality. Which one should I choose to develop my high-performance TCP server? What are the pros & cons?
Reference links:
Apache MINA (source)
Netty (source)
While MINA and Netty have similar ambitions, they are quite different in practice and you should consider your choice carefully. We were lucky in that we had lots of experience with MINA and had the time to play around with Netty. We especially liked the cleaner API and much better documentation. Performance seemed better on paper too. More importantly we knew that Trustin Lee would be on hand to answer any questions we had, and he certainly did that.
We found everything easier in Netty. Period. While we were trying to reimplement the same functionality we already had on MINA, we did so from scratch. By following the excellent documentation and examples we ended up with more functionality in much, much less code.
The Netty Pipeline worked better for us. It is somehow simpler than MINAs, where everything is a handler and it is up to you to decide whether to handle upstream events, downstream events, both or consume more low-level stuff. Gobbling bytes in "replaying" decoders was almost a pleasure. It was also very nice to be able to reconfigure the pipeline on-the-fly so easily.
But the star attraction of Netty, imho, is the ability to create pipeline handlers with a "coverage of one". You've probably read about this coverage annotation already in the documentation, but essentially it gives you state in a single line of code. With no messing around, no session maps, synchronization and stuff like that, we were simply able to declare regular variables (say, "username") and use them.
But then we hit a roadblock. We already had a multi-protocol server under MINA, in which our application protocol ran over TCP/IP, HTTP and UDP. When we switched to Netty we added SSL and HTTPS to the list in a matter of minutes! So far so good, but when it came to UDP we realised that we had slipped up. MINA was very nice to us in that we could treat UDP as a "connected" protocol. Under Netty there is no such abstraction. UDP is connectionless and Netty treats it as such. Netty exposes more of the connectionless nature of UDP at a lower level than MINA does. There are things you can do with UDP under Netty than you can't under the higher-level abstraction that MINA provides, but on which we relied.
It is not so simple to add a "connected UDP" wrapper or something. Given time constraints and on Trustin's advice that the best way to proceed was to implement our own transport provider in Netty, which would not be quick, we had to abandon Netty in the end.
So, look hard at the differences between them and quickly get to a stage where you can test any tricky functionality is working as expected. If you're satisfied that Netty will do the job, then I wouldn't hesitate to go with it over MINA. If you're moving from MINA to Netty then the same applies, but it is worth noting that the two APIs really are significantly different and you should consider a virtual rewrite for Netty - you won't regret it!
Update: Just use Netty. It is now a mature project with all the bells and whistles required for building protocol clients and servers. It has strong community with several active contributors backed by enterprises. It also has a book, 'Netty in Action'. It has been adopted by many many well-known companies and projects already. Unlike Netty, Apache MINA has been under maintenance mode since I left the project.
MINA has more out-of-the-box features at the cost of complexity and relatively poor performance. Some of those features were integrated into the core too tightly to be removed even if they are not needed by a user. In Netty, I tried to address such design issues while retaining the known strengths of MINA.
Currently, most features available in MINA are also available in Netty. In my opinion, Netty has cleaner and more documented API since Netty is the result of trying to rebuild MINA from scratch and address the known issues. If you find that an essential feature is missing, please feel free to post your suggestion to the forum. I'd be glad to address your concern.
It is also important to note that Netty has faster development cycle. Simply, check out the release date of the recent releases. Also, you should consider that MINA team will proceed to a major rewrite, MINA 3, which means they will break the API compatibility completely.
I performance tested 2 "Google Protobuffer RPC" implementations where one was based on Netty (netty-protobuf-rpc) and the other based on mina (protobuf-mina-rpc). Netty ended up being consistently faster ( +- 10% ) for all message sizes - which backs up the overall performance claim on the Netty web site. Since you want to squeeze every bit of efficiency out of your code when you use such an RPC library, i ended up writing protobuf-rpc-pro based on Netty. I have used MINA in the past, but find their documentation of the 2.0 stuff has big holes, and the breaking of API backward compatibility a big minus.
MINA and Netty were initially designed and build by the same author. That's why they are so similiar to each other.
MINA is designed at a slightly higher level with a little more features, while Netty is a little faster.
I think that there's not much difference at all, the basic concepts are the same.
In Netty site you can find some performance reports. As expected :-) they point out Netty as the framework with the best performance.
I never used Netty, but I already used MINA to implement a TCP protocol. The implementation of encoding and decoding was easy, but the implementation of the state machine was not so easy. MINA provides some classes to aid you when implementing the state machine, but I found them kind of hard to use. In the end we decided to ditch MINA and implement the protocol from scratch, and surprisingly we ended with a faster server.
I use Netty.
Twitter also chose Netty to build its new Search System and made it 3x faster.
Ref: Twitter Search is Now 3x Faster
We chose Netty over some of its other competitors, like Mina and Jetty, because it has a cleaner API, better documentation and, more importantly, because several other projects at Twitter are using this framework.
I've only ever used MINA to build a small http like server. The biggest problems I've run into with it so far:
It will hold your "request" and "response" in memory. This is only an issue because the protocol I choose to use is http. You could use your own protocol however to get around this.
No option to provide a stream off disk in case you want to serve up large files. Again can be worked around by implementing your own protocol
Nice things about it:
Can handle a lot of connections
If you choose to implement some sort of distributed work system then knowing when one of your nodes goes down and loses connection is useful for restarting the work on another node.