I wonder how is the best way to integrate Java modules developed as separate J(2)EE applications. Each of those modules exposes Java interfaces. The POJO entities (Hibernate) are being used along with those Java interfaces, there is no DTO objects. What would be the best way to integrate those modules i.e. one module calling the other module interface remotely?
I was thinking about: EJB3, Hessian, SOAP, JMS. there are pros and cons of each of the approaches.
Folks, what is your opinion or your experiences?
Having dabbled with a few of the remoting technologies and found them universally unfun I would now use Spring remoting as an abstraction from the implementation.
It allows you to concentrate on writing your functionality and let Spring handle the remote part with some config. you have the choice of several implementations (RMI, Spring's HTTP invoker, Hessian, Burlap and JMS). The abstraction means you can pick one implementation and simply swap it if your needs change.
See the SpringSource docs for more information.
The standard approach would be to use plain RMI between the various service components but this brings issues of sharing your Java interfaces and versioning changes to your domain model especially if you have lots of components using the same classes.
Are you really running each service in a separate VM? If these EJBs are always talking to each other then you're best off putting them into the same VM and avoiding any remote procedure calls as these services can use their LocalInterfaces.
The other thing that may bite you is using Hibernate POJOs. You may think that these are simple POJOs but behind the scenes Hibernate has been busy with CGLib trying to do things like allow lazy initialization. If these beans are serialzed and passed over remote boundaries then you may end up with odd Hibernate Exception getting thown. Personally I'd prefer to create simple DTOs or write the POJOs out as XML to pass between components. My colleagues would go one step further and write custom wire protocols for transferring the data for performance reasons.
Recently I have been using the MULE ESB to integrate various service components. It's quite nice as you can have a mix of RMI, sockets, web services etc without having to write most of the boiler plate code.
http://www.mulesource.org/display/COMMUNITY/Home
Why would you go with anything other than the simplest thing that works?
In your case that sounds like EJB3 or maybe JMS, depending on whether the communication needs to be synchronous or asynchronous.
EJB3 is by far these easiest being built on top of RMI with the container providing all the additional features you might need - security, transactions, etc. Presumably your POJOs are in a shared jar and therefore can simply be passed between your EJBs, although I tend towards passing value objects myself. The other benefit of EJB is, when done right, that it's the most performant (that's just my opinion btw ;-).
JMS is a little more involved, but not much and a system based on asynchronous communication affords certain niceties in terms of parallelizing tasks, etc.
The performance overhead of web-services, the inevitable extra config and additional points of failure make them, IMHO, not worth the hassle unless you've a requirement that mandates their use - I'm thinking interop with non-Java clients or providing data to external parties here.
If you need network communication between Java-only applications, Java RMI is the way to go. It has the best integration, most transparency and the least overhead.
If, however, some of your clients aren't Java-based, you should probably consider other options (Java RMI actually have an IIOP-dialect, which allows it to interact with CORBA, however - I wouldn't recommend doing this, unless it's for some legacy-code integration). Depending on your needs, webservices are probably your friend. If you are conserned with the networkload, you could go webservices over Hessian.
You literally mean remotely? As in running in a different environment with therefore different availability characteristics? With network overheads?
Assuming "yes" my first step would be to take a service approach, set aside the invocation technology for a moment. Just consider the design and meaning of your services. You know they are comparativley expensive to invoke, hence small busy interfaces tend to be a bad thing. You know that the service system might fail between invocations, so you may favour stateless services. You may need to retry requests after failure, so you may favour idempotent service designs.
Then consider availability relationships. Can your client work without the remote system. In some cases you simply can't progress if the remote system isn't available (eg. can't enable the employee if you can't get to the HR system) in other cases you can adopt a "fire-and-tell-me-later" philosophy; queue up the requests and process responses later.
Where there is an availability depdency, then simply exposing a synchronous interface seems to fit. You can do that with SLSB EJBs, if everything is Java EE, that works. I tend to generalise expecting that if my services are useful then non Java EE clients may want them too. So SOAP (or REST) tends to be useful. These days adding a web service interface to your SLSB is pretty trivial.
But my pet theory is that any sufficiently large IT system ends up needing aynch communications: you need to decouple the availability contraints. So I would tend to look for a JMS-style relationship. An MDB facade in front of your services, or SOAP/JMS is not too hard to do. Such an approach tends to highlight the failure-case design issues that were probably lurking anyway, JMS tends to make you think: "suppose I don't get an answer? suppose my answer comes late?"
I would go for SOAP.
JMS would be more efficient but you would need to code up an message driven bean for each interface.
SOAP on the other hand comes with lots of useful toolkits that will generate your message definition (WSDL) and all the neccesary handlers (client and server) when given an EJB.
With soap you can (but dont have to) deal with certificate security and secure connections over public networks. As the default protocol is HTTP over port 80 you will have minimal pain with firewalls etc. SOAP is also great for hetrogenious clients (in your case anything that isn't J2EE) with good support for most common languages on most common platforms.
Related
I have to provide an access to my semantics. Currently, it is an RDBMS but later I'll probably use additional non-RDBMS data sources (graph, hadoop etc).
The consumers of my semantics are located inside company domain/intranet but run on remote servers. Moreover, as we are in design stage, it is unclear what technology they will use for implementing their business logic (Java/C or other).
I think it would be a bad idea to provide to external modules of our SW a direct access to my Data Model over JDBC/ODBC (because i do not want to be committed to RDBMS only).
The plan is to create an API to access my semantics. The API is basically, CRUD. Current candidate is REST API using Spring.
My concern is that the access over REST might be slow.
Preferably the technology would be Java-based. However, C-base and others are welcome as well.
I wonder: what alternatives to REST shall I consider?
The only requirement except for the speed of access is that it must be easy to implement and maintain.
I'd appreciate your suggestions.
For just internal calls behind company firewall, RMI would allow to create arbitrary complex API with multiple methods that accept and return complex data structures. CORBA will also do. These technologies are part of JavaSE.
From the newer technologies, Google protocol buffers may not be bad for this task.
If you want to shield your domain model from your data model, you can just put a layer of abstraction in between, i.e. interfaces. Why do you think you need a remoting boundary? Providing a remotely accessible API over REST is completely orthogonal to your concern of eliminating coupling between your domain and data model.
The consumers of your API will always be shielded from your private domain model anyway. If you're building a java API you'll expose interfaces and with a remote and orthogonal REST interface, you'll be exposing HTTP resources.
My question: What approach could/should I take to communicate between two or more JVM instances that are running locally?
Some description of the problem:
I am developing a system for a project that requires separate JVM instances to isolate certain tasks from each other entirely.
In it's running, the 'parent' JVM will create 'child' JVMs that it will expect to execute and then return results to it (in the format of relatively simple POJO classes, or perhaps structured XML data). These results should not be transferred using the SysErr/SysOut/SysIn pipes as the child may already use these as part of its running.
If a child JVM does not respond with results within a certain time, the parent JVM should be able to signal to the child to cease processing, or to kill the child process. Otherwise, the child JVM should exit normally at the end of completing its task.
Research so far:
I am aware there are a number of technologies that may be of use e.g....
Using Java's RMI library
Using sockets to transfer objects
Using distribution libraries such as Cajo, Hessian
...but am interested in hearing what approaches others may consider before pursuing one of these options, or any others.
Thanks for any help or advice on this!
Edits:
Quantity of data to transfer- relatively small, it will mostly be just a handful of POJOs containing strings that will represent the result of the child executing. If any solution would be inefficient on larger amounts of information, this is unlikely to be a problem in my system. The amount being transferred should be pretty static and so this does not have to be scalable.
Latency of transfer- not a critical concern in this case, although if any 'polling' of results is needed this should be able to be fairly frequent without significant overheads, so I can maintain a responsive GUI on top of this at a later time (e.g. progress bar)
Not directly an answer to your question, but a suggestion of an alternative.
Have you considered OSGI?
It lets you run java projects in complete isolation from each other, within the SAME jvm.
The beauty of it is that communication between projects is very easy with services (see Core Specifications PDF page 123). This way there is not "serialization" of any sort being done as the data and calls are all in the same jvm.
Furthermore all your requirements of quality of service (response time etc...) go away - you only have to worry about whether the service is UP or DOWN at the time you want to use it. And for that you have a really nice specification that does that for you called Declarative Services (See Enterprise Spec PDF page 141)
Sorry for the off-topic answer, but I thought some other people might consider this as an alternative.
Update
To answer your question about security, I have never considered such a scenario. I don't believe there is a way to enforce "memory" usage within OSGI.
However there is a way of communicating outside of JVM between different OSGI runtimes. It is called Remote Services (see Enterprise Spec PDF, page 7). They also have nice discussion there of the factors to take into consideration when doing something like that (see 13.1 Fallacies).
Folks at Apache Felix (implementation of OSGI) I think have implementation of this with iPOJO, called Distributed Services with iPOJO (their wrapper to make using services easier). I've never used this - so ignore me if I am wrong.
I'd use KryoNet with local sockets since it specialises heavily in serialisation and is quite lightweight (you also get Remote Method Invocation! I'm using it right now), but disable the socket disconnection timeout.
RMI basically works on the principle that you have a remote type and that the remote type implements an interface. This interface is shared. On your local machine, you bind the interface via the RMI library to code 'injected' in-memory from the RMI library, the result being that you have something that satisfies the interface but is able to communicate with the remote object.
akka is another option, as well as other java actor frameworks, it provides communication and other goodies derived from the actor model.
If you can't use stdin/stdout, then i'd go with sockets. You need some sort of serialization layer on top of the sockets (as you would with stdin/stdout), and RMI is a very easy to use and pretty effective such layer.
If you used RMI and found the performance wasn't good enough, i'd switch to some more efficient serializer - there are plenty of options.
I wouldn't go anywhere near web services or XML. That seems like a complete waste of time, likely take more effort and deliver less performance than RMI.
Not many people seem to like RMI any longer.
Options:
Web Services. e.g. http://cxf.apache.org
JMX. Now, this is really a means of using RMI under the table, but it would work.
Other IPC protocols; you cited Hessian
Roll-your-own using sockets, or even shared memory. (Open a mapped file in the parent, open it again in the child. You'd still need something for synchronization.)
Examples of note are Apache ant (which forks all sorts of Jvms for one purpose or another), Apache maven, and the open source variant of the Tanukisoft daemonization kit.
Personally, I'm very facile with web services, so that's the hammer which which I tend to turn things into nails. A typical JAX-WS+JAX-B or JAX-RS+JAX-B service is very little code with CXF, and manages all the data serialization and deserialization for me.
It was mentioned above, but i wanted to expand a bit on the JMX suggestion. we actually are doing pretty much exactly what you are planning to do (from what i can glean from your various comments). we landed on using jmx for a variety of reasons, a few of which i'll mention here. for one thing, jmx is all about management, so in general it is a perfect fit for what you want to do (especially if you already plan on having jmx services for other management tasks). any effort you put into jmx interfaces will do double duty as apis you can call using java management tools like jvisualvm. this leads to my next point, which is the most relevant to what you want. the new Attach API in jdk 6 and above is very sweet. it enables you to dynamically discover and communicate with running jvms. this allows, for example, for your "controller" process to crash and restart and re-find all the existing worker processes. this is the makings of a very robust system. it was mentioned above that jmx basically rmi under the hood, however, unlike using rmi directly, you don't need to manage all the connection details (e.g. dealing with unique ports, discoverability, etc). the attach api is a bit of a hidden gem in the jdk, as it isn't very well documented. when i was poking into this stuff initially, i didn't know the name of the api, so figuring how the "magic" in jvisualvm and jconsole worked was very difficult. finally, i came across an article like this one, which shows how to actually use the attach api dynamically in your own program.
Although it's designed for potentially remote communication between JVMs, I think you'll find that Netty works extremely well between local JVM instances as well.
It's probably the most performant / robust / widely supported library of its type for Java.
A lot is discussed above. But be it sockets, rmi, jms - there is a lof of dirty work involved.
I would ratter advice akka. It is a actor based model which communicate with each other using Messages.
The beauty is, the actors can be on same JVM or another (very little config) and akka takes care the rest for you. I haven't seen a more cleaner way than doing this :)
Try out jGroups if the data to be communicated is not huge.
How about http://code.google.com/p/protobuf/
It is lightweight.
As you mentioned you can obviously send the objects over the network but that is a costly thing not to mention start up a separate JVM.
Another approach if you just want to separate your different worlds inside one JVM is to load the classes with different classloaders. ClassA#CL1!=ClassA#CL2 if they are loaded by CL1 and CL2 as sibling classloaders.
To enable communications between classA#CL1 and classA#CL2 you could have three classloaders.
CL1 that loads process1
CL2 that loads process2 (same classes as in CL1)
CL3 that loads communication classes (POJOs and Service).
Now you let CL3 be the parent classloader of CL1 and CL2.
In classes loaded by CL3 you can have a light-weight communication send/receive functionality (send(Pojo)/receive(Pojo)) the POJOs between classes in CL1 and classes in CL2.
In CL3 you expose a static service that enables implementations from CL1 and CL2 register to send and receive the POJOs.
To transfer data from one system to another, through data interface, by web services, we normally get a result set by SQL query, and format them as a web service endpoint, and allow it to be retrieved by another side.
With EJB 3.0, it seems we can replace the result set by stateless session bean. So are there any advantages over the SQL-based web services? And when should we use it?
This is a very broad question on the system architect level. I will try to answer with my best knowledge without starting a flame war (FYI, I have used both ejb and spring).
As you know, building a stable/robust software application requires many building blocks, such as logging, connection pool, etc. Usually, you can find libraries of these building blocks, but not all of them have common api, so they may require integration. In the worst case, you may have to lock into some vendors. The main idea of EJB 3 (or Java EE) is to provide a more complete set of building blocks (via API, annotation or config), so developers can start working on the core business logic right away with an industry standard API/spec/config without training on the proprietary APIs. Additionally, you can change vendor without changing your codes since API/config are really the industry standard (well, your mileage may vary a lot in the real life. hopefully, the new Java EE will fix it).
Your application may already have some of the main elements that EJB 3 already provides. However, EJB 3 promises to provide more such as ORM mapping, RMI, Load balancing, failover, transactions, dynamic redeployment, logging, system management, thread managing, resource pooling (db connection), security, caching.
As you have an working application already, you can really consider if it is worth of your efford to migrate your codes to a standard system to gain more functionality vs integrate new functionality individually. Additionally, EJB 3.0 (or Java EE) is not really the framework that you can pick. You can also look into other framework, such as Spring.
My suggestion is to really figure what your system requirements, and then pick the right technologies instead of picking up the coolest technologies first.
Good luck
I come from a background of MoM. I think I understand ESB conceptually. However, I'm not too sure about the practical differences between the two when it comes to making a choice architecturally.
Here is what I want to know
1) Any good links online which can help me in this regard.
2) Can someone tell me where it makes sense to use one over the other.
Any help would be useful.
Messaging tends to concentrate on the reliable exchange of messages around a network; using queues as a reliable load balancer and topics to implement publish and subscribe.
An ESB typically tends to add different features above and beyond messaging such as orchestration, routing, transformation and mediation.
I'd recommend reading about the Enterprise Integration Patterns which gives an overview of common patterns you'll tend to use in integration problems which are all based above a message bus (though can be used with other networking technologies too).
For example using open source; Apache ActiveMQ provides a loosely coupled reliable exchange of messages. Then you can use Apache Camel to implement the Enterprise Integration Patterns for smart routing, transformation, orchestration, working with other technologies and so forth.
I put MOM solutions and ESB solutions on two distinct planes.
I consider MOM a building block for ESB solutions. In fact, ESB solutions reach their own loose coupling and asynchronous communication capabilities, just using the paradigm offered by the specific MOM implementation.
Therefore, MOMs represent solutions for data/events distribution at customized level of QoSs (according to the specific vendor implementation), instead ESBs represent solutions providing capabilities to realize complex orchestrations in a SOA scenario (where we have multiple providers offering their services, and multiple consumers interested in consuming the services offered by the first ones).
Complex orchestrations imply communication between legacy systems, everyone of these with its own data domain representation (rules and services on specific data) and its own communication paradigm (one consumer interact with the ESB using CORBA, another one using WS, and so on).
It is clear that ESB represents a more complex architectural solution aimed to provide the abstraction of data-bus (such as the electronic buses that everyone have in his own pc), able to connect a plethora of service providers to a not well specified plethora of service consumers, hiding heterogeneity in (i) data representation and (ii) communication.
Sorry for the long post, but the concepts are complex and it is very difficult to be effective and efficient in a short statement.
An ESB is typically a layer that routes, logs, transforms, and performs other 'technical' (i.e. non-business) functions on messages. It could process messages from a messaging system (such as something JMS-based), or it could work with other types of message (such as SOAP-based web services). In that respect, it's more general than MoM.
Disclaimer: I am an IBM WebSphere consultant - although I am not contributing here in an official capacity.
ESB with web services in its true form provides Application loose coupling by sending the data through one the elements of the message.
MOM provides not only Application Loose coupling but process loose coupling along.
ESB comes with additional features supporting Governance centric approach.
Both can be used independently or together depending upon the scenario.
IBM and Oracle have SOA certifications. Since they're the leaders in the marketplace (Gartner Magic Quadrant), I would read about how they define SOA and ESBs (along with methodology and the components needed to support SOA like Governance, Registry, etc etc)
EBS is just yet another buzzword, as is SOA 2.0.
You can have a ESB system easily implemented with normal Web Services with a queue behind them.
You can have message routing and or orchestration with SOA 1.0 (Tibco, BizzTalk), one things does not stop the other really. More importantly, it is the semantics given to the messages exchanged in the system that play an important role, in this case events. Messages as events, are triggers about something that happened in your system, so the context is different.
When developing distributed applications, all written in Java by the same company, would you choose Web Services or RMI? What are the pros and cons in terms of performance, loose coupling, ease of use, ...? Would anyone choose WS? Can you build a service-oriented architecture with RMI?
I'd try to think about it this way:
Are you going for independent services running beneath each other, and those services may be accessed by non-java applications some time in the future? Then go for web services.
Do you just want to spread parts of an application (mind the singular) over several servers? Then go for RMI and you won't have to leave the Java universe to get everything working together tightly coupled.
I would choose WS.
It is unlikely that WS/RMI will be your bottleneck.
Why to close the door for other possible technologies in the future?
RMI might have problem if version of classes on client/server get out of sync.
And... I would most likely choose REST services.
If You Aren't Gonna Need It (interop with non-Java), and you probably aren't, RMI is going to be better; less code, less configuration, less bandwidth overhead.
An option if you are scared that you Are Gonna Need It is to use EJB3; it uses RMI, is very easy to setup and deploy, but also allows you to turn your calls into Web Services easily if you need them.
Whatever you do, do not create your own thing; stick to a standard.
my choices are:
standard java serialization - pros : imho offers the most performance, simple to implement (I'm using Spring to expose local interface as remote one); cons : serialization doesn't work between different jvm versions
binary serialization (for example hessian from jetty) - pros : same performance as with java serialization and works between different jvm versions
WS: only if there is a need in interoperability between different platforms java + .net , otherwise it is just too haveweight.
RMI is a great rapid-development transport, but I would advise against using it in a production environment. The serialization compatibility problem can make things awkward, you have to coordinate your deployments very carefully.
WebServices are inefficient, yes, but just through hardware at it. Alternatively, use plain, lightweight XML-over-HTTP, rather than full-fat SOAP/WSDL.