How to shield a DB?

How to shield a DB? - java

I have to provide an access to my semantics. Currently, it is an RDBMS but later I'll probably use additional non-RDBMS data sources (graph, hadoop etc).
The consumers of my semantics are located inside company domain/intranet but run on remote servers. Moreover, as we are in design stage, it is unclear what technology they will use for implementing their business logic (Java/C or other).
I think it would be a bad idea to provide to external modules of our SW a direct access to my Data Model over JDBC/ODBC (because i do not want to be committed to RDBMS only).
The plan is to create an API to access my semantics. The API is basically, CRUD. Current candidate is REST API using Spring.
My concern is that the access over REST might be slow.
Preferably the technology would be Java-based. However, C-base and others are welcome as well.
I wonder: what alternatives to REST shall I consider?
The only requirement except for the speed of access is that it must be easy to implement and maintain.
I'd appreciate your suggestions.

For just internal calls behind company firewall, RMI would allow to create arbitrary complex API with multiple methods that accept and return complex data structures. CORBA will also do. These technologies are part of JavaSE.
From the newer technologies, Google protocol buffers may not be bad for this task.

If you want to shield your domain model from your data model, you can just put a layer of abstraction in between, i.e. interfaces. Why do you think you need a remoting boundary? Providing a remotely accessible API over REST is completely orthogonal to your concern of eliminating coupling between your domain and data model.
The consumers of your API will always be shielded from your private domain model anyway. If you're building a java API you'll expose interfaces and with a remote and orthogonal REST interface, you'll be exposing HTTP resources.

Related

REST API Java microservice available inside same application server

I have small Java (Java EE) microservice, that do some calculations. This microservice is running on the same application server as other application, also written in Java EE. First question - should these apps communicate each other by REST API or different way? Second question - if so, is there a way to save some time, by not serializing/deserializing transfer objects? I understand that communication between two apps on different servers (languages) requires serialization/deserialization, but what about mentioned situation?

should these apps communicate each other by REST API or different way?
Microservices should communicate over network always. If they have a REST API then use that.
if so, is there a way to save some time, by not serializing/ deserializing transfer objects?
If they are communicating over network the serialization is a must. Anyway, serialization help the decoupling. Microservices should share data but not schema/classes. The serialization must be done by loosing the schema, i.e. you could use JSON. If you share the schema (classes) you break the microservice's encapsulation. You won't be able to change a microservice implementation with other implementation (that is using a different technology stack, PHP with Nginx for example).

If efficiency is paramount, you could use Google's Protobuf. Its a bit of a pain (when compared to json) but very efficient. Its also language-agnostic (or to be more precise: it has implementations in most common languages).
You basically define a message according to the proto spec and then a special compiler generates the relevant get/set code. you use that in your code to send and receive super-efficient messages.

How to design a 2/3 tier distributed application in Java?

I got the task to design a distributed system that basically consists of one centrally shared database and multiple fat clients (Swing based GUIs) that should interact with this database. I basically want to administrate some addresses, deadlines, tasks, etc. I am currently using Java 6 SE, JPA (eclipse-link), and a MySQL database. Now I am facing some problems:
How is client 2 informed about data changes committed to the database by client 1? Is it a good idea to use an RMI approach for messaging?
I am dealing with stale data, since the JPA EntityManager caches the query results. Does it make sense to broadcast "db-change"-messages to all active clients so that they may refresh the locally cached entity objects?
Is there a much simpler approach to achieve these goals by using an application server like GlassFish? Or is the usage of Java EE application servers only convenient for web development? (sorry for these newbie questions, but I really didn't find any clear answers by reading the Java EE docs, or I simply didn't get it :-/ ...)
Any help is highly appreciated - many thanks in advance!

Is there a much simpler approach to achieve these goals by using an application server like GlassFish ?
That is the precise point of an application server (which is distinct from a web-server) in a 3-tier setup. Certainly you can poll and/or use messaging to provide additional hooks for meta-data (e.g. db change event) communication, but you will end up poorly reinventing a very well known (and non-trivial) wheel (e.g. data synchronization in a distributed tier).
If you can live without caching query results in the client and latencies of accessing the server (2nd tier) for data access are acceptable, then clearly that is the way to go.
[below is a fairly late p.s. but happened to read this and the question again today and personally felt it required clarification.]
Java EE is a distributed container/component based architecture for the enterprise tier. Putting aside the failure of a component market to emerge for J2EE (though some did try) what is remains is the fact of its COA and its inherent support for distribution as a foundational concern of the architecture. Note that the web profile (e.g. "web-server") of Java EE is also part of the same architecture.
So what do you get when you use one of these Java EE application servers and how would it address your requirement/design concerns.
Two important key aspects of the support for distribution offered by Java EE are its (a) distributed name-space (JNDI), and (b) its menu of offerings for connectivity across tiers (pure RMI (where you roll your own distributed RPC based system), Enterprise Beans aka EJBs (remotely and locally exposed component interfaces with well defined semantics in terms of lookup and life-cycle in distributable containers). Of the EJB flavors, in terms of connection semantics, you have messaging (JMS) and straight-up RPC.
For your case, you could, for example, opt for a JMS message bus with both fat-client JMS end-points and MessageDrivenBean EJBs. You c/would design a messaging domain with both topic/subscription based and straight up Queues. These can be declaratively configured to be durable, or not, etc.
Your application server c/would provide this JMS provider, or you could opt for a best of breed, e.g. TIBCO, for your needs, per your requirements.
You are not reinventing any of the above very non-trivial concerns. Your focus remains your domain requirements, and you have all the tools you need to create, within certain reasonable SLAs, your platform.
A viable alternative to this is to compose what boils down to the same exact thing minus the COA approach of Java EE (which both gets you declarative magic and pita development ceremony) with stand alone OSS software e.g. ØMQ for your bus, REST remote RPC, and possibly REDIS for beefing up persistence guarantees for your messages and for coordinating (no JNDI ..) your distributed balls in the air.
I personally prefer that latter, given that it is more fun for me. Also efficiencies gained due to more direct control over the distribution layer allows for scalability gains given very stringent requirements (e.g. a tiny minority of requirements out there).
A distributed system design for the enterprise ("have been tasked") needs to consider business requirements beyond merely the application domain. That is part of the equation.
Hope this is helpful (and timely ;)

Since you are using JPA you could benefit from its entity locking and concurrency mechanisms.
There are two main concepts for JPA (Quoted from Java EE 6 tutorial):
Optimistic locking:
By default, persistence providers use optimistic locking, where,
before committing changes to the data, the persistence provider checks
that no other transaction has modified or deleted the data since the
data was read. This is accomplished by a version column in the
database table, with a corresponding version attribute in the entity
class. When a row is modified, the version value is incremented.
Pessimistic locking:
Pessimistic locking goes further than optimistic locking. With
pessimistic locking, the persistence provider creates a transaction
that obtains a long-term lock on the data until the transaction is
completed, which prevents other transactions from modifying or
deleting the data until the lock has ended. Pessimistic locking is a
better strategy than optimistic locking when the underlying data is
frequently accessed and modified by many transactions.
Choose the strategy that fits best to your application behavior and functional requirements.

the fat clients can poll on a configured interval. This is similar to mail clients like outlook, which poll for new e-mail messages.

Your clients conceptually connect to a "middle-tier" which contains the "business logic".
You clients send all requests to the "middle-tier" and the "middle-tier" preforms them. This means that if a middle tier cares about coordinating clients, the middle-tier can remember which clients have "looked at" an important object, and (provided the technology supports it) can transmit an update to the appropriate clients.
Clients mainly contain code to present the data under this scenario, and the code they contain to accept requests mostly proxies the request to the middle tier.

Which is scalable? Simple CRUD Webapp vs Webapp talking to a REST service

I think the title says it clearly. I am no scalability guru. I am on the verge of creating a web application which needs to scale to large data sets and possibly many (wont exaggerate here, lets say thousands of) concurrent users.
MongoDB is the data repository and i am torn between writing a simple Play! webapp talking to MongoDB versus Play! app talking to a REST service app (in Scala) which does the heavy lifting of all business logic and persistence.
Part of me thinks that having the business logic wrapped as a service is future proof and allows deploying just the webapp in multiple nodes (scaling). I come from Java EE stack and Play! is a rebel in java web frameworks. This approach assures me that i can move away from Play! if needed.
Part of me also thinks that Play! app + Scala service app is additional complexity and mayn't be fruitful in the long run.
Any suggestions are appreciated.
NOTE: I am a newbie to Scala, MongoDB and Play!. Pardon me if my question was silly.

Scalability is an engineering art. Which means that you have lots of parameters and apply your experience to specific values of these parameters to come to a solution. So general advice, without more specific data about your problem, is hard.
Having said that, from experience, some general advice:
Keep your application as clean and simple as possible. This allows you to keep your options open. In your case, start with a simple Play app. Concentrate on clean code so you can easily rework what you have into a different architectural model (with clean code, that's simpler than you'd think :-))
Measure, not guess, where the bottlenecks are. It's simple enough to flood a server with requests. Use profiling, memory dumps, whatever, to pinpoint your bottleneck in scalability.
Only then, with a working app in hand (which you could launch early with) and data on where your scaling bottlenecks are, you can make decisions on what to split off in (horizontally scalable) services.
On the outset, services look nice and scalable, but they often get you in an early mess - services need to communicate with each other, so you start introducing messaging, etcetera. Keep it simple, measure, optimize.

The question does not appear to be silly . For me encapsulating your data access behind the rest layer does not directly improve the scalability of the application.(significantly, ofcourse, there is the server that can perform http caching and handle request queues etc.., but from your description, your application looks small enough). You can achieve similar scalability without the Rest layer. But having said that, the Service layer could have a indirect impact.
First it makes your application cleaner. (UI Talking to db is messy.). It helps make the application maintainable. (Multi folds). Rest layer could provide you with middle tier that you may need in your application. Also a correctly designed Rest Layer will have to be Resource Driven . In my experience a Resource Driven Architecture is a good middle ground between ease of implementation and highly scalable design.
So I strongly suggest that you use the Service layer (Rest is the way to go :) ), but scalability in itself cannot justify the decision.

Putting the service between the UI and data source encapsulates the data source, so the UI need not know the details of how data is persisted. It also prevents the UI from reaching directly into the data source. This allows the service to authenticate, authorize, validate, bind, and perform biz logic as needed.
The downside is a slight speed bump for the app.
I'd say that adding the service has a small cost and a big upside. I'd vote for that.

The answer, as usual, is. It depends.
If there is some heavy-lifting involved and some business logic: Yup, that is best put into its own layer and if you add a RESTful interface to it, you can serve that up to whatever front-end technology you want.
Nowadays, people are often not bothering with having a separate web app layer, but serve the data via AJAX directly to the client.
You might consider adding a layer, if you either need to maintain a lot of user session state or have an opportunity to cache data on the presentation layer. There are more reasons why you would want a presentation layer, for example, serving out different presentations to different devices/clients.
Don't just add layers for complexities sake, though.
I might add that you should try to employ the HATEOAS principle. That will ease things significantly when scaling out the solution.

In web project can we write core services layer without knowledge of UI?

I am working on web project. We are using flex as UI layer. My question is often we are writing core service layer separately from web/UI layer so we can reuse same services for different UI layer/technology. So practically is it possible to reuse same core layer services without any changes/addition in API with different kind of UI technologies/layers. For e.g. same core service layer with UI technology which supports synchronized request response (e.g. jsp etc.) and non synchronize or event driven UI technology (e.g Ajax, Flex, GWT etc.) or with multiple devices like (computers, mobiles, pdas etc.). Personally I feel its very tough to write core service layer without any knowledge of UI. Looking for thoughts from other people.

It is possible of course. Services layer is most often stateless, so it just offers methods to be called with parameters that come from the UI. You can imagine it as an API that you call from your UI layers. It is important not to pass anything UI-related as a parameter. For example, don't pass HttpServletRequest or HttpSession as parameters - obtain the values needed and pass them.

My question is often we are writing
core service layer separately from
web/UI layer so we can reuse same
services for different UI
layer/technology
The service layer should be agnostic of the UI, so that's actually very good.
For e.g. same core service layer with
UI technology which supports
synchronized request response (e.g.
jsp etc.) and non synchronize or event
driven UI technology (e.g Ajax, Flex,
GWT etc.) or with multiple devices
like (computers, mobiles, pdas etc.).
A business operation exposed in a service layer should theoretically not depend on who calls it and which technology. But you seem to be facing to common symptoms:
different UI do actually require slightly different business operations (e.g. the amount of data returned may differ for mobile and non-mobile caller)
the business operations is coupled with the remoting technology (e.g. synchronous vs. asynchronous)
For 1. the question you must ask yourself is whether you can factor these slightly different operations elegantly, or the opposite, how to expose variations of the same operations elegantly. In both cases, it should be possible to design a service API that is consistent and fit the need of various clients.
For 2. the question you must ask yourself is how to decouple the business operations from the remoting technologies. A bridge or adapter or an extra mediation layer could be introduced, but that should not lead to over-engineering.
I know it's sometimes hard to decouple the business operation completely from its actual usage. Let's consider pagination: if the business operation is "I want data for query X", the concrete operation exposed does also embed some knowledge of the UI, namely the paging.
I can only provide a general advice: fight for a clear separation between UI and business when it comes to data, formatting, etc. and when you can't, try to figure out the most generic API that makes sense for all clients.
Personally I feel its very tough to
write core service layer without any
knowledge of UI.
I agree it's hard sometimes, but you seem to be doing it right -- so continue this way :)

It shouldn't be difficult to write core services without knowledge of the UI at all.
The core services just need to know what data is needed to perform their tasks (where it comes from doesn't matter).
Once you have the core services designed, you can then build several different UIs on top of them that collect the necessary data and pass it to the services...which would then perform their specific duties.

What remoting approach for Java application would you recommend?

I wonder how is the best way to integrate Java modules developed as separate J(2)EE applications. Each of those modules exposes Java interfaces. The POJO entities (Hibernate) are being used along with those Java interfaces, there is no DTO objects. What would be the best way to integrate those modules i.e. one module calling the other module interface remotely?
I was thinking about: EJB3, Hessian, SOAP, JMS. there are pros and cons of each of the approaches.
Folks, what is your opinion or your experiences?

Having dabbled with a few of the remoting technologies and found them universally unfun I would now use Spring remoting as an abstraction from the implementation.
It allows you to concentrate on writing your functionality and let Spring handle the remote part with some config. you have the choice of several implementations (RMI, Spring's HTTP invoker, Hessian, Burlap and JMS). The abstraction means you can pick one implementation and simply swap it if your needs change.
See the SpringSource docs for more information.

The standard approach would be to use plain RMI between the various service components but this brings issues of sharing your Java interfaces and versioning changes to your domain model especially if you have lots of components using the same classes.
Are you really running each service in a separate VM? If these EJBs are always talking to each other then you're best off putting them into the same VM and avoiding any remote procedure calls as these services can use their LocalInterfaces.
The other thing that may bite you is using Hibernate POJOs. You may think that these are simple POJOs but behind the scenes Hibernate has been busy with CGLib trying to do things like allow lazy initialization. If these beans are serialzed and passed over remote boundaries then you may end up with odd Hibernate Exception getting thown. Personally I'd prefer to create simple DTOs or write the POJOs out as XML to pass between components. My colleagues would go one step further and write custom wire protocols for transferring the data for performance reasons.
Recently I have been using the MULE ESB to integrate various service components. It's quite nice as you can have a mix of RMI, sockets, web services etc without having to write most of the boiler plate code.
http://www.mulesource.org/display/COMMUNITY/Home

Why would you go with anything other than the simplest thing that works?
In your case that sounds like EJB3 or maybe JMS, depending on whether the communication needs to be synchronous or asynchronous.
EJB3 is by far these easiest being built on top of RMI with the container providing all the additional features you might need - security, transactions, etc. Presumably your POJOs are in a shared jar and therefore can simply be passed between your EJBs, although I tend towards passing value objects myself. The other benefit of EJB is, when done right, that it's the most performant (that's just my opinion btw ;-).
JMS is a little more involved, but not much and a system based on asynchronous communication affords certain niceties in terms of parallelizing tasks, etc.
The performance overhead of web-services, the inevitable extra config and additional points of failure make them, IMHO, not worth the hassle unless you've a requirement that mandates their use - I'm thinking interop with non-Java clients or providing data to external parties here.

If you need network communication between Java-only applications, Java RMI is the way to go. It has the best integration, most transparency and the least overhead.
If, however, some of your clients aren't Java-based, you should probably consider other options (Java RMI actually have an IIOP-dialect, which allows it to interact with CORBA, however - I wouldn't recommend doing this, unless it's for some legacy-code integration). Depending on your needs, webservices are probably your friend. If you are conserned with the networkload, you could go webservices over Hessian.

You literally mean remotely? As in running in a different environment with therefore different availability characteristics? With network overheads?
Assuming "yes" my first step would be to take a service approach, set aside the invocation technology for a moment. Just consider the design and meaning of your services. You know they are comparativley expensive to invoke, hence small busy interfaces tend to be a bad thing. You know that the service system might fail between invocations, so you may favour stateless services. You may need to retry requests after failure, so you may favour idempotent service designs.
Then consider availability relationships. Can your client work without the remote system. In some cases you simply can't progress if the remote system isn't available (eg. can't enable the employee if you can't get to the HR system) in other cases you can adopt a "fire-and-tell-me-later" philosophy; queue up the requests and process responses later.
Where there is an availability depdency, then simply exposing a synchronous interface seems to fit. You can do that with SLSB EJBs, if everything is Java EE, that works. I tend to generalise expecting that if my services are useful then non Java EE clients may want them too. So SOAP (or REST) tends to be useful. These days adding a web service interface to your SLSB is pretty trivial.
But my pet theory is that any sufficiently large IT system ends up needing aynch communications: you need to decouple the availability contraints. So I would tend to look for a JMS-style relationship. An MDB facade in front of your services, or SOAP/JMS is not too hard to do. Such an approach tends to highlight the failure-case design issues that were probably lurking anyway, JMS tends to make you think: "suppose I don't get an answer? suppose my answer comes late?"

I would go for SOAP.
JMS would be more efficient but you would need to code up an message driven bean for each interface.
SOAP on the other hand comes with lots of useful toolkits that will generate your message definition (WSDL) and all the neccesary handlers (client and server) when given an EJB.
With soap you can (but dont have to) deal with certificate security and secure connections over public networks. As the default protocol is HTTP over port 80 you will have minimal pain with firewalls etc. SOAP is also great for hetrogenious clients (in your case anything that isn't J2EE) with good support for most common languages on most common platforms.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.