My requirement is to share a java object across a cluster.
I get Confused
whether to write an EJB and share the java objects across the cluster
or
to use any third party such as infinispan or memecached or terracotta or
what about JCache?
with the constraint that
I can't change any of my source code with specific to any application
server (such as implementing the weblogic's singleton services).
I can't offer two builds for cluster and non cluster environment.
Performance should not be downgraded.
I am looking for only open source third party if I need to use it.
It need to work in weblogic , Websphere , Jbos and Tomcat too.
Can any one come up with the best option with these constraints in mind.
It can depend on the use case of the objects you want to share in the cluster.
I think it comes down to really the following options in most complex to least complex
Distributed cacheing
http://www.ehcache.org
Distributed cacheing is good if you need to ensure that an object is accessible from a cache on every node. I have used ehache to distribute quite successfully, no need to setup a terracotta server unless you need the scale, can just point instances together via rmi. Also works synchronously and asynchronously depending on requirements. Also cache replication is handy if nodes go down so cache is actually redundant and dont lose anything. Good if you need to make sure that the object has been updated across all the nodes.
Clustered Execution/data distribution
http://www.hazelcast.com/
Hazelcast is also a nice option as provides a way of executing java classes across a cluster. This is more useful if you have an object that represents a unit of work that needs to be performed and you dont care so much where it gets executed.
Also useful for distributed collections, i.e. a distributed map or queue
Roll your own RMI/Jgroups
Can write your own client/server but I think you will start to run into issues that the bigger frameworks solve if the requirements of the objects your dealing with starts to get complex. Realistically Hazelcast is really simple and should really eliminate the need to roll your own.
It's not open source, but Oracle Coherence would easily solve this problem.
If you need an implementation of JCache, the only one that I'm aware of being available today is Oracle Coherence; see: http://docs.oracle.com/middleware/1213/coherence/develop-applications/jcache_part.htm
For the sake of full disclosure, I work at Oracle. The opinions and views expressed in this post are my own, and do not necessarily reflect the opinions or views of my employer.
It is just an idea. you might want to check the exact implementation.
It will downgrade performance but I don't see how it is possible to avoid it.
It not an easy one to implement. might be you should consider load balance instead of clustering.
you might consider RMI and/or dynamic-proxy.
extract interface of your objects.
use RMI to access the real object (from all clusters even the one that actually holds the object)
in order to create RMI for an existing code you might use dynamic-proxy (again..not sure about implementation)
*dynamic proxy can wrap any object and do some pre and post task on each method invocation. in this case it might use the original object for RMI invocation
you will need connectivity between clusters in order to propogate the RMI object.
Related
If I have a distributed java web application deployed in a cluster and I have say 10 servlets & 10 JSPs running the show, and if I want to share some data, say a variable or a simple POJO between all the threads of all the servlets on all the machines, what is the way to do it?
No framework like Spring/Struts is used and let's say I'm only using the basic Servlets and JSPs. Usually we think about ServletConfig, ServletContext, HttpSession and HttpServletRequest objects to store information which needs to be passed/shared from one component to another. ServletContext has the largest scope because it's accessible from all the servlets and JSPs in the web app. But in case of distributed application I guess ServeltContext object would be created one per JVM, so even for a single web app every machine in the cluster will have a different java object for ServletContext, correct? So in such a scenario what should be done to share a POJO between all the servlets on all the machines of a single web app?
If it's not possible using plain Servlets and JSP, do any frameworks make is possible? Would appreciate any inputs. Many thanks!
In a distributed architecture, it is useful to think beyond objects and think about "services". There are several possible solutions for this but all of them would include some form of service you could access from any of your 10 nodes.
So, you could for example create an 11th machine and host an API for putting and getting objects (values/maps/etc?). That would create a shareable region between the nodes.
However, this opens a whole world of possible issues if not done correctly, because you need to think about sinchronization, deadlocks, dirty reads and other concurrent processing stuff in a cross-JVM mindset.
Also, many systems sinchronize their nodes via the database, but this approach is somehow deprecated nowadays in favor of the more recent "microservices" approach where persistence is distributed, not monolithic.
you are using spring already, so maybe spring session project is a right choice for you - http://projects.spring.io/spring-session/. For sure its the easiest one to run.
You can use hazelcast, a framework as memcache but with auto-discovery for clustering . I use to used for the session and cache sharing on my Amazon cluster and works like a charm
http://hazelcast.com/use-cases/caching/
But if you want keep in simple you can always use as I said before memcached
http://memcached.org/
Sharing things between servers is:
error prone
sometimes complicated
The most common thing to want is user session data across a load balanced cluster of servers. If someone is talking to one server, then gets load balanced to a different server, you want to keep their session going. Tomcat Clusters does this, and it's already built in.
https://tomcat.apache.org/tomcat-7.0-doc/cluster-howto.html
The last time I played with that, it was touchy; don't count on session replication always working in any servlet container, and you'll be better off. Also, session replication is crazy expensive; once you're past a few machines, the cost (in RAM) of having all session data everywhere... starts to add up quickly, and you can't add more users easily anymore.
Wanting to share things between multiple JVMs is a code smell; if you can architect around it, do so. But other than clustering, you have the two normal options:
a database. Tried, true, tested; keep details that need to change there.
an in-memory store. If it gets called on every request, and/or must be really fast for whatever reason, just consider keeping it in memory; memcached is a multi-machine in-memory key-value-store that does just this.
The simplest solution is ConcurrentHashMap https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ConcurrentHashMap.html
If you want to scale your application - you will need something like hazelcast - http://hazelcast.com/
I have to create an mysql database to be used by several applications in parallel for the first time. Up until this point my only experience with mysql databases have been single programs (for example webservers) querying the database.
Now i am moving into a scenario where i will have several CXF java servlet type programs, as well as a background server editing and reading on the same schemas.
I am using the Connector/J JDBC driver to connect to the database in all instances.
My question is this: What do i need to do in order to make sure that the parallel access does not become a problem. I realize that i need to use transactions where appropriate, but where i am truly lost is in the management.
For example.
Do i need to close the connection every time a servlet is done with a job?
Do i need a unique user for each program accessing the database?
Do i have to do something with my Connector/J objects?
Do i have to declare my tables in a different way?
Did i miss anything or is there something i failed to think about?
I have a pretty good idea about how to handle transactions and the SQL itself, but i am pretty lost when it comes to what i need to do when setting up my database.
You should maintain a pool of connections. Connections are really expensive to create think on the order of of several hundred milliseconds. So for high volume apps it makes sense to cache and reuse them.
For your servlet it depends on what container you are using. Something like JBoss will provide pooling as part of the container. It can be defined through the datasource definition and accessed through JNDI. Other containers like tomcat may rely on something like C3PO.
Most of these frameworks return custom implementations of JDBC connections that implement the close() methods with logic that returns the connection to the pool. You should familiarize yourself with the details of your concrete implementation to make sure you are doing things in a way that is supported
As for the concurrency considerations, you should familiarize yourself with concepts of optimistic/pessimistic locking and transaction isolation levels. These have trade offs where the correct answer can only be determined given the operational context of your application.
Considering the user, Most applications have one user that represents the application called the read/write user. This user should only have privilege to read and write records from the tables,indices,sequences, etc. that are associated with your application. All the instances of the application will specify this user in their connection string.
If you familiarize yourself with the concepts above, you'll be about 95% of the way there.
One more thing. As pointed out in the comments on the administration side your database engine is a huge consideration. You should familiarize yourself with the differences and the tuning/configuration options.
I'm currently working on a Java application which should have the capability to use different versions of a class at the same time (because of multi tenancy support). I was wondering, is there any good approach to manage this? My basic approach is to have an interface, lets say Car, and implement the different versions as CarV1, CarV2, and so on. Every version gets its own class.
My approach is kind of wiered, I think. But I didn't found any literature regarding to this topic, but I actually don't know what I should search for.
The interface idea is prudent. Combine it with a factory that can produce the required implementation instance depending on some external input, e. g. the tenant-id. If you don't need to support multiple tenants in the same running instance of the application, you could also use something like the ServiceLocator from the JDK which allows to use a file-based configuration approach.
If you are running in an application server, consider just firing up multiple instances, each configured for a different client. The server will then take care of the separation of instances, just fine.
Otherwise, if you really think you need multiple implementations at the same time (at runtime) in a non-Java EE application, this is a tricky problem. Maybe you want to consider a look at OSGi containers, which provide features for having multiple versions of a class. However, an approach like this add significant complexity, if you are not already familiar with it.
In theory you can handle this using multiple class loaders like JBoss for example does.
BUT: I would strongly advise against implementing this yourself. This is a rather complicated matter and easily gotten wrong. If you are talking about a web application, you can instead create one web app instance per tenant. If you are working on a stand-alone app, you should check, if running one instance per tenant might be feasible.
I have several Beans in my Application which getting updated regularly by the usual setter methods. I want to synchronize these beans with a remote application which has the same bean classes. In my case, bandwidth matters, so i have to keep the amount of transferred bytes as low as possible. My idea was to create deltas of the state changes and transfer them instead of the whole Objects. Currently, I want to write the protocol to transfer those changes by myself but I'm not bound to it and would prefer an existing solution.
Is there already a solution for this Problem out there? And if not, how could I easily monitor those state changes in an generalized way? AOP?
Edit: This problem is not caching related even it may first seem so. The data must be replicated from a central server to several clients (about 4 to 10) over the internet. The client is a standalone desktop application.
This sounds remarkably similar to JBossCache running in POJO mode.
This is a distributed, delta-based cache that breaks down java objects into a tree structure, and only transmits changes to the bits of the tree that changes.
Should be a perfect fit for you.
I like your idea of creating deltas and sending them.
A simple Map could handle the delta for one object. Serialization could simply get you the effective message send.
To reduce the number of messages that would kill your performance, you should group your deltas for all objects and send them as a whole. So you could have others collections or maps to contain this.
To monitor all changes to many beans, AOP seem like a good solution.
EDIT : see Skaffmann's answer.
Using an existing cache technology could be better.
Many problems could already have solutions implemented...
I'm writing an HTTP Cache library for Java, and I'm trying to use that library in the same application which is started twice. I want to be able to share the cache between those instances.
What is the best solution for this? I also want to be able to write to that same storage, and it should be available for both instances.
Now I have a memory-based index of the files available to the cache, and this is not shareable over multiple VMs. It is serialized between startups, but this won't work for a shared cache.
According to the HTTP Spec, I can't just map files to URIs as there might be a variation of the same payload based on the request. I might, for instance, have a request that varies on the 'accept-language' header: In that case I would have a different file for each subsequent request which specifies a different language.
Any Ideas?
First, are you sure you want to write your own cache when there are several around? Things like:
ehcache
jboss cache
memcached
The first two are written in Java and the third can be accessed from Java. The first two also handle distributed caching, which is the general case of what you are asking for, I think. When they start up, they look to connect to other members so that they maintain a consistent cache across instances. Changes to one are reflected across instances. They can be set up to connect via multicast or with specific lists of servers specified.
Memcached typically works in a slightly different manner in that it is running externally to the Java processes you are running, so that all Java instances that start up will be talking to a common service. You can set up memcached to work in a distributed manner, but it does so by hashing keys so that the server you want to connect to can be determined by what it is you are looking for.
Doing a true distributed cache with consistent content is very hard to do well, which is why I suggest looking at an existing library. If you want to do it yourself, it would still help to look at those listed to see how they go about it and consider using something like JGroups as your underlying mechanism.
I think you should have a look at the WebDav-Specifications. It's an HTTP extension for sharing/editing/storing/versioning resources on a server. There exists an implementation as an Apache module, wich allows you a swift start using them.
So instead of implementing your own cache server implementation, you might be better off with a local Apache + mod-dav instance that is available to both of your applications.
Extra bonus: Since WebDav is a specified protocoll you get the interoperability with lots of tools for free.