JGroups, Terracotta & Hazelcast - java

Trying to wrap my head around these 3 projects and they all seem to handle slightly different problems that arise when trying to cluster. But all the documentation for them is sort of written for developers that are already "in the know", and are difficult for a newbie like me to make good sense of.
What are the specific problems each of them are trying to solve, and how do these problems differ from one another?
How does clustering with each of them differ from clustering app servers (like JBoss's or GlassFish's built-in clustering capabilities)?
Are the problems these frameworks solve different enough to warrant their use on the same project? Or are they competitors with each other and thus have different solutions to the same/similar problem?
Thanks in advance for any insight into these curious, yet elusive frameworks!

jgroups is more about task distribution and cluster management while hazelcast/terracotta are more distributed caches (data grids) - there is certainly overlap between them when you compare all the functionality - you need to figure out what functionality is more important and perhaps easier to implement.
hazelcast allows clustering through either tcp based addressing or multicasting. It supports maps, multimaps, lists, queues, topics - for disk based backups, you have to implement load/store interfaces.
With EhCache, you can use JGroups, JMS or RMI replication for caches.
In short, if you're looking for a distrubuted data cache/grid, hazelcast or ehcache would be the tools to look at - if you're looking for task distribution using a library and not concerned about existing data grid caches, JGroups would work for you.

It is possible to distinguish two categories of technologies: i. enabler (i.e. middleware API) and ii. end-to-end or ready-to-use solution (i.e. applicative API).
JGroups is an enabler technology as its core implements the Group Communication Primitives like Reliable Unicast, Multicast and Broadcast which are building blocks for more complex distributed protocols like Atomic Broadcast; it falls into the category of Group Communication Toolkits (GCTs).
Hazelcast as well as Terracotta are end-to-end service technologies as they provide a rich set of services for distributed applications; the fall into the category of In-Memory Data Grids (IMDGs), also known as distributed and in-memory caching solutions which are very well suited to compute data with low latencies.
In terms of capabilities:
JGroups provides a set of primitives to enable group membership which is a key concept in any clustering scenario where a group of joining/leaving participants/nodes has to be managed in terms of lifecycle and roles; it allows to create a rich set of protocols based on a Protocol Kernel design by stacking micro protocols on the foundation APIs (as said, group membership) which rely on reliable distribution of messages both on TCP and UDP*. Out of the box, JGroups does not provide any composite service: such kind of services can be built on top of the basic provided capabilities.*
Hazelcast provides a rich set of distributed data structures which can be fully replicated or sharded with an implicit replication factor; distributed List, Map, Queue and Lock are example of basic data structures implemented using the Java Collection interfaces, clearly the distribution and so replication implicitly needs Group Membership services which are provided by its Engine and in particular by the Cluster Manager with Cloud Discovery SPI module. Hazelcast can achieve Group Membership management via IP Multicast, IP Multicast with TCP and 3rd party Cloud Service (e.g. Zookeeper). Potentially, Hazelcast might use JGroups services for Node discovery and Cluster Management (aka Group Membership service).
Terracotta, on the other hand, if associated to the popular Ehcache then it provides a Distributed Caching service which in turn is based on a set of basic Group Membership capabilities; in terms of implementation, Terracotta Ehcache may be based on top of JGroups services and provides a specific set of APIs specific of a Caching system and so less generic than Hazelcast.
Considering the relationship between the two types of technology, JGroups is effectively an enabling service (i.e. building block) for compound services
(i.e. semantically rich APIs) like Hazelcast and Terracotta which provide end-to-end or ready-to-use services for 3rd party applications, managing all the reliable distribution aspects behind the scenes. Definitely, JGroups is a middleware and Hazelcast and Terracotta are applications which may embed their own middleware implementation for clustering services.

Related

Elasticsearch cluster load balancing best practices

I would like to understand whether I need or is it considered as a good practice to have load balancer as part of the deployment of Elasticsearch.
As far as I understand high level rest client as well as transport client of Elasticsearch can manage load balancing between the nodes. So the client needs coma separated endpoint list and that's it.
Is there any point to have also Load Balancer at the middle?
For which case it might be useful?
Pros and cons of each method?
Normally external load-balancer in ES cluster is not very common and not required as Elasticsearch already does load balancing and by default all the data nodes in ES cluster act as co-ordinating role but if you want to improve the performance you can have dedicated co-ordinating node as well.
If your goal is to have a smart load-balancing which improves the performance than if you are on ES 6.X or higher(turned by default on 7.X), you get it out of the box without doing any external configuration, by using Adaptive replica selection.
Having another loadbalancer means extra configuration and another layer before your request reaches to ES, so IMHO it doesn't make any sense to use it.
The answer depends on your architecture and also your requirements. Do you need a loadbalancer for high availability? Or for performance reasons/scalability? Or both?
Elasticsearch like many other distributed systems comes with its own protocols and semantics to distribute load across multiple nodes and to manage fail-overs.
You can use these semantics to configure nodes in such a way that a node can perform just the role of a coordinator -- effectively acting as a load balancer for heavy duty operations like search requests or bulk index requests.
Elasticsearch also has its own built-in protocol for electing a new master node in case of failures -- again effectively performing the role of a load balancer.
In general, I would recommend you to use the native capabilities to achieve your goals instead of adding more complexity by introducing another technology in front of it.
If you want a stable URL for your cluster, then configure your DNS server to reach that goal. A cloud provider managed cluster should already have such a feature, otherwise you can configure it with some efforts.

JAX-RS 2.0 (Resteasy client) for server to server communication

This is a rather generic question but would you use use JAX-RS to communicate between two server services running on potentially two different hosts (leveraging the Resteasy client)?
Or would you stick to the more traditional EJB remote invocation?
I'm a bit worried about the following potential issues:
- maintaining a pool of Http connections - will be per client and not global to the application server
- no optimisation if both services are on the same host (EJB invocations would be local in this case)
- authorisation (credentials): managed by the application itself when configuring the RestClient vs. container managed for EJB
- what else?
Any feedback?
Thanks for your help.
Most implementations of JAX-RS have a client API, so the setup should be easy if you share annotated interfaces between the two projects. Communication may be slower than with other solutions because you have to serialize/deserialize all parameters and responses, usually in formats like XML or JSON. I wouldn't worry to much about optimizing inter-process communication as communicating with localhost is still is way faster than with a remote machine. If you expect to have parts of this API public, REST would be the best option, regardless of performance.
If the communication will only be internal, and you really care about performance, you could use a more specialized framework like Protocol Buffers. JAX-RS is a JavaEE Standard and REST are well established though, which might be more important than performance. For more large, complex, JavaEE based systems, a common solution would be to use messaging and integration frameworks like Apache ActiveMQ and Apache Camel, which also support JAX-WS/JAX-RS frameworks like Apache CXF and should have optimization for inter-process communication. For small applications this seems like overkill though.
I never used EJB so I can't really compare it to other solutions. From what I've heard, the whole EJB approach is way to complex and hasn't been adapted very well in the industry. I also would worry a bit about cross platform compatibility.
I would choose a solution that isn't too complex and easy to setup. One last thing: In my experience, when you expect two application to be running on the same machine so often that you want to optimize it, they probably should have been combined in a single server application in the first place, or maybe one of the servers should have been an optional plugin for the other one.

Achieving JMS/AMQP messaging patterns using Redis

This question arises as I came across some mentions (such as this), about using a Messaging software such as ZeroMQ alongwith Redis, but I keep hearing about Redis itself used a messaging-system. So, if Redis is used along with other messaging systems, does it mean Redis has some serious deficiencies when used as a messaging system by itself ?
While the use of Redis for caching and pub/sub is clear to me, it is not clear if Redis can be used in place of a full-fledged messaging system such as JMS, AMQP or ZeroMQ.
Leaving alone the standards-compliance aspect and only concentrating on the functionality/features, does Redis provide support for all the messaging patterns/models required of a messaging system ?
The messaging patterns I am talking about are :
RPC/Request-reply (an
example
using ActiveMQ/JMS and another using RabbitMQ/AMQP)
Pipeline/Work queues (once and atmost once consumption of each message)
Broadcast (everyone subscribed to the channel)
Multicast (filtering of messages at server based on consumers' selectors)
Any other messaging pattern ?
If yes, then Redis seems to solve two (possibly more) aspects at once: Caching and Messaging.
I am looking at this in the context of a web-application backed by a Java/Java EE server.
And I am looking at this not from a proof-of-concept point-of-view but from a large-scale software development angle.
Edit1:
user:791406 asked a valid question:
"Who cares if redis supports those patterns; will redis meet your SLA
and QoS needs?"
I thought it is better to give this detail as part of the question instead of in the comments section.
My current needs are less to do with SLA and QOS and more to do with choosing a tool for my job (messaging) that I can use even when my requirements grow (reasonably) in future.
I am starting with simplistic requirements at first and we all know requirements tend to grow.
And NO, I am not looking for a one tool that does it all.
I just want to know if Redis fulfills the usual requirements expected out of a messaging system, like ActiveMQ/RabbitMQ does. Ofcourse, if my SLA/QOS needs are extreme/eccentric, I would need to get a special tool for satisfying that. For ex: In some cases ZeroMQ could be chosen over RabbitMQ due to specific SLA requirements. I am not talking about such special requirements. I am focusing on average enterprise requirements.
I was afraid (based on my little understanding) that eventhough redis could be used as a basic tool for my messaging needs of today, it might be the wrong tool for a real messaging job in future. I have experiences with messaging systems like ActiveMQ/RabbitMQ and know that they could be used for simple to (reasonably) complex messaging needs.
Edit2:
The redis website mentions "Redis is often used as a messaging server" but how to achieve the messaging patterns is not clear.
Salvatore sanfilippo mentions Redis users tend to use it as a database, as a messaging bus, or as a cache. To what extent can it serve as a "messaging bus" is not clear.
When I was trying to find out what messaging requirements of JMS that redis doesnt support, I came across something that Redis supports but JMS doesnt: Pattern-matching subscriptions i.e Clients may subscribe to glob-style patterns in order to receive all the messages sent to channel names matching a given pattern.
Conclusion:
I have decided to use JMS for my messaging needs and use Redis for Caching.
What are you needs?
I think the question you should be asking yourself is "what quality of messaging do I need to support my application?" before deciding on a messaging platform. Who cares if redis supports those patterns; will redis meet your SLA and QoS needs? Focus on that first, then make your technology decision based on that assessment.
Having said that, I'll offer my input on how you can make that decision...
Highly-Reliable/Persistent/Durable Messaging
Let's take an extreme case: say you're building a trading or financial application. Such applications demand strict SLA's where message persistence, reliability, exactly-once delivery and durability are paramount. Using redis as the messaging backbone for this case is probably a bad choice, lots of reasons why...
message redelivery (when the sh*t hits the fan)
message store replication when redis goes down
message transactions (redis can't do XA)
producer/subscriber fault tolerance and disconnection resilience
message order sequencing
sending messages when broker is down (store-and-forward)
single-threaded redis may become bottleneck
If your system has strict SLA's, some or all of these issues will most definitely arise, so how will you handle these limitations? You can implement custom code around redis to address some issues, but why bother when mature messaging platforms like ActiveMq, WebsphereMQ, and WebLogic JMS offer persistence, reliability, and fault tolerance? You said you're on a Java/Java EE stack, so you're in a position to use some of the most robust messaging frameworks available, either open source or commercial. If you're doing financial transactions, then you need to consider these options.
High Performance/Large Distributed Systems Messaging
If you're building a social network or gaming platform where you want performance over reliability, ZeroMq is probably a good fit. It's a socket communication library wrapped in a messaging-like API. It's decentralized (no broker), very fast, and highly resilient and fault tolerant. If you need to do things like N-to-N pub/sub with broker intermediaries, flow control, message persistence, or point-to-point synchronization, ZeroMq offers the necessary facilities and code samples to do it all with minimal code while avoiding building solutions from the ground up. It's written in C but has client libraries for nearly every popular language.
Hope that helps...

how to use Hornetq with struts2 under glassfi

How can i use Hornetq in struts2 to increase the performance of my web application which should retrive 1400 records from database
Hornetq provides high efficiency messaging between distributed systems: http://en.wikipedia.org/wiki/Message_oriented_middleware
It has little to no impact on how fast it will take to retrieve any number of records from a database. This is more of a factor of the db api used (JDBC vs JPA), the method of caching, the implementation of the database and hardware/network concerns when accessing it. If you needed the highest possible speed in sending and receiving many messages between two servers, this is what Hornetq is built for but if it is to retrieve and send many records at once then there will probably be negligible difference.
Integration... I would not think there would be any integration of struts2 and Hornetq. You can use spring to instantiate the service and spring services would be used by both Hornetq and struts2 but neither would have any awareness of each other. This is hinted at here: https://community.jboss.org/wiki/HornetQGeneralFAQs (But you need to read between the lines from the answers "Is HornetQ tightly coupled to JBoss Application server?" and down)

Bidirectional Java clients/servers communication

I am trying to design a distributed application in Java with a few servers and a few managers (applications monitoring and configuring the servers). Most of the traffic between the servers and the managers will be request from the managers to the servers, but the servers should be able to notify the managers when something happen. The managers don't care about each other, and the servers don't care about each other, but the managers should manage all the running servers and the servers should be managed by all the running managers.
I don't want to have every request go through a central entity.
What is the best way to implement the communication link between this?
What I thought of:
RESTful API: Unidirectional communication, so we need polling, which is less responsive and waste a lot of resources.
Opening a Socket between each server/manager pair: Looks like the best way to me, but I feel like I would be re-inventing the wheel...
Java RMI: Looks nice but it needs a central entity (rmiregistry). Plus, I'm stuck with Java if I use it, and it looks like it's more adapted for a point-to-point communication.
JMS: Same thing, it needs a central entity (JMS Provider), and it's Java only...
Thanks!
Look at AMPQ / RabbitMQ
RabbitMQ can be clustered - see http://www.rabbitmq.com/clustering.html
It have bindings for most programming languages
and also it very fast, scalable and stable
The JMS broker is Java only, but the clients can be just about anything.
If you don't want a central entity you can have an embedded broker or broker dedicated to each server.
I would use ActiveMQ because its one of the easiest to get started with (and fast)

Categories