Elasticsearch cluster load balancing best practices

Elasticsearch cluster load balancing best practices - java

I would like to understand whether I need or is it considered as a good practice to have load balancer as part of the deployment of Elasticsearch.
As far as I understand high level rest client as well as transport client of Elasticsearch can manage load balancing between the nodes. So the client needs coma separated endpoint list and that's it.
Is there any point to have also Load Balancer at the middle?
For which case it might be useful?
Pros and cons of each method?

Normally external load-balancer in ES cluster is not very common and not required as Elasticsearch already does load balancing and by default all the data nodes in ES cluster act as co-ordinating role but if you want to improve the performance you can have dedicated co-ordinating node as well.
If your goal is to have a smart load-balancing which improves the performance than if you are on ES 6.X or higher(turned by default on 7.X), you get it out of the box without doing any external configuration, by using Adaptive replica selection.
Having another loadbalancer means extra configuration and another layer before your request reaches to ES, so IMHO it doesn't make any sense to use it.

The answer depends on your architecture and also your requirements. Do you need a loadbalancer for high availability? Or for performance reasons/scalability? Or both?
Elasticsearch like many other distributed systems comes with its own protocols and semantics to distribute load across multiple nodes and to manage fail-overs.
You can use these semantics to configure nodes in such a way that a node can perform just the role of a coordinator -- effectively acting as a load balancer for heavy duty operations like search requests or bulk index requests.
Elasticsearch also has its own built-in protocol for electing a new master node in case of failures -- again effectively performing the role of a load balancer.
In general, I would recommend you to use the native capabilities to achieve your goals instead of adding more complexity by introducing another technology in front of it.
If you want a stable URL for your cluster, then configure your DNS server to reach that goal. A cloud provider managed cluster should already have such a feature, otherwise you can configure it with some efforts.

Related

Combination of Spring Cloud and Orchestration Tools Like Docker Swarm and Kubernetes

I have a cloud-native application, which is implemented using Spring Cloud Netflix.
So, in my application, I'm using Eureka service discovery to manage all instances of different services of the application. When each service instance wants to talk to another one, it uses Eureka to fetch the required information about the target service (IP and port for example).
The service orchestration can also be achieved using tools like Docker Swarm and Kubernetes, and it looks there are some overlaps between what Eureka does and what Docker Swarm and Kubernetes can do.
For example, Imagine I create a service in Docker Swarm with 5 instances. So, swarm insures that those 5 instances are always up and running. Additionally, each services of the application is sending a periodic heartbeat to the Eureka internally, to show that it's still alive. It seems we have two layers of health check here, one for Docker and another inside the Spring Cloud itself.
Or for example, you can expose a port for a service across the entire swarm, which eliminates some of the needs to have a service discovery (the ports are always apparent). Another example could be load balancing performed by the routing mesh inside the docker, and the load balancing happening internally by Ribbon component or Eureka itself. In this case, having a hardware load balancer, leads us to a 3-layered load balancing functionality.
So, I want to know is it rational to use these tools together? It seems using a combination of these technologies increases the complexity of the application very much and may be redundant.
Thank you for reading!

If you already have the application working then there's presumably more effort and risk in removing the netflix components than keeping them. There's an argument that if you could remove e.g. eureka then you wouldn't need to maintain it and it would be one less thing to upgrade. But that might not justify the effort and it also depends on whether you are using it for anything that might not be fulfilled by the orchestration tool.
For example, if you're connecting to services that are not set up as load-balanced ('headless services') then you might want ribbon within your services. (You could do this using tools in the spring cloud kubernetes incubator project or its fabric8 equivalent.) Another situation to be mindful of is when you're connecting to external services (i.e. services outside the kubernetes cluster) - then you might want to add load-balancing or rate limiting and ribbon/hystrix would be an option. It will depend on how nuanced your requirements for load-balancing or rate-limiting are.
You've asked specifically about netflix but it's worth stating clearly that spring cloud includes other components and not just netflix ones. And that there's other areas of overlap where you would need to make choices.
I've focused on Kubernetes rather than docker swarm partly because that's what I know best and partly because that's what I believe to be the current direction of travel for the industry - on this you should note that kubernetes is available within docker EE. I guess you've read many comparison articles but https://hackernoon.com/a-kubernetes-guide-for-docker-swarm-users-c14c8aa266cc might be particularly interesting to you.

You are correct in that it does seem redundant. From personal observations, I think that each layer of that architecture should handle load balancing in its' own specific way. It ends up giving you a lot more flexibility for not much more cost. If you want to take advantage of client side load balancing and any failover features, it makes sense to have Eureka. The major benefit is that if you don't want to take advantage of all of the features, you don't have to.
The container orchestration level load balancing has a place for any applications or services that do not conform to your service discovery piece that resides at the application level (Eureka).
The hardware load balancer provides another level that allows for load balancing outside of your container orchestrator.
The specific use case that I ran into was on AWS for a Kubernetes cluster with Traefik and Eureka with Spring Cloud.

Yes, you are correct. We have a similar Spring Cloud Netflix application deployed on Oracle cloud platform and Predix Cloud Foundry. If you use multiple Kubernetes clusters then you have to use Ribbon load balancing because you have multiple instance for services.
I cannot tell you which is better Kubernetes or Docker Swarm. We use Kubernetes for service orchestration as it provides more flexibility.

Two node Tomcat based Load Balancing: Architectural styles

Am working on setting up a 2-node load balancer for Tomcat web server either using Apache or employing a commercial tool named Barracuda. The servers are high end Intel servers with powerful configuration (for the requirement under consideration) in terms of CPU, Memory, internal RAID. The two options we got are either to use two nodes as Active-active meaning both tomcat serving requests in a round-robin fashion or to use just one node and use the other node only in case the first node fails. One disadvantage with active-active configuration would be that the architecture won't be Shared Nothing type as the two tomcats potentially needs to share the web-content using a disk-array.
Would highly appreciate if anyone can provide some inputs regarding the above architectural styles.

Infinispan Operational modes

I have recently started taking a look into Infinispan as our caching layer. After reading through the operation modes in Infinispan as mentioned below.
Embedded mode: This is when you start Infinispan within the same JVM as your applications.
Client-server mode: This is when you start a remote Infinispan instance and connect to it using a variety of different protocols.
Firstly, I am confuse now which will be best suited to my application from the above two modes.
I have a very simple use case, we have a client side code that will make a call to our REST Service using the main VIP of the service and then it will get load balanced to individual Service Server where we have deployed our service and then it will interact with the Cassandra database to retrieve the data basis on the user id. Below picture will make everything clear.
Suppose for example, if client is looking for some data for userId = 123 then it will call our REST Service using the main VIP and then it will get load balanced to any of our four service server, suppose it gets load balanced to Service1, and then service1 will call Cassandra database to get the record for userId = 123 and then return back to Client.
Now we are planning to cache the data using Infinispan as compaction is killing our performance so that our read performance can get some boost. So I started taking a look into Infinispan and stumble upon two modes as I mentioned below. I am not sure what will be the best way to use Infinispan in our case.
Secondly, As from the Infinispan cache what I will be expecting is suppose if I am going with Embedded Mode, then it should look like something like this.
If yes, then how Infinispan cache will interact with each other? It might be possible that at some time, we will be looking for data for those userId's that will be on another Service Instance Infinispan cache? Right? So what will happen in that scenario? Will infinispan take care of those things as well? if yes, then what configuration setup I need to have to make sure this thing is working fine.
Pardon my ignorance if I am missing anything. Any clear information will make things more clear to me to my above two questions.

With regards to your second image, yes, architecture will exactly look like this.
If yes, then how Infinispan cache will interact with each other?
Please, take a look here: https://docs.jboss.org/author/display/ISPN/Getting+Started+Guide#GettingStartedGuide-UsingInfinispanasanembeddeddatagridinJavaSE
Infinispan will manage it using JGroups protocol and sending messages between nodes. The cluster will be formed and nodes will be clustered. After that you can experience expected behaviour of entries replication across particular nodes.
And here we go to your next question:
It might be possible that at some time, we will be looking for data for those userId's that will be on another Service Instance Infinispan cache? Right? So what will happen in that scenario? Will infinispan take care of those things as well?
Infinispan was developed for this scenario so you don't need to worry about it at all. If you have for example 4 nodes and setting distribution mode with numberOfOwners=2, your cached data will live on exactly 2 nodes in every moment. When you issue GET command on NON owner node, entry will be fetched from the owner.
You can also set clustering mode to replication, where all nodes contain all entries. Please, read more about modes here: https://docs.jboss.org/author/display/ISPN/Clustering+modes and choose what is the best for your use case.
Additionally, when you add new node to the cluster there will StateTransfer take place and synchronize/rebalance entries across cluster. NonBlockingStateTransfer is implemented already so your cluster will be still capable of serving responses during that joining phase. See: https://community.jboss.org/wiki/Non-BlockingStateTransferV2
Similarly for removing/crashing nodes in your cluster. There will be automatic rebalancing process so for example some entries (numOwners=2) which after crash live only at one node will be replicated respectively to live on 2 nodes according to numberOfOwners property in distribution mode.
To sum it up, your cluster will be still up to date and this does not matter which node you are asking for particular entry. If it does not contain it, entry will be fetched from the owner.
if yes, then what configuration setup I need to have to make sure this thing is working fine.
Aforementioned getting started guide is full of examples plus you can find some configuration file examples in the Infinispan distribution: ispn/etc/config-samples/*
I would suggest you to take a look at this source too: http://refcardz.dzone.com/refcardz/getting-started-infinispan where you can find even more basic and very quick configuration examples.
This source also provides decision related information for your first question: "Should I use embedded mode or remote client-server mode?" From my point of view, using remote cluster is more enterprise-ready solution (see: http://howtojboss.com/2012/11/07/data-grid-why/). Your caching layer is very easily scalable, high-available and fault tolerant then and is independent of your database layer and application layer because it simply sits between them.
And you could be interested about this feature as well: https://docs.jboss.org/author/display/ISPN/Cache+Loaders+and+Stores

I think in newest Infinispan release supports to work in a special, compatibility mode for those users interested in accessing Infinispan in multiple ways .
follow below link to configure your cache environment to support either embedded or remotely.
Interoperability between Embedded and Remote Server Endpoints

App Server Clustering vs Terracotta

I've heard the term "clustering" used for application servers like GlassFish, as well as with Terracotta; and I'm trying to understand what the word clustering implies when used in conjunction with application servers, and when used in conjunction with Terracotta.
My understanding is:
If a GlassFish server is clustered, then it means we have multiple physical/virtual machines, each with their own JRE/JVM running separate instances of GlassFish. However, since they are clustered, they will all communicate through their admin server ("DAS"), and have the same apps deployed to all of them. They will effectively act (to the end user) as if they are a single app server - but now with load balancing, failover/redundancy and scalability added into the mix.
Terracotta is, essentially, a product that makes multiple JVMs, running on different physical/virtual machines, act as if they are a single JVM.
Thus, if my understanding is correct, the following are implied:
You cluster app servers when you want load balancing and failover tolerance
You use Terracotta when any particular JVM is too small to contain your application and you need more "horsepower"
Thus, technically, if you have a GlassFish cluster of, say, 5 server instances; each of those 5 instances could actually be an array/cluster of Terracotta instances; meaning each GlassFish server instance is actually a GlassFish instance living across the JVMs of multiple machines itself
If any of these assertions/assumptions are untrue, please correct me! If I have gone way off-base and clearly don't understand clustering and/or the very purpose of Terracotta, please point me in the right direction!

Terracotta enables you to have a shared state across all your nodes (its stateful). Basically it creates a shared memory space between different JVM's. This is useful when nodes in a cluster all need access to the same objects.
If your application is stateless and you just need load balancing and fail over you can use a solution like JGroups. In this scenario each node just handles requests and has little idea about other nodes. Objects in memory are not shared across nodes and each JVM just runs on its own and has no idea about other JVM's. This often works nicely for request / response type applications. A webserver serving content (without sessions) does this for example.
Dealing with a stateless cluster is often simpler then dealing with a stateful cluster. This is because in a stateless cluster nodes know almost nothing about each other which results in less things that can go wrong.
GlassFish sits a bit in the middle of the above concepts. Objects in memory within GlassFish are visible to all nodes. However the frontend (HTTP connectors) work stateless.
So to answer your questions:
1) Yes, those are the two most obvious reasons. However sometimes people only want failover or only want load balancing or sometimes both. Not all clustering solutions fix both of these problems.
2) Yes. Altough technically speaking Terracotta only solves the shared memory part, not the CPU part. However by solving the memory part it automatically solves the CPU part since you can now just add JVM's to the shared memory space.
3) I don't know if thats practically possible but as a thought experiment; Yes.

Clustering can mean one of the following:
Multiple instances can be managed as one. Deploy an application to the cluster, it is deployed to all instances in the cluster. Make a configuration change, and that change will be pushed to all nodes in the cluster. GlassFish supports this out of the box.
Service Availability. If any one instance fails, the application is available on another instance. Without high availability enabled, any instance failure also results in session loss for any session being managed by that instance. GlassFish supports this out of the box.
High availability. If any one instance fails, the application is available on another instance, and there is no session loss because a session replica is also maintained on another instance. GlassFish supports this. You will have to choose either #2 or #3 in any one cluster.
What you are asking about IMHO is really #3, because it is the only real case where Terracotta - in the context of high availability clustering - will offer value w/GlassFish. GlassFish already offers built-in high availability, so there had better be a very good reason to add Terracotta to the solution because it will complicate the deployment architecture.
The primary reason I can think of adding Terracotta is that you may want to offload session management to a data grid and free up GlassFish to run business logic. This may be due to more frequent garbage collection or wanting to manage more users per GlassFish instance. However, I'm not sure that Terracotta can do this seamlessly. With GlassFish built-in HA clustering, replicating sessions is seamless (no application logic modifications). You may have to write code to put/get data from a Terracotta cache I'll let you research :-) Oracle GlassFish Server also integrates (seamlessly) with Coherence to solve this problem. You can separate session management into a Coherence data grid without modifying your application code.
Unless you know for a fact up front that your application must scale to a very large number of concurrent users, start with built-in HA clustering, run tests, and go from there.
Hope this helps.

Load Balance in Distributed Project

Does anyone know a simple load balance algorithm (formula) that relates users connected, cpu load, network load and memory usage?
This will be used to compare various servers and assign to a new user the best at the moment.
Thank You.

If you are using Apache Web Server to proxy the application servers, I recommend that you use mod_proxy and mod_proxy_balancer. You can find a quick introduction about mod_proxy here. This is about Jetty, but it is easily applicable to other servers.
The first thing that you need to worry about clustering is the way sessions are handled. You need to be sure that a request belonging to a session is directed to the same server (or the session is somehow persisted and then always retrieved). Mod_proxy can do this for you.
Regarding the load balancing algorithm, see the documentation of mod_proxy_balancer. According to it, there are 3 load balancer scheduler algorithms.
An older solution to load balancing is mod_jk.
In general, this is not something I would implement myself, even if I had a better algorithm. It is better to use an existing solution.

Have a look at nginx. It is easy do configure, very fast and handles load balancing between servers.
Distributed session handling is needed accordingly (see kgiannakakis) for details.

Have a look at haproxy. It is an extremelly stable and fast load HTTP/TCP load balancer that is being used by some extremelly trafficed websites.
Distributed session handling is needed accordingly (see kgiannakakis) for details.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.