Load balancing server, how can I implement it?

Load balancing server, how can I implement it? - java

I googled for load balancing but the only thing I can find is the working theory, which at the moment, is the "easy" part for me. But zero examples of how to implement one.
I have several questions pertaining load balancing:
I have a domain (example.com) and I behind it I have a load balancing server (lets call it A) which, according to the theory, will ask the client to close the connection with A, and connect to B, a sub-server and carry on the request with B. Will the client, in a web browser stop seeing "example.com/page.html" in the address bar and start seeing "B_ip_address/page.html" instead?
How can I implement a simple load balancer from scratch? My doubt targets the HTTP part. Is there some specific message or set of messages I need to send the client which will make him disconnect from me and connect to a sub-server?
What about lower level protocols than HTTP, such as TCP/IP, are there any standard packets to tell the client he just connected to a load balancer server and now he needs to connect to xxx.xxx.xxx.xxx to carry on the request?
What method is the most used? (1) client connects to load balancing server, and it asks the client to directly connect to one of the sub-servers, or (2) the load balancing server starts bridging all traffic from the client to the sub-server and vice-versa in a transparent fashion?
So question 2, 3 and 4 regard the load balancing protocol, and the 1st one the way a domain name can be connected to a load balancer and what are the underlying consequences.

Your approach is a kind of static load balancing by redirect the calls to another server. All following calls may use this other server or are send to the load balancer again for redirect.
An implementation depends on the implementation of your system. A load balancer works best for independent requests with no session state. You need to sync the session state otherwise between the "end" servers. Or use a shared session store to provide the session state to all servers.
There exists a simple and transparent solution for HTTP server load balancing. You can use the load balancing module of an nginx server (http://nginx.org/en/docs/http/load_balancing.html). This can be used for HTTP and HTTPS requests. And it may be extended with extra servers dynamically if the load increases. You need to edit the nginx configuration and restart the server. This can be transparent to existing connections. And nginx does not cause problems with changing domain or host names.
Other protocols need some support by the client and the server. Load balancing may be transparent if a specialized device is between the client and server. Or the communication protocol needs to support connection redirects.
Edit:
Load balancing can be implemented by DNS round robin too. Each DNS lookup call returns another IP address for the same domain name. The client choose an IP and connects to this server. Another client can use the next IP. The address bar name is the same all the time.
Example:
Non-authoritative answer:
Name: www.google.com
Addresses: 2a00:1450:4001:80f::1010
173.194.116.209
173.194.116.210
173.194.116.212
173.194.116.211
173.194.116.208
Non-authoritative answer:
Name: www.google.com
Addresses: 2a00:1450:4001:80f::1010
173.194.116.210
173.194.116.212
173.194.116.211
173.194.116.208
173.194.116.209
Non-authoritative answer:
Name: www.google.com
Addresses: 2a00:1450:4001:80f::1010
173.194.116.212
173.194.116.211
173.194.116.208
173.194.116.209
173.194.116.210
The IP address range rotates. Most HTTP load balancers work as transparent load balancer like nginx or other reverse proxy implementations. A redirecting load balancer is more a low tech implementation I think.
TCP/IP is not a protocol. It's the transport layer used to transfer data implementing a specific communication protocol. While TCP/IP itself is a protocol for the network components. But not the applications. You may check https://en.wikipedia.org/wiki/OSI_model .

Related

Why do we use port number localhost: 8080? Why don't we use a port number when using www.example.com?

When I use a Spring Boot app in local it uses, localhost:8080. When it is pushed to Pivotal Cloud Foundry, it has some route https://my-app.xyz-domain.com and we can access the URL without a port, what is happening behind the scene?
Please help me understand.

There is a default port number for each protocol which is used by the browser if none is specified. For https it is 443, for http 80 and for telnet 23.
On Unix and similar systems like Linux, those are often not available to a developer so other ports are used but then they have to be specified. 8080 is often available and looks like 80.

On CloudFoundry, your application is actually still running on localhost:8080. The reason that you can access your application through https://my-app.xyz-domain.com is that the platform handles routing the traffic from that URL to your application.
The way this works is as follows:
You deploy your application. It's run by the foundation in a container. The container is assigned a port, which it provides to the application through the $PORT env variable (this can technically change, but it's been 8080 for a long time). Your application then listens on localhost:$PORT or effectively localhost:8080.
The platform also runs Envoy in your container. It's configured to listen for incoming HTTP and HTTPS requests, and it will proxy that traffic to your application on localhost:$PORT.
Using the cf cli, you map a route to your application. This is a logical rule that tells the platform what external traffic should go to your application. A route can consist of a hostname, domain, and/or path. For example, my-cool-app.example.com or my-cool-app.example.com/foo. For a route to work, the domain must have its DNS directed to the platform.
When an end-user accesses the route that you mapped, the DNS resolves to the platform and the traffic is directed to the external load balancers (sometimes TCP/layer4, sometimes HTTPS/layer7) that sit in front of the platform. These proxies do not have knowledge of CF, they just proxy incoming traffic.
Traffic from the external load balancers is spread across the set of the platform Gorouters. The Gorouter is a second layer of proxies, but these proxies have knowledge of the platform, specifically, all of the routes that have been mapped on the platform and where those applications actually live.
When a request comes to Gorouter, it will recognize the route like my-cool-app.example.com and look up the location of the container where that app is running. Traffic from Gorouter is then proxied to the Envoy which is running in the app container. This ties into step two as the Envoy will route that traffic to your application.
All in total, incoming requests are routed like this:
Client/Browser -> External LBs -> Gorouters -> Envoy -> Application

First, you should change the port to 80 or 443, because HTTP corresponds to 80, and HTTPS corresponds to 443. Then, you should set the domain name to resolve to the current host, so that you can access the current application through the domain name. In addition, if you want to set the local domain name, then The hosts file should be modified.

Java servlet - Log version of TLS that's being used in a secure connection from code?

I'd like to log the TLS version that's being used when my web app receives requests, and when it does server side HTTP requests.
I've tried enabling TLS logging at the VM level, but it's much too verbose for my liking, and I'd like to just log it at a few points in my app.
How can I do this?

Turns out that in my situation, I could never figure this out, as there was a load balancer in front of my app, and the load balancer had the SSL cert installed on it, not my nodes.
My company uses "SSL termination", which means that the SSL connection ends at the load balancer - so no nodes in our app are ever served securely (they're on an internal network of course, which is only available to the load balancer).
So if you're looking into this, first find out if your load balancer is working in this way, because if it is, you won't get any SSL info in your requests at all no matter what!

Detect protocol from java SOCKS socket

I'm developing a custom SOCKS5 server in Java. Other than the first CONNECT message that includes the HOST and PORT, is then any way to inspect the subsequent messages to determine the protocol of the data? For example, if the application data starts with "GET /...", the request is likely HyperText Transfer Protocol (HTTP), but that is far from a complete solution. Is there a way to see if the data is say HTTPS, or FTP, or "NetFlix streaming", etc...?
Secondarily, if the data is http or https how would I forward the request to a dedicated HTTP proxy?

is then any way to inspect the subsequent messages to determine the protocol of the data?... s there a way to see if the data is say HTTPS, or FTP, or "NetFlix streaming", etc...?
Basically you have destination port, destination IP address and maybe hostname (if DNS resolving is done through the SOCKS5 server too) and the payload. Based on the knowledge of well known target hosts, target ports and typical payloads you could build heuristics to guess the protocol.
You will find such heuristics in today's Intrusion Detection Systems, better firewalls and traffic classifiers and they differ a lot in the detection quality and a determined user can often fool these heuristics. This is a very wide topic but you might start looking at free deep inspection (DPI) libraries like nDPI and read more about DPI at Wikipedia.
Secondarily, if the data is http or https how would I forward the request to a dedicated HTTP proxy?
First change the target from the target requested by the client to the proxy. This must be done of course before any data gets transferred which might conflict with the DPI you do on the data stream because some connections first get data from the server (like SMTP) while others like HTTP(S) first get data from the client. Thus you probably need to find out if this is HTTP(S) before getting any payload, i.e. only based on target port. For HTTPS you would then need then to establish a tunnel using a CONNECT request as described in RFC 2817. For HTTP you would modify the request to include not only the path but the full URL (i.e. http://host[:port]/path).
As you can see all of this uses lots of heuristics which work for most but not all cases. Apart from that this can be a very complex task depending on the quality of traffic classification you need.

Control the routing of load balancer from a Tomcat

I have a load balancer problem. All load balancer configuration examples I have read inspect the client data and bases all load balancing routing decitions on this. I have a different problem. I need to let the application server tell the load balancer that he serves a specific url right now.
Background:
I have around 10000 hardware devices which connects to tomcat servers (by a binary TCP protocol). The tomcat servers are also serving http towards clients who would like to communicate with these devices.
I don't know when the hardware device connects (and I can't identify them on the connection) but I want all http requests from clients, which are directed towards that device to go to that tomcat-server after a device has connected. The hardware devices are load balanced by round robin dns.
Question:
Are there any good http load balancers to which I can let the tomcat server say "hey, device with id xxx just connected, please redirect all traffic towards this device to me"? The http requests are easy to identify. They have the id of the device in the request url.
Any suggestions on load balancers or google queries to make would be appreciated.

Interesting problem you've got there. I've had the same problem as you, but I was using jboss AS 7 instead of tomcat. However, the principles are more or less the same.
We solved this issue by using apache with mod_cluster which allows the tomcat or jboss server to register which context that it has available to the load balancer. The loadbalancer will determine which application server that has the context and route the traffic to it.
There are lots of tutorials for how to do this online, here is one good example.
http://www.devx.com/Java/Article/48086

For the original question, I think you are not looking for a load balancer, but just a plain reverse proxy with the twist that it has to be dynamic.
Check out Apache httpd mod_proxy with mod_rewrite. For the dynamic part maybe your tomcats can register their connected "refrigerators" in a sqldb, in that case use RewriteMap with dbd.

Redirect and balance Java output traffic

I have a client that make requests to different servers. Sometimes this servers rejects requests from my IP so I need to change it (I have a few public IPs). I also need to change my IP to make geolocalizated request. I'm trying to make a balance server to redirect the client traffic through different server and to keep a log of IPs being rejected. This is what I have in mind:
There would be clients in different networks with different instances of the client. This instances request an output server to the balancer and then all the traffic of clients is redirected through this servers. Output servers could make a connection with the balancer with sockets to say something like "Ey, I'm here. You can use me!". Here I have a silly activity diagram (full of mistakes probably):
Is there a simplest way to do this? Maybe I'm reinventing the wheel. If is a good solution, is if possible to do this with Java/C#? How could I redirect the traffic?

I think you are reinventing the wheel a little, what your describing is just a load balancer in sticky session/sticky IP mode.
There's a few open source projects that will do what you are looking for. (Each word is a link there)
Personally I'd suggest the LVS Project

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.