I want to use HAProxy as a load balancer. I want to put two rabbitmq server behind haproxy. Both the rabbitmq server are on different instance of EC2. I have configure HAProxy server by following this reference. I works but the problem is messages are not published in roundrobin pattern. Messages are publish only on one server. Is there any different configuration for my requirement?
My configureation in /etc/haproxy/haproxy.cfg
listen rabbitmq 0.0.0.0:5672
mode tcp
stats enable
balance roundrobin
option tcplog
no option clitcpka
no option srvtcpka
server rabbit01 46.XX.XX.XX:5672 check
server rabbit02 176.XX.XX.XX:5672 check
listen web-service *:80
mode http
balance roundrobin
option httpchk HEAD / HTTP/1.0
option httpclose
option forwardfor
option httpchk OPTIONS /health_check.html
stats enable
stats refresh 10s
stats hide-version
stats scope .
stats uri /lb?stats
stats realm LB2\ Statistics
stats auth admin:Adm1nn
Update:
I have made some R&D on this and found that HAProxy is round robin the connection on the rabbitmq server. for ex: if i request for 10 connections then it will round robin the 10 connection over my 2 rabbitmq servers and publish the message.
But the problem is I want to round robin the messages, not connection it should be manage by HAProxy server. i.e if i send 1000 msg at a time to HAProxy then 500 msg should go to rabbit server1 and 500 msg should go to rabbit server2. What should be the configuration that i have to follow?
Update:
I have also test with leastconn in balancing but HAProxy behavior in unexpected. I have posted that question on serverfault.com
Messages get published to an exchange which will route to a queue.
You probably didn't configure you queues with {"x-ha-policy","all"}. Based on the fact that the exchange routing is working on both nodes this is probably all you are missing.
Note: Pre Rabbit 3.0 you would declare a queue with the x-ha-policy argument and it would be mirrored. With rabbit 3.0 you need to apply a policy (ha-mode = all). You can set policies through the api or the api tools (rabbitmqctl, management gui). i.e.
rabbitmqctl set_policy -p '/' MirrorAllQueues '.+' '{"ha-mode": "all"}'
The AMQP protocol is designed to use persistent connections, meaning you won't get a new connection per AMQP message (to avoid the overhead of constantly reconnecting). This means that a load balancer such as HAProxy will not be effective in balancing out your messages - it can only help with balancing out your connections.
Thus you cannot achieve your stated goal. If, however, your actual goal is to distribute messages evenly to consumers of those RabbitMQ instances, then you can use clustering as Karsten describes or you can use federation.
Federation setup:
First you need to enable the federation plugins:
rabbitmq-plugins enable rabbitmq_federation
rabbitmq-plugins enable rabbitmq_federation_management
Then for each of your servers log on to the RabbitMQ Web UI as an administrator and go to Admin > "Federation Upstreams" > "Add a new upstream" and add the other server(s) as upstream(s).
Now you need to define a policy for each exchange/queue you want to be federated. I only managed to get federation working for queues mind you, so I would try that first. Go to Admin > "Policies" > "Add / update a policy" and add a policy that targets the queue(s) you want federated.
Remove the 'backup' from the server definitions.
A backup server is one that will be used when all others are down. Specifying all your servers as backup without using option allbackups will likely have untoward consequences.
Change the relevant portion of your config to the following:
listen rebbitmq *:5672
mode tcp
balance roundrobin
stats enable
option forwardfor
option tcpka
server web2 46.XX.XX.XXX:5672 check inter 5000
server web1 176.XX.XX.XX:5672 check inter 5000
Related
We are running a setup on production where grpc clients are talking to servers via proxy in between (image attached)
The client is written in java and server is written in go. We are using the load balancing property as round_robin in the client. Despite this, we have observed some bizarre behaviour. When our proxy servers scale in i.e reduce from let's say 4 to 3, then resolver gets into action and the request load from our clients gets distributed equally to all of our proxies, but when the proxy servers scale out i.e increase from 4 to 8, then the new proxy servers don't get any requests from the clients which leads to a skewed distribution of request load on our proxy servers. Is there any configuration that we can do to avoid this?
We tried setting a property named networkaddress.cache.ttl to 60 seconds in the JVM ARGS but even this didn't help.
You need to cycle the sticky gRPC connections using the keepalive and keepalive timeout configuration in the gRPC client.
Please have a look at this - gRPC connection cycling
both round_robin and pick_first perform name resolution only once. They are intended for thin, user-facing clients (android, desktop) that have relatively short life-time, so sticking to a particular (set of) backend connection(s) is not a problem then.
If your client is a server app, then you should be rather be using grpclb or the newer xDS: they automatically re-resolve available backends when needed. To enable them you need to add runtime dependency in your client to grpc-grpclb or grpc-xds respectively.
grpclb does not need any additional configuration or setup, but has limited functionality. Each client process will have its own load-balancer+resolver instance. backends are obtained via repeated DNS resolution by default.
xDS requires an external envoy instance/service from which it obtains available backends.
On October 7 2020 and Januari 21 2021, Google introduced unidirectional server streaming and bidirectional web sockets respectively for Cloud Run. Here are the blog posts:
https://cloud.google.com/blog/products/serverless/cloud-run-now-supports-http-grpc-server-streaming
https://cloud.google.com/blog/products/serverless/cloud-run-gets-websockets-http-2-and-grpc-bidirectional-streams
From the second link:
This means you can now build a chat app on top of Cloud Run using a
protocol like WebSockets, or design streaming APIs using gRPC.
This raises some questions:
How does it work with auto scaling?
Say we build a chat app and we have ws connections distributes across multiple instances and need to push a message to all of them. How would we do?
Is it okey for the instances to keep a state now(the web socket connection)? What are the consequences of this?
What I am trying to ask; How do we build a scaleable chat application with Cloud Run and other managed tools available in Google Cloud with features like private messages and public chat rooms?
How does it work with auto-scaling?
Each WebSocket connection will consume 1 connection out of 250 available connection capacity per container. (250 is subject to change in the future as it had been 80 but increased to 250 recently.) This limit info is available in the Google Cloud Run Limits doc. When container's all 250 connections are occupied, another container instance will start automatically.
Say we build a chat app and we have ws connections distributes across multiple instances and need to push a message to all of them. How would we do?
You would have to use some form of central datastore or pubsub to solve that problem. e.g. Google provides Google Cloud PubSub, or you can setup a Redis instance and use Redis' PubSub feature. There are many ways to tackle this problem.
Is it okay for the instances to keep a state now(the web socket connection)? What are the consequences of this?
It is always safe to keep a state in a container, but you just need to make sure that the container can be terminated at any time when there isn't an active connection. Also,
according to the doc, Google Cloud Run will terminate all HTTP requests (including WebSockets) after request timeouts config, which has a default value of 5 min and can be increased to 15 min. Therefore, your WebSocket connections will likely be dropped after 15 min, and you should have a logic to handle auto reconnection. Google Cloud Run doc explicitly talks about this limit.
We are trying to connect to IBMMQ using CCDT file and JMS configuration.
We are able to connect to it but we have an issue here:
since we are using spring to set connection factory with CCDT file, this is initialized once at the start of the application, but unfortunately it picks only one queue manager at a time,i.e it sends all the messages to same queue manager and it does not load balance.
Though i observed, if i manually set the CCDT file before every request then its able to load balance the Queue Managers, ideally it looks to me Queue Manager is decided whenever i set the URL to CCDT file. which is wrong practice. My expectation was to initialize the connection factory with CCDT file and then this configuration will be able to load balance on its own.
Can you help me this?
This is the expected behavior. MQ does not load balance clients, it connection balances them. The connection is the single most time consuming API call and in the case of a mutually authenticated TLS connection can take seconds to complete. Therefore a good application design will attempt to connect once, then maintain that connection for the duration of the session. The JMS architecture and Spring framework both expect this pattern.
The way that MQ provides load distribution (again, not true balancing, but rather round-robin distribution) is that the client connects a hop away from a clustered destination queue. A message addressed to that clustered destination queue will round-robin among all the instances of that queue.
If it is a request-reply application, the thing listening for requests on these clustered queue instances addresses the reply message using the Reply-To QMgr and Reply-To Queue name from the requesting message. In this scenario the requestors can fail over QMgr to QMgr if they lose their connection. The systems of record listening on the clustered queues generally do not fail over across queue managers in order to ensure all queue instances are served and because of transaction recovery.
Short answer is that CCDT and MQ client in general are not where MQ load distribution occurs. The client should make a connection and hold it as long as possible. Client reconnect and CCDT are for connection balancing only.
Load distribution is a feature of the MQ cluster. It requires multiple instances of a clustered queue and these are normally a network hop away from the client app that puts the message.
I have a system S of IP addresses that generate and receive data via JMS queues and topics. At the time being this system S is configured to connect to a broker with IP x.x.x.x.
I would like to use the same system S against two different brokers: in other terms I would like to use a sort of "network splitter" that receive the traffic from S and forward it to x.x.x.x AND also to another address y.y.y.y (a second JMS broker). The system S must be completely unaware of the splitting operation. The requirement is not to use networks of jms brokers, jms broker proxies or to play with brokers topology in general. The reason for that is to be general enough to manage protocols other than JMS as possible next steps. JMS connections uses TCP+TLS with mutual authentication.
The system S is made of both Linux and Windows servers.
I would like to do this at software level in order to cope with virtualization of both networks and servers.
What is the best strategy to achieve this? Is there specific software/library that you would recommend, preferably in java?
Thanks!
So i wrote a program to connect to a Clustered WebLogic server behind a VIP with 4 servers and 4 queues that are all connected( i think they call them distributed...) When i run the program from my local machine and just get JMS Connections, look for messages and disconnect, it works great. and by that i mean it:
iteration #1
connects to server 1.
look for a message
disconnects
iteration #2
connects to server 2.
look for a message
disconnects
and so on.
When i run it on the server though, the application picks a server and stick to it. It will never pick a new server, so the queues on the other servers don't ever get worked. like with a "sticky session" setup... My OS is Win7, and the Server os is Win2008r2 JDK is identical for both machines.. How is this configured client side? The server implementation uses "Apache Procrun" to run it as a service. but i haven't seen too many issues with that part...
is there a session cookie getting written out somewhere?
any ideas?
Thanks!
Try disabling 'Server Affinity' on the JMS Connection factory. If you are using the Default Connection Factory, define your own an disable Server Affinity.
EDIT:
Server Affinity is a Server-side setting, but it controls how messages are routed to consumers after a WebLogic JMS Server receives the message. The other option is to use round-robin DNS and send to only one hostname that resolves to a different IP(Managed Server) such that each connection goes to a different server.
I'm pretty sure this is the setting you're looking for :)