Given the following scenario, please suggest me a way to implement memcache in my application.
Currently, I have 10 webservers in which the same application is being run and a load balancer to decide upon to which web server the request be sent.
On each webserver, I am maintaining a local cache i.e. there is some class XYZ which controls the MySQL table xyz and this class has initialize method
which will warm up the local cache.
Now, suppose the webservers are X,Y,Z. The load balancer sends a request to X and this request adds some values to db & updates the cache. Again the same request was sent by the load balancer to Y. But since server Y doesnot have the value in the cache, it hits the database.
So, given this scenario, how should I implement memcache in my application so that I could minimize db hits.
Should I have a separate memcache server and all the other 10 webservers will get the cached data from this memcacher server?
One work around (not ideal though), would be to implement sticky session on the load balancer so that request from one user always go through to the same server (for the duration of their session). This doesn't help much if a server dies or you need cached data shared between sessions (but it is easy and quick to do if your load balancer supports it).
Otherwise the better solution is to use something like memcached (or membase if your feeling adventurous). Memcached can either be deployed on each or your servers or on separate servers (use multiple servers to avoid the problem of one servers dying and taking your cache with it). Then on each of your application servers specify in your memcached client connection details for all of the memcached servers (put them in the same order on each server and use a consistent hashing algorithm in the memcached client options to determine on which server(s) the cache key will go).
In short - you now have a working memcached set-up (as long as you use it sensibly inside your application)
There are plenty of memcached tutorials out there that can help with the finer points of doing all this but hopefully my post will give you some general direction.
Related
We are running a setup on production where grpc clients are talking to servers via proxy in between (image attached)
The client is written in java and server is written in go. We are using the load balancing property as round_robin in the client. Despite this, we have observed some bizarre behaviour. When our proxy servers scale in i.e reduce from let's say 4 to 3, then resolver gets into action and the request load from our clients gets distributed equally to all of our proxies, but when the proxy servers scale out i.e increase from 4 to 8, then the new proxy servers don't get any requests from the clients which leads to a skewed distribution of request load on our proxy servers. Is there any configuration that we can do to avoid this?
We tried setting a property named networkaddress.cache.ttl to 60 seconds in the JVM ARGS but even this didn't help.
You need to cycle the sticky gRPC connections using the keepalive and keepalive timeout configuration in the gRPC client.
Please have a look at this - gRPC connection cycling
both round_robin and pick_first perform name resolution only once. They are intended for thin, user-facing clients (android, desktop) that have relatively short life-time, so sticking to a particular (set of) backend connection(s) is not a problem then.
If your client is a server app, then you should be rather be using grpclb or the newer xDS: they automatically re-resolve available backends when needed. To enable them you need to add runtime dependency in your client to grpc-grpclb or grpc-xds respectively.
grpclb does not need any additional configuration or setup, but has limited functionality. Each client process will have its own load-balancer+resolver instance. backends are obtained via repeated DNS resolution by default.
xDS requires an external envoy instance/service from which it obtains available backends.
For Grpc service client side load balancing is used.
Channel creation
ManageChannelBuilder.forTarget("host1:port,host2:port,host3:port").nameResolverFactory(new CustomNameResolverProvider()).loadBalancerFactory(RoundRobinBalancerFactory.getInstance()).usePlaintText(true).build();
Use this channel to create stub.
Problem
If one of the service [host1] goes down then whether stub will handle this scenario and not send any further request to service [host1] ?
As per documentation at https://grpc.io/blog/loadbalancing
A thick client approach means the load balancing smarts are
implemented in the client. The client is responsible for keeping track
of available servers, their workload, and the algorithms used for
choosing servers. The client typically integrates libraries that
communicate with other infrastructures such as service discovery, name
resolution, quota management, etc.
So is it the responsibility of ManagedChannel class to maintain list of active server or application code needs to maintain list of active server list and create instance of ManagedChannel every time with active server list ?
Test Result
As per test if one of the service goes down there is no impact on load balancing and all request are processed correctly.
So can it be assumed that either stub or ManagedChannel class handle active server list ?
Answer with documentation will be highly appreciated.
Load Balancers generally handle nodes going down. Even when managed by an external service, nodes can crash abruptly and Load Balancers want to avoid those nodes. So all Load Balancer implementations for gRPC I'm aware of avoid failing calls when a backend is down.
Pick First (the default), iterates through the addresses until one works. Round Robin only round robins over working connections. So what you're describing should work fine.
I will note that your approach does have one downfall: you can't change the servers while the process is running. Removing broken backends in one thing, but adding new working backends is another. If your load is ever too high, you may not be able to address the issue by adding more workers because even if you add more workers your clients won't connect to them.
After spending a few hours reading the Http Client documentation and source code I have decided that I should definitely ask for help here.
I have a load balancer server using a round-robin algorithm somewhat like this
+---> RESTServer1
client --> load balancer +---> RESTServer2
+---> RESTServer3
Our client is using HttpClient to direct requests to our load balancer server, which in turn round-robins the requests to the corresponding RESTServer.
Now, Apache HttpClient creates, by default, a pool of connections (2 per route by default). This connections are by default persistent connections since I am using Http v1.1 and my servers are emitting Connection: Keep-Alive headers.
So, the problems is that since HttpClient creates this persistent connections, then those connections are no longer subject to round-robing algorithm at the balancer level. They always hit the same server every time.
This creates two problems:
I can see that sometimes one or more of the balanced servers are overloaded with traffic, whereas one ore more of the other servers are idle; and
even if I take one of my REST servers out of the balancer, it stills receives requests while the persistent connections are alive.
Definitely this is not the intended behavior.
I suppose I could force a Connection: close header in my responses, or I could run HttpClient without a connection pool or with a NoConnectionReuseStrategy. But the documentation for HttpClient states that the idea behind the use of a pool is to improve performance by avoiding having to open a socket every time and doing all the TPC handshaking and related stuff. So, I have to conclude that the use of a connection pool is beneficial to the performance of my applications.
So my question here, is there a way to use persistent connections with a load-balancer in the way or am I forced to use non-persistent connections for this scenario?
I want the performance that comes with reusing connections, but I want them properly load-balanced. Any thoughts on how I can configure this scenario with Apache Http Client if at all possible?
Your question is perhaps more related to your load balancer configuration and the style of load balancing. There are several ways:
HTTP Redirection
LB acts as a reverse proxy
Pure packet forwarding
In scenarios 1 and 3 you do not have a chance with persistent connections. If your load balancer acts like a reverse proxy, there might be a way to achieve persistent connections with balancing. "Dumb" balancers, like SMTP or LDAP selects the target per TCP connection, not on a request basis.
For example the Apache HTTPd server with the balancer module (see http://httpd.apache.org/docs/2.2/mod/mod_proxy_balancer.html) can dispatch every request (even on persistent connections) to a different server.
Also check, that you do not receive a balancer cookie which might be session persistent so that the cause is not the persistent connection but a balancer cookie.
HTH, Mark
+1 to #mp911de answer
One can also make the scenarios 1 and 3 work reasonably well by limiting the total time to live of persistent connections to some short period time, say 15 seconds. This way connections would live long enough to get re-used during periods of activity and short enough to go away during periods of relative inactivity.
I have a Java app with more than 100 servers. Currently each server opens up connections to 7 database schemas in a relational databases (logging, this, that, the other). All schemas connect to the same DB cluster, but its all effectively one database instance.
The server managed connection pools open up a handfull of connections (1 - 5) on each database schema per instance, then double that on a redundant pool. So each server open up a minimum of 30 database connections and can grow to a maximum of several hundred per server, and again, there are more than 100 servers.
All in all, the minimum number of database connections used are 3000, and this can grow to ludricous.
Obviously this is just plain wrong. The database cluster can only effeciently handle X concurrent requests and any number of requests > X introduces unnecessary contention and slows the whole lot down. (X is unknown, but it is way smaller than the 3000 minimum concurrent connections).
I want to lower the total connections used by implementing the following strategy:
Connect to one schema only (Application-X), have 6 connections per pool maximum.
Write a layer above the pool that will switch to the schema I want. The getConnection(forSchema) function will take a parameter for the target schema (eg. logging), will get a connection that could last be pointing to any schema, and issue a schema switch SQL statement (set search_path to 'target_schema').
Please do not comment on whether this approach is right or wrong. Because 'it depends' needs to be considered, such comments will not add value.
My question is whether there is a DB pool implementation out there that already does this - allows me to have one set of connections and automatically places me at the right schema, or better yet - tracks whether a pooled connection is available for your target schema before making a decision to go ahead and switch the schema (saves a DB round trip).
I would also like to hear from anyone else who has a similar issue (real-world experience) if you solved it in a different way.
Having learned the hard way myself, the best way to stabilize the number of database connections between a web application and a database is to put a reverse proxy in front of the web application.
Here's why it works:
A slow part of a web request can be returning the data to the client. If there's a lot of data or the user is on a slow connection, the connection can remain open to the web server where the data dribbles out to the client slowly. Meanwhile, the application server continues to hold a database connection open to the backend server. While the database connection may only needed for a fraction of the transaction, it's tied up until client disconnects from the application server.
Now, consider what happens when a reverse proxy is added in front of the app server. When the app server has a response prepared, it can quickly reply back to the reverse proxy in front of it, and free up the database connection behind it. The reverse proxy can then handle slowly dribbling out responses to uses, without keeping a related database connection tied up.
Before I made this architectural change, there were a number of traffic spikes that resulted in death spirals: the database handle usage would spike to exhaustion, and things would go downhill from there.
After the change, the number of the database handles required was both far less and far more stable.
If a reverse proxy is not already part of your architecture, I recommend it as a first step to control the number of database connections you require.
I have a central load balancing server and several application servers running on Apache Tomcat. The load balancing server receives request and forwards them to the application servers in round robin fashion. If one these application servers goes down, the load balancing server should stop forwarding requests to it.
My current solution for this is to ping the application servers every few minutes and if I don't receive a response, remove them from a list of available servers. Is there a better way to monitor the status of these servers? Should I ping more often or should the application servers constantly inform the load balancing server?
Execute a null transaction on it regularly. Pinging really isn't enough, it only exercises the TCP/IP stack, and I have seen operating systems in states where TCP/IP was up but no applications and not even part of the OS stack itself. Executing a transaction exercises everything. Include the database in the null transaction.
First ensure your server isn DDOS attrack protected , if the depends on you application connection avg time edit keep alive time
Then you should study about precock mpm , i think it will give you best solution