I have a spring boot application autodiscovering in Consul a downstream microservice by his serviceId.
Problem:
For some reasons some of previous registered services in Consul (which are not running anymore) are still returned during discovery.
So if I'm lucky load balancing is working through my restTemplate but sometimes I've a timeout because non running services are returned.
Questions on best practices to handle this use case:
Is it possible to log the host/service in error and not just timeout?
error:
a resourceAccessException I/O error on GET request for http://SERVICE-NAME .... connection timeout
Is it possible to log through the restTemplate the node chosen when loadbalancing occured ?
Does this kind of logging makes sense or is it better to let the magic happen later on, when Circuit breaker is implemented ?
thanks!
Related
I am in charge of designing a new enterprise application that should handle tons of clients and should be completely fault free.
In order to to that I'm thinking about implementing different microservices that are going to be replicated so eureka server / client solution is perfect for this.
Then since the eureka server could be the single point of failure I found that is possible to have it replicated in multiple instances and it is perfect.
In order to not expose every service I'm going to put as a gateway zuul that will use the eureka server in order to find the perfect instance of the backend serivice that will handle the requests.
Since now zuul is the single point of faiulre I found that it is possible to replicate also this component so if one of them fails I still have the others.
At this point I need to find the way to create a load balancer between the client application (android and ios app) and the zuul stack but a server side load balancer will be the single point of failure so it is useless.
I would like to ask if there is a way to make our tons of clients connect to an healty instance of zuul application without having any single point of failure. Maybe by implementing ribbon on the mobile application that will choose a proper healty instance of zuul.
Unfortunatly everything will be deployed on a "private" cluster so I can not use amazon elastic load balancer or any other different propietary solution
Thanks
Do you know (if it is possible) how to reserve threads/memory for a specific endpoint in a spring boot microservice?
I've one microservice that accepts HTTP Requests via Spring MVC, and those requests trigger http calls to 3rd system, which sometimes is partially degraded, and it responds very slow. I can't reduce the timeout time because there are some calls that are very slow by nature.
I've the spring-boot-actuator /health endpoint enabled and I use it like a container livenessProbe in a kubernetes cluster. Sometimes, when the 3rd system is degraded, the microservice doesn't respond to /health endpoint and kubernetes restarts my service.
This is because I'm using a RestTemplate to make HTTP calls, so I'm continuously creating new threads, and JVM starts to have problems with the memory.
I have thought about some solutions:
Implement a high availability “/health” endpoint, reserve threads, or something like that.
Use an async http client.
Implement a Circuit Breaker.
Configure custom timeouts per 3rd endpoint that I'm using.
Create other small service (golang) and deploy it in the same pod. This service is going to process the liveness probe.
Migrate/Refactor services to small services, and maybe with other framework/languages like Vert.x, go, etc.
What do you think?
The actuator health endpoint is very convenient with Spring boot - almost too convenient in this context as it does deeper health checks than you necessarily want in a liveness probe. For readiness you want to do deeper checks but not liveness. The idea is that if the Pod is overwhelmed for a bit and fails readiness then it will be withdrawn from the load balancing and get a breather. But if it fails liveness it will be restarted. So you want only minimal checks in liveness (Should Health Checks call other App Health Checks). By using actuator health for both there is no way for your busy Pods to get a breather as they get killed first. And kubernetes is periodically calling the http endpoint in performing both probes, which contributes further to your thread usage problem (do consider the periodSeconds on the probes).
For your case you could define a liveness command and not an http probe - https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-a-liveness-command. The command could just check that the Java process is running (so kinda similar to your go-based probe suggestion).
For many cases using the actuator for liveness would be fine (think apps that hit a different constraint before threads, which would be your case if you went async/non-blocking with the reactive stack). Yours is one where it can cause problems - the actuator's probing of availability for dependencies like message brokers can be another where you get excessive restarts (in that case on first deploy).
I have a prototype just wrapping up for this same problem: SpringBoot permits 100% of the available threads to be filled up with public network requests, leaving the /health endpoint inaccessible to AWS load balancer which knocks the service offline thinking it's unhealthy. There's a different between unhealthy and busy... and health is more than just a process running, port listening, superficial check, etc - it needs to be a "deep ping" which checks that it and all its dependencies are operable in order to give a confident health check response back.
My approach to solving the problem is to produce two new auto-wired components, the first to configure Jetty with a fixed, configurable maximum number of threads (make sure your JVM is allocated enough memory to match), and the second to keep a counter of each request as it starts and completes, throwing an Exception which maps to an HTTP 429 TOO MANY REQUESTS response if the count approaches a ceiling which is the maxThreads - reserveThreads. Then I can set reserveThreads to whatever I want and the /health endpoint is not bound by the request counter, ensuring that it's always able to get in.
I was just searching around to figure out how others are solving this problem and found your question with the same issue, so far haven't seen anything else solid.
To configure Jetty thread settings via application properties file:
http://jdpgrailsdev.github.io/blog/2014/10/07/spring_boot_jetty_thread_pool.html
Sounds like your Microservice should still respond to health checks /health whilist returning results from that 3rd service its calling.
I'd build an async http server with Vert.x-Web and try a test before modifying your good code. Create two endpoints. The /health check and a /slow call that just sleeps() for like 5 minutes before replying with "hello". Deploy that in minikube or your cluster and see if its able to respond to health checks while sleeping on the other http request.
We are running microservices on spring boot (with embedded tomcat) and spring cloud. It means service discovery, regular health checks and services that are responding to these health checks, ... We have also spring boot admin server for monitoring and we can see that all services are running ok. Currently running only on test environment...
Some of our microservices are called quite rarely (let's assume once per two days) however there are still regular health checks. When REST api of these services is called after so long idle time the first request takes very long time to process. It of course causes opening circuit breakers in request chains and errors... I see this behavior also when calling different endpoints using spring boot admin (Theads list, Metrics).
As a summary I have seen this behavior in calls on spring boot admim metrics, threads info, environment info or calls where database is called using Hikari data source or where aservice tried to send email through smtp server
My questions are:
Is it something related to setting of embedded server and its thread pool?
Or should I dive deep into other thread pools and connection pools touched by these requests?
Any ideas for diagnostics?
Many thanks
The problem was that there was not enough RAM to cover whole heap of those applications ... wrong setting applied to multiple virtual machines. Part of the heap was actually swapping. The problem dissapeared when heap and RAM sizes were fixed.
Please consider below scenario.
I have implemented apache load balancer using mod jk. There are three tomcat behind apache load balancer. They all are in diffrerent machines. Let's say tomcat-1 is serving a request & before completing a request it goes down due to some issue.As Tomcat clustering has been configured, other two tomcat will handle further request. But how to handle that failed request which has already been accepted by tomcat-1. Is there any solution ?
To have your proxy retry your request on another node after a failure, mod jk would need to know that a request was idemopotent.
I do see that adding this knowledge of idemoptency was discussed a long time ago. https://bz.apache.org/bugzilla/show_bug.cgi?id=39692
I doubt that they implemented this functionality.
I have seen other reverse proxy solutions implement an idempotency identifier. I seem to remember Weblogic having this ability. I have also seen it with certain hardware proxies.
I exploit spring-cloud. As far as I understand, when client of Eureka gets a list of services from Eureka server, it uses the Ribbon for load balancing.
Does the client use Hystrix to get the list of services from Eureka through the circuit breakers?
There is a Gateway Service called Netflix Zuul(You can also call it as edge Service).Client connect to gateway service which in turn queries Eureka Server to get appropriate Micro Service details.
Hystrix basically uses fault tolerance mechanism which can be used in any Micro Service. Its advantage is that, if any API goes down ,it gracefully handles the errors in the Application.
As shankarsh15 says, Hystrix actually provides resilience (e.g. fallbacks) when errors and/or timeouts occur in API calls.
I believe it's actually ribbon-loadbalance (LoadBalancerContext.java -> getServerFromLoadBalancer()) that determines which client to call.
And this ultimately works in a similar way to doing discoveryClient.getInstances("service-name") (aka gets a list of service instances, then uses round robin to pick a service to use)