i've been using Zuul in my micro-services based application for a while now, and it's been working perfectly.
now i've been asked a question by someone that made me think that micro-services world is not all rainbows and flowers after all.
so the question was " you have Zuul as the sole entrypoint to you app, isn't that some sort of centralization ? what if zuul is down ? "
i just wanted to gather some opinions, and get an answer to that question.
thank you.
Yes it is entry point and it serves many pros having single entry point in your micro service architecture.
Pros:
I can think of on top of mind
Single point of contact for all you API users(Apps). Zuul running at api.yoursite.com and calling it every time is better than calling individual services(service1.yoursite.com, service2.yoursite.com) from you api consumers.
Security point of view, you have to deal with only one server's security and you can hide all the other servers running your services in network and not exposing them on public internet. Amazon AWS and many other providers provides such facility.
You can leverage routing benefits using zuul as your entry point you can route 80% traffic to service1 old version while 20% to your same service1 new version.
Cons:
Great power comes with great responsibility.
If your main zuul server is down your whole app is down.(Edit: only if you have one zuul setup). Please check #lahiru's answer how you can have multiple zuul cluster using eureka registry configuration.
If your Zuul instance is down. You might not be able to direct to other services through the api gateway.
As a solution i would suggest you to go for a clustered solution with two zuul instances.
There you need two client side service registry services (Eureka Servers) .
If you are planning to manage fault tolerance, you can apply sentinel.
Your micro services become eureka clients.
You have to register your micro services in both of the eureka servers and make a eureka server cluster. You can refer to below diagram as a design /architecture diagram for implementation.
Note : You have to maintain two end points for both of the zuul servers. If zuul 1 is down , client can request to zuul 2.
This will help you to achieve zero down time .(You need to make sure that your VM platform is reliable and no unplanned downtimes).
Related
I am in charge of designing a new enterprise application that should handle tons of clients and should be completely fault free.
In order to to that I'm thinking about implementing different microservices that are going to be replicated so eureka server / client solution is perfect for this.
Then since the eureka server could be the single point of failure I found that is possible to have it replicated in multiple instances and it is perfect.
In order to not expose every service I'm going to put as a gateway zuul that will use the eureka server in order to find the perfect instance of the backend serivice that will handle the requests.
Since now zuul is the single point of faiulre I found that it is possible to replicate also this component so if one of them fails I still have the others.
At this point I need to find the way to create a load balancer between the client application (android and ios app) and the zuul stack but a server side load balancer will be the single point of failure so it is useless.
I would like to ask if there is a way to make our tons of clients connect to an healty instance of zuul application without having any single point of failure. Maybe by implementing ribbon on the mobile application that will choose a proper healty instance of zuul.
Unfortunatly everything will be deployed on a "private" cluster so I can not use amazon elastic load balancer or any other different propietary solution
Thanks
Small architecture and design question please.
Question:
Should Kubernetes Ingress lives together with Spring Cloud Gateway? If not, which one should be preferred?
First, with a Spring Webflux / Spring Cloud Gateway project, I managed to have working route-based forwarding. Meaning, all my clients only need to know this one and only Spring Cloud Gateway endpoint, and the Spring Cloud Gateway will forward to serviceA if the URL contains serviceA, to serviceB if serviceB, etc, straightforward.
I added some more “software level features” such as dynamic configuration (to change the routes at runtime), circuit breaker, rate limit, bulkhead, and few others features, very cool, but really, I ended up using the route forwarding as a main feature.
Then couple of weeks back, I spent time studying Kubernetes, and more precisely Kubernetes Ingress.
I managed to learn Kubernetes Ingress is a very cool and strong thing. I managed to perform at least the route based forwarding.
Meaning, clients needs to only know this one and only Ingress endpoint, and the Ingress will forward to underlying services within the Kubernetes cluster. Where as of now, it forwards everything to Spring Cloud Gateway, which will forward to everything else. I tried, and it could have forwarded to the real business services in the first place).
And this is the moment where I am having doubts.
Did I just duplicate work? (I mean in terms on functionally, I had fun learning both).
Should I consider an architecture where the Spring Cloud Gateway (only him) is really doing the gating?
Should I consider an architecture where both the Ingress and the Software Gateway have full importance, configuring features in both? (Accept the duplicated work?)
Should I remove the Spring Cloud Gateway entirely?
Thank you
In my opinion, Kubernetes must lives together with Spring Cloud Gateway.
Reverse proxies has capabilities like central logging, security, caching, routing, traffic management features etc, but there are also things they cannot do. API Gateways come into play at this point. They have all the features of reverse proxies, plus they have extra capabilities that they can't. For this reason, API Gateways are called enhanced reverse proxies.
So Kubernetes ingress acts like reverse proxy, while Spring Cloud Gateway is implementation of API Gateway pattern. As I mentioned in the definition above, I can say few things you can't do with Kubernetes.
Can you implement API composition? No
Can you implement JWT authentication by Kubernetes? If so, Can you carry JWTs more than 8 KB to downstream service by Kubernetes? No
Can you combine all Swagger/Open API documentation by Kubernetes? No
I'm having a really tough time with this one. We want to move our legacy app to Microservice application(Spring-boot, Java 8) .
As per Architect, we do-not need Service Disvovery and API Gateway is enough for the doing Service Discovery and Routing.
Note that currently , deployments are On premise server and we will have fixed number of nodes and F5/load balancer will be able to route the request to API gateway and then to the microservices.
Can we survive with Spring Cloud API Gateway and no Service Discovery?
A short answer Yes, you can survive with Spring Cloud API Gateway and no Service Discovery.
But it's really dependent on the size of your application and the amount of traffic it will be handling.
You can start migration to microservices without Service discovery.
For internal service-to-service communication just use real hardcoded IP addresses and ports.
Regarding to the API Gateway doing Service Discovery. I can be wrong, but you won't be able to communicate through Api Gateway because it also has no clue about the location of the targets (services locations have to be hardcoded as well).
Once you begin feeling that you need scaling out you won't avoid using Service Registry tool. If you start considering which one to take I can suggest using HashiCorp Consul.
Anyway, it's most likely that you finally will have to inject Service discovery mechanism to your infrastructure. You can either do it from the beginning or take care of it later if the new architecture will be beneficial to you and there will be a plan of extending it further.
If you have plans of migration to the clouds then you can think about Kubernetes for your infrastructure in advance. It provides you with Service discovery mechanism out of the box.
Kubernetes is a great platform for this, if you can opt.
It can handle parts ranging from service discovery to deployment.
You just need to make a cloud ready docker image (preferably) and deploy it to kubernetes, Kubernetes will map an internal endpoint to this, based on your configuration and your services will be registered with it ( if I talk in terms of spring-cloud and eureka server).
If there is no Service-Registry-backed DiscoveryClient then you can configure spring.cloud.discovery.client.simple.instances.userservice[0].uri=http://s11:8080
You can host this userservice on kubernetes cluster .For further details go to this docs
https://cloud.spring.io/spring-cloud-commons/2.2.x/reference/html/
Like wise to have communcation between sevices ,suppose userservice wants to communicate to password service easily configure via ribbon
passwordservice.ribbon.listOfServers:${PASSWORDSERIVCE}:http://localhost:8081
I do not see any problem with this strcuture .
I have a cloud-native application, which is implemented using Spring Cloud Netflix.
So, in my application, I'm using Eureka service discovery to manage all instances of different services of the application. When each service instance wants to talk to another one, it uses Eureka to fetch the required information about the target service (IP and port for example).
The service orchestration can also be achieved using tools like Docker Swarm and Kubernetes, and it looks there are some overlaps between what Eureka does and what Docker Swarm and Kubernetes can do.
For example, Imagine I create a service in Docker Swarm with 5 instances. So, swarm insures that those 5 instances are always up and running. Additionally, each services of the application is sending a periodic heartbeat to the Eureka internally, to show that it's still alive. It seems we have two layers of health check here, one for Docker and another inside the Spring Cloud itself.
Or for example, you can expose a port for a service across the entire swarm, which eliminates some of the needs to have a service discovery (the ports are always apparent). Another example could be load balancing performed by the routing mesh inside the docker, and the load balancing happening internally by Ribbon component or Eureka itself. In this case, having a hardware load balancer, leads us to a 3-layered load balancing functionality.
So, I want to know is it rational to use these tools together? It seems using a combination of these technologies increases the complexity of the application very much and may be redundant.
Thank you for reading!
If you already have the application working then there's presumably more effort and risk in removing the netflix components than keeping them. There's an argument that if you could remove e.g. eureka then you wouldn't need to maintain it and it would be one less thing to upgrade. But that might not justify the effort and it also depends on whether you are using it for anything that might not be fulfilled by the orchestration tool.
For example, if you're connecting to services that are not set up as load-balanced ('headless services') then you might want ribbon within your services. (You could do this using tools in the spring cloud kubernetes incubator project or its fabric8 equivalent.) Another situation to be mindful of is when you're connecting to external services (i.e. services outside the kubernetes cluster) - then you might want to add load-balancing or rate limiting and ribbon/hystrix would be an option. It will depend on how nuanced your requirements for load-balancing or rate-limiting are.
You've asked specifically about netflix but it's worth stating clearly that spring cloud includes other components and not just netflix ones. And that there's other areas of overlap where you would need to make choices.
I've focused on Kubernetes rather than docker swarm partly because that's what I know best and partly because that's what I believe to be the current direction of travel for the industry - on this you should note that kubernetes is available within docker EE. I guess you've read many comparison articles but https://hackernoon.com/a-kubernetes-guide-for-docker-swarm-users-c14c8aa266cc might be particularly interesting to you.
You are correct in that it does seem redundant. From personal observations, I think that each layer of that architecture should handle load balancing in its' own specific way. It ends up giving you a lot more flexibility for not much more cost. If you want to take advantage of client side load balancing and any failover features, it makes sense to have Eureka. The major benefit is that if you don't want to take advantage of all of the features, you don't have to.
The container orchestration level load balancing has a place for any applications or services that do not conform to your service discovery piece that resides at the application level (Eureka).
The hardware load balancer provides another level that allows for load balancing outside of your container orchestrator.
The specific use case that I ran into was on AWS for a Kubernetes cluster with Traefik and Eureka with Spring Cloud.
Yes, you are correct. We have a similar Spring Cloud Netflix application deployed on Oracle cloud platform and Predix Cloud Foundry. If you use multiple Kubernetes clusters then you have to use Ribbon load balancing because you have multiple instance for services.
I cannot tell you which is better Kubernetes or Docker Swarm. We use Kubernetes for service orchestration as it provides more flexibility.
I am developing a simple REST API using Spring 3 + Spring MVC. Authentication will be done through OAuth 2.0 or basic auth with a client token using Spring Security. This is still under debate. All connections will be forced through an SSL connection.
I have been looking for information on how to implement rate limiting, but it does not seem like there is a lot of information out there. The implementation needs to be distributed, in that it works across multiple web servers.
Eg if there are three api servers A, B, C and clients are limited to 5 requests a second, then a client that makes 6 requests like so will find the request to C rejected with an error.
A recieves 3 requests \
B receives 2 requests | Executed in order, all requests from one client.
C receives 1 request /
It needs to work based on a token included in the request, as one client may be making requests on behalf of many users, and each user should be rate limited rather than the server IP address.
The set up will be multiple (2-5) web servers behind an HAProxy load balancer. There is a Cassandra backed, and memcached is used. The web servers will be running on Jetty.
One potential solution might be to write a custom Spring Security filter that extracts the token and checks how many requests have been made with it in the last X seconds. This would allow us to do some things like different rate limits for different clients.
Any suggestions on how it can be done? Is there an existing solution or will I have to write my own solution? I haven't done a lot of web site infrastructure before.
It needs to work based on a token included in the request, as one client may be making requests on behalf of many users, and each user should be rate limited rather than the server IP address.
The set up will be multiple (2-5) web servers behind an HAProxy load balancer. There is a Cassandra backed, and memcached is used. The web servers will be running on Jetty.
I think the project is request/response http(s) protocol. And you use HAProxy as fronted.
Maybe the HAProxy can load balancing with token, you can check from here.
Then the same token requests will reach same webserver, and webserver can just use memory cache to implement rate limiter.
I would avoid modifying application level code to meet this requirement if at all possible.
I had a look through the HAProxy LB documentation nothing too obvious there, but the requirement may warrant a full investigation of ACLs.
Putting HAProxy to one side, a possible architecture is to put an Apache WebServer out front and use an Apache plugin to do the rate limiting. Over-the-limit requests are refused out front and the application servers in the tier behind Apache are then separated from rate limit concerns making them simpler. You could also consider serving static content from the Web Server.
See the answer to this question How can I implement rate limiting with Apache? (requests per second)
I hope this helps.
Rob
You could put rate limits at various points in the flow (generally the higher up the better) and the general approach you have makes a lot of sense. One option for the implementation is to use 3scale to do it (http://www.3scale.net) - it does rate limits, analytics, key managed etc. and works either with a code plugin (the Java plugin is here: https://github.com/3scale/3scale_ws_api_for_java) which pushes or by putting something like Varnish (http://www.varnish-cache.org) in the pipeline and having that apply rate limits.
I was also thinking of the similar solutions a couple of day's ago. Basically, I prefer the "central-controlled" solution to save the state of the client request in the distributed environment.
In my application, I use a "session_id" to identify the request client. Then create a servlet filter or spring HandlerInterceptorAdapter to filter the request, then check the "session_id" with the central-controlled data repository, which could be memcached, redis, cassandra or zookeeper.
We use redis as leaky bucket backend
Add a controller as entrance
google cache that token as key with expired time
then filter every request
It is best if you implement ratelimit using REDIS. For more info please look this Rate limiting js Example.