Light4J Total Requests Processed - java

I'm using Light-4J as microserver, sitting between my clients and a 3rd party API. Everything is setup and working, the clients are able to POST requests and responses are sent in reply.
However I want to know how many requests have been processed since the server started. Since I use Log4j to each successful API call I thought I might be able to read the number of lines in the log file. This works but is not accurate since I discovered that other processes are also writing to the file so the total is skewed.
Is there another way to get the data I require without me having to ensure that my requests have exclusive access to a log file?

light-4j supports metrics that can be pushed to influxdb or pulled by prometheus. You can enable it in your microservice service.yml or handler.yml (if you are using release 1.5.18 or later)
https://www.networknt.com/concern/metrics/
https://www.networknt.com/concern/prometheus/
If you generate the project from light-codegen, then the Influxdb metrics is wired in but disabled. You just need to install an InfluxDB instance and enabled it in your microservice.
Also, if you only need to proxy to your backend service, light-proxy might be the way to go unless you have some business logic in your microservice.

Related

Find out whose using Redis

We have one Redis for our company and multiple teams are using it. We are getting a surge of requests and nobody seems to know which application is causing it. We have only one password that goes around the whole company and our Redis is secured under a VPN so we know it's not coming from the outside.
Is there a way to know whose using Redis? Maybe we can pass in some headers with the connection from every app to identify who makes the most requests, etc.
We use Spring Data Redis for our communication.
This question is too broad since different strategies can be used here:
Use Redis MONITOR command. This is basically a built-in debugging tool that monitors all the commands executed by Redis
Use some kind of intermediate proxy. Instead of routing all the commands directly to redis - route everything to proxy that will do some processing like measuring the amounts of commands by the calling host or maybe types of commands depending what you want.
This is still only a configuration related solution so you won't need any changes at the level of applications
Since you have spring boot, you can use Micrometer / metering integration. This way you could create a counter / gauge that will get updated upon each request to Redis. If you also stream the metering data to tools like Prometheus, you'll be able to create a dashboard, say in grafana to see the whole picture. Micrometer can integrate also with other products, Prometheus/Grafana was only an example, you can chose any other solution (maybe in your organization you already have something like that).

Configuring storm cluster for production cluster

We have configured storm cluster with one nimbus server and three supervisors. Published three topologies which does different calculations as follows
Topology1 : Reads raw data from MongoDB, do some calculations and store back the result
Topology2 : Reads the result of topology1 and do some calculations and publish results to a queue
Topology3 : Consumes output of topology2 from the queue, calls a REST Service, get reply from REST service, update result in MongoDB collection, finally send an email.
As new bee to storm, looking for an expert advice on the following questions
Is there a way to externalize all configurations, for example a config.json, that can be referred by all topologies?
Currently configuration to connect MongoDB, MySql, Mq, REST urls are hard-coded in java file. It is not good practice to customize source files for each customer.
Wanted to log at each stage [Spouts and Bolts], Where to post/store log4j.xml that can be used by cluster?
Is it right to execute blocking call like REST call from a bolt?
Any help would be much appreciated.
Since each topology is just a Java program, simply pass the configuration into the Java Jar, or pass a path to a file. The topology can read the file at startup, and pass any configuration to components as it instantiates them.
Storm uses slf4j out of the box, and it should be easy to use within your topology as such. If you use the default configuration, you should be able to see logs either through the UI, or dumped to disk. If you can't find them, there are a number of guides to help, e.g. http://www.saurabhsaxena.net/how-to-find-storm-worker-log-directory/.
With storm, you have the flexibility to push concurrency out to the component level, and get multiple executors by instantiating multiple bolts. This is likely the simplest approach, and I'd advise you start there, and later introduce the complexity of an executor inside of your topology for asynchronously making HTTP calls.
See http://storm.apache.org/documentation/Understanding-the-parallelism-of-a-Storm-topology.html for the canonical overview of parallelism in storm. Start simple, and then tune as necessary, as with anything.

How to implement rate limiting based on a client token in Spring?

I am developing a simple REST API using Spring 3 + Spring MVC. Authentication will be done through OAuth 2.0 or basic auth with a client token using Spring Security. This is still under debate. All connections will be forced through an SSL connection.
I have been looking for information on how to implement rate limiting, but it does not seem like there is a lot of information out there. The implementation needs to be distributed, in that it works across multiple web servers.
Eg if there are three api servers A, B, C and clients are limited to 5 requests a second, then a client that makes 6 requests like so will find the request to C rejected with an error.
A recieves 3 requests \
B receives 2 requests | Executed in order, all requests from one client.
C receives 1 request /
It needs to work based on a token included in the request, as one client may be making requests on behalf of many users, and each user should be rate limited rather than the server IP address.
The set up will be multiple (2-5) web servers behind an HAProxy load balancer. There is a Cassandra backed, and memcached is used. The web servers will be running on Jetty.
One potential solution might be to write a custom Spring Security filter that extracts the token and checks how many requests have been made with it in the last X seconds. This would allow us to do some things like different rate limits for different clients.
Any suggestions on how it can be done? Is there an existing solution or will I have to write my own solution? I haven't done a lot of web site infrastructure before.
It needs to work based on a token included in the request, as one client may be making requests on behalf of many users, and each user should be rate limited rather than the server IP address.
The set up will be multiple (2-5) web servers behind an HAProxy load balancer. There is a Cassandra backed, and memcached is used. The web servers will be running on Jetty.
I think the project is request/response http(s) protocol. And you use HAProxy as fronted.
Maybe the HAProxy can load balancing with token, you can check from here.
Then the same token requests will reach same webserver, and webserver can just use memory cache to implement rate limiter.
I would avoid modifying application level code to meet this requirement if at all possible.
I had a look through the HAProxy LB documentation nothing too obvious there, but the requirement may warrant a full investigation of ACLs.
Putting HAProxy to one side, a possible architecture is to put an Apache WebServer out front and use an Apache plugin to do the rate limiting. Over-the-limit requests are refused out front and the application servers in the tier behind Apache are then separated from rate limit concerns making them simpler. You could also consider serving static content from the Web Server.
See the answer to this question How can I implement rate limiting with Apache? (requests per second)
I hope this helps.
Rob
You could put rate limits at various points in the flow (generally the higher up the better) and the general approach you have makes a lot of sense. One option for the implementation is to use 3scale to do it (http://www.3scale.net) - it does rate limits, analytics, key managed etc. and works either with a code plugin (the Java plugin is here: https://github.com/3scale/3scale_ws_api_for_java) which pushes or by putting something like Varnish (http://www.varnish-cache.org) in the pipeline and having that apply rate limits.
I was also thinking of the similar solutions a couple of day's ago. Basically, I prefer the "central-controlled" solution to save the state of the client request in the distributed environment.
In my application, I use a "session_id" to identify the request client. Then create a servlet filter or spring HandlerInterceptorAdapter to filter the request, then check the "session_id" with the central-controlled data repository, which could be memcached, redis, cassandra or zookeeper.
We use redis as leaky bucket backend
Add a controller as entrance
google cache that token as key with expired time
then filter every request
It is best if you implement ratelimit using REDIS. For more info please look this Rate limiting js Example.

Email fetcher using Play Framework

I need to develop an IMAP poller which pings an email server every few seconds and fetches every new email which arrives.
I've done it once for another application, but there I used an inbound mail channel from Spring Integration.
I just started "playing" with Play, and am not sure what the best way to achieve this is. I know that JavaMail already offers the possibility to fetch mails, but I am not sure how to actually package this. Should this be a separate module, a separate plugin, a service, or sth?
Should the polling functionality be implemented as a job?
NOTE: It is a web application BTW, although the description above may suggest it is not.
There are a few options to solve this:
1) Use java in a Job to poll the IMAP server at regular intervals
documentation on creating a Job is available and is pretty straight forward, just setup the job to run every minute or 5 minutes and then add the code to actually check for new emails.
http://www.playframework.org/documentation/1.2.4/jobs
If you're looking for how to check for new emails on IMAP then have a look through stack exchange there. For example, to poll gmail check out this question: Getting mail from GMail into Java application using IMAP
2) Use camel module to poll IMAP server with a custom route/processor
This is a heavyweight solution and only recommended if you want to make use of other features of Apache Camel.
The module is available here: http://www.playframework.org/modules/camel
Using camel to poll for IMAP messages is fairly easy once you get your head around how to use camel, the specific info for the IMAP route is here: http://camel.apache.org/mail.html
In my opinion you shouldn't use Play at all for this — if I understand your requirements correctly. Play is a web framework intended to handle HTTP requests. Your requirements say nothing about HTTP at all, so a large part of Play! would be useless.
You could use Play's server runtime and Job (and cron) architecture to run this, but you would be misusing the facilities of the framework for something for which they were never intended. You may also be inheriting requirements from Play that you wouldn't ever actually need for an application/service like the one you want to build (for example the Python runtime).
I think you should not use Play for this, but rather create this as a simple, straight-forward Java application using Spring. With Spring's scheduling capabilities you can just as easily implement what you want.
Naturally, when you intend to build a web front-end on top of this in the future, that would make it a completely different story.

Most Efficient Way of calling an external webservice in Java?

In one of our applications we need to call the Yahoo Soap Webservice to Get Weather and other related info.
I used the wsdl2java tool from axis1.4 and generated th required stubs and wrote a client. I use jsp's use bean to include the client bean and call methods defined in the client which call the yahoo webservice inturn.
Now the problem: When users make calls to the jsp the response time of the webservice differs greatly, like for one user it took less then 10 seconds and the other in the same network took more than a minute.
I was just wondering if Axis1.4 queues the requests even though the jsps are multithreaded.
And finally is there an efficient way of calling the webservice(Yahoo weather). Typically i get around 200 simultaneous requests from my users.
Why don't you schedule one thread to get the weather every minute or so, and expose that to the JSP, in stead of letting each JSP get its own weather report?
That's a lot more efficient for both you and Yahoo, and JSP's only need to lookup a local object (almost instantaneous) in stead of connecting to a web service.
EDIT
Some new requirements in the comments of this answer suggest a different way of choosing solutions.
It seems that not only weather, which not only doesn't change that often but is also the same for every user, is requested by web service but also other data like flight data.
The requirements for flight data retrieval are very much different than for weather data. So I think you should define a few types of (remote) data and choose a different solution
for each category.
As basis for the requirements I'd use something simple:
Users like their information promptly, they do not like waiting
The amount of data stored on the web server is finite
Remote web services have an EULA of sorts and are probably not happy with 200 concurrent requests of the same data by the same source (you)
Fast data access to users is best achieved by having the data locally, be it transient (kept in a bean) or persistent (a local database). That can be done by periodically requesting data from the remote source, and using the cached data in the JSP. That would also keep you in the clear with the third point.
A finite amount of data stored on the web service means that not everything can be cached. Data which differs per user, or large data sets which can vary over small periods of time, cannot readily be cached. It's not really a good idea to load data on all flights of all airports in the US every minute or so. That kind of requests would be better served by running a specific web service query when necessary.
The trick is now to identify when caching data is feasible. If it is feasible, do that, otherwise run the web service query in the background. That can be done by presenting the JSP now and starting the web service query in the background. The JSP can have an AJAX script which queries your web server whether the data is ready, and insert that data in the page when ready.
I'd use Google tools to monitor how long the call to the web service is taking.
There are several things going on here:
Map Java beans to XML request.
Send XML request to web service.
Unmarshall XML request on web service side.
Web service processes request
Web service marshalles XML response
Web service sends XML response to Java client
Unmarshall XML response and display on client.
You can't see inside the Yahoo web service, but do break out what you can see on the client side to see where the time is spent.
Check memory as well. If Axis is generating .class files, maybe your perm space is being consumed. Visual VM is available to you with the JDK. Attach it to the PID on your client to see what's going on in memory on your app server.
Maybe this would be a good place for an AJAX call. This will be a good solution if you can get the weather in the background while users are doing other things.
I would recommend local caching and data pooling. Instead of sending out 200 separate requests for similar/same locations run a background thread which pulls the weather for only the locations your users are interested in and caches them locally, this cache updates every minute or so. When users request their personal preferences, the requests hit the cache and refetch if the location is new or the data in the cache is stale. This way the user will have a more seamless experience and you will not hit Yahoo throttles and get denied service.

Categories