Distributed Metrics - java

I have been working on a single box application which uses codehale metrics heavily for instrumentation. Right now we are moving to cloud and I have below questions on how I can monitor metrics when the application is distributed.
Is there a metrics reporter that can write metrics data to Cassandra?
When and how does the aggregation happen if there are records per server in the database?
Can I define the time interval at which the metrics data gets saved into the database?
Are there any inbuilt frameworks that are available to achieve this?
Thanks a bunch and appreciate all your help.

I am answering your questions first, but I think you are misunderstanding how to use Metrics.
You can google this fairly easily. I don't know of any (I also don't understand what you'll do with it in cassandra?). You would normally use something like graphite for that. In any case, a reporter implementation is very straight forward and easy.
That question does not make too much sense. Why would you aggregate over 2 different servers - they are independent. Each of your monitored instances should be standalone. Aggregation happens on the receiving side (e.g. graphite)
You can - see 1. Write a reporter, and configure it accordingly.
Not that i know of.
Now to metrics in general:
I think you are having the wrong idea. You can monitor X servers, that is not a problem at all, but you should not aggregate that on the client side (or database side). how would that even work? Restarts zero the clients, and essentially that means you need to track the state of each of your servers so that your aggregation does work. How do you manage outages?
The way you should monitor your servers with metrics:
create a namespace
io.my.server.{hostname}.my.metric
now you have X different namespaces, but they all have a common prefix. That means, you have grouped them.
Send them to your prefered monitoring solution.
There are heaps out there. I do not understand why you want this to be cassandra - what kind of advantage do you gain from that? http://graphite.wikidot.com/ for example is a graphng solution. Your applications can automatically submit data there (graphite comes with a reporter in java that you can use). See http://graphite.wikidot.com/screen-shots on how it looks like.
The main point is that graphite (and all or most providers) know how to handle your namespaces. E.g. also look at Zabix, which can do the same thing.
Aggregations
Now the aggregation happens on the receiving side. Your provider knows how to do that, and you can define rules.
For example, you could wildcard alerts like:
io.my.server.{hostname}.my.metric.count > X
Graphite (I believe) even supports operations, e.g:
sum(io.my.server.{hostname}.my.metric.request) - which would sum up ALL your hosts's requests
That is where the aggregation happens. At that point, your servers are again standalone (as they should), and have no dependency on each other or any monitoring database etc. They simply report on their own metrics (which is what they should do) and you - as the consumer of those metrics - are responsible to make the right alerts/aggregations/formulars on the receiving end.
Aggregating this on server side would involve:
Discover all other servers
Monitor their state
Receive/send metrics back and forth
Synchronise what they report etc
That just sounds like a nightmare for maintenance :) I hope that gives you some inside/ideas.
(Disclaimer: Neither a metrics dev nur a graphite dev - this is just how I did this in the past/ and the approach I still use)
Edit:
With your comment in mind, here are my two fave solutions on what you want to achieve:
DB
you can use the DB and store dates e.g. for start message and end message.
This is not really a metric thing so maybe not preferred. As per your question you could write your own reporter on that, but it would get complicated with regards to upserts/updates etc. I think option 2 is easier and has more potential.
Logs
This is I think what you need. Your servers independently log on Start/Stop/Pause etc - whatever it is you want to report on. You then set up logstash and collect those logs.
Logstash allows you to track these events over time and create metrics on it, see:
https://www.elastic.co/guide/en/logstash/current/plugins-filters-metrics.html
Or:
https://github.com/logstash-plugins/logstash-filter-elapsed
The first one uses actual metrics. The second one is a different plugin that just measures times between start/stop events.
This is the option with the most potential because it does not rely on any format/ any data store or anything other. You even get Kibana for plotting out of the box if you use the entire ELK stack.
Say you wanted to measure your messages. You can just look for the logs, there are no application changes involved. The solution does not even touch on your application (e.g. storing your reporting data manually does take up threads and processing in your applications, so if you need to be real-time compatible this will put your overall performance down), it is a complete standalone solution. Later on, when wanting to measure other metrics, you can easily add to your logstash configuration and start doing other metrics.
I hope this helps

Related

Is there a way to do "migration" on RabbitMQ queues, exchanges, bindings, etc?

I would like to know if there is any alternative to create / change / remove exchanges, queues and bindings without depending of the framework (in my case, Spring) for this and his limitations.
The problem
Often I need to change the name of a Routing Key, Queue, or Exchange, and these frameworks do not allow to do this "refined" changes. As a consequence, the tendency is continue with the original names of queues/keys and even the original setup (durable, DLQ, etc). On the future, this ends up confusing the organization of the queues, because you can not easily give proper maintenance to their name, configuration, eventually reorganize them at different exchanges, etc.
Actually, the only way to accomplish that is manually removing them from each environment and let the framework recreate them. Or moving the messages for a temporary queue to do the same.
I would like to know if there are any alternative to control this, something like the tools for database migration, like Liquibase, Flyway, etc.
Making a parallel situation with the database problem, currently letting the Spring create everything in RabbitMQ seems to me analogous to leaving hbm2ddl Hibernate option on update on a Production database.
You can change some things but not others - but you have to do it programmatically, not declaratively.
You can use RabbitAdmin.declareBinding() to bind a queue with a different routing key (and/or exchange), and then use removeBinding() to remove the old one.
You cannot change queue arguments (DLQ settings etc) or durability.
You can use the shovel plugin to move messages from an old queue to a new one.

Axon Framework - is it possible to have a single tracking event processor for multiple sagas?

Let's start of by saying that I'm using version 3.1.2 of Axon Framework with tracking event processors enabled for both #EventHandlers and Sagas.
The current default behaviour for creating event processors for Sagas, as I'm seeing it, is to create a single tracking event processor for a single Saga. This works quite well on a microservice scale, but might turn out to be a problem in big monolithic applications which may implement a lot of Sagas. Since I'm writing such application, I want to have a better control over number of running threads, which in turn will give me better control over usage of database connection pool, context switching and memory usage. Ideally, I would like to have as many tracking event processors as CPU cores where each event processor executes multiple sagas and/or #EventHandlers.
I have already figured that I'm able to do that for #EventHandlers via either #ProcessingGroup annotation or EventHandlingConfiguration::assignHandlersMatching method, but SagaConfiguration does not seem to expose similar API. In fact, the most specific SagaConfiguration::trackingSagaManager method is hardcoded to create a new TrackingEventProcessor object, which makes me think what I'm trying to achieve is currently impossible. So here's my question: Is there some non-straightforward way that I'm missing which will let me execute multiple Sagas in the context of a single event processor?
I can confirm with you that it is (currently) not possible to have multiple Sagas be managed by a singleEventProcessor. Added to that, I'm doubting about the pro's and con's to doing so, as your scenario doesn't sound to weird at first glance.
I recommend to drop a feature request on the AxonFramework GitHub page. That way we (1) document this idea/desire and (2) have a good place to discuss whether to implement this or not.

Domain-specific crawling with different settings for each domain (e.g. speed) using Storm crawler

I have discovered Storm crawler only recently and from the past experience and studies and work with different crawlers I find this project based on Apache Storm pretty robust and suitable for many use cases and scenarios.
I have read some tutorials and tested the storm crawler with some basic setup. I would like to use the crawler in my project but there are certain things I am not sure if the crawler is capable of doing or even if it is suitable for such use cases.
I would like to do small and large recursive crawls on many web domains with specific speed settings and limit to the number of fetched urls. The crawls can be started separately at any time with different settings (different speed, ignoring robots.txt for that domain, ignoring external links).
Questions:
Is the storm crawler suitable for such scenario?
Can I set the limit to the maximum number of pages fetched by the
crawler?
Can I set the limits to the number of fetched pages for different
domains?
Can I monitor the progress of the crawl for specific domains separately?
Can I set the settings dynamically without the need of uploading modified topology to storm?
Is it possible to pause or stop crawling (for specific domain)?
Is usually storm crawler running as one deployed topology?
I assume that for some of these questions the answer may be in customizing or writing my own bolts or spouts. But I would rather avoid modifying Fetcher Bolt or main logic of the crawler as that would mean I am developing another crawler.
Thank you.
Glad you like StormCrawler
Is the storm crawler suitable for such scenario?
Probably but you'd need to modify/customise a few things.
Can I set the limit to the maximum number of pages fetched by the crawler?
You can currently set a limit on the depth from the seeds and have a different value per seed.
There is no mechanism for filtering globally based on the number of URLs but this could be done. It depends on what you use to store the URL status and the corresponding spout and status updater implementations. For instance, if you were using Elasticsearch for storing the URLs, you could have a URL filter check the number of URLs in the index and filter URLs (existing or not) based on that.
Can I set the limits to the number of fetched pages for different domains?
You could specialize the solution proposed above and query per domain or host for the number of URLs already known. Doing this would not require any modifications to the core elements, just a custom URL filter.
Can I monitor the progress of the crawl for specific domains separately?
Again, it depends on what you use as a back end. With Elasticsearch for instance, you can use Kibana to see the URLs per domain.
Can I set the settings dynamically without the need of uploading modified topology to storm?
No. The configuration is read when the worker tasks are started. I know of some users who wrote a custom configuration implementation backed by a DB table and got their components to read from that but this meant modifying a lot of code.
Is it possible to pause or stop crawling (for specific domain)?
Not on a per domain basis but you could add an intermediate bolt to check whether a domain should be processed or not. If not you could simply fail the ack. This depends on the status storage again. You could also add a custom filter to the ES spouts for instance and a field in the status index. Whenever the crawl should be halted for a specific domain, you could e.g. modify the value of the field for all the URLs matching a particular domain.
Is usually storm crawler running as one deployed topology?
Yes, often.
I assume that for some of these questions the answer may be in customizing or writing my own bolts or spouts. But I would rather avoid modifying Fetcher Bolt or main logic of the crawler as that would mean I am developing another crawler.
StormCrawler is very modular so there is always several ways of doing things ;-)
I am pretty sure you could have the behavior you want while having a single topology by modifying small non-core parts. If more essential parts of the code (e.g. per seed robots settings) are needed, then we'd probably want to add that to the code - you contributions would be very welcome.
You have very interesting questions. I think you can discover more here:
the code:https://github.com/DigitalPebble/storm-crawler oficial tutorial: http://stormcrawler.net/ and some responces: http://2015.berlinbuzzwords.de/sites/2015.berlinbuzzwords.de/files/media/documents/julien_nioche-low_latency_scalable_web_crawling_on_apache_storm.pdf

Preventing Mule servers from reprocessing same information from a database

I am working on a Mule application which reads a series of database records generates reports and posts them to a number of HTTP locations. Unfortunately, the servers are not clustered, so it is possible that both servers could read the records and post them multiple times which is undesirable. Could someone suggest the simplest way to prevent all three Mule servers reading the database, generating the reports and sending them off??
Short answer - use cluster.
Long answer - there is no magic in this world. If you don't use cluster which coordinates your efforts then you should do it by yourself. Since servers are not in cluster they should communicate somehow to prevent duplication. Cluster is the best answer and it is designed to do so. Without cluster - do it "manually".
There are many ways to do so. Main point is that it should be the only one place responsible for coordination (may I say cluster? :) - the best way IMHO it is database - it is one place which is common for all these servers. simplest way is to mark processed records and process only unprocessed ones. How you do this - extra table or extra field - it's up to you.

Monitoring Changes of a Bean to build deltas?

I have several Beans in my Application which getting updated regularly by the usual setter methods. I want to synchronize these beans with a remote application which has the same bean classes. In my case, bandwidth matters, so i have to keep the amount of transferred bytes as low as possible. My idea was to create deltas of the state changes and transfer them instead of the whole Objects. Currently, I want to write the protocol to transfer those changes by myself but I'm not bound to it and would prefer an existing solution.
Is there already a solution for this Problem out there? And if not, how could I easily monitor those state changes in an generalized way? AOP?
Edit: This problem is not caching related even it may first seem so. The data must be replicated from a central server to several clients (about 4 to 10) over the internet. The client is a standalone desktop application.
This sounds remarkably similar to JBossCache running in POJO mode.
This is a distributed, delta-based cache that breaks down java objects into a tree structure, and only transmits changes to the bits of the tree that changes.
Should be a perfect fit for you.
I like your idea of creating deltas and sending them.
A simple Map could handle the delta for one object. Serialization could simply get you the effective message send.
To reduce the number of messages that would kill your performance, you should group your deltas for all objects and send them as a whole. So you could have others collections or maps to contain this.
To monitor all changes to many beans, AOP seem like a good solution.
EDIT : see Skaffmann's answer.
Using an existing cache technology could be better.
Many problems could already have solutions implemented...

Categories