Custom time stamp for time series collected with Prometheus in java

Custom time stamp for time series collected with Prometheus in java - java

I have a java application which uses the Prometheus library in order to collect metrics during execution.
Later I link the Prometheus server to Grafana in order to visualize those metrics. I was wondering if it is possible to make Grafana show a custom X axis for those metrics?
The usual X axis is in local time. Can I make it show data with timestamps in GPS / UTC time? Is it possible? If it is, what would it require? An additional metric parameter that holds the timestamps?
I declare the metric variable like this:
private static Counter someCounter = Counter.build()
.name("someCounter_name").help("information counter").labelNames("SomeLable").register();
And add data like this:
someCounter.labels("test").inc();
Any help would be appreciated. Thank you.

This is something to handle in Grafana. If you look at the dashboard (not panel) settings, under General there's a Timezone drop-down that allows you to select UTC rather than browser local time.

Related

Prometheus Java Client : Export String based Metrics

I`m Currently trying to write an Exporter for Minecraft to display some Metrics in our Grafana Dashboard. While most Metrics are working fine with the Metric Types Counter and Gauge, i couldn't find any documentation on how to export Strings as Metrics. I need those to export Location Data, so that we can have an Overview about where our Players are from, so we can focus localization on these regions. I wasn't able to find anything about that in the official Documentation, nor was I able to find anything in the Github Repository that could help me.
Anyone can help me with that?
With kind regards
thelooter

Metrics are always numeric. But you can use a labels to export string values, this is typically used to export build or version information. E.g.
version_info{version="1.23", builtOn="Windows", built_by="myUserName" gitTag="version_1.0"} = 1
so you can show in Grafana which version is currently running.
But (!!!) Prometheus is not designed to handle a lot of label combinations. Prometheus creates a new file for every unique label value combination. This would mean that you creat a file per player if you had one metric per player. (And you still need to calculate the amount of players per Region)
What you could do is define regions in your software and export a gauge for every region representing the amount of players logged in from this region:
player_count{region="Europe"} 234
player_count{region="North America"} 567
...
If you don't want to hardcode the regions in your software, you should export the locations of the players into a database and do the statistics later based on the raw data.

What's exactly is the use of 'withIngestionTimestamps()' in Hazelcast Jet Pipeline?

I'm running a pipeline, source from Kafka topic and sink to an IMap. Everytime I write one, I come across the methods withIngestionTimestamps() and withoutTimestamps() and wondering how are they useful? I understand its all about the source adding time to the event. Question is how do I get to use it? I don't see any method to fetch the timestamp from the event?
My IMap have a possibility of getting filled with duplicate values. If I could make use of the withIngestionTimestamps() method to evaluate latest record and discard the old?

Jet uses the event timestamps to correctly apply windowing. It must decide which event belongs to which window and when the time has come to close a window and emit its aggregated result. The timestamps are present on the events as metadata and aren't exposed to the user.
However, if you want to apply your logic that refers to the wall-clock time, you can always call System.currentTimeMillis() to check it against the timestamp explicitly stored in the IMap value. That would be equivalent to using the processing time, which is quite similar to the ingestion time that Jet applies. Ingestion time is simply the processing time valid at the source vertex of the pipeline, so applying processing time at the sink vertex is just slightly different from that, and has the same practical properties.

Jet manages the event timestamp behind the scenes, it's visible only to processors. For example, the window aggregation will use the timestamp.
If you want to see the timestamp in the code, you have to include it in your item type. You have to go without timestamps from the source, add the ingestion timestamp using a map operator and let Jet know about it:
Pipeline p = Pipeline.create();
p.drawFrom(KafkaSources.kafka(...))
.withoutTimestamps()
.map(t -> tuple2(System.currentTimeMillis(), t))
.addTimestamps(Tuple2::f0, 2000)
.drainTo(Sinks.logger());
I used allowedLag of 2000ms. The reason for this is that the timestamps will be added in a vertex downstream of the vertex that assigned them. Stream merging can take place there and internal skew needs to be accounted for. For example it should account for the longest expected GC pause or network delay. See the note in addTimestamps method.

Kubernetes, Java and Grafana - How to display only the running containers?

I'm working on a setup where we run our Java services in docker containers hosted on a kubernetes platform.
On want to create a dashboard where I can monitor the heap usage of all instances of a service in my grafana. Writing metrics to statsd with the pattern:
<servicename>.<containerid>.<processid>.heapspace works well, I can see all heap usages in my chart.
After a redeployment, the container names change, so new values are added to the existing graph. My problem is, that the old lines continue to exist at the position of the last value received, but the containers are already dead.
Is there any simple solution for this in grafana? Can I just say: if you didn't receive data for a metric for more than X seconds, abort the chart line?
Update:
Upgrading to the newest Grafana Version and Setting "null" as value for "Null value" in Stacking and Null Value didn't work.
Maybe it's a problem with statsd?
I'm sending data to statsd in form of:
felix.javaclient.machine<number>-<pid>.heap:<heapvalue>|g
Is anything wrong with this?

This can happen for 2 reasons, because grafana is using the "connected" setting for null values, and/or (as is the case here) because statsd is sending the previously-seen value for the gauge when there are no updates in the current period.
Grafana Config
You'll want to make 2 adjustments to your graph config:
First, go to the "Display" tab and under "Stacking & Null value" change "Null value" to "null", that will cause Grafana to stop showing the lines when there is no data for a series.
Second, if you're using a legend you can go to the "Legend" tab and under "Hide series" check the "With only nulls" checkbox, that will cause items to only be displayed in the legend if they have a non-null value during the graph period.
statsd Config
The statsd documentation for gauge metrics tells us:
If the gauge is not updated at the next flush, it will send the
previous value. You can opt to send no metric at all for this gauge,
by setting config.deleteGauges
So, the grafana changes alone aren't enough in this case, because the values in graphite aren't actually null (since statsd keeps sending the last reading). If you change the statsd config to have deleteGauges: true then statsd won't send anything and graphite will contain the null values we expect.
Graphite Note
As a side note, a setup like this will cause your data folder to grow continuously as you create new series each time a container is launched. You'll definitely want to look into removing old series after some period of inactivity to avoid filling up the disk. If you're using graphite with whisper that can be as simple as a cron task running find /var/lib/graphite/whisper/ -name '*.wsp' -mtime +30 -delete to remove whisper files that haven't been modified in the last 30 days.

To do this, I would use
maximumAbove(transformNull(felix.javaclient.*.heap, 0), 0)
The transformNull will take any datapoint that is currently null, or unreported for that instant in time, and turn it into a 0 value.
The maximumAbove will only display the series' that have a maximum value above 0 for the selected time period.
Using maximumAbove, you can see all history containers, if you wish to see only the currently running containers, you should use just that: currentAbove

Best way to create in app notifcation based on value from a Json field

I have some data in my Json.
This data updates automatically every 20 seconds.
There is a field in that josn called power
Values of power could be from int 1 to 600
I would like to create a class which would check the value of power every 20 seconds and alert the user using in app notification if the value of the power goes above or below a certain value.
What is the best way of achieving this ? Is there a library that can be used ?

There are many patterns you can use.
You can opt for using a library such a quartz: http://quartz-scheduler.org/documentation/quartz-2.x/tutorials/
or you can create a chron job on your OS that would do what quartz does: http://kvz.io/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/ (I'd go with quartz)
Or you can do it yourself:
http://docs.oracle.com/javase/tutorial/essential/concurrency/sleep.html

User matching with current data

I have a database full of two different types of users (Mentors and Mentees), whereby I want the second group (Mentees) to be able to "search" for people in the first group (Mentors) who match their profile. Mentors and Mentees can both go in and change items in their profile at any point in time.
Currently, I am using Apache Mahout for the user matching (recommender.mostSimilarIDs()). The problem I'm running into is that I have to reload the user data every single time anyone searches. By itself, this doesn't take that long, but when Mahout processes the data it seems to take a very long time (14 minutes for 3000 Mentors and 3000 Mentees). After processing, matching takes mere seconds. I also get the same INFO message over and over again while it's processing ("Processed 2248 users"), while looking at the code shows that the message should only be outputted every 10000 users.
I'm using the GenericUserBasedRecommender and the GenericDataModel, along with the NearestNUserNeighborhood, AveragingPreferenceInferrer and PearsonCorrelationSimilarity. I load mentors from the database, add the mentee to the list of POJOs and convert them to a FastByIDMap to give to the DataModel.
Is there a better way to be doing this? The product owner needs the data to be current for every search.

(I'm the author.)
You shouldn't need to ask it to reload the data every time, why's that?
14 minutes sounds way, way too long to load such a small amount of data too, something's wrong. You might follow up with more info at user#mahout.apache.org.
You are seeing log messages from a DataModel, which you can disable in your logging system of choice. It prints one final count. This is nothing to worry about.
I would advise you against using a PreferenceInferrer unless you absolutely know you want it. Do you actually have ratings here? I might suggest LogLikelihoodSimilarity if not.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.