Micrometer/Prometheus How do I keep a gauge value from becoming NaN? - java

I am trying to monitor logged in users, i am getting the logged in user info by calling api, this is the code i have used,
public class MonitorService {
private InfoCollectionService infoService;
public MonitorService(InfoCollectionService infoService) {
this.infoService = infoService
}
#Scheduled(fixedDelay = 5000)
public void currentLoggedInUserMonitor() {
infoService.getLoggedInUser("channel").forEach(channel -> {
Metrics.gauge("LoggedInUsers.Inchannel_" + channel.getchannelName(), channel.getgetLoggedInUser());
});
}
}
And i see the values in Prometheus, the problem is after a few seconds, the value become NaN, i have read that Micrometer gauges wrap their obj input with a WeakReference(hence Garbage Collected ).I don't know how to fix it.If anybody knows how to fix this it would be great.

This is a shortcoming in Micrometer that I would like to fix eventually.
You need to keep the value in a map in the meantime so it avoid the garbage collection. Notice how we then point the gauge at the map and us a lambda to pull out the value to avoid the garbage collection.
public class MonitorService {
private Map<String, Integer> gaugeCache = new HashMap<>();
private InfoCollectionService infoService;
public MonitorService(InfoCollectionService infoService) {
this.infoService = infoService
}
#Scheduled(fixedDelay = 5000)
public void currentLoggedInUserMonitor() {
infoService.getLoggedInUser("channel").forEach(channel -> {
gaugeCache.put(channel.getchannelName(), channel.getgetLoggedInUser());
Metrics.gauge("LoggedInUsers.Inchannel_" + channel.getchannelName(), gaugeCache, g -> g.get(channel.getchannelName()));
});
}
}
I would also recommend using tags for the various channels:
Metrics.gauge("loggedInUsers.inChannel", Tag.of("channel",channel.getchannelName()), gaugeCache, g -> g.get(channel.getchannelName()));

Related

Passing results from expensive methods as they come for multiple layers

I've got a code that looks similar to this:
List<String> ids = expensiveMethod();
List<String> filteredIds = cheapFilterMethod(ids);
if (!filteredIds.isEmpty()) {
List<SomeEntity> fullEntities = expensiveDatabaseCall(filteredIds);
List<SomeEntity> filteredFullEntities = anotherCheapFilterFunction(fullEntities);
if (!filteredFullEntities.isEmpty()) {
List<AnotherEntity> finalResults = stupidlyExpensiveDatabaseCall(filteredFullEntities);
relativelyCheapMethod(finalResults);
}
}
It's basically a waterfall of a couple expensive methods that, on their own, all either grab something from a database or filter previous database results. This is due to stupidlyExpensiveDatabaseCall, which needs as few leftover entities as possible, hence the exhaustive filtering.
My problem is that the other functions aren't all quite cheap either and thus they block the thread for a couple of seconds while stupidlyExpensiveDatabaseCall is waiting and doing nothing until it gets the whole batch at once.
I'd like to process the results from each method as they come in. I know I could write a thread for each individual method and have some concurrent queue working between them, but that's a load of boilerplate that I'd like to avoid. Is there a more elegant solution?
There's a post about different ways to parallelize, not only the parallelStream() way, but also that consecutive steps run in parallel the way you described, linked by queues. RxJava may suit your need in this respect. Its a more complete variety of the rather fragmentary reactive streams API in java9. But I think, you're only really there if you use a reactive db api along with it.
That's the RxJava way:
public class FlowStream {
#Test
public void flowStream() {
int items = 10;
print("\nflow");
Flowable.range(0, items)
.map(this::expensiveCall)
.map(this::expensiveCall)
.forEach(i -> print("flowed %d", i));
print("\nparallel flow");
Flowable.range(0, items)
.flatMap(v ->
Flowable.just(v)
.subscribeOn(Schedulers.computation())
.map(this::expensiveCall)
)
.flatMap(v ->
Flowable.just(v)
.subscribeOn(Schedulers.computation())
.map(this::expensiveCall)
).forEach(i -> print("flowed parallel %d", i));
await(5000);
}
private Integer expensiveCall(Integer i) {
print("making %d more expensive", i);
await(Math.round(10f / (Math.abs(i) + 1)) * 50);
return i;
}
private void await(int i) {
try {
Thread.sleep(i);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
private void print(String pattern, Object... values) {
System.out.println(String.format(pattern, values));
}
}
The maven repo:
<!-- https://mvnrepository.com/artifact/io.reactivex.rxjava2/rxjava -->
<dependency>
<groupId>io.reactivex.rxjava2</groupId>
<artifactId>rxjava</artifactId>
<version>2.2.13</version>
</dependency>
You could use CompleteableFuture to divide up each non-CPU-bound step. The usage is similar to the javascript promise API.
public void loadEntities() {
CompletableFuture.supplyAsync(this::expensiveMethod, Executors.newCachedThreadPool())
.thenApply(this::cheapFilterMethod)
.thenApplyAsync(this::expensiveDatabaseCall)
.thenApply(this::anotherCheapFilterFunction)
.thenApplyAsync(this::stupidlyExpensiveDatabaseCall)
.thenAccept(this::relativelyCheapMethod);
}
private List<String> expensiveMethod() { ... }
private List<String> cheapFilterMethod(List<String> ids) { ... }
private List<SomeEntity> expensiveDatabaseCall(List<String> ids) { ... }
private List<SomeEntity> anotherCheapFilterFunction(List<SomeEntity> entities) { ... }
private List<AnotherEntity> stupidlyExpensiveDatabaseCall(List<SomeEntity> entities) { ... }
private void relativelyCheapMethod(List<AnotherEntity> entities) { ... }
You can also pass your own thread pool at each step if you'd like to have more control over execution.
You can use Java 8 Stream API. It's impossible to process a DB query "as they come in" because the result set will come in all at once. You'd have to change your method to handle single entities.
expensiveMethod().parallelStream()
.filter(this::cheapFilterMethod) // Returns Boolean
.map(this::expensiveDatabaseCallSingle) // Returns SomeEntity
.filter(this::anotherCheapFilterFunction) // Returns boolean for filtered entities
.map(this::stupidlyExpensiveDatabaseCallSingle) // Returns AnotherEntity
.forEach(this::relativelyCheapMethod); // void method
I would also suggest using an ExecutorService to manage your threads so you don't consume all resources just creating a bunch of threads:
ExecutorService threadPool = Executors.newFixedThreadPool(8);
threadPool.submit(this::methodForParallelStream);

How to release a guava cache object

In my application, I build a Guava Cache object by CacheBuilder.newBuilder() method, and now I need to dynamically adjust some initialization parameters for it.
As I don't find any rebuild method for a guava cache, I have to rebuild a new one.
My question is :
Anybody teach me how to release the old one ? I don't find any useful method either.I just call cache.invalidateAll() for the old one to invalidate all the keys. Is there any risk for OOM ?
As the cache maybe used in multi-threads, is it necessary to declare the cache as volatile ?
my codes is as belows:
private volatile LoadingCache<Long, String> cache = null;
private volatile LoadingCache<Long, String> oldCache = null;
public void rebuildCache(int cacheSize, int expireSeconds) {
logger.info("rebuildCache start: cacheSize: {}, expireSeconds: {}", cacheSize, expireSeconds);
oldCache = cache;
cache = CacheBuilder.newBuilder()
.maximumSize(cacheSize)
.recordStats()
.expireAfterWrite(expireSeconds, TimeUnit.SECONDS)
.build(
new CacheLoader<Long, String>() {
#Override
public String load(Long id) {
// some codes here
}
}
);
if (oldCache != null) {
oldCache.invalidateAll();
}
logger.info("rebuildCache end");
}
public String getByCache(Long id) throws ExecutionException {
return cache.get(id);
}
You don't need to do anything special to release the old one; it'll get garbage collected like any other object. You probably should mark the cache as volatile, or even better, an AtomicReference so multiple threads don't replace the cache at the same time. That said, oldCache should be a variable inside the method, not the class.

Dynamically push events/values to a Flux during application run

I am trying to make a reactive pipeline using Java and project-reactor where the use-case is that the application generates flow status(INIT, PROCESSING, SAVED, DONE) at different levels. The status must be emitted asynchronously to a flux which is needed to be handled independently and separately from the main flow. I came across this link:
Spring WebFlux (Flux): how to publish dynamically
My sample flow is something like this:
public class StatusEmitterImpl implements StatusEmitter {
private final FluxProcessor<String, String> processor;
private final FluxSink<String> sink;
public StatusEmitterImpl() {
this.processor = DirectProcessor.<String>create().serialize();
this.sink = processor.sink();
}
#Override
public Flux<String> publisher() {
return this.processor.map(x -> x);
}
public void publishStatus(String status) {
sink.next(status);
}
}
public class Try {
public static void main(String[] args) {
StatusEmitterImpl statusEmitter = new StatusEmitterImpl();
Flux.fromIterable(Arrays.asList("INIT", "DONE")).subscribe(x ->
statusEmitter.publishStatus(x));
statusEmitter.publisher().subscribe(x -> System.out.println(x));
}
}
The problem is that nothing is getting printed on the console. I cannot understand what I am missing.
DirectProcessor passes values to its registered Subscribers directly, without caching the signals. If there is no Subscriber, then the value is "forgotten". If a Subscriber comes in late, then it will only receive signals emitted after it subscribed.
That's what is happening here: because fromIterable works on an in-memory collection, it has time to push all values to the DirectProcessor, which by that time doesn't have a registered Subscriber yet.
If you invert the last two lines you should see something.
The DirectProcessor is hot publishers and don't buffer element,so you should produce element after its subscribe.like is
public static void main(String[] args) {
StatusEmitterImpl statusEmitter = new StatusEmitterImpl();
statusEmitter.publisherA().subscribe(x -> System.out.println(x));
Flux.fromIterable(Arrays.asList("INIT", "DONE")).subscribe(x -> statusEmitter.publishStatus(x));
}
, or use EmitterProcessor,UnicastProcessor instand of DirectProcessor.

How to reset metrics every X seconds?

I am trying to measure application and jvm level metrics on my application using DropWizard Metrics library.
Below is my metrics class which I am using across my code to increment/decrement the metrics. I am calling increment and decrement method of below class to increment and decrement metrics.
public class TestMetrics {
private final MetricRegistry metricRegistry = new MetricRegistry();
private static class Holder {
private static final TestMetrics INSTANCE = new TestMetrics();
}
public static TestMetrics getInstance() {
return Holder.INSTANCE;
}
private TestMetrics() {}
public void increment(final Names... metricsName) {
for (Names metricName : metricsName)
metricRegistry.counter(name(TestMetrics.class, metricName.value())).inc();
}
public void decrement(final Names... metricsName) {
for (Names metricName : metricsName)
metricRegistry.counter(name(TestMetrics.class, metricName.value())).dec();
}
public MetricRegistry getMetricRegistry() {
return metricRegistry;
}
public enum Names {
// some more fields here
INVALID_ID("invalid-id"), MESSAGE_DROPPED("drop-message");
private final String value;
private Names(String value) {
this.value = value;
}
public String value() {
return value;
}
};
}
And here is how I am using above TestMetrics class to increment the metrics basis on the case where I need to. Below method is called by multiple threads.
public void process(GenericRecord record) {
// ... some other code here
try {
String clientId = String.valueOf(record.get("clientId"));
String procId = String.valueOf(record.get("procId"));
if (Strings.isNullOrEmpty(clientId) && Strings.isNullOrEmpty(procId)
&& !NumberUtils.isNumber(clientId)) {
TestMetrics.getInstance().increment(Names.INVALID_ID,
Names.MESSAGE_DROPPED);
return;
}
// .. other code here
} catch (Exception ex) {
TestMetrics.getInstance().increment(Names.MESSAGE_DROPPED);
}
}
Now I have another class which runs every 30 seconds only (I am using Quartz framework for that) from where I want to print out all the metrics and its count. In general, I will send these metrics every 30 seconds to some other system but for now I am printing it out here. Below is how I am doing it.
public class SendMetrics implements Job {
#Override
public void execute(final JobExecutionContext ctx) throws JobExecutionException {
MetricRegistry metricsRegistry = TestMetrics.getInstance().getMetricRegistry();
Map<String, Counter> counters = metricsRegistry.getCounters();
for (Entry<String, Counter> counter : counters.entrySet()) {
System.out.println(counter.getKey());
System.out.println(counter.getValue().getCount());
}
}
}
Now my question is: I want to reset all my metrics count every 30 seconds. Meaning when my execute method prints out the metrics, it should print out the metrics for that 30 second only (for all the metrics) instead of printing for that whole duration from when the program is running.
Is there any way that all my metrics should have count for 30 seconds only. Count of whatever has happened in last 30 seconds.
As an answer because it is too long:
You want to reset the counters. There is no API for this. The reasons are discussed in the linked github issue. The article describes a possible workaround. You have your counters and use them as usual - incrementing and decrementing. But you can't reset them. So add new Gauge which value is following the counter you want to reset after it have reported to you. The getValue() method of the Gauge is called when you want to report the counter value. After storing the current value the method is decreasing the value of the counter with it. This effectively reset the counter to 0. So you have your report and also have the counter reset. This is described in Step 1.
Step 2 adds a filter that prohibits the actual counter to be reported because you are now reporting through the gauge.

Thread-safe cache of one object in java

let's say we have a CountryList object in our application that should return the list of countries. The loading of countries is a heavy operation, so the list should be cached.
Additional requirements:
CountryList should be thread-safe
CountryList should load lazy (only on demand)
CountryList should support the invalidation of the cache
CountryList should be optimized considering that the cache will be invalidated very rarely
I came up with the following solution:
public class CountryList {
private static final Object ONE = new Integer(1);
// MapMaker is from Google Collections Library
private Map<Object, List<String>> cache = new MapMaker()
.initialCapacity(1)
.makeComputingMap(
new Function<Object, List<String>>() {
#Override
public List<String> apply(Object from) {
return loadCountryList();
}
});
private List<String> loadCountryList() {
// HEAVY OPERATION TO LOAD DATA
}
public List<String> list() {
return cache.get(ONE);
}
public void invalidateCache() {
cache.remove(ONE);
}
}
What do you think about it? Do you see something bad about it? Is there other way to do it? How can i make it better? Should i look for totally another solution in this cases?
Thanks.
google collections actually supplies just the thing for just this sort of thing: Supplier
Your code would be something like:
private Supplier<List<String>> supplier = new Supplier<List<String>>(){
public List<String> get(){
return loadCountryList();
}
};
// volatile reference so that changes are published correctly see invalidate()
private volatile Supplier<List<String>> memorized = Suppliers.memoize(supplier);
public List<String> list(){
return memorized.get();
}
public void invalidate(){
memorized = Suppliers.memoize(supplier);
}
Thanks you all guys, especially to user "gid" who gave the idea.
My target was to optimize the performance for the get() operation considering the invalidate() operation will be called very rare.
I wrote a testing class that starts 16 threads, each calling get()-Operation one million times. With this class I profiled some implementation on my 2-core maschine.
Testing results
Implementation Time
no synchronisation 0,6 sec
normal synchronisation 7,5 sec
with MapMaker 26,3 sec
with Suppliers.memoize 8,2 sec
with optimized memoize 1,5 sec
1) "No synchronisation" is not thread-safe, but gives us the best performance that we can compare to.
#Override
public List<String> list() {
if (cache == null) {
cache = loadCountryList();
}
return cache;
}
#Override
public void invalidateCache() {
cache = null;
}
2) "Normal synchronisation" - pretty good performace, standard no-brainer implementation
#Override
public synchronized List<String> list() {
if (cache == null) {
cache = loadCountryList();
}
return cache;
}
#Override
public synchronized void invalidateCache() {
cache = null;
}
3) "with MapMaker" - very poor performance.
See my question at the top for the code.
4) "with Suppliers.memoize" - good performance. But as the performance the same "Normal synchronisation" we need to optimize it or just use the "Normal synchronisation".
See the answer of the user "gid" for code.
5) "with optimized memoize" - the performnce comparable to "no sync"-implementation, but thread-safe one. This is the one we need.
The cache-class itself:
(The Supplier interfaces used here is from Google Collections Library and it has just one method get(). see http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/base/Supplier.html)
public class LazyCache<T> implements Supplier<T> {
private final Supplier<T> supplier;
private volatile Supplier<T> cache;
public LazyCache(Supplier<T> supplier) {
this.supplier = supplier;
reset();
}
private void reset() {
cache = new MemoizingSupplier<T>(supplier);
}
#Override
public T get() {
return cache.get();
}
public void invalidate() {
reset();
}
private static class MemoizingSupplier<T> implements Supplier<T> {
final Supplier<T> delegate;
volatile T value;
MemoizingSupplier(Supplier<T> delegate) {
this.delegate = delegate;
}
#Override
public T get() {
if (value == null) {
synchronized (this) {
if (value == null) {
value = delegate.get();
}
}
}
return value;
}
}
}
Example use:
public class BetterMemoizeCountryList implements ICountryList {
LazyCache<List<String>> cache = new LazyCache<List<String>>(new Supplier<List<String>>(){
#Override
public List<String> get() {
return loadCountryList();
}
});
#Override
public List<String> list(){
return cache.get();
}
#Override
public void invalidateCache(){
cache.invalidate();
}
private List<String> loadCountryList() {
// this should normally load a full list from the database,
// but just for this instance we mock it with:
return Arrays.asList("Germany", "Russia", "China");
}
}
Whenever I need to cache something, I like to use the Proxy pattern.
Doing it with this pattern offers separation of concerns. Your original
object can be concerned with lazy loading. Your proxy (or guardian) object
can be responsible for validation of the cache.
In detail:
Define an object CountryList class which is thread-safe, preferably using synchronization blocks or other semaphore locks.
Extract this class's interface into a CountryQueryable interface.
Define another object, CountryListProxy, that implements the CountryQueryable.
Only allow the CountryListProxy to be instantiated, and only allow it to be referenced
through its interface.
From here, you can insert your cache invalidation strategy into the proxy object. Save the time of the last load, and upon the next request to see the data, compare the current time to the cache time. Define a tolerance level, where, if too much time has passed, the data is reloaded.
As far as Lazy Load, refer here.
Now for some good down-home sample code:
public interface CountryQueryable {
public void operationA();
public String operationB();
}
public class CountryList implements CountryQueryable {
private boolean loaded;
public CountryList() {
loaded = false;
}
//This particular operation might be able to function without
//the extra loading.
#Override
public void operationA() {
//Do whatever.
}
//This operation may need to load the extra stuff.
#Override
public String operationB() {
if (!loaded) {
load();
loaded = true;
}
//Do whatever.
return whatever;
}
private void load() {
//Do the loading of the Lazy load here.
}
}
public class CountryListProxy implements CountryQueryable {
//In accordance with the Proxy pattern, we hide the target
//instance inside of our Proxy instance.
private CountryQueryable actualList;
//Keep track of the lazy time we cached.
private long lastCached;
//Define a tolerance time, 2000 milliseconds, before refreshing
//the cache.
private static final long TOLERANCE = 2000L;
public CountryListProxy() {
//You might even retrieve this object from a Registry.
actualList = new CountryList();
//Initialize it to something stupid.
lastCached = Long.MIN_VALUE;
}
#Override
public synchronized void operationA() {
if ((System.getCurrentTimeMillis() - lastCached) > TOLERANCE) {
//Refresh the cache.
lastCached = System.getCurrentTimeMillis();
} else {
//Cache is okay.
}
}
#Override
public synchronized String operationB() {
if ((System.getCurrentTimeMillis() - lastCached) > TOLERANCE) {
//Refresh the cache.
lastCached = System.getCurrentTimeMillis();
} else {
//Cache is okay.
}
return whatever;
}
}
public class Client {
public static void main(String[] args) {
CountryQueryable queryable = new CountryListProxy();
//Do your thing.
}
}
Your needs seem pretty simple here. The use of MapMaker makes the implementation more complicated than it has to be. The whole double-checked locking idiom is tricky to get right, and only works on 1.5+. And to be honest, it's breaking one of the most important rules of programming:
Premature optimization is the root of
all evil.
The double-checked locking idiom tries to avoid the cost of synchronization in the case where the cache is already loaded. But is that overhead really causing problems? Is it worth the cost of more complex code? I say assume it is not until profiling tells you otherwise.
Here's a very simple solution that requires no 3rd party code (ignoring the JCIP annotation). It does make the assumption that an empty list means the cache hasn't been loaded yet. It also prevents the contents of the country list from escaping to client code that could potentially modify the returned list. If this is not a concern for you, you could remove the call to Collections.unmodifiedList().
public class CountryList {
#GuardedBy("cache")
private final List<String> cache = new ArrayList<String>();
private List<String> loadCountryList() {
// HEAVY OPERATION TO LOAD DATA
}
public List<String> list() {
synchronized (cache) {
if( cache.isEmpty() ) {
cache.addAll(loadCountryList());
}
return Collections.unmodifiableList(cache);
}
}
public void invalidateCache() {
synchronized (cache) {
cache.clear();
}
}
}
I'm not sure what the map is for. When I need a lazy, cached object, I usually do it like this:
public class CountryList
{
private static List<Country> countryList;
public static synchronized List<Country> get()
{
if (countryList==null)
countryList=load();
return countryList;
}
private static List<Country> load()
{
... whatever ...
}
public static synchronized void forget()
{
countryList=null;
}
}
I think this is similar to what you're doing but a little simpler. If you have a need for the map and the ONE that you've simplified away for the question, okay.
If you want it thread-safe, you should synchronize the get and the forget.
What do you think about it? Do you see something bad about it?
Bleah - you are using a complex data structure, MapMaker, with several features (map access, concurrency-friendly access, deferred construction of values, etc) because of a single feature you are after (deferred creation of a single construction-expensive object).
While reusing code is a good goal, this approach adds additional overhead and complexity. In addition, it misleads future maintainers when they see a map data structure there into thinking that there's a map of keys/values in there when there is really only 1 thing (list of countries). Simplicity, readability, and clarity are key to future maintainability.
Is there other way to do it? How can i make it better? Should i look for totally another solution in this cases?
Seems like you are after lazy-loading. Look at solutions to other SO lazy-loading questions. For example, this one covers the classic double-check approach (make sure you are using Java 1.5 or later):
How to solve the "Double-Checked Locking is Broken" Declaration in Java?
Rather than just simply repeat the solution code here, I think it is useful to read the discussion about lazy loading via double-check there to grow your knowledge base. (sorry if that comes off as pompous - just trying teach to fish rather than feed blah blah blah ...)
There is a library out there (from atlassian) - one of the util classes called LazyReference. LazyReference is a reference to an object that can be lazily created (on first get). it is guarenteed thread safe, and the init is also guarenteed to only occur once - if two threads calls get() at the same time, one thread will compute, the other thread will block wait.
see a sample code:
final LazyReference<MyObject> ref = new LazyReference() {
protected MyObject create() throws Exception {
// Do some useful object construction here
return new MyObject();
}
};
//thread1
MyObject myObject = ref.get();
//thread2
MyObject myObject = ref.get();
This looks ok to me (I assume MapMaker is from google collections?) Ideally you wouldn't need to use a Map because you don't really have keys but as the implementation is hidden from any callers I don't see this as a big deal.
This is way to simple to use the ComputingMap stuff. You only need a dead simple implementation where all methods are synchronized, and you should be fine. This will obviously block the first thread hitting it (getting it), and any other thread hitting it while the first thread loads the cache (and the same again if anyone calls the invalidateCache thing - where you also should decide whether the invalidateCache should load the cache anew, or just null it out, letting the first attempt at getting it again block), but then all threads should go through nicely.
Use the Initialization on demand holder idiom
public class CountryList {
private CountryList() {}
private static class CountryListHolder {
static final List<Country> INSTANCE = new List<Country>();
}
public static List<Country> getInstance() {
return CountryListHolder.INSTANCE;
}
...
}
Follow up to Mike's solution above. My comment didn't format as expected... :(
Watch out for synchronization issues in operationB, especially since load() is slow:
public String operationB() {
if (!loaded) {
load();
loaded = true;
}
//Do whatever.
return whatever;
}
You could fix it this way:
public String operationB() {
synchronized(loaded) {
if (!loaded) {
load();
loaded = true;
}
}
//Do whatever.
return whatever;
}
Make sure you ALWAYS synchronize on every access to the loaded variable.

Categories