Performance Impact of logging class name , method name and line number - java

I am implementing logging in my java application , so that I can debug potential issues that might occur once the application goes in production.
Considering in such cases one wouldn't have the luxury of using an IDE , development tools (to run things in debug mode or step thorough code) , it would be really useful to log class name , method name and line number with each message.
I was searching the web for best practices for logging and I came across this article which says:
You should never include file name, class name and line number,
although it’s very tempting. I have even seen empty log statements
issued from the code:
log.info("");
because the programmer assumed that the line number will be a part of
the logging pattern and he knew that “If empty logging message appears
in 67th line of the file (in authenticate() method), it means that the
user is authenticated”. Besides, logging class name, method name
and/or line number has a serious performance impact.
I am trying to understand how logging class name , method name and line number degrade performance.
Is the above true for all logging frameworks or only some of them? (The author makes a reference to Logback in the same topic) . I am interested in knowing about performance impacts of doing something like this in Log4j.

There are different concerns at play here.
First of all, what is the impact of logging a simple string. This largely depends on the infrastructure you use. Recently I did run some benchmarks, and just logging a string to a file using the standard java logging API is extremely expensive. You're going to get better results using the log4j logging infrastructure, which my tests show are in the order of 15 or 20 times faster.
Now let's consider the file name and line number problem. As opposed to C, java doesn't have a __FILE__ and __LINE__ constant which are resolved by the compiler (or the preprocessor in case of C). If you want to log file name and line number, you have two options:
You actually write such file name and line number yourself as constants. This may be acceptable for the filename, but the line number will change if you introduce any line above the one where you're logging, so you will have to go and change all the line numbers there. Not really reasonable
You use java APIs to get a stack trace as mentioned here. This operation though is very expensive at runtime, and hence will slow down your program.

Related

Make my logger very effective to my Java-application

I am struggling with the following problem and ask for help.
My application has a logger module. This takes the trace level and the message (as string).
Often should be messages constructed from different sources and/or different ways (e.G. once using String.format in prior of logging, other times using .toString methods of different objects etc). Therefore: the construction method of the error messages cannot be generalized.
What I want is, to make my logger module effective. That means: the trace messages would only then be constructed if the actual trace level gets the message. And this by preventing copy-paste code in my application.
With C/C++, by using macros it was very easy to achive:
#define LOG_IT(level, message) if(level>=App.actLevel_) LOG_MSG(message);
The LOG_MSG and the string construction was done only if the trace level enabled that message.
With Java, I don't find any similar possibility for that. That to prevent: the logging would be one line (no if-else copy-pastes everywhere), and the string construction (expensive operation) only be done if necessary.
The only solution I know, is to surrond every logger-calls with an IF-statement. But this is exactly what I avoided previously in the C++ app, and what I want to avoid in my actual Java-implementation.
My problem is, on the target system only Java 1.6 is available. Therefore the Supplier is not a choice.
What can I do in Java? How can this C/C++ method easily be done?
Firstly, I would encourage you to read this if you're thinking about implementing your own logger.
Then, I'd encourage you to look at a well-established logging API such as SLF4j. Whilst it is possible to create your own, using a pre-existing API will save you time, effort and above all else provide you with more features and flexibility out of the box (I.e file based configuration, customisability (look at Mapped Diagnostic Context)).
To your specific question, there isn't a simple way to do what you're trying to do. C/C++ are fundamentally different to java in that the preprocessor allows for macros like you've created above. Java doesn't really have an easy-to-use equivalent, though there are examples of projects that do make use of compile time code generation which is probably the closest equivalent (i.e. Project Lombok, Mapstruct).
The simplest way I know of to avoid expensive string building operations whilst logging is to surround the building of the string with a simple conditional:
if ( logger.isTraceEnabled() )
{
// Really expensive operation here
}
Or, if you're using Java 8, the standard logging library takes a java.util.function.Supplier<T> argument which will only be executed if the current log level matches that of the logging method being called:
log.fine(()-> "Value is: " + getValue());
There is also currently a ticket open for SLF4j to implement this functionality here.
If you're really really set on implementing your own logger, the two above features are easy enough to implement yourself, but again I'd encourage you not to.
Edit: Aspectj compile time weaving can be used to achieve something similar to what you're trying to achieve. It would allow you to wrap all your logging statements with a conditional statement in order to remove the boilerplate checking.
Newest logging libraryies, including java.util.logging, have a second form of methods, taking a Supplier<String>.
e.g. log.info( ()->"Hello"); instead of log.info("Hello");.
The get() method of the supplier is only called if the message has effectively to be logged, therefore your string is only constructed in that case.
I think the most important thing to understand here is that the C/C++ macro solution, does not save computational effort by not constructing the logged message, in case the log level was such that the message would not be logged.
Why is so? Simply because the macro method would make the pre-processor substitute every usage of the macro:
LOG_IT(level, message)
with the code:
if(level>=App.actLevel_) LOG_MSG(message);
Substituting anything you passed as level and anything you passed as message along with the macro itself. The resulting code to be compiled will be exactly the same as if you copied and pasted the macro code everywhere in your program. The only thing macros help you with, is to avoid the actual copying and pasting, and to make the code more readable and maintainable.
Sometimes they manage to do it, other times they make the code more cryptic and thus harder to maintain as a result. In any case, macros do not provide deferred execution to save you from actually constructing the string, as Java8 Logger class does by using lambda expressions. Java defers the execution of the body of a lambda until the last possible time. In other words, the body of the lambda is executed after the if statement.
To go back to your example in C\C++, you as a developer, would probably want the code to work regardless of the log level, so you would be forced to construct a valid string message and pass it to the macro. Otherwise in certain log levels, the program would crash! So, since the message string construction code must be before the call to the macro, you will execute it every time, regardless of the log level.
So, to make the equivalent to your code is quite simple in Java 6! You just use the built-in class: Logger. This class provides support for logging levels automatically, so you do not need to create a custom implementation of them.
If what you are asking is how to implement deferred execution without lambdas, though, I do not think it is possible.
If you wanted to make real deferred execution in C\C++ you would have to make the logging code such, as to take a function pointer to a function returning the message string, you would make your code execute the function passed to you by the function pointer inside the if statement and then you would call your macro passing not a string but a function that creates and returns the string! I believe the actual C\C++ code to do this is out of scope for this question... The key concept here, is that C\C++ provide you the tools to make deferred execution, simply because they support function pointers. Java does not support function pointers, until Java8.

A table that contains View objects tags

First of all I must say I'm new to both Android dev and Java.
I'm trying to find a list of the tags that are used for logging in Android studio.
The examples I've been researching include using:
Log.i(tag:"Info","message");
Log.i(tag:"Values","another message");
Log.i(tag:"Seekbar changed", "and another message");
I tried for the past couple of hours to find a document online, that has a table to describe the reserved tags for View objects, any help will be appreciated.
There is no fixed list of "reserved tags" one can use for logging in Android. You decide for yourself which tags you want to use and what additional information about the state of your objects or primitive types you want to display.
The Log class has six different log levels (debug, error, info, verbose, warn and wtf [What a Terrible Failure]) and corresponding (static) methods (Log.d, Log.e, Log.i, Log.v, Log.w and Log.wtf) each of which you call with two string parameters, one string parameter and one Throwable or two string parameters and one Throwable.
The most commonly used is probably the variant with two string parameters, one parameter for a tag (chosen by you) and one parameter for a message (also chosen by you). See this post for information about which level to choose.
During debugging I often use commands like this one:
Log.e(String.valueOf(myIntVariable), String.valueOf(myOtherVariable));
Let me explain the reason for using the Log class like this. I use the error level because it will give you red entries in the LogCat output (inside an IDE, e.g. Android Studio), and the same IDE will also let you filter out all logs below the error level. However, this is for debugging only; make sure to get rid of those log commands before your app enters production.
Instead of using logs in the way I do, you can also use breakpoints in the debug mode. I guess it is mainly a question of taste if you prefer one or the other. Toasts would be third option (with more boilerplate though).
If you use logs a lot in your code, it makes sense to use real tags. Either you define a string called TAG (or something else) in your class, or you put the name of the containing method as the first parameter. This will give you a sense of the order by which your methods are being called. You can also use other tags as well, and it doesn't have to follow a specific convention either (though you should have a system for it to make sense of it).

log4j/logback pass logger level as a parameter

I want to do something which seems really straightforward: just pass a lot of logging commands (maybe all, but particularly WARN and ERROR levels) through a method in a simple utility class. I want to do this in particular so that during testing I can suppress the actual output to logging by mocking the method which does this call.
But I can't find out how, with ch.qos.logback.classic.Logger, to call a single method with the Level as a parameter ... obviously I could use a switch command based on this value, but in some logging frameworks there's a method or two which lets you pass the logging Level as a parameter. Seems a bit "primitive" not to provide this.
The method might look a bit like this:
Logger.log( Level level, String msg )
Later
Having now looked up the "XY problem" I understand the scepticism about this question. Dynamic logging is considered bad, at least in Java (possibly less so in Python)... now I know and understand that the preferred route is to configure the logging configuration appropriately for testing.
One minor point, though, if I may: although I haven't implemented this yet with this particular project, I generally find "just" tracing the stacktrace back to the beginning of the particular Thread insufficient, and this is what logback does (with Exceptions passed at WARN or ERROR levels). I plan to implement a system for recording "snapshots" of Threads when they run new Threads... which can then be listed (right back to the start of the app's first Thread) if an error occurs. This is, if you like, another justification for using something to "handle" outgoing log calls. I suppose that if I want to implement something like this I will instead have to try to extend some elements of logback in some way.

How to know when there's too much logging messages?

I came across one very good library for parsing CUE files. But when I started to read its source code, I realized that it is almost unreadable:
public void setParent(final CueSheet parent) {
FileData.logger.entering(FileData.class.getCanonicalName(), "setParent(CueSheet)", parent);
this.parent = parent;
FileData.logger.exiting(FileData.class.getCanonicalName(), "setParent(CueSheet)");
}
every method has logger.entering() and logger.exiting() messages. Isn't that too much?
There's another java library for parsing audio tags. It also had like 15 log messages for each file it read. It was annoying so I commented out every call to logger. And the library became twice as fast, because they used a lot of string concatenation for log messages.
So the question is: should I really log everything, even if it is not large enterprise application? Because these libraries obviously don't need any logging, except for error messages. And my experience shows that loggers are terrible tool for debugging. Why should I use it?
How to know when is too much logging? When you know that the logged information isn't important in the long term, such as for straightforward debug actions or bug correction, or for when the application doesn't deal with too much important information.
Sometimes you need to log almost everything. Is performance or full possibility of analysis the most important part of an application? It really depends.
I've worked in the past with some integration with a lot of different webservices, like 10 in a same app. We logged all xml requests and responses. Is this an overhead? In the long term, I don't think so because we worked with a lot of credit card operations and should have every process made with the server logged. How to know what happened when there was a bug?
You wouldn't believe what I've seen in some of the xml responses. I've even received a xml without closing tags, from a BIG airplane company. Were the "excessive logs" a bad practice? Say that to your clients when you have to prove that the error came from the other vendor.
Ideally, you use a logger that allows logging levels; log4j has fatal/error/warn/debug/info, for example. That way, if you set the level to "only show errors", you don't lose speed to the software building log messages you didn't need.
That said, it's only too much logging until you wind up needing something that would have been logged. It sounds like most of the logging that's slowing you down should be "trace" level, though; it's showing you what a profiler would have.
Most logging libraries incorporate a means to confirm that logging is enabled before processing an instruction:
For example:
public void foo(ComplicatedObject bar) {
Logger.getInstance(Foo.class).trace("Entering foo(" + bar + ")");
}
Could be quite costly depending on the efficiency of the bar.toString() method. However, if you instead wrap that in a check for the logging level before doing the string concatenation:
static {
Logger log = Logger.getInstance(Foo.class);
public void foo(ComplicatedObject bar) {
if (log.isTraceEnabled()) {
log.trace("Entering foo(" + bar + ")");
}
}
Then the string concatenation only occurs if at least one appender for the class is set to Trace. Any complicated log message should do this to avoid unnecessary String creation.
This level of logging is canonically bad - in fact, I saw code exactly like this in the Daily WTF a few days ago.
But logging is in general a Very Good Thing.
It depends, it this code for an application, or a library? For an application, logger are useful once the code is in production. It should not be used to debug, but to help you replicate a bug. When a user tells you that your application crashed, you always want the maximum logging information.
I agree that it makes the code less readable. It even make the application slower!
It's a total different game for a library. You should have consistent logging with adequate level. The library SHOULD inform the development team when an error occurs.
Logging should provide you with information that a stack trace can't in order to track down a problem. This usually means that the info is some kind of historical trace of what the program did, as opposed to what state it's in at the time of failure.
Too much historical data will be ignored. If you can safely deduce that a method was called without having to actually log its entry and exit, then it's safe to remove those logging entries.
Another bad sign is if your logging files start to use up a huge amounts of disk space. You're not only sacrificing space, you're probably slowing down the app too much.
To answer the question, why should I use loggers?
Have you ever encountered a piece of software where the only error indicated presented to the end user is Fatal error occured. Would it not be nice to find out what have caused it?
Logging is a tool that can really help you narrow these kind of problems in the field.
Remember, end-user systems don't have nice IDE's to debug and the end-users usually are not knowledgeable enough to run these tools. However end-users, in most cases, are capable of copying log configuration files ( written by us, clever programmers ) into predefined location and fetch log files and email them back to us ( poor soles for having to parse megabytes of log output ) when they encounter problems.
Having said this, logging should be highly configurable and under normal conditions produce minimal output. Also, guards should protect finer level logging from consuming too many resources.
I think in the example that you have provided all logging should have been done on a TRACE level. Also, because nothing bad can really happen between function entry point and exit, it probably make sense to have only one log statement there.
Over the years I've swayed backwards and forwards between promoting logging everything at the appropriate levels (trace, info, etc...) and thinking that any is a complete waste of time. In reality it depends on what is going to be useful to track down or required (logging can be a cheap way of maintaining an audit trail).
Personally, I tend to log entry/exit at a component or service level and then log significant points in the processing such as a business logic decision or a call on another service/component. Of course errors are always logged, but once only and at the place they were handled (the stack trace and exception message should have sufficient info to diagnose the problem) and any service/component interface should always handle an errors (even if it is just converting it into another more appropriate to the caller).
The problem with logging stuff on the off chance something goes wrong is that you end up with too much information that it is impossible to identify the issue, especially if it is running under a server as you end up with loads of intertwined log entries. Obviously you can get around that by incorporating a request id in the entry and using some software to filter on that. Of course you also have the case where your application is distributed and/or cluster and you have multiple logs.
Nowadays I would never actually write trace entering/exiting entries code, the code just gets in the way and it is so much easier to use something like aspectj if it is really needed. Using aspectj also would guarantee to be consistent (you can change the log format in one place rather than having to change every operation) and accurate (in case some refactoring adds a new paramater and teh developer forgets to add it to the logging).
One thing I have thought about doing or looking to see if someone already has is a logger that will hold the entries in memory, then if an error is encountered they are written, if the operation succeeds the entries are just discarded. If anyone knows of one (ideally for log4j) please let me know, alternatively I have a few ideas on how to implement this if anyone is interested in doing one.
This is where log levels are helpful. In general, log levels in the order of verbosity and priority are TRACE, DEBUG, INFO, WARN, ERROR, FATAL.
The developer has to take a conscious call to use the correct log level while logging in the code.
While creating an instance of Logger we have to pass the correct log level by choosing it from a config (always prefer config). This decides which levels to be logged. For example, while creating the logger, if the config for log level is set to "INFO", anything below "INFO" (TRACE, DEBUG) won't be logged.
For instance, in the example you mentioned above, a TRACE OR DEBUG level would make more sense.
In runtime in production, the config for log level should always be set to INFO.
When an issue occurs in production and if the developer wants to find out the root cause, they can request for changing the log level to TRACE or DEBUG (mostly inside a QA environment where they can replicate the scenario), to see what exactly is happening (The app sometimes has to be restarted to have the log level changed, but it is helpful).
Log levels is a great practice, as most of the times, we won't be able to launch a debugger in the landscapes. As we are skipping the unnecessary file writes by choosing a higher log level, the performance won't take a hit

How to properly handle error logs?

I tried to do several searches before posting this question. If this is a duplicate, please let me know and I will delete it.
My question revolves around the proper way to handle errors produced through our web application. We currently log everything through log4j. If an error happens, it just says "An error has occurred. The IT Department has been notified and will work to correct this as soon as possible" right on the screen. This tells the user nothing... but it also does not tell the developer anything either when we try to reproduce the error. We have to go to the error log folder and try finding this error. Let me also mention that the folder is full of logs from the past week. Each time there is an error, one log file is created for that user and email is sent to the IT staff assigned to work on errors. This email does not mention the log file name but it is a copy of the same error text written that is in the log file.
So if Alicia has a problem at 7:15 with something, but there are 10 other errors that happen that same minute, I have to go through each log file trying to find hers.
What I was proposing to my fellow co-workers is adding an Error Log table into the database. This would write a record to the table for each error, record who it is for, the error, what page it happened on, etc. The bonus of this would be that we can return the primary key value from the table (error_log_id) and show that on the page with a message like "Error Reference Id (1337) has been logged and the proper IT staff has been notified. Please keep this reference id handy for future use". When we get the email, it would tell us the id of the error for quick reference. Or if the user is persistent, they can contact us with the id and we can find the error rather quickly.
How do you setup your error logging? By the way, our system uses Java Servlets that connect to a SQL Server database.
I answered a similar question here, but I will adapt that answer to your question.
We use requestID for this purpose - assign a request ID to each incoming (HTTP) request, at the very beginning of processing (in filter) and then log that on every log line, so you can easily grep those logs later by that ID and find all relevant lines.
If you think it is very tedious to add that ID to every log statement, then you are not alone - java logging frameworks have made it transparent with the use of Mapped Diagnostic Context (MDC) (at least log4j and logback have this).
RequestID can also work as a handy reference number, to spit out, in case of errors (as you already suggested). However, as others have commented, it is not wise to load those details to database - better use file-system. Or, the simplest approach is to just use the requestID - then you do not need to do anything special at the moment error occurs. It just helps you to locate the correct logfile and search inside that file.
How would one requestID look like?
We use the following pattern:
<instanceName>:<currentTimeInMillis>.<counter>
In consists of the following variables:
instanceName uniquely identifies particular JVM in particular deployment environment / .
currentTimeInMillis is quite self-explanatory. We chose to represent it in human-readable format "yyyyMMddHHmmssSSS", so it is easy to read request start time from it (beware: SimpleDateFormat is not thread-safe, so you need to either synchronize it or create a new one on each request).
counter is request counter in that particular millisecond - in the rare case you might need to generate more than one request ID in one millisecond
As you can see, the ID format has been set up in such a way that currentTimeInMillis.counter combination is guaranteed to be unique in particular JVM and the whole ID is guaranteed to be globally unique (well, not in the true sense of "global", but it is global enough for our purposes), without the need to involve database or some other central node. Also, the use of instanceName variable gives you the possibility to limit the number of log files you later need to look in to find that request.
Then, the final question: "that is fine and dandy in single-JVM solution, but how do you scale that to several JVMs, communicating over some network protocol?"
As we use Spring Remoting for our remoting purposes, we have implemented custom RemoteInvocationFactory (that takes request ID from context and saves it to RemoteInvocation attributes) and RemoteInvocationExecutor (that takes request ID from attributes and adds it to diagnostic context in the other JVM).
Not sure how you would implement it with plain-RMI or other remoting methods.
If multiple servers are running and each server leaves log messages on itself, it is really difficult to trace them. So,somebody or a tool should gather and sort them in time order.
It is a good way to have a central point where all messages are sent.
A possible solution, have your error page include a 'send email to whatever' link. When the user clicks this email the body of the e-mail might start with a few blank lines followed by something like:
----Please do not modify the information below this line.---
Error details
Any users complaining via this link will automatically send you the info you need and if you are reproducing the error you have quick access to the error message. You might even have a form for sending the e-mail so that the user never sees this (which may be important to some) but then you are relying on your system being at least able to send an e-mail.
Actually I find it useful to print the error details in an HTML comment on error pages like this so that I can always get at them myself.
I do agree with david above that I do not like storing this kind of information in a DB.
For the strategies of logging you can see the discussion Logging best practices.
I have used an approach like the one you're suggesting ( log to a db ) in the past and it has been very helpful.
Not only you cat get the error via SQL but you can also generate reports of what's the most recurring errors and attend them first.
On the design we did, equals stacktraces belong to the same records ( since they were originated exactly in the same place )
We had an small app that pooled that db and we knew then a new exception was generated instead of getting an e-mail that summed with the rest of the previous weeks were ignored altogether.
Of course this database design was very specific for the application we had and additional identifications were possible, we had software version, build, some times input parameters , etc. etc.
With time, the system administrators get to know what to do with each kind of exception and proceed accordingly.
But! Your application may not be that big anyway. Probably you can have the same just parsing the log files.
I'd oppose the idea of storing error logs in a database. The logging system should be as simple as possible and not involve components that are not 100% necesary to write a log record.
Things can get pretty complex when logging into a DB - e.g. you can having troubles logging any database-related errors (how to log errors that occured because DB not responding, e.g. because of a heavy load or a infrastructure error); other issue I'd see is a potential need to have separate transactions for logging, etc.
On the other hand, having a reference ID for an error is not a bad idea, but again, this it also means to increase complexity of logging system ( e.g. how would you propagate the ref. ID through all layers of your application when a error occurs? )
In projects I'm involved to, the general guideline is to log errors as verbosely as possible, and to include as much context information as possible (to write the logs, we use a 'conventional' approach usually - log4j or simillar). Usually, this works well even for heavy loaded systems.

Categories