What is performance wise advantage of using log4j over System.out.Println?
FYI:I know log4j has multiple appenders,debug logging and other features which System.Out.println doesn't have and is applicable at class level also and is used in larger applications.
But if I have a small application, say a file will log4j will provide better performance than System.Out.println. How internally log4j works?
Log4j isn't entitled to be more performant. It was created for having more abilities to decrease log amounts and specify the log output. Imagine a Tomcat server which logs amounts of hibernate database accesses. E.g. with the log level you can stop straining the server through this info logs. But this is not a "native" performance advantage since you can simulate this with flag checking before sysos.
The only case I can think of in which you could perceive a performance advantage using log4j over System.out.println would be for example if doing this:
logger.debug("Logging something");
Instead of doing this:
System.out.println("Logging something");
And you then configure the logging level in such a way that it does not require logging debug messages.
In that scenario, the first line will not actually log, but the second one will still write to the System.out.
In all other cases (I mean, in all cases where logging is actually done) I don't think it will be faster to use log4j performance wise.
Usually it's used for all the reasons you state, and because the performance is not that different (framework overhead is minimal compared to the time it actually takes to write things on files).
Related
I would know if someone has observed if a lot of log writes in logfile (300k lines if 4 hours of treatment) could penalize batch performance.
P1: The batch writes a lot of info in the logfile and I'm in doubt if we delete or comment all this log writes in source code the batch performance could be increased and gain 15 min or more in time execution.
We could have a million or more lines in a full batch execution (8-12 hours).
P2: Or database check and log writes could be done in parallel ? But i thought our source code doesn't do that.
Well, yes. Too much logging does affect performance. But the only way to know how much it affects performance would be to measure it.
P1: The batch writes a lot of info in the logfile and I'm in doubt if we delete or comment all this log writes in source code the batch performance could be increased and gain 15 min or more in time execution.
Nobody can tell you how much time you would gain. (I'd be surprised if you gained as much as that, but I could be wrong. Measure it!!)
P2: Or database check and log writes could be done in parallel ? But i thought our source code doesn't do that.
It is probably a bad idea to explicitly code parallel logging into your application, since it will make your code a lot more complicated. And there is a better way to get some parallelism: try using an asynchronous appender.
There are a number of things that you can do to tune logging performance without going to the extreme of ripping it all out. These include:
Switch to a different logging library. For example, log4j 2.x should be more efficient than log4j 1.2.
Don't log too much.
Log at an appropriate level, and adjust the log level depending on the circumstances.
Make sure that you are creating the log messages efficiently. For example, avoid generating complicated message strings that won't be logged due to the logging level. (In log4j 2.x, use the Logger methods that take format strings.)
Avoid expensive features in your log format / formatter. For instance logging the class / method is relatively expensive.
Try using an asynchronous log appender.
For some background on logging performance, take a look at the log4j2 Performance page.
I use log4j in my application. In development I use tons of logger.debug to display infomation for debugging. I know I can make these verbose displays go away by changing the logging level in the configuration file when deployed, my questions is will this affect performance? Is it that although the debug level is disabled, the logging work is still there and dose something silently? Is it better to remove all the logger.debug codes in the final deploy version if performance is important?
Modern loggers very quickly return from an inactive logging statement for this very reason
You need to be aware of the price of constructing the string to be logged. If you use slf4j as the front end, use {} to delay this until after the tests
Any IO operation will affect performance. Even if you change logging level, each time you call log.debug, logger have to make decision to print message or not. However, making decision is faster than doing it with writing to file/console/something else.
There is inherent tension between logging verbosity and performance of Java app in production. If we log very selectively then we might miss evidences for issues in production to debug . If we add too much logging in production , can impact performance.
I was thinking along the line with couple of options :
Log all selective and important things
Have SSDs instead of hard disks in prod
Have logging utility that can "batch" logging statements and flush periodically
Have some utility that will hold logs in memory and then flush eventually.
What are best approaches other than outlined above ? Are there any existing logging tools that can be used for this purpose ?
Try apache log4j with slf4j( you can switch log4j without much changes in your code). Use the configuration xml to provide what to log and which files to log.
Also use rolling file appenders and buffer appenders to handle flushing and batching of logs.
What's the advantage of log4j over set System.out and System.err to output to a log file?
At a high level, the win from Log4j over manual logging is that you can decouple your logging code from what you actually want to log and where and how you want to log it. Details about logging verbosity/filtering, formatting, log location, and even log type (files, network, etc.) are handled declaratively using configuration and extensibly via custom appenders, rather you having to code that flexibility yourself.
This is critically important because it's often hard for developers to predict how logging needs will change once their software is in production. Operations teams managing that software may need less verbose logs, may need mulitple logs, may need to ship those logs to multiple servers, may need to sometimes get really verbose data for troubleshooting, etc. And it's usually impossible for operations teams, if they need to change how logging works, to convince the developer to make big code changes. This often leads to production downtime, friction between operations and development, and wasted time all around.
From the developer's point of view, Log4j insulates you from having to make code changes to support logging, and insulates you from being pestered by people who want logging changes. It enables people managing your code to scratch their own itch rather than bugging you!
Also, since Log4j is the de-facto standard for Java logging, there are lots of tools available which can do cool things with Log4j-- furthermore preventing you and your operations teams from re-inventing the wheel.
My favorite feature is the ability to easily write appenders send data to non-file sources, like SYSLOG, Splunk, etc. which makes it easy to your app's custom logging into operations management tools your IT department is already using.
Actually, you should look into the slf4j facade these days, as it allows you to use {}-placeholders for the most concise statements. You can then use the appropriate logging framework behind slf4j to handle the actual treatment of your log statements. This could be log4j or the slf4j-simple which just prints out all of INFO, WARN and ERROR, and discards the rest.
The crucial observation you need to make is that the WRITING of log statements is done when the code is written, and the DECISION of what is needed is done when the code is deployed, which may be years after the code was written and tested. System.out.println requires you to physically change your code to get rid of them, which is unacceptable in a rigid write-test-deploy cycle. IF the code changes, it must be retested. With slf4j you just enable those you want to see.
We have full logging in the test phase, and rather verbose logging in the initial period of a production deployment, after which we go down to information only. This gives us full information in a scenario where debugging a case is very rarely possible.
You might find this article I wrote interesting. The target audience is beginning Java programmers, with my intention of giving them good habits from the start. http://runjva.appspot.com/logging101/index.html
my favorites (not all)
Ability to set parameters of logging in config, without recompiling
Ability to set the way log is written (from text file to SMTP sender)
Ability to filter by severity
Levels, formatting, logging to multiple files... A logging framework (even if it's java.util.logging) is really beneficial if there's a chance anything may go wrong while your code is running.
log4j allows you to log to various resources e.g. event log, email, file system etc while allowing your application to remain decoupled from all of these resources. Furthermore, you get to use a common interface to log to all of the various resources without having to learn or integrate thier corresponding APIs.
Log4j offers the ability to rotate your log files based on size and delete them based on quantity (logrotate), so your servers don't fill up their disks. Personally I think that is one of the more valuable features in Log4j.
Also Log4j is popular and understood by many developers. The last three companies I've worked at have all used Log4j in most projects.
Take a look and you will understand the power of log4j :
log4j.properties I used once for a project :
# ALL < DEBUG < INFO < WARN < ERROR < FATAL < OFF
# No appenders for rootLogger
log4j.rootLogger=OFF
folder=..
prefix=
fileExtension=.log
htmlExtension=${fileExtension}.html
datestamp=yyyy-MM-dd/HH:mm:ss.SSS/zzz
layout=%d{${datestamp}} ms=%-4r [%t] %-5p %l %n%m %n%n
# myLogger logger
log4j.logger.myLogger=ALL, stdout, infoFile, infoHtml, errorFile
# stdout
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=${layout}
# infoFile
log4j.appender.infoFile=org.apache.log4j.FileAppender
log4j.appender.infoFile.File=${folder}/${prefix}_info${fileExtension}
log4j.appender.infoFile.layout=org.apache.log4j.PatternLayout
log4j.appender.infoFile.layout.ConversionPattern=${layout}
# infoHtml
log4j.appender.infoHtml=org.apache.log4j.FileAppender
log4j.appender.infoHtml.File=${folder}/${prefix}_info${htmlExtension}
log4j.appender.infoHtml.layout=org.apache.log4j.HTMLLayout
log4j.appender.infoHtml.layout.Title=Logs
log4j.appender.infoHtml.layout.LocationInfo=true
# errorFile
log4j.appender.errorFile=org.apache.log4j.FileAppender
log4j.appender.errorFile.File=${folder}/${prefix}_error${fileExtension}
log4j.appender.errorFile.layout=org.apache.log4j.PatternLayout
log4j.appender.errorFile.layout.ConversionPattern=${layout}
# APPENDERS SETTINGS
log4j.appender.stdout.Threshold = ALL
log4j.appender.infoFile.Threshold = INFO
log4j.appender.infoHtml.Threshold = INFO
log4j.appender.errorFile.Threshold = WARN.
To change the variables in your java code you can do :
Loading Configuration
Log4j will automatically load the configuration if it is stored in a
file called "log4j.properties" and is present on the classpath under
"" (e.g. WEB-INF/classes/log4j.properties).
I don't like that approach and prefer to load the configuration
explicitly by calling:
PropertyConfigurator.configure( Config.ETC + "/log4j.properties" );
This way I can reload the configuration at any time as long as my
application is still running. I like to add a button to an
administrative jsp, "Reload Log4J".
Dynamic Log File Location
Many people complain that Log4j forces you to hard-code the location
where your logs will be kept. Actually, it is possible to dynamically
choose the log-file location, especially if you use the ${log.dir}
property substitution technique above. Here's how:
String dynamicLog = // log directory somehow chosen...
Properties p = new Properties( Config.ETC + "/log4j.properties" );
p.put( "log.dir", dynamicLog ); // overwrite "log.dir"
PropertyConfigurator.configure( p );
logging (Document historical business events that occur, you can check old logs)
track the application (project flow)
debugging the application (Detailed information what occurs in a method at granular level //data, value and all inside methods)
error handling (information about specific error that occur)
I'm using the Enerjy (http://www.enerjy.com/) static code analyzer tool on my Java code. It tells me that the following line:
System.err.println("Ignored that database");
is bad because it uses System.err. The exact error is: "JAVA0267 Use of System.err"
What is wrong with using System.err?
Short answer: It is considered a bad practice to use it for logging purposes.
It is an observation that in the old times when there where no widely available/accepted logging frameworks, everyone used System.err to print error messages and stack traces to the console. This approach might be appropriate during the development and local testing phase but is not appropriate for a production environment, because you might lose important error messages. Because of this, in almost all static analysis tools today this kind of code is detected and flagged as bad practice (or a similarly named problem).
Logging frameworks in turn provide the structured and logical way to log your events and error messages as they can store the message in various persistent locations (log file, log db, etc.).
The most obvious (and free of external dependencies) hack resolution is to use the built in Java Logging framework through the java.util.logging.Logger class as it forwards the logging events to the console by default. For example:
final Logger log = Logger.getLogger(getClass().getName());
...
log.log(Level.ERROR, "Something went wrong", theException);
(or you could just turn off that analysis option)
the descriptor of your error is:
The use of System.err may indicate residual debug or boilerplate code. Consider using a
full-featured logging package such as Apache Commons to handle error logging.
It seems that you are using System.err for logging purposes, that is suboptimal for several reasons:
it is impossible to enable logging at runtime without modifying the application binary
logging behavior cannot be controlled by editing a configuration file
problably many others
Whilst I agree with the points above about using a logging framework, I still tend to use System.err output in one place: Within shut-down hooks. This is because I discovered that when using the java.util.logging framework log statements are not always displayed if they occur in shut-down hooks. This is because the logging library presumably contains its own shutdown hook to clean up log files and other resources, and as you can't rely on the order in which shutdown hooks run, you cannot rely on java.util.logging statements working as expected.
Check out this link (the "Comments" section) for more information on this.
http://weblogs.java.net/blog/dwalend/archive/2004/05/shutdown_hooks_2.html
(Obviously the other alternative is to use a different logging framework.)
System.err is really more for debugging purposes than anything else. Proper exception handling and dealing with errors in a manner that is more user-friendly is preferred. If the user is meant to see the error, use a System.out.println instead.
If you want to keep track of those errors from a developer's standpoint, you should use a logger.
Things written to System.err are typically lost at runtime, so it is considered a better practice to use a logging framework that is more flexible about where to output the message, so it can be stored as a file and analyzed.
System.err and System.out for non-console applications is only ever seen by the developer running the code in his or her IDE, and useful information may get lost if the item is triggered in production.
System.err.println and System.out.println should not be used as loggging-interface. STD-Output and STD-Error (these are written by System.out and .err) are for messages from command-line-tools.
System.err prints to the console. This may be suitable for a student testing their homework, but will be unsuitable for an application where these messages won't be seen (Console only store so many lines).
A better approach would be to throw an exception holding the message that would normally get sent to the console. An alternative to this would be use third party logging software which would store this messages in a file which can be stored forever.