Log4j is logging some binary information at beginning of File - java

There is an issue we are facing in production environment.
The File generated using log4j is getting appended with some special characters at the start of file, before starting to log.
This is resulting in a binary file which is making tools like Splunk not able to access these files as it is expecting text files.
Please help me what could be the issue here.

According to Google, my best guess is that you are using GC logs (JVM Garbage Collector logs) from what I read here: https://developer.jboss.org/message/529671#529671 and here: https://developer.jboss.org/thread/148848?tstart=0&_sscc=t.
It seems that there is no real solution, except maybe using the right combination of ASCII encoding + right locale, according to the pages previously linked.
Since you said, in your question, that you have this problem on production environment, I may suggest you to simply disable GC logs in production, because you should not do this in production (enabling GC logs have a performance/storage impact). In your JVM start options, look for something like -XX:+PrintGC or -verbose:gc.

Related

Profiling WebSphere with hprof for CPU SAMPLES output

I'm trying to profile WebSphere using hprof over IBM stack (J9 JVM on AIX / Linux). Specifically, I'm interested in the CPU samples output from hprof, and particularly, the startup time (from the time WS is started until it is "ready for business").
The problem is, I can't get the CPU samples output in the hprof result file.
I'm using the following JVM argument for configuring hprof: -Xrunhprof:cpu=samples,file=path-to-hprof.txt, for dumping hprof output in ASCII format. According to the generated hprof output, the CPU SAMPLES output is only generated at program exit:
HEAP DUMP, SITES, CPU SAMPLES|TIME and MONITOR DUMP|TIME records are generated
at program exit.
So, for shutting down WebSphere gracefully after it successfully started, I'm using the stopServer.sh script, and expecting the CPU SAMPLES output to be present in the result java.hprof.txt file after shutdown completes, but it isn't.
What am I doing wrong? Is there a better method for using hprof with WebSphere and generating CPU profiling output? Any help will be much appreciated!
Edit: I'm running WebSphere version 8.0.0.11 over IBM J9 VM (build 2.6, JRE 1.6.0 20150619_253846) on RHEL 7.5.
P.S.: I also looked for a way for closing WS from the management console GUI, but couldn't find any.
P.P.S.: In the meanwhile I'm using the very nice jvmtop tool with the --profile <pid> option, but that provides only partial insight, and as opposed to hprof, has to be attached on the fly, so some parts of the execution are lost.
Thanks to #kgibm's helpful hints, I realized I was on the right track, and went back the next day to try again. Surprisingly, this time, it worked! The hprof file was generated with the expected WebSphere CPU samples output.
I kept experimenting to figure out what I got wrong in the first place. Here's what I think has happened:
At first, I had a couple of native agents specified in WebSphere JVM arguments. The combination of these agents caused WS to run much slower. When I killed WS, there were a few seconds between the Server server1 stop completed message was printed and hprof.txt being completely written. I believe I was too quick to view hprof.txt, before the CPU samples output was actually written.
Then, for troubleshooting this issue, I added the doe=n parameter to the hprof argument. doe stands for Dump On Exit, and defaults to y. Only later I realized that this is probably wrong, since as quoted, CPU samples output is only generated at exit.
I think that these two issues together contributed to my confusion, so when I started clean, everything was OK.
Perhaps it is worth clarifying in hprof documentation that the doe=n option is conflicting with cpu=samples, and possibly with the other options that write on exit as well (I didn't see such an indication in the docs, but it's possible I've missed it).

Better way to locate problems in a running Java application?

We have a few java applications (jars) running as backend server applications on localhost. These programs are inside a virtual box (RHEL 6.2).
After one of the jar's ran for 5 days, it stopped working. No exceptions were thrown (didn't see any output of the errors that could be caught in the catch block). To find out what caused this, we put in some println's and redirected output to a text file using the > operator on the commandline using shellscript.
After about 4 or 5 days, we faced a situation where we could see that the jar was still running, but it wasn't outputting anything to the text file or to the database to which the application was supposed to write entries.
Perhaps the textfile became too large for the virtual box to handle, but basically we wanted to know this:
How are such runtime problems located in Java? In C++ we have valgrind, Purify etc, but
1. are there such tools in Java?
2. How would you recommend we output println's without facing the extremely-large-textfile problem? Or is there a better way to do it?
Rather than printing to System.out how about using tools like log4j. Log4J allows for logfile sizing, versioning and purging.
see http://logging.apache.org/log4j/1.2/
You may also want to re-consider your server architecture.
How are such runtime problems located in Java? In C++ we have
valgrind, Purify etc, but 1. are there such tools in Java?
There are lot of java profilers available, few are free as well. There is one called VisualVM, which comes along with java distribution. You can attach your process with profiler, but profilers will only help you find few problems such as memory leaks, cpu intenstive task etc
How would you recommend we output println's without facing the extremely-large-textfile problem? Or is there a better way to do it?
Sysout are not a good way to deal with this problem. Loggers such as log4j provides very roboust and easy to use API. Log4j also provides easy way to configure to roll over your log files, etc features

How do I stop .mdmp files from being created

I have an instance of Solr, hosted with Tomcat that recently started creating minidump files. There are no errors in any of logs, and Solr continues to work with out a hitch.
The files are approximately 14gb, and are filling up the hard drive. Is there a way to turn this off, while we investigate the issue?
Generally speaking when JVM crashes the content of hs_err error log file (controlled by -XX:ErrorFile) is often enough to point what the trouble may be.
To prevent Oracle JVM Hotspot to generate Windows minidump (mdmp files), the JVM option to use on command line is: -XX:-CreateMinidumpOnCrash
It exists since 2011 but was very difficult to find: How to disable minidump (mdmp) files generation with Java Hotspot JVM on Windows
This article has decent information on both Linux and Windows JVM dump files. Have yet to test it myself on my current version of Java 7....
From that site:
Disabling Text dump Files
If you suspect problems with the creation of text dump files you can turn off the text dump file by using the option: -XXnoJrDump.
Disabling the Binary Crash Files
You can turn off the binary crash file by using the option: -XXdumpSize:none.
Are you using Java 7? In that case revert to Java 5 or 6. Lucene/Solr and Java 7 don't go well together and it could be this creates the dump files. Otherwise if everything is working, just disable the dumping of files.
I never found a way to disable the Java minidumps on windows. The strange part here is that everything on the server worked correctly, besides the hard drive filling up with minidumps.
We eventually re-installed everything, same version of Solr/Java/Tomcat onto a linux machine and didn't have the problem any more. I would imagine that re-installing everything onto a windows machine would have also fixed the problem. This was a strange one.

Tool to count stacktraces in a logfile

Is there a tool that is able to collect and count (Java) stacktraces in a large logfile, such that you get an overview which errors occur most often?
I am not aware of any automatic tool but logmx will give you a nice clean overview of your log file with search options.
This probably isn't the best answer but I am going to try to answer the spirit of your question. You should try Dynatrace. It's not free and it doesn't work with log files per say but it can get you very detail reports of what types of exceptions are thrown from where and when on top of a lot of other info.
I'm not too sure if there is a tool available to evaluate log files but you may have more success with a tool like AppDynamics. This is a monitoring tool which can be used to evaluate Live application performance and can be configured to monitor exception frequency.
Good luck.
Mark.

Configuring Hadoop logging to avoid too many log files

I'm having a problem with Hadoop producing too many log files in $HADOOP_LOG_DIR/userlogs (the Ext3 filesystem allows only 32000 subdirectories) which looks like the same problem in this question: Error in Hadoop MapReduce
My question is: does anyone know how to configure Hadoop to roll the log dir or otherwise prevent this? I'm trying to avoid just setting the "mapred.userlog.retain.hours" and/or "mapred.userlog.limit.kb" properties because I want to actually keep the log files.
I was also hoping to configure this in log4j.properties, but looking at the Hadoop 0.20.2 source, it writes directly to logfiles instead of actually using log4j. Perhaps I don't understand how it's using log4j fully.
Any suggestions or clarifications would be greatly appreciated.
Unfortunately, there isn't a configurable way to prevent that. Every task for a job gets one directory in history/userlogs, which will hold the stdout, stderr, and syslog task log output files. The retain hours will help keep too many of those from accumulating, but you'd have to write a good log rotation tool to auto-tar them.
We had this problem too when we were writing to an NFS mount, because all nodes would share the same history/userlogs directory. This means one job with 30,000 tasks would be enough to break the FS. Logging locally is really the way to go when your cluster actually starts processing a lot of data.
If you are already logging locally and still manage to process 30,000+ tasks on one machine in less than a week, then you are probably creating too many small files, causing too many mappers to spawn for each job.
I had this same problem. Set the environment variable "HADOOP_ROOT_LOGGER=WARN,console" before starting Hadoop.
export HADOOP_ROOT_LOGGER="WARN,console"
hadoop jar start.jar
Configuring hadoop to use log4j and setting
log4j.appender.FILE_AP1.MaxFileSize=100MB
log4j.appender.FILE_AP1.MaxBackupIndex=10
like described on this wiki page doesn't work?
Looking at the LogLevel source code, seems like hadoop uses commons logging, and it'll try to use log4j by default, or jdk logger if log4j is not on the classpath.
Btw, it's possible to change log levels at runtime, take a look at the commands manual.
According to the documentation, Hadoop uses log4j for logging. Maybe you are looking in the wrong place ...
I also ran in the same problem.... Hive produce a lot of logs, and when the disk node is full, no more containers can be launched. In Yarn, there is currently no option to disable logging. One file particularly huge is the syslog file, generating GBs of logs in few minutes in our case.
Configuring in "yarn-site.xml" the property yarn.nodemanager.log.retain-seconds to a small value does not help. Setting "yarn.nodemanager.log-dirs" to "file:///dev/null" is not possible because a directory is needed. Removing the writing ritght (chmod -r /logs) did not work either.
One solution could be to a "null blackhole" directory. Check here:
https://unix.stackexchange.com/questions/9332/how-can-i-create-a-dev-null-like-blackhole-directory
Another solution working for us is to disable the log before running the jobs. For instance, in Hive, starting the script by the following lines is working:
set yarn.app.mapreduce.am.log.level=OFF;
set mapreduce.map.log.level=OFF;
set mapreduce.reduce.log.level=OFF;

Categories