Custom log4j appender in Hadoop 2 - java

How to specify custom log4j appender in Hadoop 2 (amazon emr)?
Hadoop 2 ignores my log4j.properties file that contains custom appender, overriding it with internal log4j.properties file. There is a flag -Dhadoop.root.logger that specifies logging threshold, but it does not help for custom appender.

I know this question has been answered already, but there is a better way of doing this, and this information isn't easily available anywhere. There are actually at least two log4j.properties that get used in Hadoop (at least for YARN). I'm using Cloudera, but it will be similar for other distributions.
Local properties file
Location: /etc/hadoop/conf/log4j.properties (on the client machines)
There is the log4j.properties that gets used by the normal java process.
It affects the logging of all the stuff that happens in the java process but not inside of YARN/Map Reduce. So all your driver code, anything that plugs map reduce jobs together, (e.g., cascading initialization messages) will log according to the rules you specify here. This is almost never the logging properties file you care about.
As you'd expect, this file is parsed after invoking the hadoop command, so you don't need to restart any services when you update your configuration.
If this file exists, it will take priority over the one sitting in your jar (because it's usually earlier in the classpath). If this file doesn't exist the one in your jar will be used.
Container properties file
Location: etc/hadoop/conf/container-log4j.properties (on the data node machines)
This file decides the properties of the output from all the map and reduce tasks, and is nearly always what you want to change when you're talking about hadoop logging.
In newer versions of Hadoop/YARN someone caught a dangerously virulent strain of logging fever and now the default logging configuration ensures that single jobs can generate several hundred of megs of unreadable junk making your logs quite hard to read. I'd suggest putting something like this at the bottom of the container-log4j.properties file to get rid of most of the extremely helpful messages about how many bytes have been processed:
log4j.logger.org.apache.hadoop.mapreduce=WARN
log4j.logger.org.apache.hadoop.mapred=WARN
log4j.logger.org.apache.hadoop.yarn=WARN
log4j.logger.org.apache.hadoop.hive=WARN
log4j.security.logger=WARN
By default this file usually doesn't exist, in which case the copy of this file found in hadoop-yar-server-nodemanager-stuff.jar (as mentioned by uriah kremer) will be used. However, like with the other log4j-properties file, if you do create /etc/hadoop/conf/container-log4j.properties it will be used on all your YARN stuff. Which is good!
Note: No matter what you do, a copy of container-log4j-properties in your jar will not be used for these properties, because the YARN nodemanager jars are higher in the classpath. Similarly, despite what the internet tells you -Dlog4j.configuration=PATH_TO_FILE will not alter your container logging properties because the option doesn't get passed on to yarn when the container is initialized.

1.in order to change log4j.properties at the name node, u can change /home/hadoop/log4j.properties.
2.in order to change log4j.properties for the container logs, u need to change it at the yarn containers jar, since they hard-coded loading the file directly from project resources.
2.1 ssh to the slave (on EMR u can also simply add this as bootstrap action, so u dont need to ssh to each of the nodes).
ssh to hadoop slave
2.2 override the container-log4j.properties at the jar resources:
jar uf /home/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.2.0.jar container-log4j.properties

Look for hadoop-config.sh in the deployment. That is the script being sourced before executing the hadoop command. I see the following code in hadoop-config.sh, see if modifying that helps.
HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.root.logger=${HADOOP_ROOT_LOGGER:-INFO,console}"

Related

Log4j2 read properties from an external property file

I have an application that i have to deliver in a packaged JAR which is then run by the client in a complicated environment that makes editing env variables or JVM arguments very cumbersome (additionally client is not too technical). Currently we are using some external proeprty files for configuring the database and so on and this is going well so far.
I would like to allow the client to configure some aspects of Log4j2 using these properties files, i can see that Log4j2 allows multiple ways of performing property substitutions https://logging.apache.org/log4j/log4j-2.1/manual/configuration.html#PropertySubstitution
I can also see that it is possible to load property bundles but if i understand the docs correctly these bundles need to be loaded from classpath and i have not stumbled upon a possibility of define this properties file by giving its direct path (such as "load the properties file /tmp/myconfig.properties").
So ultimately - is it possible to use variables from an external .properties file that is NOT in classpath but in a specified filesystem location? Or maybe some other way exists to load this data from external file? (i already noted that using env variables or jvm arguments is out of the question in my case).

No log statement is printing using websphere 8.0

I have one application running in WAS8.
we have a jar - commons-logging-1.1.1.jar in WEB-INF/lib
we have one properties file - commons-logging.properties
the content of the file is
priority=1
org.apache.commons.logging.LogFactory=org.apache.commons.logging.impl.LogFactoryImpl
we have org.apache.commons.logging.LogFactory file in WebContent/META-INF/services
the content of the file is
org.apache.commons.logging.impl.Log4jFactory
The log files are created but nothing is written in it. It is not showing any error in the log files in log of appserver.
Please let me know if I am missing something.
Please note: if I keep commons-logging.properties in /opt/IBM/WebSphere/80/AppServer/profiles/AppSrv01/properties, then it works perfectly fine. It is writing in the log files. But as I heard it is not standard practice so I can't keep the file in that place. I have to find some alternative way.
Please help me.
If it is not a requirement to have separate log files for your web application, you can simply remove the commons-log jar and properties from your module (war) and your log statements will write to SystemOut.log according to the log level settings in the WebSphere console (which can be changed at runtime, by the way).
If you must separate application logging, you can refer to this infocenter article that lays out the combination of commons-log jar location, commons-log properties values, and application classloader settings (Parent-First, Parent-Last, commons-log.jar bundled or not, etc) to achieve the desired results.

Log4j different .property files for Appenders

I was wondering if there is a way to define the appenders (file, console etc) on a different file than the one defining the actual logging properties.
The idea came up from a system I am developing and we have the following requirement:
Different versions of the system will be deployed on the same server. So in order not to maintain different log4j properties file, that will all set the same properties and differ on the file appenders (so as to know which log was recorded from which version of the system).
Thank you in advance
You can use DOMConfigurator or PropertyConfigurator to load your log4j settings from an external file. You can invoke this API multiple times under a run to load the settings from different sources.
In your case, you can load the Appender details alone dynamically from another property file based on the version.Just like suffixing some version id to the file name and loading it from your code in a generic way.
If each version running on a different VM process (on different ports), you can add an argument to the virtual machine. e.g.:
-Dmysystem.version=1.0.1
If you are using the XML configuration:
<param name="file" value="/logs/system.v${mysystem.version}.log" />
Of if you are using the properties:
log4j.appender.ROLL.File=/logs/system.v${mysystem.version}.log
In both cases, the generated file might be:
/logs/system.v1.0.1.log
In this way, you can maintain a single configuration file and dynamic filenames.

tomcat, 2 webapps, 2 log4js, but both apps log to one file

To elaborate on that, I have a Tomcat server version 7.0.27 running Java 1.6.0_27.
I have two wars, each with their own log4j jar, also using slf4j with slf4j-log4j. Each war has it's own configuration file (log4j.xml).
war 1 should use file log-1.log and war 2 should use file log-2.log but both are logging into log-1.log.
I've checked there are no other log4j jars in the tomcat installation so I'm not sure where the problem is. I've also turned off shared class loading but that made no difference. My next step is to turn on verbose class loader logging and/or start debugging log4j but maybe someone here knows why this is and can save me some time. Thanks for any input on this.
Update:
Ok think I got this one. The log4j xml files are fine. After doing a verbose:class I can see that log4j.jar is only getting loaded once and from neither web application.
I'm working with Documentum. They have a runtime jar required to use their libraries that is an empty jar with a manifest file. The manifest points to a bunch of jars. In other words, they don't use maven... Anyway ... one of those jars happens to be logj4 found in the Documentum installation. So it seems both webapps are using that one. I think this is the problem. To be confirmed...
If you are placing Documentum's runtime jar on your top-level classpath, and that runtime jar is referencing log4j.jar, then it will only load once. You don't have to use that runtime jar, though, or you can use it in just your Documentum .war, if one is non-Documentum.
You didn't post your properties file but i can think of some reasons:
You don't have an appender that writes to the different files, i.e you need appender1 to write to log1.log and appender2 writing to log2.txt
You have the appenders set up right but both the applications are using the same logger, so they both write to the same file.
You have 2 loggers, each with its own appender, but in your code you are not initializing the correct logger:
//there is no logger called com.sample, so it defaults to root logger that has appender that writes to log1.txt
Logger logger = Logger.getLogger(com.sample.MyClass.class);
If you post your properties file and your logger init code it'll be easier to help you.

Where on a remote workstation should I put a CSV-config file for distributed JMeter testing?

I want to make JMeter distributed testing. It was said in the manual that first I should start jmeter-server on remote nodes, and then I should update jmeter.config and run jmeter on a master node.
I did all these steps. My test plan includes working with CSV-config files. If I test just from 1 (master) node - then everything works as a charm. But when I try distributed testing all tests fail. Some investigation showed that remote nodes send requests without substitution of ${..}-like parameters. Requests look like
POST data:
5|0|6|http://host.com/portal/|67D1C612DCF291DCD0F71AD15E404F37|host.ui.client.services.LoginService|login|java.lang.String/2004016611|${ADMIN_LOGIN}|1|2|3|4|3|5|5|5|6|6|1|
It's obvious that remote jmeter-server cannot find the CSV-file. Where should I put it?
P.S: I have machines with different OS (Windows 7 and Ubuntu 10.04).
The easiest way to resolve the multiple OS issue is to put the CSV file in the Jmeter BIN directory on all test machines, and do not reference the path in the CSV Data Set Config component.
Put a full path and filename into your 'CSV Data Set Config' component, eg. c:\loadtest\config.csv and ensure that you put the CSV file in the place that is specified.
The components manual also states the following:
Relative file names are resolved with respect to the path of the active test plan.
So it should be possible to put the file in the same directory as the test plan file. This ought to work in both Linux and Windows.
Any reference to data file assumes that such a file exists in respective nodes in the specified path. For example, if you have your CSV files in C:\data, then when you execute the test plan in a distributed fashion, the testplan would look for the data file in C:\data of the node (the slave).
In effect, if you are using 10 slave machines, you need to have c:\data folder in all those 10 machines.
There is no need to copy test plan.
EDITED because the docs reference was wrong - I got burned by my own answer :)
Old question, but I just ran into this issue and the answers here are conflicting.
Is a relative path resolved to the bin/ directory, or to the directory of the current .jmx test script?
Answer: it is only the directory of the test script. From the docs:
Relative file names are resolved with
respect to the path of the active test plan. Absolute file names are
also supported, but note that they are unlikely to work in remote
mode, unless the remote server has the same directory structure. If
the same physical file is referenced in two different ways - e.g.
csvdata.txt and ./csvdata.txt - then these are treated as different
files. If the OS does not distinguish between upper and lower case,
csvData.TXT would also be opened separately.

Categories