Could someone tell me the detailed description of below flume command to execute conf file.
bin/flume-ng agent --conf-file netcat_flume.conf --name a1
-Dflume.root.logger=INFO,console
As of my knowledge,
--conf-file -> To specify Configuration File name or to mention to FLUME that we need to run this file.
--name -> Agent
But what below command does.?
-Dflume.root.logger=INFO,console
Thanks in advance for your help.
Its the Log4j Property which is explained in detail below
INFO which means output only informational messages that highlight the progress of the application at coarse-grained level. For more details check
console means output the log4j logs onto the console. Other options available are write to database and write to file.
-Dflume.root.logger=INFO,console
The above statement write coarse grained level logs of flume execution to console
The shell script flume-ng,accept args,finally run command like:
java -Xmx20m -Dflume.root.logger=INFO,console -cp '=:/home/scy/apache-flume-1.4.0-bin/lib/*:/home/scy/apache-flume-1.4.0-bin/conf:/home/scy/jdk1.6.0_45/lib/tools.jar' -Djava.library.path= org.apache.flume.node.Application --conf-file conf/example.conf --name agent1 conf org.apache.flume.node
Let's look at sourcecode org.apache.flume.node.Application.main(String[] args):
PropertiesFileConfigurationProvider configurationProvider =
new PropertiesFileConfigurationProvider(agentName,
configurationFile);
Here class PropertiesFileConfigurationProvider accept agentName and configurationFile which specific by "--conf-file" and "--name"
Then application.start() run all source,channel and sink
As about -Dflume.root.logger=INFO,console,Let's look at flume/log4j.properties:
flume.root.logger=INFO,LOGFILE
flume.root.logger will changed by -Dflume.root.logger=INFO,console,it means put all info level logs to console
Related
I've got a Spring Boot application that'd I'd like to automatically generate traces for using the OpenTelemetry Java agent, and subsequently upload those traces to Google Cloud Trace.
I've added the following code to the entry point of my application for sending traces:
OpenTelemetrySdk.builder()
.setTracerProvider(
SdkTracerProvider.builder()
.addSpanProcessor(
SimpleSpanProcessor.create(TraceExporter.createWithDefaultConfiguration())
)
.build()
)
.buildAndRegisterGlobal();
...and I'm running my application with the following system properties:
-javaagent:path/to/opentelemetry-javaagent-all.jar \
-jar myapp.jar
...but I don't know how to connect the two.
Is there some agent configuration I can apply? Something like:
-Dotel.traces.exporter=google_cloud_trace
I ended up resolving this as follows:
Clone the GoogleCloudPlatform /
opentelemetry-operations-java repo
git clone
git#github.com:GoogleCloudPlatform/opentelemetry-operations-java.git
Build the exporter-auto project
./gradlew clean :exporter-auto:shadowJar
Copy the jar produced in exporter-auto/build/libs to my target project
Run the application with the following arguments:
-javaagent:path/to/opentelemetry-javaagent-all.jar
-Dotel.javaagent.experimental.extensions=[artifact-from-step-3].jar
-Dotel.traces.exporter=google_cloud_trace
-Dotel.metrics.exporter=none
-jar myapp.jar
Note: This setup does not require any explicit code changes in the target code base.
I'm very new in the debugging the code cloned from the Github. However, till now, I have done below :
cloned the repo to my local machine (git clone ) as well as using "sourcetree" software.
built the code (mvn clean install)
able to import the maven project in IDE (Ecliplse, InteliiJ)
After the build completed I'm able to start the application (eg: start.sh) in the target/bin directory which has been created after the build
Logged into the application's UI succesfully
Questions:
- Now, at this moment I am not sure what is the main class file for aplication and from where and which .java file I shall attach the breakpoint
- Once attached the breakpoint how shall I debug it while traversing through the UI.
Can someone please give me a pointer. Thanks in advance!
Eg: I'm testing all this on "Apache/NiFi-Registry" project.
ref: https://github.com/apache/nifi-registry
You're going to need to edit this line in the nifi-registry.sh script to enable remote debugging
run_nifi_registry_cmd="'${JAVA}' -cp '${BOOTSTRAP_CLASSPATH}' -Xms12m -Xmx24m ${BOOTSTRAP_DIR_PARAMS} org.apache.nifi.registry.bootstrap.RunNiFiRegistry $#"
Is it just me, or is that memory footprint really small?
For example, in Kafka, there is this section of the startup script
# Set Debug options if enabled
if [ "x$KAFKA_DEBUG" != "x" ]; then
# Use default ports
DEFAULT_JAVA_DEBUG_PORT="5005"
if [ -z "$JAVA_DEBUG_PORT" ]; then
JAVA_DEBUG_PORT="$DEFAULT_JAVA_DEBUG_PORT"
fi
# Use the defaults if JAVA_DEBUG_OPTS was not set
DEFAULT_JAVA_DEBUG_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=${DEBUG_SUSPEND_FLAG:-n},address=$JAVA_DEBUG_PORT"
if [ -z "$JAVA_DEBUG_OPTS" ]; then
JAVA_DEBUG_OPTS="$DEFAULT_JAVA_DEBUG_OPTS"
fi
echo "Enabling Java debug options: $JAVA_DEBUG_OPTS"
KAFKA_OPTS="$JAVA_DEBUG_OPTS $KAFKA_OPTS"
fi
Then it runs eventually ${JAVA} ... ${KAFKA_OPTS}, and if you stop Kafka server and start it with
export KAFKA_DEBUG=y; kafka-server-start ...
Then you can attach a remote debugger on port 5005 by default
I understand you're using NiFi Registry, not Kafka, but basically, you need to add arguments to the JVM and reboot it. You can't just attach to the running Registry and walk through the source code.
Remote debugging a Java application
I'm using oozie environment. After successfully completition of the job, I can't find System.out.println output in the oozie log. I googled for many hours and i found this
but without result. From oozie web console i got the job id "0000011-180801114827014-oozie-oozi-W", then i tried to get more information about the job using the following command:
oozie job -oozie http://localhost:11000/oozie/ -info 0000011-180801114827014-oozie-oozi-W
then i get the externalId from action JobCompleted "16546" and i think the job id 180801114827014. Finally i tried to get log from java action using the following command:
yarn logs -applicationId application_180801114827014_16546
Where I'm doing wrong? Any suggestion?
Edit
I check if log aggregation was enaled and seems that it is enabled
Then, where I'm a doing wrong?
I can say from experience that stdout is not removed from any YARN action, however, the encouraged way to log information in your applications is using Log4j which goes to syslog, not stdout (or stderr).
However, as your terminal says, YARN log aggregation needs enabled / completed for you to see the logs from the yarn logs command
And if that command doesn't work otherwise, go to the Oozie UI, to the job action, or directly to the YARN UI and search for the action, then find the logs link from there
I have a custom source for my Flume (version 1.5.0) agent and I want to debug it. It's actually custom Twitter source, from Cloudera's example here. I have a number of questions:
(1) Is it possible to remote debug the Flume source (written in Java) when I run the Flume agent?
In addition, when I run the agent, I have this option
-Dflume.root.logger=DEBUG,console
but it seems that the logger.debugs that I have in the Java source are not appearing in the terminal.
(2) How do I make my logs appear? What's missing in my Flume or logging configuration?
(3) If I'm able to make the logs appear, how do I print to the file the console output of my Flume source logger.debugs only, excluding Flume agent's own logs?
Thanks.
Use following arguments for the JVM running flume agent as specified in the link http://stackoverflow.com/a/22631355/1660002 .
EX-
For newer JDK(for me 1.8) :
-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=6006
And you can connect to that remote port in the address field using IntelliJ or any other IDE remote debugging
I'm trying to profile my application to see if I can reproduce this blogpost. I added -D mapred.task.profile=true to the command line and checked in the job configuration that it took.
Hadoop: The Definitive Guide says the profile info will appear in the Unix dir I ran the job from. The dir I started from has a file attempt_201305011806_0042_m_000002_0.profile, which is correct job ID but there wasn't a mapper #2 (only 1 mapper and it didn't fail). The output only has the header info in the profile file; there isn't any actual profiling info.
The Hadoop docs say the output will be in the user log directory but I can't find anything. If I go into the task logs for the mapper, there's profiling info under "profile.out logs" with legitimate info. My HDFS output dir doesn't have the profiling info at all. Shouldn't the profiling output be in HDFS somewhere?
Also, it only gives text-based output in the log but all of the tools I've found to visualize the profile assume binary hprof format. Any ideas for how I could get a binary profile or else load a text-based profile into an hprof tool?
I noticed there's a space at
-D mapred.task.profile=true
Is that a typo? If yes, just remove it and see what happens. Also, you should be able to see a profiler files under the user log directory, which is usually where you ran the job from.
Also, hprof is the default for hadoop, so check if you are not overwriting it with
-Dmapred.task.profile.params