In java,is there any way i can Print the Time of execution of that log statement in Logger
3 [main] DEBUG Main.class -305
3 [main] DEBUG Main.class -307
3 [main] DEBUG Main.class -311
3[24-2-2016 12:00:00] [main] DEBUG Main.class -305 794
3[24-2-2016 12:00:01] [main] DEBUG Main.class -307
3[24-2-2016 12:00:02] [main] DEBUG Main.class -311
You can use log4j for implementing logging in your application.
You can go through this tutorial on log4j .
logs generated are like this :
2014-07-02 20:52:39 DEBUG HelloExample:19 - This is debug : mkyong
2014-07-02 20:52:39 INFO HelloExample:23 - This is info : mkyong
2014-07-02 20:52:39 WARN HelloExample:26 - This is warn : mkyong
2014-07-02 20:52:39 ERROR HelloExample:27 - This is error : mkyong
2014-07-02 20:52:39 FATAL HelloExample:28 - This is fatal : mkyong
Related
I'm trying to operate HDFS via Java Hadoop client. But when I call FileSystem::listFiles, the returned iterator give me no entry.
Here is my java code:
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.LocatedFileStatus;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.RemoteIterator;
import java.net.URI;
import java.net.URISyntaxException;
import java.io.IOException;
class HadoopTest {
public static void main(String[] args) throws IOException, URISyntaxException {
String url = "hdfs://10.2.206.148";
FileSystem fs = FileSystem.get(new URI(url), new Configuration());
System.out.println("get fs success!");
RemoteIterator<LocatedFileStatus> iterator = fs.listFiles(new Path("/"), false);
while (iterator.hasNext()) {
LocatedFileStatus lfs = iterator.next();
System.out.println(lfs.getPath().toString());
}
System.out.println("iteration finished");
}
}
And here is the outputs:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/admin/pengduo/hadoop_test/lib/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/admin/pengduo/hadoop_test/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder]
10:49:26.019 [main] DEBUG org.apache.hadoop.util.Shell - setsid exited with exit code 0
10:49:26.064 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation #org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[Rate of successful kerberos logins and latency (milliseconds)])
10:49:26.069 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation #org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[Rate of failed kerberos logins and latency (milliseconds)])
10:49:26.069 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation #org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[GetGroups])
10:49:26.070 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation #org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[Renewal failures since startup])
10:49:26.070 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation #org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName=Ops, valueName=Time, about=, interval=10, type=DEFAULT, value=[Renewal failures since last successful login])
10:49:26.071 [main] DEBUG org.apache.hadoop.metrics2.impl.MetricsSystemImpl - UgiMetrics, User and group related metrics
10:49:26.084 [main] DEBUG org.apache.hadoop.security.SecurityUtil - Setting hadoop.security.token.service.use_ip to true
10:49:26.096 [main] DEBUG org.apache.hadoop.security.Groups - Creating new Groups object
10:49:26.097 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...
10:49:26.097 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
10:49:26.097 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
10:49:26.097 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
10:49:26.098 [main] DEBUG org.apache.hadoop.util.PerformanceAdvisory - Falling back to shell based
10:49:26.098 [main] DEBUG org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping
10:49:26.153 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000
10:49:26.157 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login
10:49:26.158 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login commit
10:49:26.161 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - using local user:UnixPrincipal: admin
10:49:26.161 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - Using user: "UnixPrincipal: admin" with name admin
10:49:26.161 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - User entry: "admin"
10:49:26.161 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - UGI loginUser:admin (auth:SIMPLE)
log4j:WARN No appenders could be found for logger (org.apache.htrace.core.Tracer).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
10:49:26.201 [main] DEBUG org.apache.hadoop.fs.FileSystem - Loading filesystems
10:49:26.211 [main] DEBUG org.apache.hadoop.fs.FileSystem - file:// = class org.apache.hadoop.fs.LocalFileSystem from /home/admin/pengduo/hadoop_test/lib/hadoop-common-3.2.1.jar
10:49:26.216 [main] DEBUG org.apache.hadoop.fs.FileSystem - viewfs:// = class org.apache.hadoop.fs.viewfs.ViewFileSystem from /home/admin/pengduo/hadoop_test/lib/hadoop-common-3.2.1.jar
10:49:26.218 [main] DEBUG org.apache.hadoop.fs.FileSystem - har:// = class org.apache.hadoop.fs.HarFileSystem from /home/admin/pengduo/hadoop_test/lib/hadoop-common-3.2.1.jar
10:49:26.219 [main] DEBUG org.apache.hadoop.fs.FileSystem - http:// = class org.apache.hadoop.fs.http.HttpFileSystem from /home/admin/pengduo/hadoop_test/lib/hadoop-common-3.2.1.jar
10:49:26.219 [main] DEBUG org.apache.hadoop.fs.FileSystem - https:// = class org.apache.hadoop.fs.http.HttpsFileSystem from /home/admin/pengduo/hadoop_test/lib/hadoop-common-3.2.1.jar
10:49:26.226 [main] DEBUG org.apache.hadoop.fs.FileSystem - hdfs:// = class org.apache.hadoop.hdfs.DistributedFileSystem from /home/admin/pengduo/hadoop_test/lib/hadoop-hdfs-client-3.2.1.jar
10:49:26.233 [main] DEBUG org.apache.hadoop.fs.FileSystem - webhdfs:// = class org.apache.hadoop.hdfs.web.WebHdfsFileSystem from /home/admin/pengduo/hadoop_test/lib/hadoop-hdfs-client-3.2.1.jar
10:49:26.234 [main] DEBUG org.apache.hadoop.fs.FileSystem - swebhdfs:// = class org.apache.hadoop.hdfs.web.SWebHdfsFileSystem from /home/admin/pengduo/hadoop_test/lib/hadoop-hdfs-client-3.2.1.jar
10:49:26.234 [main] DEBUG org.apache.hadoop.fs.FileSystem - Looking for FS supporting hdfs
10:49:26.234 [main] DEBUG org.apache.hadoop.fs.FileSystem - looking for configuration option fs.hdfs.impl
10:49:26.251 [main] DEBUG org.apache.hadoop.fs.FileSystem - Looking in service filesystems for implementation class
10:49:26.251 [main] DEBUG org.apache.hadoop.fs.FileSystem - FS for hdfs is class org.apache.hadoop.hdfs.DistributedFileSystem
10:49:26.282 [main] DEBUG org.apache.hadoop.hdfs.client.impl.DfsClientConf - dfs.client.use.legacy.blockreader.local = false
10:49:26.282 [main] DEBUG org.apache.hadoop.hdfs.client.impl.DfsClientConf - dfs.client.read.shortcircuit = false
10:49:26.282 [main] DEBUG org.apache.hadoop.hdfs.client.impl.DfsClientConf - dfs.client.domain.socket.data.traffic = false
10:49:26.282 [main] DEBUG org.apache.hadoop.hdfs.client.impl.DfsClientConf - dfs.domain.socket.path =
10:49:26.291 [main] DEBUG org.apache.hadoop.hdfs.DFSClient - Sets dfs.client.block.write.replace-datanode-on-failure.min-replication to 0
10:49:26.297 [main] DEBUG org.apache.hadoop.io.retry.RetryUtils - multipleLinearRandomRetry = null
10:49:26.312 [main] DEBUG org.apache.hadoop.ipc.Server - rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcProtobufRequest, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker#7c729a55
10:49:26.322 [main] DEBUG org.apache.hadoop.ipc.Client - getting client out of cache: org.apache.hadoop.ipc.Client#222545dc
10:49:26.587 [main] DEBUG org.apache.hadoop.util.PerformanceAdvisory - Both short-circuit local reads and UNIX domain socket are disabled.
10:49:26.593 [main] DEBUG org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil - DataTransferProtocol not using SaslPropertiesResolver, no QOP found in configuration for dfs.data.transfer.protection
get fs success!
10:49:26.629 [main] DEBUG org.apache.hadoop.ipc.Client - The ping interval is 60000 ms.
10:49:26.631 [main] DEBUG org.apache.hadoop.ipc.Client - Connecting to /10.2.206.148:8020
10:49:26.658 [IPC Client (1923598304) connection to /10.2.206.148:8020 from admin] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1923598304) connection to /10.2.206.148:8020 from admin: starting, having connections 1
10:49:26.660 [IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1923598304) connection to /10.2.206.148:8020 from admin sending #0 org.apache.hadoop.hdfs.protocol.ClientProtocol.getListing
10:49:26.666 [IPC Client (1923598304) connection to /10.2.206.148:8020 from admin] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1923598304) connection to /10.2.206.148:8020 from admin got value #0
10:49:26.666 [main] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: getListing took 52ms
iteration finished
10:49:26.695 [shutdown-hook-0] DEBUG org.apache.hadoop.ipc.Client - stopping client from cache: org.apache.hadoop.ipc.Client#222545dc
10:49:26.695 [shutdown-hook-0] DEBUG org.apache.hadoop.ipc.Client - removing client from cache: org.apache.hadoop.ipc.Client#222545dc
10:49:26.695 [shutdown-hook-0] DEBUG org.apache.hadoop.ipc.Client - stopping actual client because no more references remain: org.apache.hadoop.ipc.Client#222545dc
10:49:26.695 [shutdown-hook-0] DEBUG org.apache.hadoop.ipc.Client - Stopping client
10:49:26.696 [IPC Client (1923598304) connection to /10.2.206.148:8020 from admin] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1923598304) connection to /10.2.206.148:8020 from admin: closed
10:49:26.696 [IPC Client (1923598304) connection to /10.2.206.148:8020 from admin] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1923598304) connection to /10.2.206.148:8020 from admin: stopped, remaining connections 0
10:49:26.797 [Thread-4] DEBUG org.apache.hadoop.util.ShutdownHookManager - Completed shutdown in 0.102 seconds; Timeouts: 0
10:49:26.808 [Thread-4] DEBUG org.apache.hadoop.util.ShutdownHookManager - ShutdownHookManger completed shutdown.
Note that we can get the file system successfully. And the iteration is executed with no errors.
However, list the same directory using hadoop fs command looks good:
$ $HADOOP_HOME/bin/hadoop fs -ls hdfs://10.2.206.148/
Warning: fs.defaultFs is not set when running "ls" command.
Found 4 items
drwxr-x--x - hadoop hadoop 0 2020-09-21 20:29 hdfs://10.2.206.148/apps
drwxr-x--x - hadoop hadoop 0 2021-07-08 10:44 hdfs://10.2.206.148/spark-history
drwxrwxrwt - root hadoop 0 2021-07-08 10:43 hdfs://10.2.206.148/tmp
drwxr-x--t - hadoop hadoop 0 2020-11-20 11:31 hdfs://10.2.206.148/user
I have set HADOOP_HOME appropriately.
My Hadoop libs versions are 3.2.1:
$ ll hadoop-*
-rw-r--r-- 1 admin admin 60258 Jul 8 10:42 hadoop-annotations-3.2.1.jar
-rw-r--r-- 1 admin admin 139109 Jul 8 10:42 hadoop-auth-3.2.1.jar
-rw-r--r-- 1 admin admin 44163 Jul 8 10:42 hadoop-client-3.2.1.jar
-rw-r--r-- 1 admin admin 4137520 Jul 8 10:42 hadoop-common-3.2.1.jar
-rw-r--r-- 1 admin admin 5959246 Jul 8 10:42 hadoop-hdfs-3.2.1.jar
-rw-r--r-- 1 admin admin 5094412 Jul 8 10:42 hadoop-hdfs-client-3.2.1.jar
-rw-r--r-- 1 admin admin 805845 Jul 8 10:42 hadoop-mapreduce-client-common-3.2.1.jar
-rw-r--r-- 1 admin admin 1657002 Jul 8 10:42 hadoop-mapreduce-client-core-3.2.1.jar
-rw-r--r-- 1 admin admin 85900 Jul 8 10:42 hadoop-mapreduce-client-jobclient-3.2.1.jar
-rw-r--r-- 1 admin admin 3287723 Jul 8 10:42 hadoop-yarn-api-3.2.1.jar
-rw-r--r-- 1 admin admin 322882 Jul 8 10:42 hadoop-yarn-client-3.2.1.jar
-rw-r--r-- 1 admin admin 2919779 Jul 8 10:42 hadoop-yarn-common-3.2.1.jar
I'm confused why Java Hadoop client behaves differently from Hadoop CLI, and how to make my Java program performs correctly. Can anyone help me? Many thanks!
I have figured out this on myself. The problem is that I use FileSystem::listFiles. This method will only list all the files (but not the directories) under the given path. While I have only 4 directories under the given path. To list all entries including files and directories and the given path, I should use FileSystem::listLocatedStatus instead of FileSystem::listFiles.
// this will list only the files but not the directories under "/"
// RemoteIterator<LocatedFileStatus> iterator = fs.listFiles(new Path("/"), false);
// this will list all entries including the files and the directories
RemoteIterator<LocatedFileStatus> iterator = fs.listLocatedStatus(new Path("/"));
Using this guide to get a Standalone. But whenever I perform the "sbin\start-master.sh" command, a new CMD window pops up and then disappears immediately. No URL is being printed.
I've done the initial setup and set all the environment variables properly but it still doesn't seem to work. How can I fix this?
Edit: Also tried this example using Intelli-J. Its showing the following without any other output:
[Thread-0] INFO org.eclipse.jetty.util.log - Logging initialized #568ms
[Thread-0] INFO spark.webserver.JettySparkServer - == Spark has ignited ...
[Thread-0] INFO spark.webserver.JettySparkServer - >> Listening on 0.0.0.0:4567
[Thread-0] INFO org.eclipse.jetty.server.Server - jetty-9.3.2.v20150730
[Thread-0] INFO org.eclipse.jetty.server.ServerConnector - Started ServerConnector#e92ccd{HTTP/1.1,[http/1.1]}{0.0.0.0:4567}
[Thread-0] INFO org.eclipse.jetty.server.Server - Started #942ms
Here are some of the messages from my IDEA log file. It appears that it is not able to connect to some maven service on localhost. This used to work just fine for me with 14.
classworlds-2.5.1.jar" org.jetbrains.idea.maven.server.RemoteMavenServer
2016-01-27 20:03:44,054 [ 318261] INFO - ution.rmi.RemoteProcessSupport - Port/ID: 44435/Maven32ServerImplaa933e0f
2016-01-27 20:04:59,912 [ 394119] WARN - ution.rmi.RemoteProcessSupport - The cook failed to start due to java.net.ConnectException: Operation timed out
2016-01-27 20:05:44,776 [ 438983] WARN - ution.rmi.RemoteProcessSupport - java.rmi.NotBoundException: _DEAD_HAND_
2016-01-27 20:05:44,776 [ 438983] WARN - ution.rmi.RemoteProcessSupport - at sun.rmi.registry.RegistryImpl.lookup(RegistryImpl.java:166)
2016-01-27 20:05:44,776 [ 438983] WARN - ution.rmi.RemoteProcessSupport - at com.intellij.execution.rmi.RemoteServer.start(RemoteServer.java:88)
2016-01-27 20:05:44,776 [ 438983] WARN - ution.rmi.RemoteProcessSupport - at org.jetbrains.idea.maven.server.RemoteMavenServer.main(RemoteMavenServer.java:22)
2016-01-27 20:06:15,111 [ 469318] WARN - #org.jetbrains.idea.maven - Cannot open index /Users/xxxxx/Library/Caches/IdeaIC15/Maven/Indices/Index0
org.jetbrains.idea.maven.indices.MavenIndexException: Cannot open index /Users/xxxxx/Library/Caches/IdeaIC15/Maven/Indices/Index0
Update - 1
Found this bug report -> https://youtrack.jetbrains.com/issue/IDEA-147234 which says we should upgrade maven to atleast 3.3.3 I did but I am still seeing the same problem
Looks like it's related with /etc/hosts as #MoizRaja mentioned. I suggest to try commenting everything in /etc/hosts and just leave
127.0.0.1 localhost
I'm trying to use JSonLoader form elephant-bird-pig package.
My script is simple:
register elephant-bird-pig-4.5.jar
register elephant-bird-hadoop-compat-4.5.jar
A = load '1_record_2.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad');
DUMP A
And I get an error:
2014-09-30 16:15:32,439 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-09-30 16:15:32,447 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2014-09-30 16:15:32,448 [main] INFO org.apache.pig.newplan.logical.optimizer.LogicalPlanOptimizer - {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, GroupByConstParallelSetter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NewPartitionFilterOptimizer, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter], RULES_DISABLED=[FilterLogicExpressionSimplifier]}
2014-09-30 16:15:32,449 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-09-30 16:15:32,450 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-09-30 16:15:32,450 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-09-30 16:15:32,464 [main] INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at hadoop1/10.242.8.131:8050
2014-09-30 16:15:32,466 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-09-30 16:15:32,466 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-09-30 16:15:32,467 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/twitter/elephantbird/util/HadoopCompat
Details at logfile: pig_1412081068149.log
I don't know what is missing. Can you please suggest something?
File pig_1412081068149.log contains:
Pig Stack Trace
---------------
ERROR 2998: Unhandled internal error. com/twitter/elephantbird/util/HadoopCompat
java.lang.NoClassDefFoundError: com/twitter/elephantbird/util/HadoopCompat
at com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.setLocation(LzoBaseLoadFunc.java:93)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:477)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:298)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:191)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1324)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1309)
at org.apache.pig.PigServer.storeEx(PigServer.java:980)
at org.apache.pig.PigServer.store(PigServer.java:944)
at org.apache.pig.PigServer.openIterator(PigServer.java:857)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:774) etc...
What class is missing (java.lang.NoClassDefFoundError) ? What libraries should I add more?
Thanks
pawel
I've checked if registered libraries exist and classes are inside the libraries.
Everythig looked fine, but I was still getting this error.
So I left pig shell, and opened it once again - the same script works fine.
You missed two important jars to register.
register elephant-bird-core-4.5.jar
register json-simple-1.1.1.jar
Add these two to your pig script and everything should work fine.
I'm having problems with a particular UDF jar and have so far been unable to figure out where to begin. I have a way of testing the jar from the command line and it works. If I REGISTER the jar in a pig script, pig fails to create a jar for the job. I can register other jars without trouble and this jar was working until a few days ago. Here is the output when running the pig script:
[michael#hadoop01 logitech-correlation]$ pig -f MatchWithClassifier.pig -param date=20130301 -param siteId=0
2013-05-10 11:20:30,523 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0-cdh4.1.2 (rexported) compiled Nov 01 2012, 18:38:58
2013-05-10 11:20:30,524 [main] INFO org.apache.pig.Main - Logging error messages to: /home/michael/correlation/pig_1368210030521.log
2013-05-10 11:20:30,981 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://hadoop01/
2013-05-10 11:20:31,346 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: hadoop01.dev.terapeak.com:8021
2013-05-10 11:20:32,143 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: FILTER
2013-05-10 11:20:32,390 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2013-05-10 11:20:32,422 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2013-05-10 11:20:32,422 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2013-05-10 11:20:32,508 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2013-05-10 11:20:32,518 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2013-05-10 11:20:32,522 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job5623238576559565298.jar
2013-05-10 11:20:36,398 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2017: Internal error creating job configuration.
Details at logfile: /home/michael/correlation/pig_1368210030521.log
The stack trace is below:
Pig Stack Trace
---------------
ERROR 2017: Internal error creating job configuration.
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobCreationException: ERROR 2017: Internal error creating job configuration.
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:727)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.compile(JobControlCompiler.java:259)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:180)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1275)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1260)
at org.apache.pig.PigServer.execute(PigServer.java:1250)
at org.apache.pig.PigServer.executeBatch(PigServer.java:362)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:132)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:193)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:430)
at org.apache.pig.Main.main(Main.java:111)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.util.zip.ZipException: invalid distance too far back
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:164)
at java.util.zip.ZipInputStream.read(ZipInputStream.java:163)
at java.util.jar.JarInputStream.read(JarInputStream.java:194)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at org.apache.pig.impl.util.JarManager.addStream(JarManager.java:242)
at org.apache.pig.impl.util.JarManager.mergeJar(JarManager.java:216)
at org.apache.pig.impl.util.JarManager.mergeJar(JarManager.java:206)
at org.apache.pig.impl.util.JarManager.createJar(JarManager.java:126)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler.getJob(JobControlCompiler.java:411)
... 17 more
================================================================================
Based on this I would think the problem is from the "java.util.zip.ZipException: invalid distance too far back" exception. Is pig having some issue reading the jar?