HDFS copyToLocalFile throws java.io.IOException: Mkdirs failed to create file

HDFS copyToLocalFile throws java.io.IOException: Mkdirs failed to create file - java

I am trying to copy the file from HDFS to Local linux file system using Hadoop FileSystem class.
I have access to create folder in the path where i am trying to copy, i checked using mkdir command.
Also i tried using shell command hadoop fs -copyToLocal hdfsFilePath localFilepath it was working.
I am running this on YARN Cluster.
I tried below approaches, but i am getting the java.io.IOException: Mkdirs failed to create file:/home/user error.
Error log:
16/01/14 01:09:36 ERROR util.FileUtil:
java.io.IOException: Mkdirs failed to create /home/user (exists=false, cwd=file:/hdfs4/yarn/nm/usercache/user/appcache/application_1452126203792_8862/container_e2457_1452126203792_8862_01_000001)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:442)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:428)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:365)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1970)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1939)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1915)
at com.batch.util.FileUtil.copyToLocalFileSystem(FileUtil.java:66)
at com.batch.dao.impl.DaoImpl.writeFile(DaoImpl.java:108)
at com.batch.JobDriver.runJob(JobDriver.java:79)
at com.batch.JobDriver.main(JobDriver.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)
Actually i am passing localFilePath as /home/user/test, but i am getting the error like failed to create file:/home/user
fs.copyToLocalFile(hdfsFilePath, localFilePath);
fs.copyToLocalFile(false, hdfsFilePath, localFilePath, true);

This week i faced the same thing, problem was that i was deploying the job in cluster mode, therefore the machine where the job was going to run did not have that directory created. Is it possible you are deploying the job in cluster mode? If so, try deploying it in client mode (the output directory has to exist though)

For anyone looking for this exact error, but maybe not from YARN:
I had this exact error when trying to run org.apache.hadoop.fs.FileSystem.copyToLocalFile on my local (Mac) machine, with local FS configured using the job.local.dir attribute.
This was the exception:
java.io.IOException: Mkdirs failed to create file:/User/yossiv/algo-resources/AWS/QuerySearchEngine.blacklistVersionFile (exists=false, cwd=file:/Users/yossiv/git/c2s-algo)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:456)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:441)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:928)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:806)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:368)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:341)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:292)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:2066)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:2035)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:2011)
What fixed it was to change job.local.dir to be under the current directory , which is listed in the exception text after cwd=, In my case that's /Users/yossiv/git/c2s-algo.
Broke my head two days over this, hope this helps someone.

Related

Cannot open sever.jar file

So I was trying to make a 1.17.1 minecraft server on my mac. I couldn't open my 1.17.1_server.jar with Java 8 so I download Java 16.0.2.
Unfortunately, everytime I was opening the 1.17.1_server.jar file, I got
"The Java JAR file "1.17._server.jar" could not be launched." .
I first thought that it was because the file was launch by Java 8 instead of 16.
So I went into the terminal and run :<path to java> -jar 1.17.1_server.jar
I then got this : Error: Unable to access jarfile 1.17.1_server.jar
Finally i tried to put the path of the jar file in the command...
So I've run : path to java -jar path to server
and got this :
[main/ERROR]: Failed to load properties from file: server.properties
[15:57:35] [main/WARN]: Failed to load eula.txt
[15:57:35] [main/INFO]: You need to agree to the EULA in order to run the server. Go to eula.txt for more info.
So why I have to agreed Eula if i've never launched it ? Does it think that he already been launched ?

As stated in the error message
[15:57:35] [main/INFO]: You need to agree to the EULA in order to run the server. Go to eula.txt for more info.
You have to open the file and check yes

Okay find the solution: my eula and server properties file were in my user folder dk why (never moved them so...)

Can't create a Hadoop sequence file on a local file system

I found this example of how to write to a local file system, but it throws this exception:
Exception in thread "main" java.io.IOException: (null) entry in command string: null chmod 0644 C:\temp\test.seq
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:770)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:866)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:849)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:733)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:225)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.<init>(RawLocalFileSystem.java:209)
at org.apache.hadoop.fs.RawLocalFileSystem.createOutputStreamWithMode(RawLocalFileSystem.java:307)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:296)
at org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:328)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSOutputSummer.<init>(ChecksumFileSystem.java:398)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:461)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1168)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
Running this on a Windows 10 box. I even tried using the msys git bash shell thinking maybe that would help the JVM simulate a chmod operation. Didn't change anything. Any suggestions on how to do this on Windows?

I too faced this error and it was resolved after following the steps. (Note : I am using Spark 2.0.2 and Hadoop 2.7)
Verify whether you are getting "java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.". You check it by running "spark-shell" command.
I got the above mentioned error. It occurred because I didn't add "HADOOP_HOME" in environment var. After adding the "HADOOP_HOME", in my case same as "SPARK_HOME", the issue was resolved.

Running a Hadoop program using only jars on Windows requires a few steps beyond just referencing the jars.
Credit to Professor Lu at University of Helsinki for posting a Hadoop on Windows guide for his students.
Here is a rundown of steps I had to take using Windows 10 and Hadoop 2.7.3:
Download and extract Hadoop binaries to somewhere like C:\hadoop-2.7.3.
Download patch files from https://github.com/srccodes/hadoop-common-2.2.0-bin/archive/master.zip and extract them to your %HADOOP_HOME%\bin directory.
Set a HADOOP_HOME environment variable. For example, C:\hadoop-2.7.3.
Download the Hadoop source code, copy hadoop-common-project\hadoop-common\src\main\java\org\apache\hadoop\io\nativeio\NativeIO.java to your project, and modify line 609 from
return access0(path, desiredAccess.accessRight());
to
return true;

One of the solutions is as follows.
In the Project Structure (Intelij), under SDK's ensure there is no other version of Hadoop referenced. In my case - I was running Spark earlier and it was referring Hadoop JAR's and this was causing access issues. Once I removed them and ran the MR job it ran fine.

Running a script in apache storm

Is there a way to simply run a python script in Apache Storm?
I'm trying to figure out how to use storm to run scripts but am having trouble. It seems like I need to create a Java program to call the script and use it as a bolt but I simply want to send a very basic python script to storm to see if it is possible.
I read that the following command is helpful in sending topologies to storm but am having trouble understanding the syntax and if I am allowed to send any python code to storm or if it needs to have specific syntax.
Can someone clarify whether or not I can submit any python script to storm and if so what the following line of code means.
storm shell resources/ python topology.py arg1 arg2
When I try to submit a basic python script using the above code i get the following output.
956 [main] INFO backtype.storm.StormSubmitter - Uploading topology jar stormshell8691441.jar to assigned location: /home/scix3/apache/storm/data/nimbus/inbox/stormjar-ae0739f9-7c93-4f00-a02b-c4eceba3b005.jar
966 [main] INFO backtype.storm.StormSubmitter - Successfully uploaded topology jar to assigned location: /home/scix3/apache/storm/data/nimbus/inbox/stormjar-ae0739f9-7c93-4f00-a02b-c4eceba3b005.jar
Exception in thread "main" java.io.IOException: Cannot run program "simple.py" (in directory "."): error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
at java.lang.Runtime.exec(Runtime.java:617)
at org.apache.commons.exec.launcher.Java13CommandLauncher.exec(Java13CommandLauncher.java:58)
at org.apache.commons.exec.DefaultExecutor.launch(DefaultExecutor.java:254)
at org.apache.commons.exec.DefaultExecutor.executeInternal(DefaultExecutor.java:319)
at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:160)
at org.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:147)
at backtype.storm.util$exec_command_BANG_.invoke(util.clj:386)
at backtype.storm.command.shell_submission$_main.doInvoke(shell_submission.clj:29)
at clojure.lang.RestFn.applyTo(RestFn.java:139)
at backtype.storm.command.shell_submission.main(Unknown Source)
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:186)
at java.lang.ProcessImpl.start(ProcessImpl.java:130)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
... 10 more
The exact command I'm using (possibly incorrect) is storm shell resources/ simple.py
simple.py is merely a print 'Hello, world script.
I'm using storm version 0.9.4

Yes, you can run python on Storm. In fact you can run just about code from just about any language on a storm cluster, its just a matter of implementing the API.
However, there are some requirements for that to work, and so far as I can tell said requirements are not spelled out in the storm documentation. The fastest path to get up an running would be to take the splitsentence.py example from the storm source and run with it.

try pyleus (https://github.com/Yelp/pyleus) or streamparse (https://github.com/Parsely/streamparse), i will recommend using pyleus as it is simple.

Running MapReduce job written in Java through my PHP web page

My PHP server is hosted on Job Tracker machine and I am trying to run the map reduce job through my web page by calling the command line executing the jar command,
but I am getting no response and job is not starting.
However if I run a command to list the hdfs using same methodology it is running fine. Please guide me.
Following command is not responding me anything and job is not running:
exec("HADOOP_DIR/bin/hadoop jar /usr/local/MapReduce.jar Mapreduce [input Path] [output Path]");
But if I do this:
exec("HADOOP_DIR/bin/hadoop dfs -ls /user/hadoop");
It is running fine.

I solved this problem by changing the php server user to hduser (user which has permission to write files in hdfs). without changing this user only the commands which reads from the hdfs were working and not the one which needs to create the files or write on hdfs.
When i tried to run the command for creating the directory in hdfs through my php script, I got the following error in my php server logs (/var/log/apache2/error.log):
mkdir: org.apache.hadoop.security.AccessControlException: Permission denied: user=www-data, access=WRITE, inode="hduser":hduser:supergroup:rwxr-xr-x
And on running the Jar command to trigger MapRed program I got the following error:
Exception in thread "main" java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:1879)
at org.apache.hadoop.util.RunJar.main(RunJar.java:115)
Then what i did is i changed the user in /etc/apache2/apache2.conf to my hadoop user and then restarted my server and every thing was working fine now.
I should reference Execute hadoop jar from PHP Server fails. Permission denied post which helped me alot in solving this problem. I hope this post helps others too.

running a simple code Mapreduce

I am running a simple code of MapReduce and am getting the following error:
`Exception in thread "main" java.io.IOException: Error opening job jar: Test.jar
at org.apache.hadoop.util.RunJar.main(RunJar.java:90)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:114)
at java.util.jar.JarFile.<init>(JarFile.java:133)
at java.util.jar.JarFile.<init>(JarFile.java:70)
at org.apache.hadoop.util.RunJar.main(RunJar.java:88)`
Some details of the problem:
My hadoop version is 0.20.
I have set new JobConf(Statecount.class) where Statecount.class is the class from which I am running this job. What do I have to do to resolve this error?
Can anyone help me?
Thanks.

check that the hadoop user (usually 'hadoop') have permission to this file
sometimes hadoop need that some files will be on the HDFS and not in your file system.

Are you trying to run a jar named Test.jar in the java program RunJar?
If so please remember that any local path used could only be the on the name node.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

HDFS copyToLocalFile throws java.io.IOException: Mkdirs failed to create file - java

Related

Cannot open sever.jar file

Can't create a Hadoop sequence file on a local file system

Running a script in apache storm

Running MapReduce job written in Java through my PHP web page

running a simple code Mapreduce

Categories

Resources