Exceptions in accessing HDFS file system in Java - java

I tried accessing HDFS file with Java API as the following code:
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.fs.FSDataInputStream;
public static void main(args[]) {
Configuration conf = new Configuration();
conf.addResource(new Path("/etc/hadoop/conf/core-site.xml"));
conf.addResource(new Path("/etc/hadoop/conf/hdfs-site.xml"));
try {
Path path = new Path("hdfs://mycluster/user/mock/test.txt");
FileSystem fs = FileSystem.get(path.toUri(), conf);
if (fs.exists(path)) {
FSDataInputStream inputStream = fs.open(path);
// Process input stream ...
}
else
System.out.println("File does not exist");
} catch (IOException e) {
System.out.println(e.getMessage());
An exception occurred at FileSystem.get(path.toUri(), conf) saying that Couldn't create proxy provider class org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider which is caused by java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.security.Credentials.
I did not find much information about the error. Is the issue due to the wrong API (org.apache.hadoop.hdfs instead of org.apache.hadoop.fs)?

1)Do you have the hadoop-hdfs-.jar available in your classpath?
2)How are you downloading the dependencies? Maven/Manual/Other
3)Could you please provide the stacktrace?

Related

Typo in word "hdfs" gives me: "java.io.IOException: No FileSystem for scheme: hdfs". Using FileSystem lib over hadoop 2.7.7

While using FileSystem.get(URI.create("hdfs://localhost:9000/"), configuration) I'm getting the error "Typo in word hdfs", when I tried to run the code gives me the IOException:
java.io.IOException: No FileSystem for scheme: hdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2658)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2665)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:93)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2701)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2683)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:372)
at com.oracle.hadoop.client.Test.main(Test.java:53)
I already tried to use in different ways to use the call to hdfs, I'm using the libraries for hadoop 2.7.7
Here is my current code:
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.log4j.BasicConfigurator;
import java.io.IOException;
import java.io.InputStream;
import java.net.URI;
public class Test {
public static void main(String []args) {
Configuration conf = new Configuration();
InputStream in = null;
try {
FileSystem fs = FileSystem.get(URI.create("hdfs://localhost:9000/"), conf);
in = fs.open(new Path(uri));
IOUtils.copyBytes(in, System.out, 4096, false);
} catch (IOException e) {
e.printStackTrace();
} finally {
IOUtils.closeStream(in);
}
}
Actually, I just added this maven dependency: http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs/2.7.7
to maven pom.xml and problem solved.

Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: expected: file:///

I am trying to implement copyFromLocal command using java, below is my code.
package com.hadoop;
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class CopyFromLocal {
public static void main(String[] args) throws IOException, URISyntaxException {
Configuration conf =new Configuration();
conf.addResource(new Path("/usr/hdp/2.3.0.0-2557/hadoop/conf/core-site.xml"));
conf.addResource(new Path("/usr/hdp/2.3.0.0-2557/hadoop/conf/mapred-site.xml"));
conf.addResource(new Path("/usr/hdp/2.3.0.0-2557/hadoop/conf/hdfs-site.xml"));
FileSystem fs = FileSystem.get(conf);
Path sourcePath = new Path("/root/sample.txt");
Path destPath = new Path("hdfs://sandbox.hortonworks.com:8020/user/Deepthy");
if(!(fs.exists(destPath)))
{
System.out.println("No Such destination exists :"+destPath);
return;
}
fs.copyFromLocalFile(sourcePath, destPath);
}
}
I get the following exception:
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: hdfs://sandbox.hortonworks.com:8020/user/Deepthy, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:305)
at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:47)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:357)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:643)
at com.amal.hadoop.CopyFromLocal.main(CopyFromLocal.java:27)
I added these jars to classpath:
hadoop-0.20.1-core.jar
commons-logging-1.1.3.jar
Kindly suggest where I'm going wrong.
Change the configuration as below
conf.set("fs.default.name","hdfs://sandbox.hortonworks.com:8020");
Please Give a relative path in your destination destPath like
Path destPath = new Path("/user/Deepthy");
This will fix the issue

Can a Jar File be updated programmatically without rewriting the whole file?

It is possible to update individual files in a JAR file using the jar command as follows:
jar uf TicTacToe.jar images/new.gif
Is there a way to do this programmatically?
I have to rewrite the entire jar file if I use JarOutputStream, so I was wondering if there was a similar "random access" way to do this. Given that it can be done using the jar tool, I had expected there to be a similar way to do it programmatically.
It is possible to update just parts of the JAR file using Zip File System Provider available in Java 7:
import java.net.URI;
import java.nio.file.FileSystem;
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardCopyOption;
import java.util.HashMap;
import java.util.Map;
public class ZipFSPUser {
public static void main(String [] args) throws Throwable {
Map<String, String> env = new HashMap<>();
env.put("create", "true");
// locate file system by using the syntax
// defined in java.net.JarURLConnection
URI uri = URI.create("jar:file:/codeSamples/zipfs/zipfstest.zip");
try (FileSystem zipfs = FileSystems.newFileSystem(uri, env)) {
Path externalTxtFile = Paths.get("/codeSamples/zipfs/SomeTextFile.txt");
Path pathInZipfile = zipfs.getPath("/SomeTextFile.txt");
// copy a file into the zip file
Files.copy( externalTxtFile,pathInZipfile,
StandardCopyOption.REPLACE_EXISTING );
}
}
}
Yes, if you use this opensource library you can modify it in this way as well.
https://truevfs.java.net
public static void main(String args[]) throws IOException{
File entry = new TFile("c:/tru6413/server/lib/nxps.jar/dir/second.txt");
Writer writer = new TFileWriter(entry);
try {
writer.write(" this is writing into a file inside an archive");
} finally {
writer.close();
}
}

how to create directory on hadoop machine running on vm using hadoop api

I have a java client program that creates directory, but when execute the program its creating directory on my local machine even i have configured fs.defaultFS to vm url that matches core-site.xml.
here is the sample program that creates directory.
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class Mkdir {
public static void main(String ar[]) throws IOException
{
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://testing:8020");
FileSystem fileSystem = FileSystem.get(conf);
Path path = new Path("/user/newuser");
fileSystem.mkdirs(path) ;
fileSystem.close();
}
}
add this two file in your code
Configuration conf = new Configuration();
conf.addResource(new Path("/home/user17/BigData/hadoop/core-site.xml"));
conf.addResource(new Path("/home/user17/BigData/hadoop/hdfs-site.xml"));
FileSystem fileSystem = FileSystem.get(conf);
give path according to your system

Error in reading a txt file on HDFS and copying/writing the content of it into a newly created file on LOCAL filesystem

I am trying to read a file on HDFS and copy the content of the file into a newly created local file using the following java program. FYI, I have installed hadoop single node cluster on my machine.
HdfsCli.java
package com;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class HdfsCli {
public void readFile(String file) throws IOException {
Configuration conf = new Configuration();
String hadoopConfPath = "\\opt\\hadoop\\etc\\hadoop\\";
conf.addResource(new Path(hadoopConfPath + "core-site.xml"));
conf.addResource(new Path(hadoopConfPath + "hdfs-site.xml"));
conf.addResource(new Path(hadoopConfPath + "mapred-site.xml"));
FileSystem fileSystem = FileSystem.get(conf);
// For the join type of queries, output file in the HDFS has 'r' in it.
// String type="r";
Path path = new Path(file);
if (!fileSystem.exists(path)) {
System.out.println("File " + file + " does not exists");
return;
}
FSDataInputStream in = fileSystem.open(path);
String filename = file.substring(file.lastIndexOf('/') + 1,
file.length());
OutputStream out = new BufferedOutputStream(new FileOutputStream(
new File("/home/DAS_Pig/" + filename)));
byte[] b = new byte[1024];
int numBytes = 0;
while ((numBytes = in.read(b)) > 0) {
out.write(b, 0, numBytes);
}
conf.clear();
in.close();
out.close();
fileSystem.close();
}
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
HDFSClient hc = new HDFSClient();
hc.readFile("hdfs://localhost:9000//DasData//salaries.txt");
System.out.println("Successfully Done!");
}
}
However, when I am running this code, the following error is coming:
Exception in thread "main" org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot communicate with client version 4
at org.apache.hadoop.ipc.Client.call(Client.java:1066)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at com.sun.proxy.$Proxy1.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at com.HDFSClient.readFile(HDFSClient.java:22)
at com.HdfsCli.main(HdfsCli.java:57)
I am newb in hadoop development. Can anyone guide me in resolving this?
Thank you!
Server and client versions are different. Looks like server version is 4.i, and client is 3.i. You have to upgrade client classpath libraries up to server version.
Delete all current hadoop jars depencies included in your project. Download a newer version of Hadoop. Configure a build path of your project with new jars.

Categories