"chmod" hadoop path from Java - java

What i am doing basically is automating some shell commands (theses commands including hadoop shell commands) using java code, now I am doing the follwoing commands on bash:
hadoop fs -mkdir path//tp//folder
hadoop fs -chmod a+w path//to//folder
everything working fine, now when to trying to use java code to perform the same actions:
org.apache.hadoop.fs.FileSystem.mkdir(new Path("path//to//folder"),new FsPermission(FsAction.ALL, FsAction.ALL, FsAction.ALL))
unfortunately this method:
public void setPermission(Path p, FsPermission permission) throws IOException
{
}
is not implemented (respectively: empty) with hadoop v 2.6.0 ~ 2.8.0
My question how can i add read/write permission to hadoop path using java code?

First of all, you might want to cross-check the results of your analysis. If you look here for example, you find that FileSystem is actually an abstract class. So it wouldnt surprise me if the specific subclass that actually gets instantiated at some point overrides that empty method setPermissions() - based on the underlying OS for example.
In any case, there is a simple, but ugly workaround: use ProcessBuilder and run
hadoop fs -chmod a+w path//to//folder
from within Java. And write down:
// TODO: check with next version of hadoop if fs.FileSystem.setPermission() is now implemented

Related

URI is not hierarchical. How to get File Path using getResourceAsStream

private void generateDATFiles() throws Exception {
File shellScriptPath= new File((this.getClass().getResource("/Vorlagen/Simulation/test.sh").toURI()));
ProcessBuilder pb = new ProcessBuilder(shellScriptPath.getAbsolutePath());
Process p = pb.start();
}
So I have a shell script which I want to execute. The problem is that I need the file path and I can get it using getResource but I get the error that my uri is not hierarchical so I found out that I need to use getResourceAsStream to avoid the error, but my question is how I can get the file path using getResourceAsStream?
Unfortunately it won't be an easy thing to do. If you pack the .sh script together with the other part of the program in a single .jar it won't work. You can only access it as a resourcestream and not as an URI (even if in development mode you can get the actual URI). That's because the .sh AND the class files and everything is actually in the same file for the file system (.jar).
It's not so much a java limitation as the OS. If the .sh is bundled in the jar/war/any other archive you cannot run it from the java code. (Actually you cannot do it from the command prompt either).
In order to solve it you can get the input stream and write the contents in a temporary file (you can use the java createTempFile functionality) and then execute that one. Or you can extract the .sh file from the jar(zip) and execute it
Try to do with this way.
class J{
public static void main (String a[]){
{
System.out.println(J.class.getResourceAsStream("/file.txt")
}
}

Is there a way for a Java app to gain root permissions?

When running Files.walk(Paths.get("/var/")).count() as an unprivileged user, the execution might throw an exception as there are folders inside /var/ that need root permission to be traversed.
I am not looking for a way to execute a bash command as root (e.g. sudo find /var), using Process, etc.
I just want to make sure Files.walk(Paths.get("/var/")).count() does not throw an AccessDeniedException:
Exception in thread "restartedMain" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0
at sun.reflect.NativeMethodAccessorImpl.invoke
at sun.reflect.DelegatingMethodAccessorImpl.invoke
at java.lang.reflect.Method.invoke
at org.springframework.boot.devtools.restart.RestartLauncher.run
Caused by: java.io.UncheckedIOException: java.nio.file.AccessDeniedException: /var/cache/httpd
at java.nio.file.FileTreeIterator.fetchNextIfNeeded
at java.nio.file.FileTreeIterator.hasNext
at java.util.Iterator.forEachRemaining
at java.util.Spliterators$IteratorSpliterator.forEachRemaining
at java.util.stream.AbstractPipeline.copyInto
at java.util.stream.AbstractPipeline.wrapAndCopyInto
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential
at java.util.stream.AbstractPipeline.evaluate
at java.util.stream.LongPipeline.reduce
at java.util.stream.LongPipeline.sum
at java.util.stream.ReferencePipeline.count
at com.example.DemoApplication.main
... 5 more
Caused by: java.nio.file.AccessDeniedException: /var/cache/httpd
at sun.nio.fs.UnixException.translateToIOException
at sun.nio.fs.UnixException.rethrowAsIOException
at sun.nio.fs.UnixException.rethrowAsIOException
at sun.nio.fs.UnixFileSystemProvider.newDirectoryStream
at java.nio.file.Files.newDirectoryStream
at java.nio.file.FileTreeWalker.visit
at java.nio.file.FileTreeWalker.next
at java.nio.file.FileTreeIterator.fetchNextIfNeeded
This is just an example. Using filter(...) it is possible to work around the exception. But this example can be expanded to other use cases too.
So in short Is this possible at all, for CLI, JavaFX, etc. apps to gain root permission after they have been executed from command line via a method such as java -jar app.jar?
If what you want is actually skipping the paths where you have no access, you have two approaches:
Streams
In the answer to this question it is explained how to obtain the stream of all files of a subtree you can access.
But this example can be expanded to other use cases too.
FileVisitor
Using a FileVisitor adds a lot of code, but grants you much more flexibility when walking directory trees. To solve the same problem you can replace Files.walk() with:
Files.walkFileTree(Path start, FileVisitor<? super Path> visitor);
extending SimpleFileVisitor (to count the files) and overriding some methods.
You can:
Override the visitFileFailed method, to handle the case you cannot access a file for some reasons; (Lukasz_Plawny's advice)
(optional) Override the preVisitDirectory method, checking for permissions before accessing the directory: if you can't access it, you can simply skip its subtree (keep in mind that you may be able to access a directory, but not all its files);
e.g. 1
#Override
public FileVisitResult visitFileFailed(Path file, IOException exc) {
// you can log the exception 'exc'
return FileVisitResult.SKIP_SUBTREE;
}
e.g. 2
#Override
public FileVisitResult preVisitDirectory(Path dir, BasicFileAttributes attrs) {
if(!Files.isReadable(dir))
return FileVisitResult.SKIP_SUBTREE;
return FileVisitResult.CONTINUE;
}
FileVisitor docs
FileVisitor tutorial
Hope it helps.
Just a few completely untested ideas:
1) Run your app with root priviledges to begin with:
sudo java -jar myapp.jar
2) Let your app start a launcher-class that requests root permissions and then continues running the rest of your app:
java -jar myapp.jar
This in turn does execute a shell command, but only an xterm that prompts for root password, and then continues to run a java program with root permissions:
xterm -e "sudo sh -c 'java -jar /tmp/myrootapp.jar'"
or perhaps use something nicer-looking using gksudo. Mind the ' and ".
Maybe the myapp.jar extracts itself into a temporary directory. myapp.jar contains myrootapp.jar and thus it can launch it as described above. /tmp should of course be retrieved from within java, and preferably be a directory with a random name that only the user running myapp.jar has access to in order to prevent myrootapp.jar injection.
Cross-platform
You mentioned /var/ yourself, so I assumed you were on some sort of Linux. If this is supposed to work cross-platform, e.g. on Macintosh or Microsoft Windows too, you need to do some sort of system identification first. Then you can apply StrategyPattern in code to handle the various ways of letting myrootapp.jar obtain root or administrator permissions.
There is no easy way to change permissions. Java is not good at these tasks. There are only some tricks like check permissions on start and try to change permissions via su/sudo then restart application or using Java-gnome. Please read a bit more here: Java: Ask root privileges on Ubuntu

Finding total file descriptors throws exception

I'm trying to find total file descriptors and found that sigar api allows to get those information. However while trying to do the below
Sigar sigar = new Sigar();
sigar.getProcFd(<pid>);
replaced the pid with an actual process if, throws the following exception:
org.hyperic.sigar.SigarNotImplementedException: This method has not been implemented on this platform
at org.hyperic.sigar.SigarNotImplementedException.<clinit>(SigarNotImplementedException.java:28)
at org.hyperic.sigar.ProcFd.gather(Native Method)
at org.hyperic.sigar.ProcFd.fetch(ProcFd.java:30)
at org.hyperic.sigar.Sigar.getProcFd(Sigar.java:531)
From the exception it's clear that the native Method - gather() hasn't been implemented/available on my OS (Mac OS X). How do I fix this? I tried adding the "libsigar-universal64-macosx.dylib" to the classpath but with no luck.
Also, I tried creating ProcFd like below instead of getting it from sigar:
ProcFd proc = new ProcFd();
System.out.println("Total FD: " + proc.getTotal());
In this case the output is always 0. Based on the api doc it looks like it should be providing the total number of open file descriptor (http://cpansearch.perl.org/src/DOUGM/hyperic-sigar-1.6.3-src/docs/javadoc/org/hyperic/sigar/ProcFd.html). Not sure if it's returning 0 because of the same reason as above i.e. missing implementation for my OS. Is that correct?
Also, wondering why is that when ProcFd is got using "sigar.getProcFd()" it throws the above mentioned exception. But when created using "ProcFd proc = new ProcFd()" it doesn't, however proc.getTotal() always returns 0?
I ended up using lsof in shell script instead of using sigar library. Never got this to work on mac. I tried in Linux and it worked without any issues.
The answer is in the documentation (http://cpansearch.perl.org/src/DOUGM/hyperic-sigar-1.6.3-src/docs/javadoc/org/hyperic/sigar/ProcFd.html), and as per your finding: OSX is not supported.
getTotal
public long getTotal()
Get the Total number of open file descriptors.
Supported Platforms: AIX, HPUX, Linux, Solaris, Win32.
System equivalent commands:
AIX: lsof
Darwin: lsof
FreeBSD: lsof
HPUX: lsof
Linux: lsof
Solaris: lsof
Win32:
Returns:
Total number of open file descriptors

InvalidInputException When loading file into Hbase MapReduce

I am very new for Hadoop and Map Reduce. For starting bases i executed Word Count Program. It executed well but when i try running csv file into Htable which i followed [Csv File][1]
It throwing me in to following error which i am not aware of it, please can any one help me in knowing the above error
12/09/07 05:47:31 ERROR security.UserGroupInformation: PriviledgedActionException as:hduser cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path [1]: http://salsahpc.indiana.edu/ScienceCloud/hbase_hands_on_1.htm#shell_exercises
This error is really kiiling my time, please can any one help me with this exception
The problem why you are directing to path hdfs://HadoopMaster:54310/user/hduser/csvtable instead of csvtable is.
1) Add your Hbase jars into Hadoop class path because your Map reduce doesn't by default configure to hbase jars.
2) GO to hadoop-ev.sh and edit Hadoop_classpath and add all your hbase jars in it. hope it might work now
your job is attempting to read an input file from:
hdfs://HadoopMaster:54310/user/hduser/csvtable
you should verify that this file exists on HDFS using the hadoop shell tools:
hadoop fs -ls /user/hduser/csvtable
my guess is that your file hasn't been loaded onto HDFS.

How to use Sqoop in Java Program?

I know how to use sqoop through command line.
But dont know how to call sqoop command using java programs .
Can anyone give some code view?
You can run sqoop from inside your java code by including the sqoop jar in your classpath and calling the Sqoop.runTool() method. You would have to create the required parameters to sqoop programmatically as if it were the command line (e.g. --connect etc.).
Please pay attention to the following:
Make sure that the sqoop tool name (e.g. import/export etc.) is the first parameter.
Pay attention to classpath ordering - The execution might fail because sqoop requires version X of a library and you use a different version. Ensure that the libraries that sqoop requires are not overshadowed by your own dependencies. I've encountered such a problem with commons-io (sqoop requires v1.4) and had a NoSuchMethod exception since I was using commons-io v1.2.
Each argument needs to be on a separate array element. For example, "--connect jdbc:mysql:..." should be passed as two separate elements in the array, not one.
The sqoop parser knows how to accept double-quoted parameters, so use double quotes if you need to (I suggest always). The only exception is the fields-delimited-by parameter which expects a single char, so don't double-quote it.
I'd suggest splitting the command-line-arguments creation logic and the actual execution so your logic can be tested properly without actually running the tool.
It would be better to use the --hadoop-home parameter, in order to prevent dependency on the environment.
The advantage of Sqoop.runTool() as opposed to Sqoop.Main() is the fact that runTool() return the error code of the execution.
Hope that helps.
final int ret = Sqoop.runTool(new String[] { ... });
if (ret != 0) {
throw new RuntimeException("Sqoop failed - return code " + Integer.toString(ret));
}
RL
Find below a sample code for using sqoop in Java Program for importing data from MySQL to HDFS/HBase. Make sure you have sqoop jar in your classpath:
SqoopOptions options = new SqoopOptions();
options.setConnectString("jdbc:mysql://HOSTNAME:PORT/DATABASE_NAME");
//options.setTableName("TABLE_NAME");
//options.setWhereClause("id>10"); // this where clause works when importing whole table, ie when setTableName() is used
options.setUsername("USERNAME");
options.setPassword("PASSWORD");
//options.setDirectMode(true); // Make sure the direct mode is off when importing data to HBase
options.setNumMappers(8); // Default value is 4
options.setSqlQuery("SELECT * FROM user_logs WHERE $CONDITIONS limit 10");
options.setSplitByCol("log_id");
// HBase options
options.setHBaseTable("HBASE_TABLE_NAME");
options.setHBaseColFamily("colFamily");
options.setCreateHBaseTable(true); // Create HBase table, if it does not exist
options.setHBaseRowKeyColumn("log_id");
int ret = new ImportTool().run(options);
As suggested by Harel, we can use the output of the run() method for error handling. Hoping this helps.
There is a trick which worked out for me pretty well. Via ssh, you can execute the Sqoop command directly. Just you have to use is an SSH Java Library
This is independent of Java. You just need to include any SSH library and sqoop installed in the remote system you want to perform the import. Now connect to the system via ssh and execute the commands which will export data from MySQL to hive.
You have to follow this step.
Download sshxcute java library: https://code.google.com/p/sshxcute/
and Add it to the build path of your java project which contains the following Java code
import net.neoremind.sshxcute.core.SSHExec;
import net.neoremind.sshxcute.core.ConnBean;
import net.neoremind.sshxcute.task.CustomTask;
import net.neoremind.sshxcute.task.impl.ExecCommand;
public class TestSSH {
public static void main(String args[]) throws Exception{
// Initialize a ConnBean object, the parameter list is IP, username, password
ConnBean cb = new ConnBean("192.168.56.102", "root","hadoop");
// Put the ConnBean instance as parameter for SSHExec static method getInstance(ConnBean) to retrieve a singleton SSHExec instance
SSHExec ssh = SSHExec.getInstance(cb);
// Connect to server
ssh.connect();
CustomTask sampleTask1 = new ExecCommand("echo $SSH_CLIENT"); // Print Your Client IP By which you connected to ssh server on Horton Sandbox
System.out.println(ssh.exec(sampleTask1));
CustomTask sampleTask2 = new ExecCommand("sqoop import --connect jdbc:mysql://192.168.56.101:3316/mysql_db_name --username=mysql_user --password=mysql_pwd --table mysql_table_name --hive-import -m 1 -- --schema default");
ssh.exec(sampleTask2);
ssh.disconnect();
}
}
If you know the location of the executable and the command line arguments you can use a ProcessBuilder, this can then be run a separate Process that Java can monitor for completion and return code.
Please follow the code given by vikas it worked for me and include these jar files in classpath and import these packages
import com.cloudera.sqoop.SqoopOptions;
import com.cloudera.sqoop.tool.ImportTool;
Ref Libraries
Sqoop-1.4.4 jar /sqoop
ojdbc6.jar /sqoop/lib (for oracle)
commons-logging-1.1.1.jar hadoop/lib
hadoop-core-1.2.1.jar /hadoop
commons-cli-1.2.jar hadoop/lib
commmons-io.2.1.jar hadoop/lib
commons-configuration-1.6.jar hadoop/lib
commons-lang-2.4.jar hadoop/lib
jackson-core-asl-1.8.8.jar hadoop/lib
jackson-mapper-asl-1.8.8.jar hadoop/lib
commons-httpclient-3.0.1.jar hadoop/lib
JRE system library
1.resources.jar jdk/jre/lib
2.rt.jar jdk/jre/lib
3. jsse.jar jdk/jre/lib
4. jce.jar jdk/jre/lib
5. charsets,jar jdk/jre/lib
6. jfr.jar jdk/jre/lib
7. dnsns.jar jdk/jre/lib/ext
8. sunec.jar jdk/jre/lib/ext
9. zipfs.jar jdk/jre/lib/ext
10. sunpkcs11.jar jdk/jre/lib/ext
11. localedata.jar jdk/jre/lib/ext
12. sunjce_provider.jar jdk/jre/lib/ext
Sometimes u get error if your eclipse project is using JDK1.6 and the libraries you add are JDK1.7 for this case configure JRE while creating project in eclipse.
Vikas if i want to put the imported files into hive should i use options.parameter ("--hive-import") ?

Categories