About transfer file in hdfs - java

I need to transfer files from one hdfs folder to another hdfs folder in java code.
May I ask is there api that we can call to transfer files among hdfs paths?
Also I'd like to ask is there anyway to invoke a mapreduce job from java code? Of course, this java not running in hdfs.
Thank you very much and have a great weekend!

May I ask is there api that we can call to transfer files among hdfs paths?
Use the o.a.h.hdfs.DistributedFileSystem#rename method to move file from one folder in HDFS to another folder. The function has been overloaded and one of the method takes Options.Rename as a parameter.
FYI .... I haven't checked the code, but I think that rename involves changes to the name space and not any actual block movements.
Also I'd like to ask is there anyway to invoke a mapreduce job from java code? Of course, this java not running in hdfs.
Hadoop is written in Java, so there should be a way :) Use the o.a.h.mapreduce.Job#submit and o.a.h.mapreduce.Job#waitForCompletion methods.

Related

Embedding model files to jar in Tensorflow Java

I want to embed a pre-trained model in JAR file, and later use it in prediction using Tensorflow's Java API. My Tensorflow version is 1.12.0.
I have tensorflow model exported in Python using tf.saved_model.simple_save. This saves the PB file and variables to an export directory. I can successfully load this model in Java using SavedModelBundle.load and run predictions as long as the export directory is locally available.
Now I want to use my predictor in a restricted/remote environment as a custom function for Aster Analytics database. This database takes a jar file, and runs a user defined function during SQL calls.
My problem is that SavedModelBundle loads from a local directory and does not utilize file streams or similar methods to read resources in a JAR file. As this database has access restrictions, I can not create a permanent local directory and move the exported files there. Similarly, I can not call Tensorflow Serving or any other RPC/REST calls to an external server.
I don't want to create a temp directory and copy exported directory there at each function call (as it may leave left-over directories, and there could be simultaneous calls creating race conditions etc.)
Are there any efficient way to deliver the model in the jar file, and then read it? I'm hoping there is a way to use SavedModelBundle to read JAR resources. Alternatively, I'm hoping that I can read the model's graph using a file stream, and then initialize variables using files included in the JAR. However I do not know how to do either.
I appreciate any suggestions and/or directions.
Thx in advance.

Writing mapreduce output directly onto a webpage

I have a mapreduce job which writes its output to a file in HDFS. But instead of writing it to HDFS, I want the output to be written directly on a webpage. I have created a web project in eclipse and written driver, mapper and reducer classes in it. When I run it with tomcat server, it didn't work.
So how can the output be displayed on a webpage?
If you are using MAP-R distribution , you can write the output of your map reduce job to the file system (not the HDFS), but to fix your issue will require more info.
HDFS (by itself) is not really designed for low-latency random read/writes. A few options you do have however are WebHDFS / HTTPfs. This exposes a REST API to HDFS. http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.6.0/hadoop-project-dist/hadoop-hdfs/WebHDFS.html and http://hadoop.apache.org/docs/r2.4.1/hadoop-hdfs-httpfs/. You could have the webserver pull whatever file you want and serve it on the webpage. I don't think this is a very good solution however.
A better solution might be to have MapReduce output to HBase (http://hbase.apache.org/) and have your webserver pull from HBase. It is far better suited for low-latency random read / writes.

Java encryption on-the-fly by virtual drive

I have made an encryption system and I'm looking for a way to add some way to integrate it with file system on windows(if possible also on linux). I don't want to rise a debate whether it is needed, that it already exists etc...
I was hoping to find a way to mount a virtual disk drive that will be able to access the files in the decrypted form, encrypt and decrypt on the fly using my software, it is currently written in java, but if needed I can port it to c++.
I have found one way to do it, which is to run a java ssh server and use another software to mount it in windows, but it doesn't work work well, constant crashes or it sometimes just doesn't mount the drive.
I need it as I want to access the files using IDE and other programs without coping the files as it decreases the security and doubles the disk space used.
Has anyone found a way to do it preferably in java?
Is there some kind of API for it (all I need is list files, get parent, read file, write file)?
Or is there a good java lib what works with another program to do it?

how to write into a text file in Java

I am doing a project in java and in that i need to add and modify my
text file at runtime,which is grouped in the jar.
I am using class.getResourceAsStream(filename) this method we
can read that file from class path.
i want to write into the same textfile.
What is the possible solution for this.
If i can't update the text file in jar what other solution is there?
Appreciate any help.
The easiest solution here is to not put the file in the jar. It sounds like you are putting files in your jar so that your user only needs to worry about one file that contains everything related to that program. This is an artificial constraint and just add headaches.
There is a simple solution that still allows you to distribute just the jar file. At start up, attempt to read the file from the file system. If you don't find it, use default values that are encoded in you program. Then when changes are made, you can write it to the file system.
In general, you can't update a file that you located using getResourceAsStream. It might be a file in a JAR/ZIP file ... and writing it would entail rewriting the entire JAR file. It might be a remote file served up by a Url classloader.
For your sanity (and good practice), you should not attempt to update files that you access via the classpath. If you need to, read the file out of the JAR file (or whatever), copy it into the regular file system, and then update the copy.
I'm not saying that it is impossible to do this in all cases. Indeed, in most normal cases you can do it with some effort. However, this is not supported, and there are no standard APIs for doing this.
Furthermore, attempts to update resources are liable to cause anomalies in the classloader. For example, I'd expect resources in JAR files to not update (from the perspective of the application) until the application restarted. But resources in exploded JAR files probably would update ... though new resources might not show up.
Finally, there are cases where updating a resource is impossible:
When the user doesn't have write access to the application's installation directory. This is typical for a properly administered UNIX / Linux machine.
When the JAR file is fetched from a remote server, you are likely not to be able to write the updates back.
When you are using an arbitrary custom classloader, you've got no way of knowing where the actual bytes of an updated resource should be stored, and no way of storing them.
All JAR rewriting techniques in Java look similar. Open the Jar file, read all of it's contents, and write a new Jar file containing the unmodified contents (and the modifications you whished to make). Such techniques are not advisable for a Jar file on the class path, much less a Jar file you're running from.
If you decide you must do it this way, Java World has a few articles:
Modifying Archives, Part 1
Modifying Archives, Part 2
A good solution that avoids the need to put your items into a Jar file is to read (if present) a properties file out of a hidden subdirectory in the user's home directory. The logic looks a bit like this:
if (the hidden directory named after my application doesn't exist) {
makeTheHiddenDirectory();
writeTheDefaultPropertiesFile();
}
Properties appProps = new Properties();
appProps.load(new FileInputStream(fileInHiddenDir));
...
... After the appProps have changed ...
...
appProps.store(new FileOutputStream(fileInHiddenDir), "Do not modify this file");
Look to java.util.Properties, and keep in mind that they have two different load and store formats (key = value based and XML based). Pick the one that suits you best.
If i can't update the text file in jar what other solution is there?
Store the information in any of:
Cookies
The server
Deploy the applet using 1.6.0_10+, launch it using JWS and use the PersistenceService to store the information. Here is my demo. of the PersistenceService.
Also, if your users will agree to a trusted applet (which seems overkill for this), you might write the information to a sub-directory of user.home.

How to move files from one drive to another drive in java

I am trying to move a file using java, from one folder to another folder, however the folders are on different hard drives, which fails to work with the renameTo method. I only need this feature to work on linux...
-jason
Moving files between different filesystems requires you to copy them. The vanilla JDK doesn't have any method to do that, you'll have to do it yourself (e.g. by using FileInputStream / FileOutputStream).
Also check out this thread.
If you want to move files on different file Systems. Copy and Delete.
Apache IO FileUtils

Categories