Java File Object for a File in ADLS Gen2 - java

I have a tool that works for on-premise data upload. Basically, it reads the file from local system i.e.(on-premise: Linux or Windows) and send it over to a location.
It makes use of Java File class. eg: new File("/dir/file.txt")
I want to make use of the same code for input files on ADLS Gen2. I would be running the code on Azure Databricks and stuck with getting the File object for the files in ADLS Gen2. I am using wasbs protocol for making the File object, but it is coming as null as Java is not recognizing the directory structure.

If this tool is using local file access, then you can still use it by mounting the ADLS as DBFS to some location, like, /mnt, and then use this mount locally as /dbfs/<mount-point> (/dbfs/mnt/).

Related

Kafka Configure PKCS12 `ssl.keystore.location=user.p12` without access to local file system

I can successfully connect to an SSL secured Kafka cluster with the following client properties:
security.protocol=SSL
ssl.truststore.type=PKCS12
ssl.truststore.location=ca.p12
ssl.truststore.password=<redacted>
ssl.keystore.type=PKCS12
ssl.keystore.location=user.p12
ssl.keystore.password=<redacted>
However, I’m writing a Java app that is running in a managed cloud environment, where I don’t have access to the file system. So I can’t just give it a local file path to .p12 files.
Are there any other alternatives, like using loading from S3, or from memory, or from a JVM classpath resource?
Specifically, this is a Flink app running on Amazon's Kinesis Analytics Managed Flink cluster service.
Sure, you can download whatever you want from wherever before you give a properties object to a KafkaConsumer. However, the user running the Java process will need some access to the local filesystem in order to download files.
I think packaging the files as part of your application JAR makes more sense, however, I don't know an easy way to refer to a classpath resource as if it were a regular filesystem path. If the code runs in YARN cluster, you can try using yarn.provided.lib.dirs option when submitting as well
I used a workaround tempoarily, upload your certificates to a fileshare and make your application, during initialization, dowload the certificates from the file share and save it to the location of choice like /home/site/ca.p12 then kakfa properties should read
...
ssl.truststore.location=/home/site/ca.p12
...
Here are few lines of code to help you download and save your certificate.
// Create the Azure Files client.
CloudFileClient fileClient = storageAccount.createCloudFileClient();
// Get a reference to the file share
CloudFileShare share = fileClient.getShareReference("[SHARENAME]");
// Get a reference to the root directory for the share.
CloudFileDirectory rootDir = share.getRootDirectoryReference();
// Get a reference to the directory where the file to be deleted is in
CloudFileDirectory containerDir = rootDir.getDirectoryReference("[DIRECTORY]");
CloudFile file = containerDir.getFileReference("[FILENAME]");
file.downloadToFile("/home/site/ca.p12");

How to consult a file in prolog from mongodb or java

I'm implementing a web application with Java which in a part of it, I create a SWI-Prolog file. I know about the consult command in Prolog but this command needs an absolute path. But, because i use MongoDB to store my results i would like to store that file in Mongo and not to create directory on my project filesystem to consult the file. Is it possible to consult the file to Prolog without having the actual *.pl file in the root directory?
The argument of consult/1 does not need to be an absolute path. It is just relative to the working directory of Prolog that you may get using ?- pwd.
That still requires you to save the file. That too is not necessary if you can transfer the data through some other means (e.g., networking). If you can somehow get a Prolog stream to the data you can use load_files/2 using the stream(In) option to load the program file.

How to tell if a directory on a Linux machine has an external file system mounted to it in Java?

I have three separate Linux servers that mount and share the same single file system under a directory called /efs
I have a Java application that uses this file system, and needs to be able to verify that the file system has been mounted correctly (Or else, it would simply write to /efs on the local machine instead of the shared storage without knowing) - How would I detect at run time from my application that the file system has been mounted to the directory?
Sorry if this is a duplicate question. I really did try to find information on this but I couldn't find a clear answer.
I see a few approaches to this problem:
If the mounted filesystem at /efs is different from the root filesystem, you can compare them using Files.getFileStore(path).type(). The filesystem for / being the same would then be a clear indicator that the /efs mount is missing. This assumes JDK >= 7.
Read and parse /proc/mounts to see which file systems are mounted and with which options. This is the same data source Java's FileStore API uses under the hood, so you might take the parsing logic directly from the JDK sources. This would be independent on the Java version.
Have a file on the mounted filesystem which is never on the root filesystem which can then be checked for simply by Files.exists(path) or new File(path).exists(). This would be independent from the Java version and even independent from Linux.
Answering my own question - It just occurred to me to create a file in the mounted file system and simply check the file exists avoiding any OS dependent code.

How to refer a file system in Cloudbess?

I'm new to Cloudbees. I just opened an account and trying to upload a .JAR file which basically downloads a file to the location mentioned by user (through java command line arguments). So far I have run the .JAR in my local. So far, I was referring to my local file system to save the file. If I deploy my .JAR file to Cloudbees SDK, where can I save the downloaded file (and then process it).
NOTE: I know this is not a new requirement in java if we deploy the jar in UNIX/WINDOWS OS where we can refer the file system w.r.t to home directory.
Edit#1:
I've seen couple of discussions about the same topic.
link#1
link#2
Everywhere they are talking about the ephemeral (temporary) file system which we can access through System.getProperty("java.io.tempDir"). But I'm getting null when I access java.io.tempDir. Did anyone manage to save files temporarily using this tempDir?
You can upload a jar with the java stack specifying the class and classpath (http://developer.cloudbees.com/bin/view/RUN/Java+Container)
Our filesystem however is not persistent, so if you are talking about saving a file from within your application, you could save it in this path
System.getProperty("java.io.tmpdir")
but it will be gone when your application hibernates, scales-up/down or is redeployed.
If you want a persistent way to store file/images, you can use AmazonS3 from your cloudbees application: uploading your files there will ensure their persistence.
You can find an example of how to do that in this clickstart:
http://developer-blog.cloudbees.com/search?q=amazon+s3
More information here: https://wiki.cloudbees.com/bin/view/RUN/File+system+access

How selenium can test if it has read access to a file

Our test-app runs on multiple Virtual Machines through Selenium Remote Control.
The App sits on a test controller Server.
The test-app is used to test a third party online application.
How can I test to see if on certain VM Selenium-RC has read access to a file or folder.
Is there anything like file.canRead(filepath) kind of thing for selenium too?
Before you respond:
File's canRead(filepath) will only test if the file is readable from a test controller server, not able to say anything if it is readable on VM where actual browsers are opening(testing) third-party-online-application.
Basically, I want to upload some file to the third-party-online-application through selenium.
Before doing an upload, I want to make sure that the file is available for upload (on VMs).
A solution would be to create a download link in the application and then attempt to download the file via Selenium. That way, you get a user-representative experience.
If you want to be really fancy, have the Application create a file with the current date and then let the test download the file (simple text file) and check if the file contains the date. Then you test application writing a file and user reading the file, which covers access rights as well.
Which scripting language you are using? If assuming that your file to upload resides under "./data" directory then in java you can check with following steps
File file = new File("./data/myfile.ext");
boolean canUpload = file.exists() && file.canRead();
String fileToUpload = file.getCanonicalPath(); //file name with full path
File file = new File("Folder_Location"); // Folder path if file name not known
boolean canUpload = file.listFiles()[index].canRead();
Note : For latest downloaded file use
int size=file.listFiles().length-1;
boolean canUpload = file.listFiles()[size].canRead();

Categories