java.nio.file.Path for URLs? - java

Java7 ships with a default Path implementation for local files. Is there a Path implementation for URLs?
For example, I should be able to copy a remote resource using the following code:
Path remote = Paths.get(new URI("http://www.example.com/foo/bar.html"));
Path local = Paths.get(new URI("/bar.html"));
Files.copy(remote, local);
Currently, this throws java.nio.file.FileSystemNotFoundException: Provider "http" not installed. I could probably implement this myself but I'd rather not reinvent the wheel.

It seems like what you're really trying to do is accomplish what FTP does - copy files from one place to another. I would suggest you find better ways to do this with existing FTP code libraries.
URIs are not file system paths, so you can't treat them as such. They are addresses/resource locators that, when you go there with your browser (or another client that handles them), they trigger some action as defined by the server that's behind them. There's no standard for what that server does, hence the flexibility of web services. Therefore, if your server is doing to accept HTTP requests in this manner to facilitate file copies, you're going to have to roll your own, and pass the file data into a POST request.
To say it another way, (1) don't treat URIs like they are file system paths - they aren't, (2) find an FTP library to copy files, and/or (3) if you really want to build a web service that does this, abstract the details of the file copying via a POST request. If you do #3 understand that what your building is pretty close to custom, and that it will probably only work on a subset of sites that follow your particular design (i.e. the ones you build yourself). There's no standard set of parameters or "file copying" via POST command that I'm aware of that you can leverage to make this "just work" - you're going to have to match up your HTTP request with the web service on the server side.

You can do:
URI uri = new URI("http://www.example.com/foo/bar.html");
try (InputStream is = uri.toURL().openStream()) {
// ...
}
It will work for http, https and file out of the box, probably for few more.
For relative URIs, you have to resolve them first:
URI relative = new URI("bar.html");
URI base = new URI("http://www.example.com/foo/");
URI absolute = base.resolve(relative);
System.out.println(absolute); // prints "http://www.example.com/foo/bar.html"
Now you can call toURL().openStream() on the absolute URI.

I had to deal with a similar problem and wrote a little method to solve this, you can find below.
It works fine for concatenation of URL and relative suffixes. Be careful not to give suffix absolute because of the behaviour of URI resolve function.
private URI resolveURI(String root, String suffix) {
URI uri;
try {
uri = new URI(root + "/").resolve(suffix);
} catch (URISyntaxException e) {
//log
}
return uri;
}

Related

Why can I get a valid url from getClass().getResource(), but the url that is returned creates a file that doesn't exist

I am trying to load some data into an AWS lambda and am using getClass().getResource() to do so. This returns a nice URL that in logs seemingly prints out a plausible url; however, when I try and make a file based on that path, I get a file that when I call .exists() returns false.
If I run the code bellow, the first print statement gives "returns exists: false"
Meanwhile, the second print statement gives something around the lines of "test path: /file:/var/task/lib/MyLambda-1.0.jar!/com/my/package/folders/file.end
File test = new File(cFile);
System.out.println("exists: " + test.exists());
System.out.println("test path: " + test.getAbsolutePath());
Not sure why this would be. If Java finds a file, then I would assume that the file exists...
Short answer: don't assume that the "path" of a URL is a file system pathname.
I am trying to load some data into an AWS lambda and am using getClass().getResource() to do so. This returns a nice URL that in logs seemingly prints out a plausible url;
Yes. (It would be nice if you showed us what the original URL looks like ... though I can guess.)
However, when I try and make a file based on that path, I get a file that when I call .exists() returns false.
OK, unless the URL has the protocol "file:", I would NOT expect that to work.
The path in a URL is a path that is intended for the protocol handler to resolve. The idea is that you use URL::openStream to open a stream to the resource named by the URL and then read it. The protocol handler takes care of interpreting the path (etc) and setting up the stream.
For a "file:" URL, the protocol handler will resolve the path in the file system, and provide you a stream to read the file.
For a "http:" URL, the protocol handler establishes a connection to the server, sends a GET request, and returns you a stream to read the response body.
For a "jar:" URL, the protocol handler opens the JAR file, finds the entry within the JAR file, and hands you a stream to read it.
And so on.
If you look at these, it is only in the "file:" case that there is a reasonable expectation that treating the path component of the URL as a file system pathname could work.
Looking at the pathname in your question:
file:/var/task/lib/MyLambda-1.0.jar!/com/my/package/folders/file.end
I surmise that the original URL was:
jar:file:/var/task/lib/MyLambda-1.0.jar!/com/my/package/folders/file.end
So what that says to the "jar:" protocol handler is:
Find the resource identified by the URL "file:/var/task/lib/MyLambda-1.0.jar"
Open it as a JAR file stream
Find the entry "/com/my/package/folders/file.end" in the JAR file's namespace
Open a stream to read that entry's content.
The JAR file protocol handler knows how to do that. But (clearly) the File class doesn't ... because that "path" is not a file system pathname.
How you solve this depends on what you really need.
If you just need a stream to read the resource, use getClass().getResourceAsStream(...) instead.
If it must be a file in the file system, you may have to get hold of the stream (see above), copy it to a temporary file, and use a File for the temporary file.
If you are doing the because you want to write to the "file", I would suggest that you give up on that idea. It is a bad idea for an application to try to update its resources. And in some cases it simply won't / cannot work.
Is your File test = new File(cFile), Is your cFile made correctly with a proper path? Maybe the last print statement is just picking up on the incorrect path you made? But in reality you don't actually have a file there. Have you checked manually?

Generate URI from String provided by user via command line argument

This is such a simple question, I'm sure the answer is out there and I'm simply not searching with the proper lingo. I'm new to Java, using Java 8, and want to learn how to properly handle this, rather than rigging it together.
The application takes in arguments via command line.
$ MyApp /home/user/thefiletheywant.me
I have tried the following:
// Missing Scheme, I know I can just force ("file:" + args[0]) but is that proper?
URI fileIn = new URI(args[0]);
// I've learned this is the same thing as above
URI fileIn = URI.create(args[0]);
I've seen examples that take the string, check with File.Separator to verify it is "/" and if not, replace it, then simply tack on "file:" in front. Which, again, seems sloppy.
What if the user added "http:"?
What if the user specifies a full path or a path relative to the directory they are currently in?
Do the builtin functions verify the path is proper? I'm aware of file.isFile() and file.exists(), which I can check myself easy enough.
If I knew exactly where the file was every time, of course the URI.create would be fine. But for future education, I want to know how to properly handle this very simple scenario. Please forgive me if in my searches I've simply somehow missed what I suspect is an easy solution.
You could just create a File object in java, which is OS-independent (Windows uses backslash for instance), check if it exists, and use the handy toURI() method on it, to create a valid URI object.
File myFile = new File(args[0]);
URI fileUri = null;
if(myFile.exists()) {
fileUri = myFile.toURI();
}

How to get list of files from URL

I have an URL http ://......../somefolder/ I want to get the names of all the files inside this folder. I have tried this below code but it's showing error.
URL url = new URL("http://.............../pages/");
File f=new File(url.getFile());
String list[]=f.list();
for(String x:list)
{
System.out.println(x);
}
Error :-Exception in thread "main" java.lang.NullPointerException
at Directory.main(Directory.java:25)
It's not possible to do it like this.
HTTP has no concept of a "folder". The thing you see when you open that URL is just another web page, which happens to have a bunch of links to other pages. It's not special in any way as far as HTTP is concerned (and therefore HTTP clients, like the one built into Java).
That's not to say it's completely impossible. You might be able to get the file list another way.
Edit: The reason your code doesn't work is that it does something completely nonsensical. url.getFile() will return something like "/......./pages/", and then you pass that into the File constructor - which gives you a File representing the path /....../pages/ (or C:\......\pages\ on Windows). f.list() sees that that path doesn't exist on your computer, and returns null. There is no way to get a File that points to a URL, just like there's no way to get an int with the value 5.11.

Java URL problem

A webpage contains a link to an executable (i.e. If we click on the link, the browser will download the file on your local machine).
Is there any way to achieve the same functionality with Java?
Thank you
Yes there is.
Here a simple example:
You can have a JSF(Java Server Faces) page, with a supporting backing bean that contains a method annotated with #PostConstruct This means that any action(for example downloading), will occur when the page is created.
There is already a question very similar already, have a look at: Invoke JSF managed bean action on page load
You can use Java's, URL class to download a file, but it requires a little work. You will need to do the following:
Create the URL object point at the file
Call openStream() to get an InputStream
Open the file you want to write to (a FileOutputStream)
Read from the InputStream and write to the file, until there is no more data left to read
Close the input and output streams
It doesn't really matter what type of file you are downloading (the fact that it's an executable file is irrelevant) since the process is the same for any type of file.
Update: It sounds like what you actually want is to plug the URL of a webpage into the Java app, and have the Java app find the link in the page and then download that link. If that is the case, the wording of your question is very unclear, but here are the basic steps I would use:
First, use steps 1 and 2 above to get an InputStream for the page
Use something like TagSoup or jsoup to parse the HTML
Find the <a> element that you want and extract its href attribute to get the URL of the file you need to download (if it's a relative URL instead of absolute, you will need to resolve that URL against the URL of the original page)
Use the steps above to download that URL
Here's a slight shortcut, based on jsoup (which I've never used before, I'm just writing this from snippets stolen from their webpage). I've left out a lot of error checking, but hey, I usually charge for this:
Document doc = Jsoup.connect(pageUrl).get();
Element aElement = doc.getElementsByTag("a").first() // Obviously you may need to refine this
String newUrl = aElement.attr("abs:href"); // This is a piece of jsoup magic that ensures that the destination URL is absolute
// assert newUrl != null
URL fileUrl = new URL(newUrl);
String destPath = fileUrl.getPath();
int lastSlash = destPath.lastIndexOf('/');
if (lastSlash != -1) {
destPath = destPath.substring(lastSlash);
}
// Assert that this is really a valid filename
// Now just download fileUrl and save it to destPath
The proper way to determine what the destination filename should be (unless you hardcode it) is actually to look for the Content-Disposition header, and look for the bit after filename=. In that case, you can't use openStream() on the URL, you will need to use openConnection() instead, to get a URLConnection. Then you can use getInputStream() to get your InputStream and getRequestProperty("Content-Disposition") to get the header to figure out your filename. In case that header is missing or malformed, you should then fall-back to using the method above to determine the destination filename.
You can do this using apache commons IO FileUtils
http://commons.apache.org/io/apidocs/org/apache/commons/io/FileUtils.html#copyURLToFile(java.net.URL, java.io.File)
Edit:
I was able to successfully download a zip file from source forge site (it is not empty), It did some thing like this
import java.io.File;
import java.net.URL;
import org.apache.commons.io.FileUtils;
public class Test
{
public static void main(String args[])
{
try {
URL url = new URL("http://sourceforge.net/projects/gallery/files/gallery3/3.0.2/gallery-3.0.2.zip/download");
FileUtils.copyURLToFile(url, new File("test.zip"));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
I was able successfully download tomcat.exe too
URL url = new URL("http://archive.apache.org/dist/tomcat/tomcat-6/v6.0.16/bin/apache-tomcat-6.0.16.exe");

Any way to specify absolute paths in FTP URLs?

I'm using the Java URL and URLConnection classes to upload an file to a server using FTP. I don't need to do anything other than simply upload the file, so I'd like to avoid any external libraries and I'm wary of using the non-supported sun.net.ftp class.
Is there any way to use absolute paths in the FTP connection string? I'd like to put my files in something like "/ftptransfers/..." but the FTP path is relative to the user home directory.
Sample upload code:
URL url = new URL("ftp://username:password#host/file.txt");
URLConnection uc = url.openConnection();
uc.setDoOutput(true);
OutputStream out = uc.getOutputStream() ;
out.write("THIS DATA WILL BE WRITTEN TO FILE".getBytes());
out.close();
I did actually find out there is a semi-standard way to do it that worked for me.
Short answer: replace the leading slash with "%2F"
Long answer: per the "A FTP URL Format" document:
For example, the URL "ftp://myname#host.dom/%2Fetc/motd" is
interpreted by FTP-ing to "host.dom", logging in as "myname"
(prompting for a password if it is asked for), and then executing
"CWD /etc" and then "RETR motd".
This has a different meaning from
"ftp://myname#host.dom/etc/motd" which would "CWD etc" and then
"RETR motd"; the initial "CWD" might be executed relative to the
default directory for "myname".
On the other hand,
"ftp://myname#host.dom//etc/motd", would "CWD " with a null
argument, then "CWD etc", and then "RETR motd".
I think that your best bet is to use the apache commons FTP component, and do a 'cd' after you make the connection.
you can always write a wrapper so that the URL can be specified in the format above if you so wish.
-ace

Categories