Java URL problem - java

A webpage contains a link to an executable (i.e. If we click on the link, the browser will download the file on your local machine).
Is there any way to achieve the same functionality with Java?
Thank you

Yes there is.
Here a simple example:
You can have a JSF(Java Server Faces) page, with a supporting backing bean that contains a method annotated with #PostConstruct This means that any action(for example downloading), will occur when the page is created.
There is already a question very similar already, have a look at: Invoke JSF managed bean action on page load

You can use Java's, URL class to download a file, but it requires a little work. You will need to do the following:
Create the URL object point at the file
Call openStream() to get an InputStream
Open the file you want to write to (a FileOutputStream)
Read from the InputStream and write to the file, until there is no more data left to read
Close the input and output streams
It doesn't really matter what type of file you are downloading (the fact that it's an executable file is irrelevant) since the process is the same for any type of file.
Update: It sounds like what you actually want is to plug the URL of a webpage into the Java app, and have the Java app find the link in the page and then download that link. If that is the case, the wording of your question is very unclear, but here are the basic steps I would use:
First, use steps 1 and 2 above to get an InputStream for the page
Use something like TagSoup or jsoup to parse the HTML
Find the <a> element that you want and extract its href attribute to get the URL of the file you need to download (if it's a relative URL instead of absolute, you will need to resolve that URL against the URL of the original page)
Use the steps above to download that URL
Here's a slight shortcut, based on jsoup (which I've never used before, I'm just writing this from snippets stolen from their webpage). I've left out a lot of error checking, but hey, I usually charge for this:
Document doc = Jsoup.connect(pageUrl).get();
Element aElement = doc.getElementsByTag("a").first() // Obviously you may need to refine this
String newUrl = aElement.attr("abs:href"); // This is a piece of jsoup magic that ensures that the destination URL is absolute
// assert newUrl != null
URL fileUrl = new URL(newUrl);
String destPath = fileUrl.getPath();
int lastSlash = destPath.lastIndexOf('/');
if (lastSlash != -1) {
destPath = destPath.substring(lastSlash);
}
// Assert that this is really a valid filename
// Now just download fileUrl and save it to destPath
The proper way to determine what the destination filename should be (unless you hardcode it) is actually to look for the Content-Disposition header, and look for the bit after filename=. In that case, you can't use openStream() on the URL, you will need to use openConnection() instead, to get a URLConnection. Then you can use getInputStream() to get your InputStream and getRequestProperty("Content-Disposition") to get the header to figure out your filename. In case that header is missing or malformed, you should then fall-back to using the method above to determine the destination filename.

You can do this using apache commons IO FileUtils
http://commons.apache.org/io/apidocs/org/apache/commons/io/FileUtils.html#copyURLToFile(java.net.URL, java.io.File)
Edit:
I was able to successfully download a zip file from source forge site (it is not empty), It did some thing like this
import java.io.File;
import java.net.URL;
import org.apache.commons.io.FileUtils;
public class Test
{
public static void main(String args[])
{
try {
URL url = new URL("http://sourceforge.net/projects/gallery/files/gallery3/3.0.2/gallery-3.0.2.zip/download");
FileUtils.copyURLToFile(url, new File("test.zip"));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
I was able successfully download tomcat.exe too
URL url = new URL("http://archive.apache.org/dist/tomcat/tomcat-6/v6.0.16/bin/apache-tomcat-6.0.16.exe");

Related

Cannot find protocol 'resource' in URL constructor

I've implemented a class to read from RSS 2.0 and Atom 1.0 feeds. I want to write some unit tests in order to verify functionality. Here is the feed reader section of my code:
private String readFeed(final String url) throws IOException
{
final StringBuilder builder = new StringBuilder();
final URL feedUrl = new URL(url);
final BufferedReader in = new BufferedReader(
new InputStreamReader(feedUrl.openStream()));
String input;
while ((input = in.readLine()) != null)
{
builder.append(input);
}
in.close();
return builder.toString();
}
After some research, I figured the best way to test would be to have a sample feed as an XML file in my project resources directory.
I've created a example file "resources/rss2-0.xml"
I'm sending in the following value to the readFeed function, "resource:///rss2-0.xml",
and I keep receiving java.net.MalformedURLException: unknown protocol: resource
This is my first time using a URL pathway to load from a resource. From what I can tell, resource seems like it should be a valid protocol. Anyone have any ideas what I may be doing wrong or other ways to go about this?
If you want to deal with path using your local file system, the Path class is best suited for this task.
An object that may be used to locate a file in a file system. It will
typically represent a system dependent file path.
You can use it like so :
Path path = FileSystems.getDefault().getPath("/resources/rss2-0.xml");
BufferedReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8);
If your really want to deal with URL, the protocol you're looking for is simply "file". So it would be file:///rss2-0.xml instead of resource:///rss2-0.xml and even file:/resources/rss2-0.xml to be exact.
Note that in your case, you will indeed have to deal with URLs sooner or later, but when working on local tests, using the Path class will save you troubles. If you want another alternative, try the URI class. Since an URI is an identifier (see difference between URI and URL) it can identify either an URL or a Path an may serve as a bridge between your production code which will ultimately deal with URLs and your test code where the Path class could be best put in use.
For example :
public interface FeedReader {
String readFeed(final URI uri);
}
And 2 implementations, one for testing locally :
public class LocalFeedReader implements FeedReader {
#Override
public String readFeed(final URI uri) {
// URI -> Path
// then dealing with Path to target local rss2-0.xml file
}
}
And one for production code :
public class WebFeedReader implements FeedReader {
#Override
public String readFeed(final URI uri) {
// URI -> URL
// then dealing with URL to target real resources
}
}
The java docs say that only http, https, file, and jar are "guaranteed" to exist on the search path for protocol handlers. Others only "may" be supported.
http://docs.oracle.com/javase/8/docs/api/java/net/URL.html#URL-java.lang.String-java.lang.String-int-java.lang.String-
It looks like if you want a custom handler that isn't supported in your java distribution, you'll have to create one.
http://mjremijan.blogspot.com/2012/02/create-your-own-java-url-handlers.html

Downloading file from dropbox in java

I'm writing a swing application, but I'm sure I'll think of more to add to it later, so I would like a way to download the file from dropbox if its new. I've tried a lot of different things, but all they give me are the page's HTML. Anyone know how to do this? I sure don't.
In my opinion, the Dropbox API is far too complicated for what you need.
It's actually extremely simple to download a file from dropbox.
The first step is to put the file that you want to download somewhere inside your dropbox's Public Folder.
Next you want to right click that file and choose "copy public link." You can do this from the web interface or even right there in your computer-sync-folder-thing. This will give you a unique download url for the file.
Next, use this code:
String url="https://dl.dropboxusercontent.com/u/73386806/Prune%20Juice/Prune%20Juice.exe";
String filename="PruneJuice.exe";
try{
URL download=new URL(url);
ReadableByteChannel rbc=Channels.newChannel(download.openStream());
FileOutputStream fileOut = new FileOutputStream(filename);
fileOut.getChannel().transferFrom(rbc, 0, 1 << 24);
fileOut.flush();
fileOut.close();
rbc.close();
}catch(Exception e){ e.printStackTrace(); }
Of course, change the value of the url string to your own download url, and the value of filename to whatever you want to save the file as.
Now, if this fails, you may need to change the url from https:// to http://, but either way it should still work. Dropbox is cool like that.

Test if a file is an image file

I am using some file IO and want to know if there is a method to check if a file is an image?
This works pretty well for me. Hope I could help
import javax.activation.MimetypesFileTypeMap;
import java.io.File;
class Untitled {
public static void main(String[] args) {
String filepath = "/the/file/path/image.jpg";
File f = new File(filepath);
String mimetype= new MimetypesFileTypeMap().getContentType(f);
String type = mimetype.split("/")[0];
if(type.equals("image"))
System.out.println("It's an image");
else
System.out.println("It's NOT an image");
}
}
if( ImageIO.read(*here your input stream*) == null)
*IS NOT IMAGE*
And also there is an answer: How to check a uploaded file whether it is a image or other file?
In Java 7, there is the java.nio.file.Files.probeContentType() method. On Windows, this uses the file extension and the registry (it does not probe the file content). You can then check the second part of the MIME type and check whether it is in the form <X>/image.
You may try something like this:
String pathname="abc\xyz.png"
File file=new File(pathname);
String mimetype = Files.probeContentType(file.toPath());
//mimetype should be something like "image/png"
if (mimetype != null && mimetype.split("/")[0].equals("image")) {
System.out.println("it is an image");
}
You may try something like this:
import javax.activation.MimetypesFileTypeMap;
File myFile;
String mimeType = new MimetypesFileTypeMap().getContentType( myFile ));
// mimeType should now be something like "image/png"
if(mimeType.substring(0,5).equalsIgnoreCase("image")){
// its an image
}
this should work, although it doesn't seem to be the most elegant version.
There are a variety of ways to do this; see other answers and the links to related questions. (The Java 7 approach seems the most attractive to me, because it uses platform specific conventions by default, and you can supply your own scheme for file type determination.)
However, I'd just like to point out that no mechanism is entirely infallible:
Methods that rely on the file suffix will be tricked if the suffix is non-standard or wrong.
Methods that rely on file attributes (e.g. in the file system) will be tricked if the file has an incorrect content type attribute or none at all.
Methods that rely on looking at the file signature can be tricked by binary files which just happen to have the same signature bytes.
Even simply attempting to read the file as an image can be tricked if you are unlucky ... depending on the image format(s) that you try.
Other answers suggest to load full image into memory (ImageIO.read) or to use standard JDK methods (MimetypesFileTypeMap and Files.probeContentType).
First way is not efficient if read image is not required and all you really want is to test if it is an image or not (and maybe to save it's content type to set it in Content-Type response header when this image will be read in the future).
Inbound JDK ways usually just test file extension and not really give you result that you can trust.
The way that works for me is to use Apache Tika library.
private final Tika tika = new Tika();
private MimeType detectImageContentType(InputStream inputStream, String fileExtension) {
Assert.notNull(inputStream, "InputStream must not be null");
String fileName = fileExtension != null ? "image." + fileExtension : "image";
MimeType detectedContentType = MimeType.valueOf(tika.detect(inputStream, fileName));
log.trace("Detected image content type: {}", detectedContentType);
if (!validMimeTypes.contains(detectedContentType)) {
throw new InvalidImageContentTypeException(detectedContentType);
}
return detectedContentType;
}
The type detection is based on the content of the given document stream and the name of the document. Only a limited number of bytes are read from the stream.
I pass fileExtension just as a hint for the Tika. It works without it. But according to documentation it helps to detect better in some cases.
The main advantage of this method compared to ImageIO.read is that Tika doesn't read full file into memory - only first bytes.
The main advantage compared to JDK's MimetypesFileTypeMap and Files.probeContentType is that Tika really reads first bytes of the file while JDK only checks file extension in current implementation.
TLDR
If you plan to do something with read image (like resize/crop/rotate it), then use ImageIO.read from Krystian's answer.
If you just want to check (and maybe store) real Content-Type, then use Tika (this answer).
If you work in the trusted environment and you are 100% sure that file extension is correct, then use Files.probeContentType from prunge's Answer.
Here's my code based on the answer using tika.
private static final Tika TIKA = new Tika();
public boolean isImageMimeType(File src) {
try (FileInputStream fis = new FileInputStream(src)) {
String mime = TIKA.detect(fis, src.getName());
return mime.contains("/")
&& mime.split("/")[0].equalsIgnoreCase("image");
} catch (IOException e) {
throw new RuntimeException(e);
}
}

java.nio.file.Path for URLs?

Java7 ships with a default Path implementation for local files. Is there a Path implementation for URLs?
For example, I should be able to copy a remote resource using the following code:
Path remote = Paths.get(new URI("http://www.example.com/foo/bar.html"));
Path local = Paths.get(new URI("/bar.html"));
Files.copy(remote, local);
Currently, this throws java.nio.file.FileSystemNotFoundException: Provider "http" not installed. I could probably implement this myself but I'd rather not reinvent the wheel.
It seems like what you're really trying to do is accomplish what FTP does - copy files from one place to another. I would suggest you find better ways to do this with existing FTP code libraries.
URIs are not file system paths, so you can't treat them as such. They are addresses/resource locators that, when you go there with your browser (or another client that handles them), they trigger some action as defined by the server that's behind them. There's no standard for what that server does, hence the flexibility of web services. Therefore, if your server is doing to accept HTTP requests in this manner to facilitate file copies, you're going to have to roll your own, and pass the file data into a POST request.
To say it another way, (1) don't treat URIs like they are file system paths - they aren't, (2) find an FTP library to copy files, and/or (3) if you really want to build a web service that does this, abstract the details of the file copying via a POST request. If you do #3 understand that what your building is pretty close to custom, and that it will probably only work on a subset of sites that follow your particular design (i.e. the ones you build yourself). There's no standard set of parameters or "file copying" via POST command that I'm aware of that you can leverage to make this "just work" - you're going to have to match up your HTTP request with the web service on the server side.
You can do:
URI uri = new URI("http://www.example.com/foo/bar.html");
try (InputStream is = uri.toURL().openStream()) {
// ...
}
It will work for http, https and file out of the box, probably for few more.
For relative URIs, you have to resolve them first:
URI relative = new URI("bar.html");
URI base = new URI("http://www.example.com/foo/");
URI absolute = base.resolve(relative);
System.out.println(absolute); // prints "http://www.example.com/foo/bar.html"
Now you can call toURL().openStream() on the absolute URI.
I had to deal with a similar problem and wrote a little method to solve this, you can find below.
It works fine for concatenation of URL and relative suffixes. Be careful not to give suffix absolute because of the behaviour of URI resolve function.
private URI resolveURI(String root, String suffix) {
URI uri;
try {
uri = new URI(root + "/").resolve(suffix);
} catch (URISyntaxException e) {
//log
}
return uri;
}

Locating a file in a network disk in a Servlet

I create ImageServlet to refer to videos out of my web application scope.
The location of all of my videos are on a intranet location that could be reached from any computer in the intranet:
String path = "\\myip\storage\ogg\VX-276.ogg"
In my application, when I write it as URL - it can't display it!
If I try to open it with chrome it automatically changes it to file://myip/storage/ogg/VX-276.ogg and the file is being displayed.
I tried to do so: file:////odelyay_test64/storage/ogg/
as well but Java converts the string to: file:\myip\storage\ogg\VX-276.ogg which does not exist!
What is the correct way to refer to it?
EDITED
I create a small test:
String path = "file://myip/storage/ogg/VX-276.ogg";
File file = new File(path);
if (file.exists())
System.out.println("exists");
else {
System.out.println("missing" + file.getPath());
}
and I get:
missing file:\myip\storage\ogg\VX-276.ogg
As you can see the slashes are being switched
As per your previous question, you're referencing the resource in a HTML <video> tag. All URLs in the HTML source code must be http:// URLs (or at least be relative to a http:// URL). Most browsers namely refuse to load resources from file:// URLs when the HTML page is itself been requested by http://. You just need to let the URL point to the servlet. If the servlet's doGet() method get hit, then the URL is fine and you should not change it.
Your concrete problem is in the way how you open and read the desired file in the servlet. You need to ensure that the path in File file = new File(path) points to a valid location before you open a FileInputStream on it.
String path = "file://myip/storage/ogg/VX-276.ogg";
File file = new File(path);
// ...
If the servlet code is well written that it doesn't suppress/swallow exceptions and you have read the server logs, then you should have seen an IOException such as FileNotFoundException along with a self-explaining message in the server logs whenever reading the file fails. Go read the server logs.
Update as per the comments, it turns out that you're using Windows and thus file:// on a network disk isn't going to work for Java without mapping it on a drive letter. You need to map //myip on a drive letter first, for example X:.
String path = "X:/storage/ogg/VX-276.ogg";
File file = new File(path);
// ...
in the end I used VFS library of apache and my code looks like this:
public static void main(String[] args) {
FileSystemManager fsManager = null;
String path = "\\\\myip\\storage\\ogg\\VX-276.ogg";
try {
fsManager = VFS.getManager();
FileObject basePath;
basePath = fsManager.resolveFile("file:" + path);
if (basePath.exists())
System.out.println("exists");
else {
System.out.println("missing" + basePath.getURL());
}
} catch (FileSystemException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
In this way, I don't need to create a driver for each user of the system and it allows me not to depend on operation system!

Categories