Convert URL to normal windows filename Java - java

Is there a way to convert this:
/C:/Users/David/Dropbox/My%20Programs/Java/Test/bin/myJar.jar
into this?:
C:\Users\David\Dropbox\My Programs\Java\Test\bin\myJar.jar
I am using the following code, which will return the full path of the .JAR archive, or the /bin directory.
fullPath = new String(MainInterface.class.getProtectionDomain()
.getCodeSource().getLocation().getPath());
The problem is, getLocation() returns a URL and I need a normal windows filename.
I have tried adding the following after getLocation():
toString() and toExternalForm() both return:
file:/C:/Users/David/Dropbox/My%20Programs/Java/Test/bin/
getPath() returns:
/C:/Users/David/Dropbox/My%20Programs/Java/Test/bin/
Note the %20 which should be converted to space.
Is there a quick and easy way of doing this?

The current recommendation (with JDK 1.7+) is to convert URL → URI → Path. So to convert a URL to File, you would say Paths.get(url.toURI()).toFile(). If you can’t use JDK 1.7 yet, I would recommend new File(URI.getSchemeSpecificPart()).
Converting file → URI: First I’ll show you some examples of what URIs you are likely to get in Java.
-classpath URLClassLoader File.toURI() Path.toUri()
C:\Program Files file:/C:/Program%20Files/ file:/C:/Program%20Files/ file:///C:/Program%20Files/
C:\main.c++ file:/C:/main.c++ file:/C:/main.c++ file:///C:/main.c++
\\VBOXSVR\Downloads file://VBOXSVR/Downloads/ file:////VBOXSVR/Downloads/ file://VBOXSVR/Downloads/
C:\Résume.txt file:/C:/R%c3%a9sume.txt file:/C:/Résume.txt file:///C:/Résume.txt
\\?\C:\Windows (non-path) file://%3f/C:/Windows/ file:////%3F/C:/Windows/ InvalidPathException
Some observations about these URIs:
The URI specifications are RFC 1738: URL, superseded by RFC 2396: URI, superseded by RFC 3986: URI. (The WHATWG also has a URI spec, but it does not specify how file URIs should be interpreted.) Any reserved characters within the path are percent-quoted, and non-ascii characters in a URI are percent-quoted when you call URI.toASCIIString().
File.toURI() is worse than Path.toUri() because File.toURI() returns an unusual non-RFC 1738 URI (gives file:/ instead of file:///) and does not format URIs for UNC paths according to Microsoft’s preferred format. None of these UNC URIs work in Firefox though (Firefox requires file://///).
Path is more strict than File; you cannot construct an invalid Path from “\.\” prefix. “These prefixes are not used as part of the path itself,” but they can be passed to Win32 APIs.
Converting URI → file: Let’s try converting the preceding examples to files:
new File(URI) Paths.get(URI) new File(URI.getSchemeSpecificPart())
file:///C:/Program%20Files C:\Program Files C:\Program Files C:\Program Files
file:/C:/Program%20Files C:\Program Files C:\Program Files C:\Program Files
file:///C:/main.c++ C:\main.c++ C:\main.c++ C:\main.c++
file://VBOXSVR/Downloads/ IllegalArgumentException \\VBOXSVR\Downloads\ \\VBOXSVR\Downloads
file:////VBOXSVR/Downloads/ \\VBOXSVR\Downloads \\VBOXSVR\Downloads\ \\VBOXSVR\Downloads
file://///VBOXSVR/Downloads \\VBOXSVR\Downloads \\VBOXSVR\Downloads\ \\VBOXSVR\Downloads
file://%3f/C:/Windows/ IllegalArgumentException IllegalArgumentException \\?\C:\Windows
file:////%3F/C:/Windows/ \\?\C:\Windows InvalidPathException \\?\C:\Windows
Again, using Paths.get(URI) is preferred over new File(URI), because Path is able to handle the UNC URI and reject invalid paths with the \?\ prefix. But if you can’t use Java 1.7, say new File(URI.getSchemeSpecificPart()) instead.
By the way, do not use URLDecoder to decode a file URL. For files containing “+” such as “file:///C:/main.c++”, URLDecoder will turn it into “C:\main.c  ”! URLDecoder is only for parsing application/x-www-form-urlencoded HTML form submissions within a URI’s query (param=value&param=value), not for unquoting a URI’s path.
2014-09: edited to add examples.

String path = "/c:/foo%20bar/baz.jpg";
path = URLDecoder.decode(path, "utf-8");
path = new File(path).getPath();
System.out.println(path); // prints: c:\foo bar\baz.jpg

The current answers seem fishy to me.
java.net.URL.getFile
turns a file URL such as this
java.net.URL = file:/C:/some/resource.txt
into this
java.lang.String = /C:/some/resource.txt
so you can use this constructor
new File(url.getFile)
to give you the Windows path
java.io.File = C:\some\resource.txt

As was mentioned - getLocation() returns an URL. File can easily convert an URI to a path so for me the simpliest way is just use:
File fullPath = new File(MainInterface.class.getProtectionDomain().
getCodeSource().getLocation().toURI());
Of course if you really need String, just modify to:
String fullPath = new File(MainInterface.class.getProtectionDomain().
getCodeSource().getLocation().toURI()).toString();
You don't need URLDecoder at all.

The following code is what you need:
String path = URLDecoder.decode("/C:/Users/David/Dropbox/My%20Programs/Java/Test/bin/", "UTF-8");
System.out.println(new File(path).getPath());

Hello confused people from the future. There is a nuance to the file path configuration here. The path you are setting for TESSDATA_PREFIX is used internally by the C++ tesseract program, not by the java wrapper. This means that if you're using windows you will need to replace the leading slash and replace all other forward slashes with backslashes. A very hacky workaround looks like this:
URL pathUrl = this.getClass().getResource(TESS_DATA_PATH);
String pathStr = pathUrl.getPath();
// hack to get around windows using \ instead of /
if (SystemUtils.IS_OS_WINDOWS) {
pathStr = pathStr.substring(1);
pathStr = pathStr.replaceAll("/", "\\\\");
}

Related

How to convert network path to URL in Java

I have literally searched the whole internet for this question but I have not found an answer. I have a file, in the network and I want to create an Itext image with it and for that, I have to convert its path to URL. The problem is when I use path.toURI().toURL() it appends my project path to the URL such that my URL ends up starting with C:/ which will not work.
Is there a way to just convert a string to file URL in java?
I have tried this:
String paths = "‪\\\\DESKTOP-A11F076\\Users\\Benson Korir\\Desktop\\walgotech\\passport.jpg";
String first = "file:" + paths.replaceAll("\\\\", "//").replaceAll("////", "//");
String second = "file://desktop-a11f076//Users//Benson Korir//Desktop//walgotech//passport.jpg";
System.out.println(first);
System.out.println(second);
The second string I have copied directly from the browser and it works fine. Funny this is these two strings output the same thing but the first string brings an error when I use it here:
Image image1 = Image.getInstance(second);
I am getting the error below:
java.io.FileNotFoundException: ‪\DESKTOP-A11F076\Users\Benson Korir\Desktop\walgotech\passport.jpg (The system cannot find the path specified)
If I got your requirement correctly, your path is a UNC file name, and that is the short form of an SMB path, with DESKTOP-A11F076 being the remote machine, and \Users\Benson Korir\Desktop\walgotech\passport.jpg being the path to the file on that machine.
If I am correct with that assumption, my understanding is that your URL have to look like this: smb://‪DESKTOP-A11F076/Users/Benson Korir/Desktop/walgotech/passport.jpg.
As far I remember is a Java java.io.File object capable to handle a UNC file name (this article implies that, too), but when translating it to a URI, it tries to make it absolute first, and there it fails in your case.
I usually avoid working on Windows whenever possible, therefore I have no environment to verify that.

ClassLoader.getResource returns odd path (maybe)?

When loading an asset such as a text file from the resources folder, the most common approach is to use ClassLoader to get the path:
String path = getClass().getClassLoader().getResource("file.txt").getPath();
You can then use any of the many readers that java has to read the content of that file. But for some reason, Paths.get(path) is not happy with the path:
byte[] content = Files.readAllBytes(Paths.get(path))
-> throws java.nio.file.InvalidPathException when executed
ClassLoader.getResource(...).getPath() is returning:
/D:/Projects/myapp/build/resources/main/file.txt
Paths.get() doesn't like it. Apparently the ':' after /D is an 'Illegal char'. (Note that the path seems correct, the file is actually there)
Which one is causing the problem? Is ClassLoader.getResource() returning an invalid path or is Paths.get() acting up over nothing?
Some time later
It seems that there are multiple different formats for paths in java. The various frameworks don't appear to completely agree on what is right and what is wrong, therefore there are various discrepancies between the paths that they create and accept.
In this example, Paths.get() was in fact not expecting the leading slash in the path:
/D:/Projects/myapp/build/resources/main/vertex.vs.glsl <- EVIL
D:/Projects/myapp/build/resources/main/vertex.vs.glsl <- OK
I suppose that the question now is: How do I sanitise file paths returned by ClassLoader.getResource() for use with Paths.get() properly? Are there any other differences between their two file path formats?
"the most common approach" is not necessarily the best :)
Take care which path you mean: ClassLoader.getResource() returns a URL, which can have a path component. However, this is not necessarily a valid file-path.
Note, that there is also a method Paths.get(URI) which takes a URI as parameter
The first slash in /D:/Projects/myapp/build/resources/main/file.txt just means, that this is an absolute path: see Class.getResource
I recommend, that you simply use ClassLoader.html#getResourceAsStream when you want to read a file
Update to answer comment: "So why does Paths.get() not accept the absolute path?"
Paths.get() does accept absolute paths.
But you must pass a valid (file-)path - and in your case you pass the URL-path directly (which is not a valid file-path).
When you call: getClass().getClassLoader().getResource("file.txt") it returns a URL: file:/D:/Projects/myapp/build/resources/main/file.txt
this URL consists of the schema file:
and the valid (absolute URL)path: /D:/Projects/myapp/build/resources/main/file.txt
you try to use this URL-path directly as a file-path, which is wrong
thus the Paths.get(String,..) method throws an InvalidPathException
To convert the URL path to a valid file-path you could use the Paths.get(URI) method like so:
URL fileUrl = getClass().getClassLoader().getResource("file.txt");
Path filePath = Paths.get(fileUrl.toURI());
// now you have a valid file-path: D:/Projects/myapp/build/resources/main/file.txt
Please, have a look at the result of getClass().getClassLoader().getResource("file.txt"). It's a URL. With getPath() then you just retrieve the path part of that URL, ignoring protocol and server part. Opening the path part as a file might work under certain circumstances (in the easy cases where file syntax and URL path syntax match), but don't do it in production code.
Why? When you leave your IDE and deliver your application as JAR or WAR, the resources will reside inside a ZIP-compressed file, and there will be no file "file.txt" that you can open, there's only an entry in a JAR or WAR file.
As #TmTron pointed out, I also recommend to use ClassLoader.getResourceAsStream(). That will work in all cases.

Java.io.file constructor to deal with UNC file path

When I try to use the JAR file in the UNC path, I find I met a problem. The constructor of java.io.file will always convert a UNC file path to local path.
For example, I try
String dirStr = "file:\\\\dir1\dir2\file.jar!Myclass";
File ff = new File(dirStr);
System.out.println(ff.toString());
I'll get output like: file:\dir1\dir2\file.jar!Myclass. But what I expect to get is file:\\dir1\dir2\file.jar!MyClass.
I tried to add more slashes in the dirStr, but it can't work. Because in the java.io.file, it'll call method to remove duplicated slashes.
And I try to use the URI to create the ff. But the output will be \dir1\dir2\file.jar!Myclass, which is not available to use JAR file successfully. I think the form of JAR must be start with the file: protocol to use parse the string ending with ! in above string \dir1\dir2\file.jar!Myclass.
Is there any way can new File() to get the pathname of File, i.e. ff, like file:\\dir1\dir2\file.jar!MyClass.
Since your input dir String is UNC type, i think you should use Java's URI.
Example code:
URI uri = new URI(dirStr);
System.out.println(uri.toString()); // If you want to get the path as URI
File ff = new File(uri.getPath()); // If you want to access the file.
The other better way is using Path:
URI uri = new URI(dirStr);
Path path = Paths.get(uri); // Or directly Path path = Paths.get(dirStr);
File ff = path.toFile(); // << your file here
path.toUri(); // << your uri path here
The constructor File(String) takes a path, not a URL. Remove the file: part and use two backslashes for every one in the actual filename, to satisfy the compiler's escaping rules. Or use the correct number of forward slashes.

filesystem.getPath() returns wrong path

This problem is driving me crazy. I have a file I would like to reach in my src/main/resources folder and I am trying to obtain the path via:
FileSystem fileSystem = FileSystems.getDefault();
Path path = fileSystem.getPath(AnalysisEngine.class.getResource("/models/10_NB_7dev_2.model").getFile());
However, I keep getting the following error:
Illegal char <:> at index 2: /C:/Users/...(the path is here)/models/10_NB_7dev_2.model
As you can see, the path returned has '/' before C:, which ruins everything. What is the reason and how could this be fixed? Is there an alternative with java.io package?
I am using Windows 8 - 64 bit OS, if it helps.
The URL returned by Class#getResource(String) contains a preceding /.
/C:/Users/...(the path is here)/models/10_NB_7dev_2.model
That's just how URLs work. Then the FileSystem tries to parse that, but it makes no sense to it that there is a : character in the mix, so it throws an exception. In other words, getPath() is trying to create a path, not a url. You cannot have a : character in a Windows (possibly linux as well) path, unless it is directly following the Drive name as the first two characters of the path string.
The solution here is not to use the path of a classpath resource. A classpath resource might not come from the filesystem directly, it might be inside a jar.
...(the path is here)/models/10_NB_7dev.model
in your code you put:
("/models/10_NB_7dev_2.model").
Are you meaning to put a _2.?
If you are not worried about using the default filesystem (e.g. if you aren't using an in-memory filesystem for testing) then you can do:
URI uri = AnalysisEngine.class.getResource("/models/10_NB_7dev_2.model").toURI();
Path path = Paths.get(uri);

Convert URL to AbsolutePath

Is there any easy way to convert a URL that contains to two-byte characters into an absolute path?
The reason I ask is I am trying to find resources like this:
URL url=getClass().getResources("/getresources/test.txt");
String path=url.toString();
File f=new File(path);
The program can't find the file. I know the path contain '%20' for all spaces which I could convert but my real problem is I'm using a japanese OS and when the program jar file is in a directory with japanese text (for example デスクトップ) I get the URL-encoding of the directory name,
like this:
%e3%83%87%e3%82%b9%e3%82%af%e3%83%88%e3%83%83%e3%83%97
I think I could get the UTF-8 byte codes and convert this into the proper characters to find the file, but I'm wondering if there is an easier way to do this. Any help would be greatly appreciated.
nt
URL url = getClass().getResource("/getresources/test.txt");
File f = new File(url.toURI());
If you were interested in getting Path from URL, you can do:
Path p = Paths.get(url.toURI());
File has a constructor taking an argument of type java.net.URI for this case:
File f = new File(url.toURI());
Another option for those who use Java 11 or later:
Path path = Path.of(url.toURI());
or as a string:
String path = Path.of(url.toURI()).toString();
Both methods above throw a URISyntaxException that can be safely ignored if the URL is guaranteed to be a file URL.

Categories