get file list from glob without specifying base directory - java

There are previous questions about checking if a file matches a glob pattern (here is one). However, I would like to get a list of files that match a glob pattern without having to specify the base directory to search. I need to accept both relative and absolute directories (I resolve the relative ones to a specified directory), and it needs to be cross-platform compatible.
Given a string such as "C:/users/foo/", "/user/foo/.txt" or "dir/*.txt", how do I get the list of matching paths?

Yes, you'll need a programmatic way to find out if a glob pattern is absolute. This can be done as follows:
for (String glob : new String[] { "../path/*.txt", "c:/../path/*.txt", "/../path/*.txt" }) {
System.out.println(glob + ": is " + (new File(glob).isAbsolute() ? "absolute" : "relative"));
}
On Windows this will output
../path/*.txt: is relative
c:/../path/*.txt: is absolute
/../path/*.txt: is relative
On unix the last is absolute. If you know the glob pattern is relative, prepend the special directory to it. After that you'll have an absolute path for all glob patterns and can use that to specify it for the search.
EDIT 1
As per you comment, you can do the following. But you can also mix and match nio and io. You should know that java.io.File.isAbsolute() only checks the file path FORMAT, not if the file actually exists, to determine if it's in absolute or relative form. It does that in a platform specific manor.
String baseDir = "c:/BaseDir/";
for (String glob : new String[] { "../path/*.txt", "c:/../path/*.txt", "/../path/*.txt" }) {
File file = new File(glob);
if (!file.isAbsolute()) {
file = new File(baseDir, glob);
}
System.out.println(file.getPath() + ": is " + (file.isAbsolute() ? "absolute" : "relative"));
}
this will print
c:\BaseDir\..\path\*.txt: is absolute
c:\..\path\*.txt: is absolute
c:\BaseDir\..\path\*.txt: is absolute
you will still have to do the globbing yourself or use any methods described in the post you mentioned (How to find files that match a wildcard string in Java?)

Related

list the filenames without showing the parent path

I am listing all the files names in a given directory( recursively). That includes showing the file names in sub-directories also.
File file = new File(FILE_PATH);
// Recursively search for all the resource files.
Collection files = FileUtils.listFiles(file, TrueFileFilter.INSTANCE, TrueFileFilter.INSTANCE);
for (Iterator iterator = files.iterator(); iterator.hasNext();)
{
File fileIter = (File) iterator.next();
System.out.println("File = " + fileIter.getPath());
}
Where File is the parent directory ("C:\Users\sd\Desktop\sdsd)
Now the code above works file and list me all the files in the that directory and sub directory, like
C:\Users\sd\Desktop\sdsd\TagCategory\healthoutcomes_queries\Neurological.txt
but I want to show only (the path inside of the parent path)
TagCategory\healthoutcomes_queries\Neurological.txt
How can I do that.
Use Path.relativize()
Constructs a relative path between this path and a given path.
Relativization is the inverse of resolution. This method attempts to
construct a relative path that when resolved against this path, yields
a path that locates the same file as the given path. For example, on
UNIX, if this path is "/a/b" and the given path is "/a/b/c/d" then the
resulting relative path would be "c/d".
So you just need to create a relative path from the parent path by invoking parentPath.relativize(filePath) and do it for each file :
Path parentPath = Paths.get(FILE_PATH);
for (Iterator<File> iterator = files.iterator(); iterator.hasNext();){
Path filePath = iterator.next().toPath();
Path relativePath = parentPath.relativize(filePath);
System.out.println("File = " + relativePath );
}
Note that you should use a generic collection to avoid casts : Collection<File>, and the modern idiom for looping through iterators using the "enhanced for loop" is cleaner to read:
for (File file : files) {
System.out.println("File = " +
parentPath.relativize(file.toPath()));
}
Just add substring:
fileIter.getPath().substring(file.length())
You can use the substring command to get that value as per the below..
If that parent directory is going to remain the same length then it would simply be
fileIter.getPath().substring(25);
This will get all of the characters after the 25th character in the string, if you wanted to omit the .txt for example you can specify where the substring will end, the below takes three off the total length.
fileIter.getPath().substring(25, fileIter.getPath().length() - 3);
For more details on the substring method see https://beginnersbook.com/2013/12/java-string-substring-method-example/
What's the point of complicating your code by using old third-party libraries? Just use plain Java: it does exactly the same thing as your multi-line method:
Path root = Paths.get(FILE_PATH);
Files.walk(root).forEach(path -> System.out.println("File = " + root.relativize(path)));

How to get the file path of a file

If I already have an existing file and I want to know its path using only its name, how can I do this?
I have the following code, but it return the name of the file even when it does not exists:
PathMatcher matcher = FileSystems.getDefault().getPathMatcher("glob:**.{java,class}");
Path filename = Paths.get("Go,mvDep.java");
if (matcher.matches(filename)) {
System.out.println(filename);
}
Thank you for your help!
I think the core of your confusion is that a Path does not necessarily represent a file on your computer. It is simply a Java object that represents a conceptual object in a file system. In the same way that you could construct a new Person("John", "Smith") without actually knowing anyone named 'John Smith', you can construct a Path regardless of whether or not a file exists at the given location.
Once you have a Path there are a number of things you can do with it, including check if it exists via Files.exists(), or create it with Files.createFile(). Generally speaking, the Files class lets you inspect and work with the actual file system objects a Path represents.
The intent of a PathMatcher is similarly disconnected from the actual file system; it exists to to determine if a given Path fits the PathMatcher's pattern - it's basically a Path-specific regular expression engine.
So what your code is actually doing is:
Creating a glob that will match any path which ends in .java or .class (regardless of whether such a path exists anywhere).
Constructing a relative Path to a file called Go,mvDep.java. Implicitly this path is relative to the current working directory, but you could pass it to Path.resolve() to create a new Path referring to a file in a different location.
Checking if the path Go,mvDep.java matches your glob, which it does since it ends in .java, so it prints the path.
It sounds like what you actually want is to find an existing file with the name Go,mvDep.java. If so, you want to use Files.find() to search a directory and return a stream of the files that match a BiPredicate<Path, BasicFileAttributes> matcher you define. Your matcher might look something like this:
new BiPredicate<Path, BasicFileAttributes>() {
public boolean test(Path path, BasicFileAttributes attributes) {
return matcher.matches(path);
}
}
Or in Lambda syntax simply:
(p, a) -> matcher.matches(p)

How to get the absolute path of a file when given a relative or absolute path and the absolute path it is relative to

Let's say I have an absolute 'base' path:
/home/someone/dir1/dir2/
The user can pass me a new path, that can either be absolute or relative to base path, so the following would both be valid:
..
/home/someone/dir1/
How do I get java to give me the correct absolute path ie for both these cases:
/home/someone/dir1/
and do this in a platform-independent way?
I tried the following:
File resolvedFile = new File((new File(basePath).toURI().resolve(new File(newPath).toURI())));
However, where newPath was relative, newFile(newPath) resolves it automatically against the current working directory, rather than the basePath I want to supply.
Any thoughts?
Many thanks!
Answering my own question..
Seems like it can be done in java 7 using Path:
Path p1 = Paths.get("/home/joe/foo");
// Result is /home/joe/foo/bar
System.out.format("%s%n", p1.resolve("bar"));
Since I can't get java 7 for my mac 10.5.8, I'm going with something like (NB NOT THOROUGHLY TESTED!):
static String getAbsolutePath(String basePath, String relativeOrAbsolutePath) throws IOException {
boolean isAbsolute = false;
File relativeOrAbsoluteFile = new File(relativeOrAbsolutePath);
if (relativeOrAbsoluteFile.isAbsolute()){
isAbsolute = true;
}
if (isAbsolute){
return relativeOrAbsolutePath;
}
else {
File absoluteFile = new File(basePath, relativeOrAbsolutePath);
return absoluteFile.toString();
}
}
Take a look at File#getCanonicalPath
From the JavaDocs:
Returns the canonical pathname string of this abstract pathname. A
canonical pathname is both absolute and unique. The precise definition
of canonical form is system-dependent. This method first converts this
pathname to absolute form if necessary, as if by invoking the
getAbsolutePath() method, and then maps it to its unique form in a
system-dependent way. This typically involves removing redundant names
such as "." and ".." from the pathname, resolving symbolic links (on
UNIX platforms), and converting drive letters to a standard case (on
Microsoft Windows platforms).
Every pathname that denotes an existing file or directory has a unique
canonical form. Every pathname that denotes a nonexistent file or
directory also has a unique canonical form. The canonical form of the
pathname of a nonexistent file or directory may be different from the
canonical form of the same pathname after the file or directory is
created. Similarly, the canonical form of the pathname of an existing
file or directory may be different from the canonical form of the same
pathname after the file or directory is deleted.
try this in your code.
System.setProperty("user.dir", "your_base_path")
Not sure if this works outside of my setup (windows platform, JRE 1.6.x)
but the following worked like a trick:
File path = new File(relativeOrAbsoluteGoldpath);
absolutePath = path.getCanonicalPath();
where relativeOrAbsoluteGoldpath is an arbitrary path name that may or may not be relative.

Java can't get the path of a file that exists in the current directory

If a file exists in the same directory where a Java application is running and I create a File object for that file the Java File methods for the path of the file include the filename as well. Code and output are below.
If this was a bug in the JDK version I'm using someone would surely have seen it by now.
Why do File.getAbsolutePath() and File.getCanonicalPath() include the file name? The Javadocs indicate that the directory name should be returned.
import java.io.File;
import java.io.IOException;
public class DirectoryFromFile {
private void getDirectoryOfFile(String fileName) throws IOException{
File f = new File(fileName );
System.out.println("exists(): " + f.exists());
System.out.println("getPath(): " + f.getPath());
System.out.println("getAbsolutePath(): " + f.getAbsolutePath());
System.out.println("getParent(): " + f.getParent() );
System.out.println("getCanonicalPath(): " + f.getCanonicalPath() );
System.out.println("getAbsoluteFile().getCanonicalPath(): " + f.getAbsoluteFile().getCanonicalPath() );
String dirname = f.getCanonicalPath();
System.out.println("dirname: " + dirname);
File dir = new File(dirname);
System.out.println("dir: " + dir.getAbsolutePath());
if (dirname.endsWith(fileName))
dirname = dirname.substring(0, dirname.length() - fileName.length());
System.out.println("dirname: " + dirname);
File dir2 = new File(dirname);
System.out.println("dir2: " + dir2.getAbsolutePath());
}
public static void main(String[] args) {
DirectoryFromFile dff = new DirectoryFromFile();
try {
dff.getDirectoryOfFile("test.txt");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Here' the output:
exists(): true
getPath(): test.txt
getAbsolutePath(): C:\dean\src\java\directorytest\directory.from.file\test.txt
getParent(): null
getCanonicalPath(): C:\dean\src\java\directorytest\directory.from.file\test.txt
getAbsoluteFile().getCanonicalPath(): C:\dean\src\java\directorytest\directory.from.file\test.txt
dirname: C:\dean\src\java\directorytest\directory.from.file\test.txt
dir: C:\dean\src\java\directorytest\directory.from.file\test.txt
dirname: C:\dean\src\java\directorytest\directory.from.file\
dir2: C:\dean\src\java\directorytest\directory.from.file
So far the only way I've found to get the directory in this case is to manually parse off the file name.
Does the File class have a way to get the directory name in this case (where a File that exists in the current directory is created without specifying a directory)?
Why do File.getAbsolutePath() and File.getCanonicalPath() include the
file name? The Javadocs indicate that the directory name should be
returned.
No, they don't. If you'd care to point out why you think they do, someone can probably identify the mistake in your reasoning. Also, if you specify exactly what you'd like to see for output given some particular input, we can help you out there, too. Your question title seems strange, too, since your problem seems to be that it is returning the full path to a file.
Edit: I think I understand the source of your confusion. A File represents a file system path in a platform-agnostic way. It can be a path to a file or to a directory. It also always represents the same path, though not necessarily the same absolute path. This is a very fine distinction but a very important one. A File object representing a relative path is always relative. Given a File representing a relative path, you can get the current corresponding absolute path using getAbsolutePath(). This doesn't, however, alter the fact that the File represents a relative path. Further invocations of getAbsolutePath() on the same File object may return different values. Consider, for example:
// A relative file
File foo = new File("foo.txt");
// Resolve relative file against CWD
System.out.println(foo.getAbsolutePath());
// Output: D:\dev\projects\testbed\foo.txt
System.setProperty("user.dir", "C:\\somewhere");
// Resolve relative file against new CWD
System.out.println(foo.getAbsolutePath());
// Output: C:\somewhere\foo.txt
// Get an absolute file
File absoluteFoo = foo.getAbsoluteFile();
// Show absolute path
System.out.println(absoluteFoo.getAbsolutePath());
// Output: C:\somewhere\foo.txt
System.setProperty("user.dir", "D:\\somewhere-else");
// An absolute path doesn't change when the CWD changes
System.out.println(absoluteFoo.getAbsolutePath());
// Output: C:\somewhere\foo.txt
It should be clear now that the path a File represents is only that: a path. Further, a path can be composed of zero or more parts, and calling getParent() on any File gives back the path of that File with the last path element removed unless there isn't a "last path element" to remove. Thus the expected result of new File("foo").getParent() is null since the relative path "foo" has no parent.
From the example and explanation above, you should be able to see that the way to get the containing directory when you've created relative-path File object is with
String absoluteParentDirPath = someRelativeFile.getAbsoluteFile().getParent();
with the caveat that the "absolute path" depends on your environment at the time.
Additional note: Since File is Serializable, you could write a relative-path file to disk or send it across a network. That File, when deserialized in another JVM, will still represent a relative path and will be resolved against whatever the current working directory of that JVM happens to be.
The behaviour is expected. The documentation does not mention that the filename is not included.
Perhaps you are confused by the difference between getAbsolutePath() and getAbsoluteFile(). It's that the latter returns a File instance.
I'm not sure why you think the Javadoc says that it returns the directory name.
Here is the Javadoc --
An abstract representation of file and directory pathnames.
User interfaces and operating systems use system-dependent pathname strings to name files and directories. This class presents an abstract, system-independent view of hierarchical pathnames. An abstract pathname has two components:
An optional system-dependent prefix string, such as a disk-drive specifier, "/" for the UNIX root directory, or "\\" for a Microsoft Windows UNC pathname, and
A sequence of zero or more string names.
The first name in an abstract pathname may be a directory name or, in the case of Microsoft Windows UNC pathnames, a hostname. Each subsequent name in an abstract pathname denotes a directory; the last name may denote either a directory or a file. The empty abstract pathname has no prefix and an empty name sequence.
http://download.oracle.com/javase/6/docs/api/java/io/File.html#getAbsolutePath%28%29
Returns the absolute pathname string of this abstract pathname.
In addition to the existing answers with regards to getAbsolutePath and getCanonicalPath, please also note, that File.getParent() does not mean "parent directory" it merely refers to the parent file object that was used to create the file.
For example, if the file object can be created as such:
File dir = new File("/path/to/a/directory");
File f1 = new File(dir, "x.txt");
File f2 = new File(dir, "../another/y.txt");
File f3 = new File("z.txt");
f1 would refer to /path/to/a/directory/x.txt, it's parent is dir (/path/to/a/directory)
f2 would refer to /path/to/a/directory/../another/y.txt, it's canonical path would be /path/to/a/another/y.txt, but it's parent is still the reference to dir (/path/to/a/directory)
f3 would refer to z.txt in the current directory. It does not have a parent file object, so f3.getParent() or f3.getParentFile() would return null.
path is the full path
if you only want the directory you need to call file.getParent()

What's the difference between getPath(), getAbsolutePath(), and getCanonicalPath() in Java?

What's the difference between getPath(), getAbsolutePath(), and getCanonicalPath() in Java?
And when do I use each one?
Consider these filenames:
C:\temp\file.txt - This is a path, an absolute path, and a canonical path.
.\file.txt - This is a path. It's neither an absolute path nor a canonical path.
C:\temp\myapp\bin\..\\..\file.txt - This is a path and an absolute path. It's not a canonical path.
A canonical path is always an absolute path.
Converting from a path to a canonical path makes it absolute (usually tack on the current working directory so e.g. ./file.txt becomes c:/temp/file.txt). The canonical path of a file just "purifies" the path, removing and resolving stuff like ..\ and resolving symlinks (on unixes).
Also note the following example with nio.Paths:
String canonical_path_string = "C:\\Windows\\System32\\";
String absolute_path_string = "C:\\Windows\\System32\\drivers\\..\\";
System.out.println(Paths.get(canonical_path_string).getParent());
System.out.println(Paths.get(absolute_path_string).getParent());
While both paths refer to the same location, the output will be quite different:
C:\Windows
C:\Windows\System32\drivers
The best way I have found to get a feel for things like this is to try them out:
import java.io.File;
public class PathTesting {
public static void main(String [] args) {
File f = new File("test/.././file.txt");
System.out.println(f.getPath());
System.out.println(f.getAbsolutePath());
try {
System.out.println(f.getCanonicalPath());
}
catch(Exception e) {}
}
}
Your output will be something like:
test\..\.\file.txt
C:\projects\sandbox\trunk\test\..\.\file.txt
C:\projects\sandbox\trunk\file.txt
So, getPath() gives you the path based on the File object, which may or may not be relative; getAbsolutePath() gives you an absolute path to the file; and getCanonicalPath() gives you the unique absolute path to the file. Notice that there are a huge number of absolute paths that point to the same file, but only one canonical path.
When to use each? Depends on what you're trying to accomplish, but if you were trying to see if two Files are pointing at the same file on disk, you could compare their canonical paths. Just one example.
In short:
getPath() gets the path string that the File object was constructed with, and it may be relative current directory.
getAbsolutePath() gets the path string after resolving it against the current directory if it's relative, resulting in a fully qualified path.
getCanonicalPath() gets the path string after resolving any relative path against current directory, and removes any relative pathing (. and ..), and any file system links to return a path which the file system considers the canonical means to reference the file system object to which it points.
Also, each of these has a File equivalent which returns the corresponding File object.
Note that IMO, Java got the implementation of an "absolute" path wrong; it really should remove any relative path elements in an absolute path. The canonical form would then remove any FS links or junctions in the path.
getPath() returns the path used to create the File object. This return value is not changed based on the location it is run (results below are for windows, separators are obviously different elsewhere)
File f1 = new File("/some/path");
String path = f1.getPath(); // will return "\some\path"
File dir = new File("/basedir");
File f2 = new File(dir, "/some/path");
path = f2.getPath(); // will return "\basedir\some\path"
File f3 = new File("./some/path");
path = f3.getPath(); // will return ".\some\path"
getAbsolutePath() will resolve the path based on the execution location or drive. So if run from c:\test:
path = f1.getAbsolutePath(); // will return "c:\some\path"
path = f2.getAbsolutePath(); // will return "c:\basedir\some\path"
path = f3.getAbsolutePath(); // will return "c:\test\.\basedir\some\path"
getCanonicalPath() is system dependent. It will resolve the unique location the path represents. So if you have any "."s in the path they will typically be removed.
As to when to use them. It depends on what you are trying to achieve. getPath() is useful for portability. getAbsolutePath() is useful to find the file system location, and getCanonicalPath() is particularly useful to check if two files are the same.
The big thing to get your head around is that the File class tries to represent a view of what Sun like to call "hierarchical pathnames" (basically a path like c:/foo.txt or /usr/muggins). This is why you create files in terms of paths. The operations you are describing are all operations upon this "pathname".
getPath() fetches the path that the File was created with (../foo.txt)
getAbsolutePath() fetches the path that the File was created with, but includes information about the current directory if the path is relative (/usr/bobstuff/../foo.txt)
getCanonicalPath() attempts to fetch a unique representation of the absolute path to the file. This eliminates indirection from ".." and "." references (/usr/foo.txt).
Note I say attempts - in forming a Canonical Path, the VM can throw an IOException. This usually occurs because it is performing some filesystem operations, any one of which could fail.
I find I rarely have need to use getCanonicalPath() but, if given a File with a filename that is in DOS 8.3 format on Windows, such as the java.io.tmpdir System property returns, then this method will return the "full" filename.

Categories