Find files/directories recursive in a folder Java - java

I want to get all suborders of a certain folder (not only direct suborders but also all sub-suborders etc. --> recursive) that contain a certain regex. This should be as efficient as possible.
The regex is entered by the user. For example he enters "hello" and then all suborders of an folder (myFolder)should be listed which contain the regex ".* hello .*". More specific, the paths of the found folders should be returned.
I have already done some research and found nio.file, but I am not sure if nio.file is faster than io.file in this case. Im new to this whole topic, so excuse me if I say something that was not correct.
I use Java 11.
For example, I also found a function like this and it works:
Stream s = Files.find(
Paths.get("myFolder"),
Integer.MAX_VALUE,
(path, basicFileAttributes) -> {
File f = path.toFile();
return f.getName().matches(".*'myRegexHere'.*") && f.isDirectory() && !f.isHidden();
});
But I'm not sure if there are more efficient functions/approaches or a more efficient way. I am open for suggestions and ideas.
EDIT:
Is there a way to exclude certain folders from the search?

Related

Java set of path pretty output in console

Please advice good solution in Java how to pretty print in console Set of java.nio.file.Path.
For example:
Path:
/src/test/resources/TestFolder/Wave.java
/src/test/resources/TestFolder
/src/test/resources/TestFolder/Mello.java
/src/test/resources/TestFolder/TestFolder2/Dave2.java
/src/test/resources/TestFolder/TestFolder2/Hello2.java
/src/test/resources/TestFolder/TestFolder2
And expected result:
TestFolder
Wave.java
Mello.java
TestFolder2
Dave2.java
Hello2.java
There is no built in API call that would do this. Fortunately, Java is a programming language, and you're a programmer. Let's program it! :)
The tools you need:
relativize, or getFileName
You can use the relativize call to produce paths that are relative to a 'root point'. For example:
Paths.get("/src/test/resources").relativize(Paths.get("/src/test/resources/TestFolder/Mello.java"))
turns into a path representing: TestFolder/Mello.java.
However, perhaps you want each entry in your output to always only be just a single file name; in that case, the getFileName() call strips out all path elements except the lasts one: Paths.get("/src/test/resources/TestFolder/TestFolder2/Hello2.java").getFileName() produces a path with just Hello2.java (if you need it as string, just call toString() on the path object to get that).
StringBuilder
The StringBuilder class can be used to produce a longer string piece by piece.
repeat
If you have an int representing your 'nesting level', in your example you want a bunch of spaces in front of the filename equal to some multiple of that. You can use the repeat call to turn a number into a string containing that number of spaces: String prefix = " ".repeat(5); produces a string containing 10 spaces.
NB: This is somewhat newer API; if you're on an old version of java and this call does not work, you'd have to write it yourself. It's just a single for loop.
Files.isDirectory
To know if any given file is a directory, you can call that; it returns true if it is and false if it is not.
Files.newDirectoryStream
This is one way to 'walk' a file system: This lets you list each dir/file in a given directory:
Path somePathObject = Paths.get("/foo/bar");
try (var contents = Files.newDirectoryStream(somePathObject)) {
for (Path content : contents) {
.. this is called once for each file/dir in '/foo/bar'
}
}
recursion
Finally, to tie it all together: You'd want to walk through each child in a given starting point, if it is a file, print a number of spacers equal to our nesting level (which starts at 0), then the simple file name, and then move on to the next entry. For directory entries, you want to do that but then 'dive' into the directory, incrementing the nesting level. If you make the nesting level a parameter, you can call your own method, using the directory as new 'root point', and passing 'nestingLevel + 1' for the nesting level.
Good luck & Have fun!

File.length() returns 0

I'm trying to get the length (in bytes) of some files. The problem is that I get a zero for their length while they are not of length zero (I checked!)
Moreover, every other file's method I'm using on these files works just fine. so it's just the issue with the length.
why this happening?
thank you.
//add all files names in the directory into array
String[] files = new File(sourcedir).list();
filesNamesList.addAll(Arrays.asList(files));
filesNamesList.removeIf(name -> ((new File(sourcedir + PATH_BACK_SLASH + name))
.isDirectory()));
for (String f:files){
File e = new File(f);
System.out.println((e).length());
}
}
Your file paths are possibly incorrect.
Java's File.length() Javadoc states that length() may return 0L if the file is not found.
The length, in bytes, of the file denoted by this abstract pathname, or 0L if the file does not exist. Some operating systems may return 0L for pathnames denoting system-dependent entities such as devices or pipes."
Therefore, if you're certain your files have content in them, then you are likely not finding them correctly. Double check your file paths and try again.
Also, the javadoc recommends using Files.readAttributes() if you need more detailed error information; I echo their sentiment here.
It is worth checking the Absolute path of a file using File.getAbsolutePath() to make sure whether the function is given with the right path to the file or not.
The list() method that you're using lists the file names, not their absolute paths.
So your code lists all file names in sourcedir (not their full paths), then looks for those same file names in the current directory you're running your program from. That's why length() returns 0 for all those files (as per the docs, it'll return 0 if the file doesn't exist.)
If you instead want a list of all the absolute paths, then you can do that concisely using streams like so:
List<String> files = Arrays.stream(new File("C:\\users\\michael\\desktop").listFiles())
.map(f -> f.getAbsolutePath())
.collect(Collectors.toList());
However, if all these files are only ever going to be from the same directory, then use new File(sourcedir, name) in your for loop (this is a better alternative than the new File(sourcedir + PATH_BACK_SLASH + name) you use elsewhere.)
The problem is that those files does not contain any contents in them. So you are getting 0 as the output

Real-life usage of Path.relativize() if source path contains anything else than real folders

While preparing a java 7 certification exam, I had to start looking closely at the Path.relativize() method. While superficially its purpose seems straight forward, express a path relatively to another, I have found that its implementation defies all understanding I had of filesystems, namely on Windows or Linux/Unix.
Consider the following:
// Case 1
System.out.println(Paths.get("c:\\folder1\\folder2").relativize(Paths.get("c:\\file.txt")));
// Case 2
System.out.println(Paths.get("c:\\folder1\\folder2\\other-file.txt").relativize(Paths.get("c:\\file.txt")));
// Case 3
System.out.println(Paths.get("c:\\folder1\\..\\.\\folder1\\..\\.\\folder1\\..\\.\\folder1\\..").relativize(Paths.get("c:\\file.txt")));
The output I got is:
..\..\file.txt
..\..\..\file.txt
..\..\..\..\..\..\..\..\..\..\..\file.txt
Case 1 illustrates the straight forward usage one could expect of that function, i.e. find the path of a file relatively to a folder, given the absolute paths of both file and folder. Fine, that gives me something I can type in a cmd window in Windows and will find my file correctly.
Case 2, as discussed in other StackOverflow questions, highlights the fact that the method has no way of distinguishing a file name from a folder name (a folder can contain a . in its name, and a file can have none). OK, to me this should mean the method should come with the caveat: "use at your own risk, if providing a path where the leaf is a file, the relativized result won't work in a file system". Or does it, if yes, which file system?
Case 3 to me is nonsense. The method does not even take into account the meaning of "." and ".." in the source path, but happily uses ".." in the result as a way to go up a level in the source folder. This may make sense in a theoretical filesystem I'm yet to encounter, but in Windows, Linux or Unix, the result is unusable. "..\..\..\..\..\..\..\..\..\..\..\file.txt" will of course not point to file.txt relatively to a path that despite being expressed as "c:\folder1\..\.\folder1\..\.\folder1\..\.\folder1\.." really points to "c:\".
Thanks for hanging in so far. The question: what possible use could one make of the Path.relativize() method considering its results only make real-life sense in a fraction of cases?
Your expression
Paths.get("c:\\folder1\\..\\.\\folder1\\..\\.\\folder1\\..\\.\\folder1\\..")
is really just c:\ when normalized.
If you then relativize c:\file.txt against that path, you get a relative path that will lead you there. From c:\, that relative path is file.txt.
The result
..\..\..\..\..\..\..\..\..\..\..\file.txt
is exactly equivalent to file.txt. When normalized, leading .. will be discarded. I agree it isn't pretty, but that's just an implementation detail.

How to check if a stopword file is corrupted or wrong

I have a file with several hundreds of stopwords. I want to be able to check if the file has been modified by a user for example or even if it is corrupted.
The way I am thinking of doing it currently is by looking if the number of lines is correct. I could also check if the total number of characters is the one expected or even have the whole stopwords list loaded in memory to check if every single one of them is in the file. All 3 of the ways I thought of seem inefficient and/or bad so I thought of asking if there is any better way of doing it.
What I am thinking of implementing:
private static final int WORD_COUNT = 354;
public static boolean stopwordsCorrupted(File file) {
int numOfLines = countLines(file);
return WORD_COUNT != numOfLines;
}
Check out this: http://en.wikipedia.org/wiki/Checksum This uses the hashfuntion of the file to check if no alterations have been made
Here you also have an example of how to use it.
Java WatchService API might be helpful for your problem.

How to check Directory exits with proper case in java

I want to copy one folder to another location in java,
but when I use
File f = new File(userInputFilePath);
and checks
if(f.isDirectory())
it returns true.
For example for the userInputPath as "C:\To\TesT" while the directory path is "C:\to\Test".
Please suggest me ASAP
On Windows systems the case of filenames is irrelevant; try renaming the directory from Test to TesT and you'll see what I mean. You can of course go against this manually by comparing Strings (something like f.getPath().equals(userInputFilePath) && f.isDirectory()) but that's not necessarily a good idea as most programs will not differentiate between the two and this could cause unexpected behavior.

Categories