I'm using PathMatcher and SimpleFileVisitor to iterate over directory and find all the files that start only with a certain prefix. However, I can't get any files although there are some files that match my preference.
Example of file required:
Prefix_some_text.csv
Here is the Main code that invokes the call for SimpleFileVisitor class, and it uses regex pattern with the prefix and suppose to find all files starting with the certain pattern:
String directoryAsString = "C:/Users";
String pattern = "Prefix";
SearchFileByWildcard sfbw = new SearchFileByWildcard();
try {
List<String> actual = sfbw.searchWithWc(Paths.get(directoryAsString),pattern);
} catch (IOException e) {
e.printStackTrace();
}
The implementation of SearchFileByWildcard class that uses SimpleFileVisitor :
static class SearchFileByWildcard {
List<String> matchesList = new ArrayList<String>();
List<String> searchWithWc(Path rootDir, String pattern) throws IOException {
matchesList.clear();
FileVisitor<Path> matcherVisitor = new SimpleFileVisitor<Path>() {
#Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attribs) throws IOException {
FileSystem fs = FileSystems.getDefault();
PathMatcher matcher = fs.getPathMatcher("regex:" + pattern);
Path name = file.getFileName(); //takes the filename from the full path
if (matcher.matches(name)) {
matchesList.add(name.toString());
}
return FileVisitResult.CONTINUE;
}
};
Files.walkFileTree(rootDir, matcherVisitor);
return matchesList;
}
}
I'm debating whether to use glob instead of regex? Or maybe something with my regex is flawed.
It seems like the pattern is wrong. It matches only files named "Prefix". Try to change it in String pattern = "Prefix.*";.
Otherwise you can scan for files which name starts by the string "Prefix".
String name = file.getFileName().toString();
if (name.startsWith(pattern)) {
matchesList.add(name);
}
I am trying to write an app to use Files.find method in it.
Below program works perfectly :
package ehsan;
/* I have removed imports for code brevity */
public class Main {
public static void main(String[] args) throws IOException {
Path p = Paths.get("/home/ehsan");
final int maxDepth = 10;
Stream<Path> matches = Files.find(p,maxDepth,(path, basicFileAttributes) -> String.valueOf(path).endsWith(".txt"));
matches.map(path -> path.getFileName()).forEach(System.out::println);
}
}
This works fine and gives me a list of files ending with .txt ( aka text files ) :
hello.txt
...
But below program does not show anything :
package ehsan;
public class Main {
public static void main(String[] args) throws IOException {
Path p = Paths.get("/home/ehsan");
final int maxDepth = 10;
Stream<Path> matches = Files.find(p,maxDepth,(path, basicFileAttributes) -> path.getFileName().equals("workspace"));
matches.map(path -> path.getFileName()).forEach(System.out::println);
}
}
But it does not show anything :(
Here is my home folder hiearchy (ls result) :
blog Projects
Desktop Public
Documents Templates
Downloads The.Purge.Election.Year.2016.HC.1080p.HDrip.ShAaNiG.mkv
IdeaProjects The.Purge.Election.Year.2016.HC.1080p.HDrip.ShAaNiG.mkv.aria2
Music Videos
Pictures workspace
So whats going wrong with path.getFileName().equals("workspace")?
Path.getFilename() does not return a String, but a Path object, do this:
getFilename().toString().equals("workspace")
Use the following and look at the console. Maybe none of your files contains workspace in it
Files.find(p,maxDepth,(path, basicFileAttributes) -> {
if (String.valueOf(path).equals("workspace")) {
System.out.println("FOUND : " + path);
return true;
}
System.out.println("\tNOT VALID : " + path);
return false;
});
Tried the following:
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.spi.FileTypeDetector;
import org.apache.tika.Tika;
import org.apache.tika.mime.MimeTypes;
/**
*
* #author kiriti.k
*/
public class TikaFileTypeDetector {
private final Tika tika = new Tika();
public TikaFileTypeDetector() {
super();
}
public String probeContentType(Path path) throws IOException {
// Try to detect based on the file name only for efficiency
String fileNameDetect = tika.detect(path.toString());
if (!fileNameDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileNameDetect;
}
// Then check the file content if necessary
String fileContentDetect = tika.detect(path.toFile());
if (!fileContentDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileContentDetect;
}
// Specification says to return null if we could not
// conclusively determine the file type
return null;
}
public static void main(String[] args) throws IOException {
Tika tika = new Tika();
// expects file path as the program argument
if (args.length != 1) {
printUsage();
return;
}
Path path = Paths.get(args[0]);
TikaFileTypeDetector detector = new TikaFileTypeDetector();
// Analyse the file - first based on file name for efficiency.
// If cannot determine based on name and then analyse content
String contentType = detector.probeContentType(path);
System.out.println("File is of type - " + contentType);
}
public static void printUsage() {
System.out.print("Usage: java -classpath ... "
+ TikaFileTypeDetector.class.getName()
+ " ");
}
}
The above program is checking based on file extension only. How do I make it to check content type also(mime) and then determine the type. I am using tika-app-1.8.jar in netbean 8.0.2. What am I missing?
The code checks the file extension first and returns the MIME type based on that, if it finds a result. If you want it to check the content first, just switch the two statements:
public String probeContentType(Path path) throws IOException {
// Check contents first
String fileContentDetect = tika.detect(path.toFile());
if (!fileContentDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileContentDetect;
}
// Try file name only if content search was not successful
String fileNameDetect = tika.detect(path.toString());
if (!fileNameDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileNameDetect;
}
// Specification says to return null if we could not
// conclusively determine the file type
return null;
}
Be aware that this may have huge performance impact.
You can use Files.probeContentType(path)
I have a directory containing sub directories each one of this directories has a file called
*.properties
I want to search for these files with java
Thanks
See the java tutorial Walking the File Tree:
Do you need to create an application that will recursively visit all the files in a file tree? Perhaps you need to delete every .class file in a tree, or find every file that hasn't been accessed in the last year. You can do so with the FileVisitor interface.
In particular see the Finding Files example:
import java.io.*;
import java.nio.file.*;
import java.nio.file.attribute.*;
import static java.nio.file.FileVisitResult.*;
import static java.nio.file.FileVisitOption.*;
import java.util.*;
/**
* Sample code that finds files that
* match the specified glob pattern.
* For more information on what
* constitutes a glob pattern, see
* http://docs.oracle.com/javase/javatutorials/tutorial/essential/io/fileOps.html#glob
*
* The file or directories that match
* the pattern are printed to
* standard out. The number of
* matches is also printed.
*
* When executing this application,
* you must put the glob pattern
* in quotes, so the shell will not
* expand any wild cards:
* java Find . -name "*.java"
*/
public class Find {
/**
* A {#code FileVisitor} that finds
* all files that match the
* specified pattern.
*/
public static class Finder
extends SimpleFileVisitor<Path> {
private final PathMatcher matcher;
private int numMatches = 0;
Finder(String pattern) {
matcher = FileSystems.getDefault()
.getPathMatcher("glob:" + pattern);
}
// Compares the glob pattern against
// the file or directory name.
void find(Path file) {
Path name = file.getFileName();
if (name != null && matcher.matches(name)) {
numMatches++;
System.out.println(file);
}
}
// Prints the total number of
// matches to standard out.
void done() {
System.out.println("Matched: "
+ numMatches);
}
// Invoke the pattern matching
// method on each file.
#Override
public FileVisitResult visitFile(Path file,
BasicFileAttributes attrs) {
find(file);
return CONTINUE;
}
// Invoke the pattern matching
// method on each directory.
#Override
public FileVisitResult preVisitDirectory(Path dir,
BasicFileAttributes attrs) {
find(dir);
return CONTINUE;
}
#Override
public FileVisitResult visitFileFailed(Path file,
IOException exc) {
System.err.println(exc);
return CONTINUE;
}
}
static void usage() {
System.err.println("java Find <path>" +
" -name \"<glob_pattern>\"");
System.exit(-1);
}
public static void main(String[] args)
throws IOException {
if (args.length < 3 || !args[1].equals("-name"))
usage();
Path startingDir = Paths.get(args[0]);
String pattern = args[2];
Finder finder = new Finder(pattern);
Files.walkFileTree(startingDir, finder);
finder.done();
}
}
You can recursively call this method
File dir = new File("C:/");
File [] files = dir.listFiles(new FilenameFilter() {
#Override
public boolean accept(File dir, String name) {
return name.endsWith(".properties");
}
});
Alternately you use Java 7 Files.walkTree method and filter as described here.
In addition to other answers:
Guava (since v15):
for (File f : Files.fileTreeTraverser().preOrderTraversal(rootDir)) {
// filter and process
}
Commons IO:
for (File f : FileUtils.listFiles(rootDir, fileFilter, dirFilter)) {
// process
}
Java 7:
Files.walkFileTree(rootDir, fileVisitor);
I have this code which reads all the files from a directory.
File textFolder = new File("text_directory");
File [] texFiles = textFolder.listFiles( new FileFilter() {
public boolean accept( File file ) {
return file.getName().endsWith(".txt");
}
});
It works great. It fills the array with all the files that end with ".txt" from directory "text_directory".
How can I read the contents of a directory in a similar fashion within a JAR file?
So what I really want to do is, to list all the images inside my JAR file, so I can load them with:
ImageIO.read(this.getClass().getResource("CompanyLogo.png"));
(That one works because the "CompanyLogo" is "hardcoded" but the number of images inside the JAR file could be from 10 to 200 variable length.)
EDIT
So I guess my main problem would be: How to know the name of the JAR file where my main class lives?
Granted I could read it using java.util.Zip.
My Structure is like this:
They are like:
my.jar!/Main.class
my.jar!/Aux.class
my.jar!/Other.class
my.jar!/images/image01.png
my.jar!/images/image02a.png
my.jar!/images/imwge034.png
my.jar!/images/imagAe01q.png
my.jar!/META-INF/manifest
Right now I'm able to load for instance "images/image01.png" using:
ImageIO.read(this.getClass().getResource("images/image01.png));
But only because I know the file name, for the rest I have to load them dynamically.
CodeSource src = MyClass.class.getProtectionDomain().getCodeSource();
if (src != null) {
URL jar = src.getLocation();
ZipInputStream zip = new ZipInputStream(jar.openStream());
while(true) {
ZipEntry e = zip.getNextEntry();
if (e == null)
break;
String name = e.getName();
if (name.startsWith("path/to/your/dir/")) {
/* Do something with this entry. */
...
}
}
}
else {
/* Fail... */
}
Note that in Java 7, you can create a FileSystem from the JAR (zip) file, and then use NIO's directory walking and filtering mechanisms to search through it. This would make it easier to write code that handles JARs and "exploded" directories.
Code that works for both IDE's and .jar files:
import java.io.*;
import java.net.*;
import java.nio.file.*;
import java.util.*;
import java.util.stream.*;
public class ResourceWalker {
public static void main(String[] args) throws URISyntaxException, IOException {
URI uri = ResourceWalker.class.getResource("/resources").toURI();
Path myPath;
if (uri.getScheme().equals("jar")) {
FileSystem fileSystem = FileSystems.newFileSystem(uri, Collections.<String, Object>emptyMap());
myPath = fileSystem.getPath("/resources");
} else {
myPath = Paths.get(uri);
}
Stream<Path> walk = Files.walk(myPath, 1);
for (Iterator<Path> it = walk.iterator(); it.hasNext();){
System.out.println(it.next());
}
}
}
erickson's answer worked perfectly:
Here's the working code.
CodeSource src = MyClass.class.getProtectionDomain().getCodeSource();
List<String> list = new ArrayList<String>();
if( src != null ) {
URL jar = src.getLocation();
ZipInputStream zip = new ZipInputStream( jar.openStream());
ZipEntry ze = null;
while( ( ze = zip.getNextEntry() ) != null ) {
String entryName = ze.getName();
if( entryName.startsWith("images") && entryName.endsWith(".png") ) {
list.add( entryName );
}
}
}
webimages = list.toArray( new String[ list.size() ] );
And I have just modify my load method from this:
File[] webimages = ...
BufferedImage image = ImageIO.read(this.getClass().getResource(webimages[nextIndex].getName() ));
To this:
String [] webimages = ...
BufferedImage image = ImageIO.read(this.getClass().getResource(webimages[nextIndex]));
I would like to expand on acheron55's answer, since it is a very non-safe solution, for several reasons:
It doesn't close the FileSystem object.
It doesn't check if the FileSystem object already exists.
It isn't thread-safe.
This is somewhat a safer solution:
private static ConcurrentMap<String, Object> locks = new ConcurrentHashMap<>();
public void walk(String path) throws Exception {
URI uri = getClass().getResource(path).toURI();
if ("jar".equals(uri.getScheme()) {
safeWalkJar(path, uri);
} else {
Files.walk(Paths.get(path));
}
}
private void safeWalkJar(String path, URI uri) throws Exception {
synchronized (getLock(uri)) {
// this'll close the FileSystem object at the end
try (FileSystem fs = getFileSystem(uri)) {
Files.walk(fs.getPath(path));
}
}
}
private Object getLock(URI uri) {
String fileName = parseFileName(uri);
locks.computeIfAbsent(fileName, s -> new Object());
return locks.get(fileName);
}
private String parseFileName(URI uri) {
String schemeSpecificPart = uri.getSchemeSpecificPart();
return schemeSpecificPart.substring(0, schemeSpecificPart.indexOf("!"));
}
private FileSystem getFileSystem(URI uri) throws IOException {
try {
return FileSystems.getFileSystem(uri);
} catch (FileSystemNotFoundException e) {
return FileSystems.newFileSystem(uri, Collections.<String, String>emptyMap());
}
}
There's no real need to synchronize over the file name; one could simply synchronize on the same object every time (or make the method synchronized), it's purely an optimization.
I would say that this is still a problematic solution, since there might be other parts in the code that use the FileSystem interface over the same files, and it could interfere with them (even in a single threaded application).
Also, it doesn't check for nulls (for instance, on getClass().getResource().
This particular Java NIO interface is kind of horrible, since it introduces a global/singleton non thread-safe resource, and its documentation is extremely vague (a lot of unknowns due to provider specific implementations). Results may vary for other FileSystem providers (not JAR). Maybe there's a good reason for it being that way; I don't know, I haven't researched the implementations.
So I guess my main problem would be, how to know the name of the jar where my main class lives.
Assuming that your project is packed in a Jar (not necessarily true!), you can use ClassLoader.getResource() or findResource() with the class name (followed by .class) to get the jar that contains a given class. You'll have to parse the jar name from the URL that gets returned (not that tough), which I will leave as an exercise for the reader :-)
Be sure to test for the case where the class is not part of a jar.
I've ported acheron55's answer to Java 7 and closed the FileSystem object. This code works in IDE's, in jar files and in a jar inside a war on Tomcat 7; but note that it does not work in a jar inside a war on JBoss 7 (it gives FileSystemNotFoundException: Provider "vfs" not installed, see also this post). Furthermore, like the original code, it is not thread safe, as suggested by errr. For these reasons I have abandoned this solution; however, if you can accept these issues, here is my ready-made code:
import java.io.IOException;
import java.net.*;
import java.nio.file.*;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.Collections;
public class ResourceWalker {
public static void main(String[] args) throws URISyntaxException, IOException {
URI uri = ResourceWalker.class.getResource("/resources").toURI();
System.out.println("Starting from: " + uri);
try (FileSystem fileSystem = (uri.getScheme().equals("jar") ? FileSystems.newFileSystem(uri, Collections.<String, Object>emptyMap()) : null)) {
Path myPath = Paths.get(uri);
Files.walkFileTree(myPath, new SimpleFileVisitor<Path>() {
#Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
System.out.println(file);
return FileVisitResult.CONTINUE;
}
});
}
}
}
Here is an example of using Reflections library to recursively scan classpath by regex name pattern augmented with a couple of Guava perks to to fetch resources contents:
Reflections reflections = new Reflections("com.example.package", new ResourcesScanner());
Set<String> paths = reflections.getResources(Pattern.compile(".*\\.template$"));
Map<String, String> templates = new LinkedHashMap<>();
for (String path : paths) {
log.info("Found " + path);
String templateName = Files.getNameWithoutExtension(path);
URL resource = getClass().getClassLoader().getResource(path);
String text = Resources.toString(resource, StandardCharsets.UTF_8);
templates.put(templateName, text);
}
This works with both jars and exploded classes.
Here's a method I wrote for a "run all JUnits under a package". You should be able to adapt it to your needs.
private static void findClassesInJar(List<String> classFiles, String path) throws IOException {
final String[] parts = path.split("\\Q.jar\\\\E");
if (parts.length == 2) {
String jarFilename = parts[0] + ".jar";
String relativePath = parts[1].replace(File.separatorChar, '/');
JarFile jarFile = new JarFile(jarFilename);
final Enumeration<JarEntry> entries = jarFile.entries();
while (entries.hasMoreElements()) {
final JarEntry entry = entries.nextElement();
final String entryName = entry.getName();
if (entryName.startsWith(relativePath)) {
classFiles.add(entryName.replace('/', File.separatorChar));
}
}
}
}
Edit:
Ah, in that case, you might want this snippet as well (same use case :) )
private static File findClassesDir(Class<?> clazz) {
try {
String path = clazz.getProtectionDomain().getCodeSource().getLocation().getFile();
final String codeSourcePath = URLDecoder.decode(path, "UTF-8");
final String thisClassPath = new File(codeSourcePath, clazz.getPackage().getName().repalce('.', File.separatorChar));
} catch (UnsupportedEncodingException e) {
throw new AssertionError("impossible", e);
}
}
Just to mention that if you are already using Spring, you can take advantage of the PathMatchingResourcePatternResolver.
For instance to get all the PNG files from a images folder in resources
ClassLoader cl = this.getClass().getClassLoader();
ResourcePatternResolver resolver = new PathMatchingResourcePatternResolver(cl);
Resource[] resources = resolver.getResources("images/*.png");
for (Resource r: resources){
logger.info(r.getFilename());
// From your example
// ImageIO.read(cl.getResource("images/" + r.getFilename()));
}
A jar file is just a zip file with a structured manifest. You can open the jar file with the usual java zip tools and scan the file contents that way, inflate streams, etc. Then use that in a getResourceAsStream call, and it should be all hunky dory.
EDIT / after clarification
It took me a minute to remember all the bits and pieces and I'm sure there are cleaner ways to do it, but I wanted to see that I wasn't crazy. In my project image.jpg is a file in some part of the main jar file. I get the class loader of the main class (SomeClass is the entry point) and use it to discover the image.jpg resource. Then some stream magic to get it into this ImageInputStream thing and everything is fine.
InputStream inputStream = SomeClass.class.getClassLoader().getResourceAsStream("image.jpg");
JPEGImageReaderSpi imageReaderSpi = new JPEGImageReaderSpi();
ImageReader ir = imageReaderSpi.createReaderInstance();
ImageInputStream iis = new MemoryCacheImageInputStream(inputStream);
ir.setInput(iis);
....
ir.read(0); //will hand us a buffered image
Given an actual JAR file, you can list the contents using JarFile.entries(). You will need to know the location of the JAR file though - you can't just ask the classloader to list everything it could get at.
You should be able to work out the location of the JAR file based on the URL returned from ThisClassName.class.getResource("ThisClassName.class"), but it may be a tiny bit fiddly.
Some time ago I made a function that gets classess from inside JAR:
public static Class[] getClasses(String packageName)
throws ClassNotFoundException{
ArrayList<Class> classes = new ArrayList<Class> ();
packageName = packageName.replaceAll("\\." , "/");
File f = new File(jarName);
if(f.exists()){
try{
JarInputStream jarFile = new JarInputStream(
new FileInputStream (jarName));
JarEntry jarEntry;
while(true) {
jarEntry=jarFile.getNextJarEntry ();
if(jarEntry == null){
break;
}
if((jarEntry.getName ().startsWith (packageName)) &&
(jarEntry.getName ().endsWith (".class")) ) {
classes.add(Class.forName(jarEntry.getName().
replaceAll("/", "\\.").
substring(0, jarEntry.getName().length() - 6)));
}
}
}
catch( Exception e){
e.printStackTrace ();
}
Class[] classesA = new Class[classes.size()];
classes.toArray(classesA);
return classesA;
}else
return null;
}
public static ArrayList<String> listItems(String path) throws Exception{
InputStream in = ClassLoader.getSystemClassLoader().getResourceAsStream(path);
byte[] b = new byte[in.available()];
in.read(b);
String data = new String(b);
String[] s = data.split("\n");
List<String> a = Arrays.asList(s);
ArrayList<String> m = new ArrayList<>(a);
return m;
}
There are two very useful utilities both called JarScan:
www.inetfeedback.com/jarscan
jarscan.dev.java.net
See also this question: JarScan, scan all JAR files in all subfolders for specific class
The most robust mechanism for listing all resources in the classpath is currently to use this pattern with ClassGraph, because it handles the widest possible array of classpath specification mechanisms, including the new JPMS module system. (I am the author of ClassGraph.)
How to know the name of the JAR file where my main class lives?
URI mainClasspathElementURI;
try (ScanResult scanResult = new ClassGraph().whitelistPackages("x.y.z")
.enableClassInfo().scan()) {
mainClasspathElementURI =
scanResult.getClassInfo("x.y.z.MainClass").getClasspathElementURI();
}
How can I read the contents of a directory in a similar fashion within a JAR file?
List<String> classpathElementResourcePaths;
try (ScanResult scanResult = new ClassGraph().overrideClasspath(mainClasspathElementURI)
.scan()) {
classpathElementResourcePaths = scanResult.getAllResources().getPaths();
}
There are lots of other ways to deal with resources too.
One more for the road that's a bit more flexible for matching specific filenames because it uses wildcard globbing. In a functional style this could resemble:
import java.io.IOException;
import java.net.URISyntaxException;
import java.nio.file.FileSystem;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.function.Consumer;
import static java.nio.file.FileSystems.getDefault;
import static java.nio.file.FileSystems.newFileSystem;
import static java.util.Collections.emptyMap;
/**
* Responsible for finding file resources.
*/
public class ResourceWalker {
/**
* Globbing pattern to match font names.
*/
public static final String GLOB_FONTS = "**.{ttf,otf}";
/**
* #param directory The root directory to scan for files matching the glob.
* #param c The consumer function to call for each matching path
* found.
* #throws URISyntaxException Could not convert the resource to a URI.
* #throws IOException Could not walk the tree.
*/
public static void walk(
final String directory, final String glob, final Consumer<Path> c )
throws URISyntaxException, IOException {
final var resource = ResourceWalker.class.getResource( directory );
final var matcher = getDefault().getPathMatcher( "glob:" + glob );
if( resource != null ) {
final var uri = resource.toURI();
final Path path;
FileSystem fs = null;
if( "jar".equals( uri.getScheme() ) ) {
fs = newFileSystem( uri, emptyMap() );
path = fs.getPath( directory );
}
else {
path = Paths.get( uri );
}
try( final var walk = Files.walk( path, 10 ) ) {
for( final var it = walk.iterator(); it.hasNext(); ) {
final Path p = it.next();
if( matcher.matches( p ) ) {
c.accept( p );
}
}
} finally {
if( fs != null ) { fs.close(); }
}
}
}
}
Consider parameterizing the file extensions, left an exercise for the reader.
Be careful with Files.walk. According to the documentation:
This method must be used within a try-with-resources statement or similar control structure to ensure that the stream's open directories are closed promptly after the stream's operations have completed.
Likewise, newFileSystem must be closed, but not before the walker has had a chance to visit the file system paths.
Just a different way of listing/reading files from a jar URL and it does it recursively for nested jars
https://gist.github.com/trung/2cd90faab7f75b3bcbaa
URL urlResource = Thead.currentThread().getContextClassLoader().getResource("foo");
JarReader.read(urlResource, new InputStreamCallback() {
#Override
public void onFile(String name, InputStream is) throws IOException {
// got file name and content stream
}
});