Java iterative reading of Files - java

at the moment I'm having a problem with writing a tool for my company. I have 384 XML files that i have to read and parse with a SAX Parser into txt files.
What i got until now is the parsing of all XML-Files into one txt File, size 43 MB. With a BufferedReader and line.startsWith i want to extract all relevant information out of the textfile.
Edit: Done
(So my Problem is how to solve this more efficiently. I'm having an idea (but unfortunately not in code as you might think) but i dont know if its possible: I want to iterate through a Directory, find the XML-File i want, then parse it and create a new txt File with the parsed content. If done for all 384 XML files i want the same thing for the 384 txt files, read them with a BufferedReader to get my relevant information. Its important to read them one at a time. Another Problem is the Directory path, its a bit complex: "C:\Users\xxx\Documents\Data\ProjectName\A1\1\1SLin\wanted.xml" for each file there is a own directory. The variable is A1, it reaches from A-P and 1-24. Alternatively I have all the relevant files with thir absolute path in an arraylist, so its also okay to iterate over this list if its easier.)
Edit:
I came to a solution: Below contains the search directories method and a method to parse the xml Files of a List into the same directory with the same filename but another file extension
public List<File> searchFile(File dir, String find) {
File[] files = dir.listFiles();
List<File> matches = new ArrayList<File>();
if (files != null) {
for (int i = 0; i < files.length; i++) {
if (files[i].isDirectory()) {
matches.addAll(searchFile(files[i], find));
} else if (files[i].getName().equalsIgnoreCase(find)) {
matches.add(files[i]);
}
}
}
Collections.sort(matches);
return matches;
}
public static void main(String[] args) throws IOException {
Import_Files im = new Import_Files();
File dir = new File("C:\\Users\\xxx\\Desktop\\MS-Daten\\");
String name = "snp_result_5815.xml";
List<File> matches = im.searchFile(dir, name);
System.out.println(matches);
for (int i=0; i<matches.size(); i++) {
String j = String.valueOf(i);
String xml_name = matches.get(i).getAbsolutePath();
File f = new File(matches.get(i).getAbsolutePath().replaceFirst(".xml", ".txt"));
System.setOut(new PrintStream(new FileOutputStream(f)));
System.out.println("\nstarting File: "+ i + "\n");
xml_parse myReader = new xml_parse(xml_name);
myReader.setContentHandler(new MyContentHandler());
myReader.setErrorHandler(new MyErrorHandler());
myReader.run();
}
}

The searchFolder method below will take a path and file extension, search the path and all sub-directories, and pass any matching file types to the processFile method.
public static void main(String[] args) {
String path = "c:\\temp";
Pattern filePattern = Pattern.compile("(?i).*\\.xml$");
searchFolder(path, filePattern);
}
public static void searchFolder(String searchPath, Pattern filePattern){
File dir = new File(searchPath);
for(File item : dir.listFiles()){
if(item.isDirectory()){
//recursively search subdirectories
searchFolder(item.getAbsolutePath(), filePattern);
} else if(item.isFile() && filePattern.matcher(item.getName()).matches()){
processFile(item);
}
}
}
public static void processFile(File aFile){
String filename = aFile.getAbsolutePath();
String txtFilename = filename.substring(0, filename.lastIndexOf(".")) + ".txt";
//Do your xml file parsing and write to txtFilename
}
The complexity of the path makes no difference, just specify the root path to search (looks like C:\Users\xxx\Documents\Data\ProjectName in your case) and it will find all the files.

Related

is there a way to use the file path instead of file name in BufferedReader?

I would like to add the path instead of the file name in BufferedReader?
I want to use the path because I want the code to pickup any file that has the name "audit" in that specific folder.
So I am currently using this method below, but it only works when I add the absolute path.
`
public static void main(String[] args)
throws IOException {
List<String> stngFile = new ArrayList<String>();
BufferedReader bfredr = new BufferedReader(new FileReader
("file path"));
String text = bfredr.readLine();
while (text != null) {
stngFile.add(text);
text = bfredr.readLine();
}
bfredr.close();
String[] array = stngFile.toArray(new String[0]);
Arrays.toString(array);
for (String eachstring : array) {
System.out.println(eachstring);
}
}
`
I am new to programming any help is much appreciated. Thanks in advance.
FileReader also has a constructor that takes a file. You can create a file object using a URI or a string for the path. You could use a FileFilter or just check if each file matches the name you provide, which is how i would do it:
To get all files in a folder you can use folder.listFiles().
You can then use file.getName().contains("audit") to check if the filename contains "audit".
Note that this is case sensitive, to ignore case you would just use file.getName().toLowerCase().contains("audit") (make sure the string you check here, in this case "audit", is always lower case).
As pointed out by g00se, you will also have to check if file is actually a file and not a directory using file.isFile()
Then in a loop you just read out to content of each file that matches the above condition seperately.
If you need the files in all the subfolders aswell, see this post.
Example:
public static void main(String[] args) throws IOException {
File folder = new File("C:\\MyFolder"); // the folder containing all the files you are looking for
for (File file : folder.listFiles()) { // loop through each file in that folder
if (file.getName().contains("audit") && file.isFile()) { // check if it contains audit in its name
// your previous code for reading out the file content
BufferedReader bfredr = new BufferedReader(new FileReader(file));
List<String> stngFile = new ArrayList<String>();
String text = bfredr.readLine();
while (text != null) {
stngFile.add(text);
text = bfredr.readLine();
}
bfredr.close();
String[] array = stngFile.toArray(new String[0]);
Arrays.toString(array);
for (String eachstring : array) {
System.out.println(eachstring);
}
}
}
}

How to find specific directory and its files according to the keyword passed in java and loading in memory approach

I have a project structure like below:
Now, my problem statement is I have to iterate resources folder, and given a key, I have to find that specific folder and its files.
For that, I have written a below code with the recursive approach but I am not getting the output as intended:
public class ConfigFileReader {
public static void main(String[] args) throws Exception {
System.out.println("Print L");
String path = "C:\\...\\ConfigFileReader\\src\\resources\\";
//FileReader reader = new FileReader(path + "\\Encounter\\Encounter.properties");
//Properties p = new Properties();
//p.load(reader);
File[] files = new File(path).listFiles();
String resourceType = "Encounter";
System.out.println(navigateDirectoriesAndFindTheFile(resourceType, files));
}
public static String navigateDirectoriesAndFindTheFile(String inputResourceString, File[] files) {
String entirePathOfTheIntendedFile = "";
for (File file : files) {
if (file.isDirectory()) {
navigateDirectoriesAndFindTheFile(inputResourceString, file.listFiles());
System.out.println("Directory: " + file.getName());
if (file.getName().startsWith(inputResourceString)) {
entirePathOfTheIntendedFile = file.getPath();
}
} else {
System.out.print("Inside...");
entirePathOfTheIntendedFile = file.getPath();
}
}
return entirePathOfTheIntendedFile;
}
}
Output:
The output should return C:\....\Encounter\Encounter.properties as the path.
First of all, if it finds the string while traversing it should return the file inside that folder and without navigating the further part as well as what is the best way to iterate over suppose 1k files because every time I can't follow this method because it doesn't seem an effective way of doing it. So, how can I use an in-memory approach for this problem? Please guide me through it.
You will need to check the output of recursive call and pass that back when a match is found.
Always use File or Path to handle filenames.
Assuming that I've understood the logic of the search, try this which scans for files of form XXX\XXXyyyy
public class ConfigReader
{
public static void main(String[] args) throws Exception {
System.out.println("Print L");
File path = new File(args[0]).getAbsoluteFile();
String resourceType = "Encounter";
System.out.println(navigateDirectoriesAndFindTheFile(resourceType, path));
}
public static File navigateDirectoriesAndFindTheFile(String inputResourceString, File path) {
File[] files = path.listFiles();
File found = null;
for (int i = 0; found == null && files != null && i < files.length; i++) {
File file = files[i];
if (file.isDirectory()) {
found = navigateDirectoriesAndFindTheFile(inputResourceString, file);
} else if (file.getName().startsWith(inputResourceString) && file.getParentFile().getName().equals(inputResourceString)) {
found = file;
}
}
return found;
}
}
If this is slow especially for 1K of files re-write with Files.walkFileTree which would be much faster than File.list() in recursion.

Java - Getting file name without extension from a folder

I'm using this code to get the absolute path of files inside a folder
public void addFiles(String fileFolder){
ArrayList<String> files = new ArrayList<String>();
fileOp.getFiles(fileFolder, files);
}
But I want to get only the file name of the files (without extension). How can I do this?
i don't think such a method exists. you can get the filename and get the last index of . and truncate the content after that and get the last index of File.separator and remove contents before that.
you got your file name.
or you can use FilenameUtils from apache commons IO and use the following
FilenameUtils.removeExtension(fileName);
This code will do the work of removing the extension and printing name of file:
public static void main(String[] args) {
String path = "C:\\Users\\abc\\some";
File folder = new File(path);
File[] files = folder.listFiles();
String fileName;
int lastPeriodPos;
for (int i = 0; i < files.length; i++) {
if (files[i].isFile()) {
fileName = files[i].getName();
lastPeriodPos = fileName.lastIndexOf('.');
if (lastPeriodPos > 0)
fileName = fileName.substring(0, lastPeriodPos);
System.out.println("File name is " + fileName);
}
}
}
If you are ok with standard libraries then use Apache Common as it has ready-made method for that.
There's a really good way to do this - you can use FilenameUtils.removeExtension.
Also, See: How to trim a file extension from a String
String filePath = "/storage/emulated/0/Android/data/myAppPackageName/files/Pictures/JPEG_20180813_124701_-894962406.jpg"
String nameWithoutExtension = Files.getNameWithoutExtension(filePath);

Use Java to find a File within a directory using only a Name

I'm trying to write this script that takes an Excel sheet, gets all the names of files from the cells, and moves each of those files to a specific folder. I've already got most of the code done, I just need to be able to search for each file in the source directory using just its title. Another problem is that I'm searching for multiple file types (.txt, .repos, .xlsx, .xls, .pdf, and some files don't have extensions), I only can search by the file name without the extension.
In my findAndMoveFiles method, I've got an ArrayList of each File and a Guava Multimap of XSSFCells to Strings (a cell is one cell from the Excel file and a String is the name of the folder it needs to go into, one to many relationship) as parameters. What I've got right now for the method is this.
public static void findAndMoveFiles(List<File> files, Multimap<XSSFCell, String> innerCells) {
// For each file, get its values (folders), and put that file in each of those folders
for (XSSFCell cell : innerCells.keySet()) {
// find the file in the master directory
//Finder f = new Finder();
//if (f.canBeFound(FOLDER, cell.getStringCellValue())) {
File file = find(FOLDER, cell.getStringCellValue());
//System.out.println(file.getAbsolutePath());
//List<String> values = new ArrayList(innerCells.get(cell));
/*for (String folder : values) {
File copy = file;
if (copy != null) {
System.out.println(folder);
System.out.println(copy.getAbsolutePath());
if (copy.renameTo(new File("C:\\strobell\\" + folder + "\\" + copy.getAbsolutePath()))) {
System.out.println(copy.getName() + " has been moved successfully.");
} else {
System.out.println(copy.getName() + " has failed to move.");
}
}
}*/
//}
}
}
public static File find(File dir, String fileName) {
String files = "";
File[] listOfFiles = dir.listFiles();
for (int i = 0; i < listOfFiles.length; i++) {
if (listOfFiles[i].isFile()) {
files = listOfFiles[i].getAbsolutePath();
if (files.equals(fileName)) {
return listOfFiles[i];
}
}
}
return null;
}
I commented out parts because it wasn't working. I was getting NullPointerExceptions because some files were being returned as null. I know that it's returning null, but each file should be found.
If there are any 3rd party libraries that can do this, that would be amazing, I've been racking my brain on how to do this properly.
Instead of
File[] listOfFiles = dir.listFiles();
use
File[] listOfFiles = dir.list(new FileNameFilter() {
public boolean accept(File dir, String name) {
if( /* code to check if file name is ok */ ) {
return true;
}
return false;
}
}););
Then you can code your logic on the file names in the condition.

How to get files with a filename starting with a certain letter?

I wrote some code to read a text file from C drive directly given a path.
String fileName1 = "c:\\M2011001582.TXT";
BufferedReader is = new BufferedReader(new FileReader(fileName1));
I want to get a list of files whose filename starts with M. How can I achieve this?
"but how can i write a code that file is exist in local drive or not"
To scan a directory for files matching a condition:
import java.io.File;
import java.io.FilenameFilter;
public class DirScan
{
public static void main(String[] args)
{
File root = new File("C:\\");
FilenameFilter beginswithm = new FilenameFilter()
{
public boolean accept(File directory, String filename) {
return filename.startsWith("M");
}
};
File[] files = root.listFiles(beginswithm);
for (File f: files)
{
System.out.println(f);
}
}
}
(The files will exist, otherwise they wouldn't be found).
You can split the string based on the token '\' and take the second element in the array and check it by using the startsWith() method avaialble on the String object
String splitString = fileName1.split("\\") ;
//check if splitString is not null and size is greater than 1 and then do the following
if(splitString[1].startsWith("M")){
// do whatever you want
}
To check if file exist, you can check in File Class docs
In Nutshell:
File f = new File(fileName1);
if(f.exists()) {
//do something
}

Categories