I'm currently trying write a program that will delete duplicate files within a given folder. I've been told to use Path object in favor of the File object, and that the API for Path has everything that File would have, but I can't seem to figure out how to make an array of items within the given path. Is it poor practice to be converting a Path to a File and using the listFiles() method? Is it bad practice to be converting from a Path to a File and back as I do in the code below?
public class FileIO {
final static String FILE_PATH = "C:\\Users\\" + System.getProperty("user.name") + "\\Documents\\Duplicate Test";
public static void main(String args[]) throws IOException {
Path path = Paths.get(FILE_PATH);
folderDive(path);
}
public static void folderDive(Path path) throws IOException {
File [] pathList = path.toFile().listFiles();
ArrayList<String> deletedList = new ArrayList<String>();
Arrays.sort(pathList);
BufferedWriter writer = new BufferedWriter(new FileWriter(FILE_PATH + "\\Deleted.txt"));
deletedList.add("Listed Below are files that have been succesfully deleted: ");
for(int pivot = 0; pivot < pathList.length - 1; pivot++) {
for(int index = pivot + 1; index < pathList.length; index++) {
if(pathList[pivot].exists() && pathList[index].exists() &&
fileCompare(pathList[pivot].toPath(), pathList[index].toPath())) {
deletedList.add(pathList[index].getName());
pathList[index].delete();
}
}
}
for(String list: deletedList) {
writer.write(list);
writer.newLine();
}
writer.close();
}
public static boolean fileCompare(Path firstFile, Path comparedFile) throws IOException {
byte [] first = Files.readAllBytes(firstFile);
byte [] second = Files.readAllBytes(comparedFile);
if(Arrays.equals(first, second)) {
return true;
}
return false;
}
}
Related
I'm attempting to save a list of lists to an ArrayList using a while loop which is looping over the lines in a scanner. The scanner is reading a 12 line text file of binary. The list of list (ArrayList) is successfully created, but as soon as the while loop terminates the variable ArrayList is empty and an empty list of lists is returned. I also tested the code by declaring a counter at the same time I declare the list of lists and the counter is incremented in the while loop and retains the data after the loop.
I'm still very new to coding! Thank you in advance.
public static void main(String[] args) throws Exception{
try {
readFile();
data = dataPrep();
}
catch (Exception e) {
e.printStackTrace ();
}
}
public static void readFile() throws FileNotFoundException {
try {
File inputtxt = new File("test.txt");
scanner = new Scanner(inputtxt);
}
catch (FileNotFoundException error) {
System.out.println(error);
}
}
public static ArrayList<ArrayList> dataPrep(){
ArrayList<ArrayList> allBinaryNumbers = new ArrayList<ArrayList>();
ArrayList<Integer> singleBinaryNumber = new ArrayList<Integer>();
int counter = 0;
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
char[] charLine = line.toCharArray();
for (char numb : charLine){
singleBinaryNumber.add(Integer.parseInt(String.valueOf(numb)));
}
allBinaryNumbers.add(singleBinaryNumber);
System.out.println(allBinaryNumbers);
singleBinaryNumber.clear();
counter++;
}
System.out.println(allBinaryNumbers);
System.out.println(counter);
return allBinaryNumbers;
}
My test.txt is this
00100
11110
10110
10111
10101
01111
00111
11100
10000
11001
00010
01010
You are reusing the same singleBinaryNumber which you clear after you finish populating it. Remember, this is a reference (pointer) which means you are adding the same list rather than new lists on each iteration.
You code should be something like this:
public static ArrayList<ArrayList> dataPrep(){
ArrayList<ArrayList> allBinaryNumbers = new ArrayList<ArrayList>();
int counter = 0;
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
char[] charLine = line.toCharArray();
ArrayList<Integer> singleBinaryNumber = new ArrayList<Integer>(); // create a new list for each iteration
for (char numb : charLine){
singleBinaryNumber.add(Integer.parseInt(String.valueOf(numb)));
}
allBinaryNumbers.add(singleBinaryNumber);
System.out.println(allBinaryNumbers);
// singleBinaryNumber.clear(); <-- remove this line
counter++;
}
System.out.println(allBinaryNumbers);
System.out.println(counter);
return allBinaryNumbers;
}
I think a better way to go would be to have the readFile() method do exactly that, read the file instead of just opening it and to have this method return an ArrayList. It should also accept a String argument that would be the file path (with file name) rather than hard coding the file path directly in the method itself, for example:
// Class instance member variable with Getter & Setter methods.
private String sourceFile;
// In your main() method:
// List of Lists. Each internal list is from a different file.
List<List<String>> filesBinaries = new ArrayList<>();
// List for current file to be read.
List<String> binaries = readFile(sourceFile);
// Add the current file List object to the
// List of Lists (filesBinaries).
if (!binaries.isEmpty()) {
filesBinaries.add(binaries);
}
With this, your readFile() method would then possibly look something like this:
public static List<String> readFile(String filePath) throws FileNotFoundException {
File file = new File(filePath);
if (!file.exists()) {
throw new FileNotFoundException("readFile() Method Error! The "
+ "supplied file in the path shown below does not exist!" + System.lineSeparator()
+ file.getAbsolutePath() + System.lineSeparator());
}
List<String> binaryLines = new ArrayList<>();
// 'Try With Resources' is used here to auto-close the reader.
try (Scanner reader = new Scanner(file)) {
String line = "";
while (reader.hasNextLine()) {
line = reader.nextLine().trim();
// Data Line Validation:
/* Skip past blank lines (if any) and any lines
that do not contain a valid binary string. Valid
lines would be: 100100 or 00100 11100 01110 */
if (line.isEmpty() || !line.matches("[01 ]+")) {
continue;
}
binaryLines.add(line);
}
}
return binaryLines;
}
And to fire this method, your main() method might look something like this:
public static void main(String[] args) {
List<List<String>> filesData = new ArrayList<>();
String sourceFile = "BinaryData.txt"; // The CURRENT file to read
try {
// Read the desired file...
List<String> binaryData = readFile(sourceFile);
// If the binaryData list is not empty then add it to the fileData List.
if (!binaryData.isEmpty()) {
filesData.add(binaryData);
}
}
catch (FileNotFoundException ex) {
System.err.println("The file: \"" + sourceFile + "\" could not be "
+ "found!" + System.lineSeparator() + ex.getMessage());
}
// Display the contens of the filesData List...
for (int i = 0; i < filesData.size(); i++) {
System.out.println("File #" + (i + 1) + " Binary Data:");
System.out.println("====================");
for (String binaryString : filesData.get(i)) {
System.out.println(binaryString);
}
}
}
List of russian names from input txt file
Александр
Роман
Михаил
This code sorts these names correctly in IntelliJ Idea during debugging.
When I create a jar file and run it from the windows console java -jar E:\\sort-it.jar, then in the output file the first name is Роман, although it should be Александр, as in debugging.
The incorrect order from jar launch is
Роман
Александр
Михаил
The correct order is
Александр
Михаил
Роман
What could be the problem?
package programs;
import java.io.*;
import java.util.*;
public class Main{
public static String inputFileName = "E:/in.txt";
public static String outputFileName = "E:/out.txt";
public static List<String> FetchFileData(String fileName) throws IOException {
List<String> tempArray = new ArrayList();
BufferedReader reader = new BufferedReader(new FileReader(fileName));
String line;
while ((line = reader.readLine()) != null){
tempArray.add(line);
}
reader.close();
return tempArray;
}
public static List<String> SortWords(List<String> inputArray) {
String temp;
for (int i = 0; i < inputArray.size(); i++){
for (int j = i + 1; j < inputArray.size(); j++){
if (inputArray.get(i).compareTo(inputArray.get(j)) > 0){
temp = inputArray.get(i);
inputArray.set(i, inputArray.get(j));
inputArray.set(j, temp);
}
}
}
return inputArray;
}
public static void WriteToFile(List<String> inputArray, String fileName) throws IOException {
BufferedWriter writer = new BufferedWriter(new FileWriter(fileName));
for (int i = 0; i < inputArray.size(); i++) {
writer.write(inputArray.get(i));
writer.newLine();
}
writer.close();
}
public static void main(String[] args) throws IOException {
List<String> unsortedArray;
List<String> sortedArray;
unsortedArray = FetchFileData(inputFileName);
sortedArray = SortWords(unsortedArray);
WriteToFile(sortedArray, outputFileName);
}
}
A small problem is that FileReader uses the default platform encoding.
In the IDE & Windows that could be another that in the console.
Better do:
public static List<String> FetchFileData(String fileName) throws IOException {
Charset charset = Charset.forName("Cp1251");
return Files.readAllLines(Paths.get(fileName), charset);
}
Specifying the charset of your files ensures that the application is portable to other computers (with the same file). Files provides support for writing too.
Ensure that every line trimmed of spaces and maybe the Unicode BOM character, \uFEFF:
String line = lines.get(i);
line = line.trim().replace("\uFEFF", "");
That there are better solutions that compareTo has already been said.
No sneeky Latin lookalike letters were inserted instead of Cyrillic.
The code looks fine too.
So check the charset; something else I do not see, as unlikely as it is.
I want to interchange the last 2 words in a java file.
The file is called text_d.txt and it contains:
Student learns programming java.
and this is the code(below).The output is the same and I don't understand why it does not change.
import java.nio.*;
import java.io.*;
import java.lang.*;
public class Test3 {
public static void main(String[] args) throws Exception {
String s2="text_t.txt";
File _newf = new File("text_d.txt");
changeOrder(_newf);
}
public static void changeOrder(File f) throws Exception {
FileInputStream _inp=new FileInputStream(f.getAbsolutePath());
BufferedReader _rd=new BufferedReader(new InputStreamReader(_inp));
String _p=_rd.readLine();
while (_p != null) {
String [] _b = _p.split(" ");
for(int i = 0; i <= _b.length; i++) {
if(i == 2) {
String aux=_b[i];
_b[i]=_b[i+1];
_b[i+1]=aux;
break;
}
}
_p=_rd.readLine();
}
}
}
For reading, interchanging and writing the file, I suggest you to do something like this:
public class Test3 {
public static void main(String[] args) throws Exception {
String s2="text_t.txt";
File _newf = new File("text_d.txt");
changeOrder(_newf);
}
public static void changeOrder(File f) throws Exception {
FileInputStream _inp = new FileInputStream(f.getAbsolutePath());
BufferedReader _rd = new BufferedReader(new InputStreamReader(_inp));
ArrayList<String[]> newFileContent = new ArrayList<String[]>();
String _p=_rd.readLine();
while (_p != null) {
String [] _b = _p.split(" ");
String temp = _b[_b.length - 2];
_b[_b.length - 2] = _b[_b.length - 1];
_b[_b.length - 1] = temp;
newFileContent.add(_b);
_p=_rd.readLine();
}
PrintWriter writer = new PrintWriter(f.getAbsolutePath(), "UTF-8");
for (String[] line : newFileContent) {
for (String word : line) {
writer.print(word);
}
writer.println();
}
writer.close();
}
There is two minor changes:
First I changed the for loop you used in your code, with 3 lines of code.
Second I used to add all of lines which changed in the while loop in an ArrayList of String arrays which could hold changes in order to save on the file in the future.
And after all, I used an instance of PrintWriter class which could write a file on the hard disk. and in a foreach loop, I wrote contents of new file on the input file.
You could try something like this:
public static void changeOrder(File f) throws Exception {
FileInputStream _inp = new FileInputStream(f.getAbsolutePath());
BufferedReader _rd = new BufferedReader(new InputStreamReader(_inp));
String _p = _rd.readLine();
while (_p != null) {
String [] _b=_p.split(" ");
String temp = _b[_b.length - 1];
_b[_b.length - 1] = _b[_b.length - 2];
_b[_b.length - 2] = temp;
_p = _rd.readLine();
}
}
But if you want the file to be updated you need to write the results to the file...You should use something to write to the file like a PrintWriter.
This should do the trick You want:
public static String changeOrder(File fileName) throws IOException {
Scanner file = new Scanner(fileName);
String line = file.nextLine();
line = line.replace('.', ' ');
String[] items = line.split(" ");
StringBuilder sb = new StringBuilder();
sb.append(items[0] + " ");
sb.append(items[1] + " ");
sb.append(items[3] + " ");
sb.append(items[2] + ".");
return sb.toString();
}
Expected result: Student learns java programming.
I am trying to write a program which will delete all duplicate files in a directory. It is currently able to detect duplicates, but my deleting code does not seem to be working (Files.delete() returns false). Can anybody tell me why this is?
Current code:
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.lang.SecurityManager;
public class Duplicate {
#SuppressWarnings("resource")
public static boolean isDuplicate(File a, File b) throws IOException {
FileInputStream as = new FileInputStream(a);
FileInputStream bs = new FileInputStream(b);
while(true) {
int aBytes = as.read();
int bBytes = bs.read();
if(aBytes != bBytes) {
return false;
} else if(aBytes == -1) {
System.out.println("Duplicate found: "+a.getName()+", "+b.getName());
return true;
}
}
}
public static void main(String[] args) throws IOException {
File dir = new File(System.getProperty("user.dir"));
File[] files = dir.listFiles();
for(int i = 0; i < files.length; i++) {
for(int j = i+1; j < files.length; j++) {
if(isDuplicate(files[i], files[j])) {
String filePath = System.getProperty("user.dir").replace("\\", "/")+"/"+files[i].getName();
System.out.println("Deleting "+filePath);
File f = new File(filePath);
if(f.delete())
System.out.println(filePath+" deleted successfully");
else
System.out.println("Could not delete "+filePath);
}
}
}
}
}
Did you close your file streams? It would make sense that it would return false if the file is currently open.
Apart from the resources problem (which certainly explains why you can't delete), the problem is that you won't know why the deletion fails -- in fact, with File you have no means to know at all.
Here is the equivalent program written with java.nio.file, with resource management:
public final class Duplicates
{
private Duplicates()
{
throw new Error("nice try!");
}
private static boolean duplicate(final Path path1, final Path path2)
throws IOException
{
if (Files.isSameFile(path1, path2))
return true;
final BasicFileAttributeView view1
= Files.getFileAttributeView(path1, BasicFileAttributeView.class);
final BasicFileAttributeView view2
= Files.getFileAttributeView(path2, BasicFileAttributeView.class);
final long size1 = view1.readAttributes().size();
final long size2 = view2.readAttributes().size();
if (size1 != size2)
return false;
try (
final FileChannel channel1 = FileChannel.open(path1,
StandardOpenOption.READ);
final FileChannel channel2 = FileChannel.open(path2,
StandardOpenOption.READ);
) {
final ByteBuffer buf1
= channel1.map(FileChannel.MapMode.READ_ONLY, 0L, size1);
final ByteBuffer buf2
= channel2.map(FileChannel.MapMode.READ_ONLY, 0L, size1);
// Yes, this works; see javadoc for ByteBuffer.equals()
return buf1.equals(buf2);
}
}
public static void main(final String... args)
throws IOException
{
final Path dir = Paths.get(System.getProperty("user.dir"));
final List<Path> list = new ArrayList<>();
for (final Path entry: Files.newDirectoryStream(dir))
if (Files.isRegularFile(entry))
list.add(entry);
final int size = list.size();
for (int i = 0; i < size; i++)
for (int j = i + 1; j < size; j++)
try {
if (duplicate(list.get(i), list.get(j)))
Files.deleteIfExists(list.get(j));
} catch (IOException e) {
System.out.printf("Aiie... Failed to delete %s\nCause:\n%s\n",
list.get(j), e);
}
}
}
Note: a better strategy would probably be to create a directory in which you will move all duplicates you detect; when done, just delete all files in this directory then the directory itself. See Files.move().
I have around 100 files in a folder. Each file will have data like this and each line resembles an user id.
960904056
6624084
1096552020
750160020
1776024
211592064
1044872088
166720020
1098616092
551384052
113184096
136704072
And I am trying to keep on merging the files from that folder into a new big file until the total number of user id's become 10 Million in that new big file.
I am able to read all the files from a particular folder and then I keep on adding the user id's from those files in a linkedhashset. And then I was thinking to see whether the size of hashset is 10 Million and if it is 10 million then write all those user id's to a new text file. Is that feasoible solution?
That 10 million number should be configurable. In future, If I need to change that 10 million 1o 50Million
then I should be able to do that.
Below is the code I have so far
public static void main(String args[]) {
File folder = new File("C:\\userids-20130501");
File[] listOfFiles = folder.listFiles();
Set<String> userIdSet = new LinkedHashSet<String>();
for (int i = 0; i < listOfFiles.length; i++) {
File file = listOfFiles[i];
if (file.isFile() && file.getName().endsWith(".txt")) {
try {
List<String> content = FileUtils.readLines(file, Charset.forName("UTF-8"));
userIdSet.addAll(content);
if(userIdSet.size() >= 10Million) {
break;
}
System.out.println(userIdSet);
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
Any help will be appreciated on this? And any better way to do the same process?
Continuing from where we left. ;)
You can use the FileUtils to write the file along with the writeLines() method.
Try this -
public static void main(String args[]) {
File folder = new File("C:\\userids-20130501");
Set<String> userIdSet = new LinkedHashSet<String>();
int count = 1;
for (File file : folder.listFiles()) {
if (file.isFile() && file.getName().endsWith(".txt")) {
try {
List<String> content = FileUtils.readLines(file, Charset.forName("UTF-8"));
userIdSet.addAll(content);
if(userIdSet.size() >= 10Million) {
File bigFile = new File("<path>" + count + ".txt");
FileUtils.writeLines(bigFile, userIdSet);
count++;
userIdSet = new LinkedHashSet<String>();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
If the purpose of saving the data in the LinkedHashSet is just for writing it again to another file then I have another solution.
EDIT to avoid OutOfMemory exception
public static void main(String args[]) {
File folder = new File("C:\\userids-20130501");
int fileNameCount = 1;
int contentCounter = 1;
File bigFile = new File("<path>" + fileNameCount + ".txt");
boolean isFileRequired = true;
for (File file : folder.listFiles()) {
if (file.isFile() && file.getName().endsWith(".txt")) {
try {
List<String> content = FileUtils.readLines(file, Charset.forName("UTF-8"));
contentCounter += content.size();
if(contentCounter < 10Million) {
FileUtils.writeLines(bigFile, content, true);
} else {
fileNameCount++;
bigFile = new File("<path>" + fileNameCount + ".txt");
FileUtils.writeLines(bigFile, content);
contentCounter = 1;
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
You can avoid the use of the Set as intermediate storage if you write at the same time that you read from file. You could do something like this,
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintWriter;
public class AppMain {
private static final int NUMBER_REGISTERS = 10000000;
private static String[] filePaths = {"filePath1", "filePaht2", "filePathN"};
private static String mergedFile = "mergedFile";
public static void main(String[] args) throws IOException {
mergeFiles(filePaths, mergedFile);
}
private static void mergeFiles(String[] filePaths, String mergedFile) throws IOException{
BufferedReader[] readerArray = createReaderArray(filePaths);
boolean[] closedReaderFlag = new boolean[readerArray.length];
PrintWriter writer = createWriter(mergedFile);
int currentReaderIndex = 0;
int numberLinesInMergedFile = 0;
BufferedReader currentReader = null;
String currentLine = null;
while(numberLinesInMergedFile < NUMBER_REGISTERS && getNumberReaderClosed(closedReaderFlag) < readerArray.length){
currentReaderIndex = (currentReaderIndex + 1) % readerArray.length;
if(closedReaderFlag[currentReaderIndex]){
continue;
}
currentReader = readerArray[currentReaderIndex];
currentLine = currentReader.readLine();
if(currentLine == null){
currentReader.close();
closedReaderFlag[currentReaderIndex] = true;
continue;
}
writer.println(currentLine);
numberLinesInMergedFile++;
}
writer.close();
for(int index = 0; index < readerArray.length; index++){
if(!closedReaderFlag[index]){
readerArray[index].close();
}
}
}
private static BufferedReader[] createReaderArray(String[] filePaths) throws FileNotFoundException{
BufferedReader[] readerArray = new BufferedReader[filePaths.length];
for (int index = 0; index < readerArray.length; index++) {
readerArray[index] = createReader(filePaths[index]);
}
return readerArray;
}
private static BufferedReader createReader(String path) throws FileNotFoundException{
BufferedReader reader = new BufferedReader(new FileReader(path));
return reader;
}
private static PrintWriter createWriter(String path) throws FileNotFoundException{
PrintWriter writer = new PrintWriter(path);
return writer;
}
private static int getNumberReaderClosed(boolean[] closedReaderFlag){
int count = 0;
for (boolean currentFlag : closedReaderFlag) {
if(currentFlag){
count++;
}
}
return count;
}
}
The way you're going, you likely may run out of memory, your are keeping an unnecessary record in userIdSet.
A slight modification that can improve your code is as follows:
public static void main(String args[]) {
File folder = new File("C:\\userids-20130501");
File[] listOfFiles = folder.listFiles();
// there's no need for the userIdSet!
//Set<String> userIdSet = new LinkedHashSet<String>();
// Instead I'd go for a counter ;)
long userIdCount = 0;
for (int i = 0; i < listOfFiles.length; i++) {
File file = listOfFiles[i];
if (file.isFile() && file.getName().endsWith(".txt")) {
try {
List<String> content = FileUtils.readLines(file, Charset.forName("UTF-8"));
// I just want to know how many lines there are...
userIdCount += content.size();
// my guess is you'd probably want to print what you've got
// before a possible break?? - You know better!
System.out.println(content);
if(userIdCount >= 10Million) {
break;
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
Like I noted, just a slight modification. It was not my intention to run a very detailed analysis on your code. I just pointed out a glaring mis-design.
Finally, where you stated System.out.println(content);, you might consider writing to file at that point.
If you will write to file one line at a time, you try-catch block may look like this:
try {
List<String> content = FileUtils.readLines(file, Charset.forName("UTF-8"));
for(int lineNumber = 0; lineNumber < content.size(); lineNumber++){
if(++userIdCount >= 10Million){
break;
}
// here, write to file... But I will use simple System.out.print for example
System.out.println(content.get(lineNumber));
}
} catch (IOException e) {
e.printStackTrace();
}
Your code can be improved in many ways, but I don't have time to do that. But I hope my suggestion can push you further to the front in the right track. Cheers!