I'm currently trying to create a method that merges several ZipFiles into one big. Therefore I created a method that takes a output file and a list of InputStreams.
These InputStreams are later transformed into ZipInputStreams. That works fine!
But I have trouble when a file has already been added to the archive. At this point I need to override the entry already added (InputStreams with a higher index (lower in the list) should override the files from streams with a lower index). I also know how to do that: I just do not add the entry if a archive that would need to override it.
But the problem is how could I check if a entry is contained in a ZipInputStream so I can skip the addition of the entry for the current stream?
My code so far:
public static void makeNewZipFromInputStreamList(File outputFile,
ArrayList<InputStream> inputStreamList,
ArrayList<String> includeList, ArrayList<String> excludeList)
throws IOException, IllegalArgumentException {
final int sizeOfLists[] = new int[] { inputStreamList.size(),
includeList.size(), excludeList.size() };
if ((sizeOfLists[0] != sizeOfLists[1])
|| (sizeOfLists[0] != sizeOfLists[2])
|| (sizeOfLists[1] != sizeOfLists[2]))
throw new IllegalArgumentException(
"The ArrayLists do not have the same size ("
+ sizeOfLists[0] + ", " + sizeOfLists[1] + ", "
+ sizeOfLists[2] + ")");
final ZipOutputStream zipOutputFile = new ZipOutputStream(
new FileOutputStream(outputFile));
final int size = sizeOfLists[0];
InputStream inputStreamTempArray[] = inputStreamList
.toArray(new InputStream[size]);
String includeArray[] = includeList.toArray(new String[size]);
String excludeArray[] = excludeList.toArray(new String[size]);
int i, j;
ZipInputStream stream, streamTmp;
ZipInputStream inputStreamArray[] = new ZipInputStream[size];
String include, exclude, fileName;
ZipEntry entry;
for (i = 0; i < size; i++) {
inputStreamArray[i] = new ZipInputStream(inputStreamTempArray[i]);
if (includeArray[i] == null) {
includeArray[i] = "";
}
if (excludeArray[i] == null) {
excludeArray[i] = "";
}
}
for (i = 0; i < size; i++) {
while ((entry = inputStreamArray[i].getNextEntry()) != null) {
fileName = entry.getName();
for (j = i + 1; j < size; j++) {
// Check if the entry exists in the following archives (Then skip this entry)
}
if (fileName.matches(includeArray[i]) || !fileName.matches(excludeArray[i])) {
zipOutputFile.putNextEntry(entry);
if (!entry.isDirectory()) {
copyStream(inputStreamArray[i], zipOutputFile, false, false);
}
}
}
inputStreamArray[i].close();
}
zipOutputFile.close();
}
copyStream:
private static boolean copyStream(final InputStream is,
final OutputStream os, boolean closeInputStream,
boolean closeOutputStream) {
try {
final byte[] buf = new byte[1024];
int len = 0;
while ((len = is.read(buf)) > 0) {
os.write(buf, 0, len);
}
if (closeInputStream) {
is.close();
}
if (closeOutputStream) {
os.close();
}
return true;
} catch (final IOException e) {
e.printStackTrace();
}
return false;
}
EDIT:
I had the idea to just append the entries the other way round meaning starting from the end of the list and if a entry is already put it is just going to skip.
When I'm doing this I get a really weird error:
java.util.zip.ZipException: invalid entry compressed size (expected 1506 but got 1507 bytes)
at java.util.zip.ZipOutputStream.closeEntry(Unknown Source)
at java.util.zip.ZipOutputStream.putNextEntry(Unknown Source)
at io.brainstone.github.installer.FileUtils.makeNewZipFromInputStreamList(FileUtils.java:304)
at io.brainstone.github.installer.Main.startInstalling(Main.java:224)
at io.brainstone.github.installer.Window$3$1.run(Window.java:183)
This is my current code:
public static void makeNewZipFromInputStreamList(File outputFile,
ArrayList<InputStream> inputStreamList,
ArrayList<String> includeList, ArrayList<String> excludeList)
throws IOException, IllegalArgumentException {
final int sizeOfLists[] = new int[] { inputStreamList.size(),
includeList.size(), excludeList.size() };
if ((sizeOfLists[0] != sizeOfLists[1])
|| (sizeOfLists[0] != sizeOfLists[2])
|| (sizeOfLists[1] != sizeOfLists[2]))
throw new IllegalArgumentException(
"The ArrayLists do not have the same size ("
+ sizeOfLists[0] + ", " + sizeOfLists[1] + ", "
+ sizeOfLists[2] + ")");
final ZipOutputStream zipOutputFile = new ZipOutputStream(
new FileOutputStream(outputFile));
final int size = sizeOfLists[0];
final InputStream inputStreamTempArray[] = inputStreamList
.toArray(new InputStream[size]);
final String includeArray[] = includeList.toArray(new String[size]);
final String excludeArray[] = excludeList.toArray(new String[size]);
final ZipInputStream inputStreamArray[] = new ZipInputStream[size];
HashMap<String, Object[]> tmp;
int i, j;
String fileName;
ZipEntry entry;
for (i = size - 1; i >= 0; i--) {
System.out.println(i);
inputStreamArray[i] = new ZipInputStream(inputStreamTempArray[i]);
if (includeArray[i] == null) {
includeArray[i] = "";
}
if (excludeArray[i] == null) {
excludeArray[i] = "";
}
while ((entry = inputStreamArray[i].getNextEntry()) != null) {
fileName = entry.getName();
if (fileName.matches(includeArray[i])
|| !fileName.matches(excludeArray[i])) {
// Here is where I would check if a entry is already put.
// Probably just by catching the exception thrown in this
// case
zipOutputFile.putNextEntry(entry);
if (!entry.isDirectory()) {
copyStream(inputStreamArray[i], zipOutputFile, false,
false);
}
}
}
inputStreamArray[i].close();
}
zipOutputFile.close();
}
Hold a map from fileName to entry.
Iterate over all entries in all input streams and put the entries in the map, mapped by file name. Last entry will always override previous. When you finish you have only all highest-indexed entries per file name.
Iterate over the map's entries and put them to zipOutputFile.
// (1) here all entries will be stored, overriding low-indexed with high-indexed
final Map<String, ZipEntry> fileNameToZipEntry = new HashMap<String, ZipEntry>();
// (2) Iterate over all entries and store in map, overriding low-indexed
for (i = 0; i < size; i++) {
while ((entry = inputStreamArray[i].getNextEntry()) != null) {
fileName = entry.getName();
fileNameToZipEntry.put(fileName, entry);
}
inputStreamArray[i].close();
}
// (3) Iterating the map that holds only the entries required for zipOutputFile
int j = 0;
for ( Set<Map.Entry<String, ZipEntry>> mapEntry : fileNameToZipEntry.entrySet() ) {
if (fileName.matches(includeArray[j]) || !fileName.matches(excludeArray[j])) {
zipOutputFile.putNextEntry(entry);
if (!entry.isDirectory()) {
copyStream(inputStreamArray[j], zipOutputFile, false, false);
}
}
j++;
}
The simplest way to solve this is iterating backwards through the ArrayLists.
public static void makeNewZipFromInputStreamList(File outputFile,
ArrayList<InputStream> inputStreamList,
ArrayList<String> includeList, ArrayList<String> excludeList)
throws IOException, IllegalArgumentException {
final int sizeOfLists[] = new int[] { inputStreamList.size(),
includeList.size(), excludeList.size() };
if ((sizeOfLists[0] != sizeOfLists[1])
|| (sizeOfLists[0] != sizeOfLists[2])
|| (sizeOfLists[1] != sizeOfLists[2]))
throw new IllegalArgumentException(
"The ArrayLists do not have the same size ("
+ sizeOfLists[0] + ", " + sizeOfLists[1] + ", "
+ sizeOfLists[2] + ")");
final ZipOutputStream zipOutputFile = new ZipOutputStream(
new FileOutputStream(outputFile));
final int size = sizeOfLists[0];
final InputStream inputStreamTempArray[] = inputStreamList
.toArray(new InputStream[size]);
final String includeArray[] = includeList.toArray(new String[size]);
final String excludeArray[] = excludeList.toArray(new String[size]);
final ZipInputStream inputStreamArray[] = new ZipInputStream[size];
HashMap<String, Object[]> tmp;
int i, j;
String fileName;
ZipEntry entry;
for (i = size - 1; i >= 0; i--) {
inputStreamArray[i] = new ZipInputStream(inputStreamTempArray[i]);
if (includeArray[i] == null) {
includeArray[i] = "";
}
if (excludeArray[i] == null) {
excludeArray[i] = "";
}
while ((entry = inputStreamArray[i].getNextEntry()) != null) {
fileName = entry.getName();
if (fileName.matches(includeArray[i])
|| !fileName.matches(excludeArray[i])) {
try {
zipOutputFile.putNextEntry(entry);
if (!entry.isDirectory()) {
copyStream(inputStreamArray[i], zipOutputFile,
false, false);
}
} catch (ZipException ex) {
if (!ex.getMessage()
.matches("duplicate entry: .*\\..*")) {
throw new RuntimeException(
"Unexpected " + ex.getClass() + " (\""
+ ex.getMessage()
+ "\")\n(only duplicate entry execptions are expected!)",
ex);
}
}
}
}
inputStreamArray[i].close();
}
zipOutputFile.close();
}
But thank you anyways!
Related
When runing this method i get wrong number of arguments exception like this :
Exception in thread "JavaFX Application Thread" java.lang.IllegalArgumentException: wrong number of arguments
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at javafxcyberwind.Participant_MethodsIMPL.execution(Participant_MethodsIMPL.java:85)
The exception is in the comment line, despite the arguments entered are correct, this is my code:
#Override
public void execution(String cls, String ip, Object... par) throws InvocationTargetException, RemoteException {
try {
URLClassLoader loader = new URLClassLoader(new URL[]{new URL("file:///" + prep)});
Class<?> c = loader.loadClass(cls);
Object j = c.newInstance();
Method[] methods = c.getDeclaredMethods();
for (Method method : methods) {
ArrayList<Object> tab = new ArrayList<>();
if (method.getReturnType() == int.class || method.getReturnType() == String.class || method.getReturnType() == boolean.class || method.getReturnType() == double.class) {
tab.clear();
tab.addAll(Arrays.asList(par));
int i = 0;
HashMap<Integer, File> lif = new HashMap<>();
File file = null;
for (Object o : tab) {
if (o.getClass().equals(Fichier.class)) {
String nomfichier = ((Fichier) o).getNom();
file = new File(prep + nomfichier);
lif.put(i, file);
}
i++;
}
for (Map.Entry<Integer, File> entry : lif.entrySet()) {
tab.remove(entry.getKey());
tab.add(entry.getKey(), entry.getValue());
}
k = method.invoke(j, tab.toArray());//line of exception
if (file != null) {
file.delete();
}
}
if (method.getReturnType().toString().equals("class java.io.File")) {
tab.clear();
tab.addAll(Arrays.asList(par));
int i = 0;
int t = -1;
String nomfichier = null;
File file = null;
for (Object o : tab) {
if (o.getClass().equals(Fichier.class)) {
nomfichier = ((Fichier) o).getNom();
file = new File(prep + nomfichier);
t = i;
}
i++;
}
if (t != -1) {
tab.remove(t);
tab.add(t, file);
}
k = method.invoke(j, tab.toArray());
if (file != null) {
file.delete();
}
fff = nomfichier.replace(nomfichier, cls + "_" + nomfichier);
File fres = new File(prep + fff);
R.uploadToCloud(fff);
Socket s = new Socket(ip, R.getPort());
FileInputStream inf = new FileInputStream(fres);
ObjectOutputStream out = new ObjectOutputStream(s.getOutputStream());
byte buf[] = new byte[1024];
int n;
while ((n = inf.read(buf)) != -1) {
out.write(buf, 0, n);
}
out.close();
inf.close();
s.close();
fres.delete();
}
}
} catch (IOException | ClassNotFoundException | InstantiationException | IllegalAccessException ex) {
Logger.getLogger(Participant_MethodsIMPL.class.getName()).log(Level.SEVERE, null, ex);
}
}
I can avoid the exception and make it work but as you see just for one file passed as an argument, this is the code :
if (method.getReturnType() == int.class || method.getReturnType() == String.class || method.getReturnType() == boolean.class || method.getReturnType() == double.class) {
tab.clear();
tab.addAll(Arrays.asList(par));
int i = 0;
int t = -1;
File file = null;
for (Object o : tab) {
if (o.getClass().equals(Fichier.class)) {//means there is an argument of type File
String nomfichier = ((Fichier) o).getNom();//getting the file name
file = new File(prep + nomfichier);//file that will replace the remote file
t = i;
}
i++;
}
if (t != -1) {//replacing the remote file
tab.remove(t);
tab.add(t, file);
}
k = method.invoke(j, tab.toArray());
if (file != null) {
file.delete();
}
}
This method is called remotely so i have to create a new file for each file passed as an argument known that the files are received in advance.
The problem is when passing more than one file as an argument, in this case, how can i create a list of files where each file have an id that equals i then just browse this list ? I tried to do this with HashMap like on top but i keep getting the exception !
I solved this by just replacing :
for (Map.Entry<Integer, File> entry : lif.entrySet()) {
tab.remove(entry.getKey());
tab.add(entry.getKey(), entry.getValue());
}
By :
lif.entrySet().stream().forEach(entry -> tab.set(entry.getKey(), entry.getValue()));
FileReader fr = new FileReader(inp);
CSVReader reader = new CSVReader(fr, ',', '"');
// writer
File writtenFromWhile = new File(dliRootPath + writtenFromWhilePath);
writtenFromWhile.createNewFile();
CSVWriter writeFromWhile = new CSVWriter(new FileWriter(writtenFromWhile), ',', '"');
int insideWhile = 0;
String[] currRow = null;
while ((currRow = reader.readNext()) != null) {
insideWhile++;
writeFromWhile.writeNext(currRow);
}
System.out.println("inside While: " + insideWhile);
System.out.println("lines read (acc.to CSV reader): " + reader.getLinesRead());
The output is:
inside While: 162199
lines read (acc.to CSV reader): 256865
Even though all lines are written to the output CSV (when viewed in a text editor, Excel shows much lesser number of rows), the while loop does not iterate the same number of times as the rows in input CSV. My main objective is to implement some other logic inside while loop on each line.
I have been trying to debug since two whole days ( a bigger code) without any results.
Please explain how I can loop through while 256865 times
Reference data, complete picture:
Here is the CSV I am reading in the above snippet.
My complete program tries to separate out those records from this CSV which are not present in this CSV, based on the fields title and author (i.e if author and title is the same in 2 records, even if other fields are different, they are counted as duplicate and should not be written to the output file). Here is my complete code (the difference should be around 300000, but i get only ~210000 in the output file with my code):
//TODO ask id
/*(*
* id also there in fields getting matched (thisRow[0] is id)
* u can replace it by thisRow[fielAnd Column.get(0)] to eliminate id
*/
package mainOne;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
import com.opencsv.CSVReader;
import com.opencsv.CSVWriter;
public class Diff_V3 {
static String dliRootPath = "/home/gurnoor/Incoming/Untitled Folder 2/";
static String dli = "new-dli-IITG.csv";
static String oldDli = "dli-iisc.csv";
static String newFile = "newSampleFile.csv";// not used
static String unqFile = "UniqueFileFinal.csv";
static String log = "Diff_V3_log.txt";
static String splittedNewDliDir = "/home/gurnoor/Incoming/Untitled Folder 2/splitted new file";
static String splittedOldDliDir = "/home/gurnoor/Incoming/Untitled Folder 2/splitted old file";
// debug
static String testFilePath = "testFile.csv";
static int insidepopulateMapFromSplittedCSV = 0;
public static void main(String[] args) throws IOException, CustomException {
// _readSample(dliRootPath+dli, dliRootPath+newFile);
// System.out.println(areIDsunique(dliRootPath + dli, 550841) );// open
// in geany to get total no
// of lines
// TODO implement sparate function to check equals
// File filteredFile = new File(dliRootPath + "filteredFile.csv");
// filteredFile.createNewFile();
File logFile = new File(dliRootPath + log);
logFile.createNewFile();
new File(dliRootPath + testFilePath).createNewFile();
List<String> fieldsToBeMatched = new ArrayList<>();
fieldsToBeMatched.add("dc.contributor.author[]");
fieldsToBeMatched.add("dc.title[]");
filterUniqueFileds(new File(splittedNewDliDir), new File(splittedOldDliDir), fieldsToBeMatched);
}
/**
* NOTE: might remove the row where fieldToBeMatched is null
*
* #param inpfile
* #param file
* #param filteredFile
* #param fieldsToBeMatched
* #throws IOException
* #throws CustomException
*/
private static void filterUniqueFileds(File newDir, File oldDir, List<String> fieldsToBeMatched)
throws IOException, CustomException {
CSVReader reader = new CSVReader(new FileReader(new File(dliRootPath + dli)), '|');
// writer
File unqFileOp = new File(dliRootPath + unqFile);
unqFileOp.createNewFile();
CSVWriter writer = new CSVWriter(new FileWriter(unqFileOp), '|');
// logWriter
BufferedWriter logWriter = new BufferedWriter(new FileWriter(new File(dliRootPath + log)));
String[] headingRow = // allRows.get(0);
reader.readNext();
writer.writeNext(headingRow);
int headingLen = headingRow.length;
// old List
System.out.println("[INFO] reading old list...");
// CSVReader oldReader = new CSVReader(new FileReader(new
// File(dliRootPath + oldDli)));
Map<String, List<String>> oldMap = new HashMap<>();
oldMap = populateMapFromSplittedCSV(oldMap, oldDir);// populateMapFromCSV(oldMap,
// oldReader);
// oldReader.close();
System.out.println("[INFO] Read old List. Size = " + oldMap.size());
printMapToCSV(oldMap, dliRootPath + testFilePath);
// map of fieldName, ColumnNo
Map<String, Integer> fieldAndColumnNoInNew = new HashMap<>(getColumnNo(fieldsToBeMatched, headingRow));
Map<String, Integer> fieldAndColumnNoInOld = new HashMap<>(
getColumnNo(fieldsToBeMatched, (String[]) oldMap.get("id").toArray()));
// error check: did columnNo get populated?
if (fieldAndColumnNoInNew.isEmpty()) {
reader.close();
writer.close();
throw new CustomException("field to be matched not present in input CSV");
}
// TODO implement own array compare using areEqual()
// error check
// if( !Arrays.equals(headingRow, (String[]) oldMap.get("id").toArray())
// ){
// System.out.println("heading in new file, old file: \n"+
// Arrays.toString(headingRow));
// System.out.println(Arrays.toString((String[])
// oldMap.get("id").toArray()));
// reader.close();
// writer.close();
// oldReader.close();
// throw new CustomException("Heading rows are not same in old and new
// file");
// }
int noOfRecordsInOldList = 0, noOfRecordsWritten = 0, checkManually = 0;
String[] thisRow;
while ((thisRow = reader.readNext()) != null) {
// for(int l=allRows.size()-1; l>=0; l--){
// thisRow=allRows.get(l);
// error check
if (thisRow.length != headingLen) {
String error = "Line no: " + reader.getLinesRead() + " in file: " + dliRootPath + dli
+ " not read. Check manually";
System.err.println(error);
logWriter.append(error + "\n");
logWriter.flush();
checkManually++;
continue;
}
// write if not present in oldMap
if (!oldMap.containsKey(thisRow[0])) {
writer.writeNext(thisRow);
writer.flush();
noOfRecordsWritten++;
} else {
// check if all reqd fields match
List<String> twinRow = oldMap.get(thisRow[0]);
boolean writtenToOp = false;
// for (int k = 0; k < fieldsToBeMatched.size(); k++) {
List<String> newFields = new ArrayList<>(fieldAndColumnNoInNew.keySet());
List<String> oldFields = new ArrayList<>(fieldAndColumnNoInOld.keySet());
// faaltu error check
if (newFields.size() != oldFields.size()) {
reader.close();
writer.close();
CustomException up = new CustomException("something is really wrong");
throw up;
}
// for(String fieldName : fieldAndColumnNoInNew.keySet()){
for (int m = 0; m < newFields.size(); m++) {
int columnInNew = fieldAndColumnNoInNew.get(newFields.get(m)).intValue();
int columnInOld = fieldAndColumnNoInOld.get(oldFields.get(m)).intValue();
String currFieldTwin = twinRow.get(columnInOld);
String currField = thisRow[columnInNew];
if (!areEqual(currField, currFieldTwin)) {
writer.writeNext(thisRow);
writer.flush();
writtenToOp = true;
noOfRecordsWritten++;
System.out.println(noOfRecordsWritten);
break;
}
}
if (!writtenToOp) {
noOfRecordsInOldList++;
// System.out.println("[INFO] present in old List: \n" +
// Arrays.toString(thisRow) + " AND\n"
// + twinRow.toString());
}
}
}
System.out.println("--------------------------------------------------------\nDebug info");
System.out.println("old File: " + oldMap.size());
System.out.println("new File:" + reader.getLinesRead());
System.out.println("no of records in old list (present in both old and new) = " + noOfRecordsInOldList);
System.out.println("checkManually: " + checkManually);
System.out.println("noOfRecordsInOldList+checkManually = " + (noOfRecordsInOldList + checkManually));
System.out.println("no of records written = " + noOfRecordsWritten);
System.out.println();
System.out.println("inside populateMapFromSplittedCSV() " + insidepopulateMapFromSplittedCSV + "times");
logWriter.close();
reader.close();
writer.close();
}
private static void printMapToCSV(Map<String, List<String>> oldMap, String testFilePath2) throws IOException {
// writer
int i = 0;
CSVWriter writer = new CSVWriter(new FileWriter(new File(testFilePath2)), '|');
for (String key : oldMap.keySet()) {
List<String> row = oldMap.get(key);
String[] tempRow = new String[row.size()];
tempRow = row.toArray(tempRow);
writer.writeNext(tempRow);
writer.flush();
i++;
}
writer.close();
System.out.println("[hello from line 210 ( inside printMapToCSV() ) of ur code] wrote " + i + " lines");
}
private static Map<String, List<String>> populateMapFromSplittedCSV(Map<String, List<String>> oldMap, File oldDir)
throws IOException {
File defective = new File(dliRootPath + "defectiveOldFiles.csv");
defective.createNewFile();
CSVWriter defectWriter = new CSVWriter(new FileWriter(defective));
CSVReader reader = null;
for (File oldFile : oldDir.listFiles()) {
insidepopulateMapFromSplittedCSV++;
reader = new CSVReader(new FileReader(oldFile), ',', '"');
oldMap = populateMapFromCSV(oldMap, reader, defectWriter);
// printMapToCSV(oldMap, dliRootPath+testFilePath);
System.out.println(oldMap.size());
reader.close();
}
defectWriter.close();
System.out.println("inside populateMapFromSplittedCSV() " + insidepopulateMapFromSplittedCSV + "times");
return new HashMap<String, List<String>>(oldMap);
}
private static Map<String, Integer> getColumnNo(List<String> fieldsToBeMatched, String[] headingRow) {
Map<String, Integer> fieldAndColumnNo = new HashMap<>();
for (String field : fieldsToBeMatched) {
for (int i = 0; i < headingRow.length; i++) {
String heading = headingRow[i];
if (areEqual(field, heading)) {
fieldAndColumnNo.put(field, Integer.valueOf(i));
break;
}
}
}
return fieldAndColumnNo;
}
private static Map<String, List<String>> populateMapFromCSV(Map<String, List<String>> oldMap, CSVReader oldReader,
CSVWriter defectWriter) throws IOException {
int headingLen = 0;
List<String> headingRow = null;
if (oldReader.getLinesRead() > 1) {
headingRow = oldMap.get("id");
headingLen = headingRow.size();
}
String[] thisRow;
int insideWhile = 0, addedInMap = 0, doesNotContainKey = 0, containsKey = 0;
while ((thisRow = oldReader.readNext()) != null) {
// error check
// if (oldReader.getLinesRead() > 1) {
// if (thisRow.length != headingLen) {
// System.err.println("Line no: " + oldReader.getLinesRead() + " in
// file: " + dliRootPath + oldDli
// + " not read. Check manually");
// defectWriter.writeNext(thisRow);
// defectWriter.flush();
// continue;
// }
// }
insideWhile++;
if (!oldMap.containsKey(thisRow[0])) {
doesNotContainKey++;
List<String> fullRow = Arrays.asList(thisRow);
fullRow = oldMap.put(thisRow[0], fullRow);
if (fullRow == null) {
addedInMap++;
}
} else {
List<String> twinRow = oldMap.get(thisRow[0]);
boolean writtenToOp = false;
// for(String fieldName : fieldAndColumnNoInNew.keySet()){
for (int m = 0; m < headingRow.size(); m++) {
String currFieldTwin = twinRow.get(m);
String currField = thisRow[m];
if (!areEqual(currField, currFieldTwin)) {
System.err.println("do something!!!!!! DUPLICATE ID in old file");
containsKey++;
FileWriter logWriter = new FileWriter(new File((dliRootPath + log)));
System.err.println("[Skipped record] in old file. Row no: " + oldReader.getLinesRead()
+ "\nRecord: " + Arrays.toString(thisRow));
logWriter.append("[Skipped record] in old file. Row no: " + oldReader.getLinesRead()
+ "\nRecord: " + Arrays.toString(thisRow));
logWriter.close();
break;
}
}
}
}
System.out.println("inside while: " + insideWhile);
System.out.println("oldMap size = " + oldMap.size());
System.out.println("addedInMap: " + addedInMap);
System.out.println("doesNotContainKey: " + doesNotContainKey);
System.out.println("containsKey: " + containsKey);
return new HashMap<String, List<String>>(oldMap);
}
private static boolean areEqual(String field, String heading) {
// TODO implement, askSubhayan
return field.trim().equals(heading.trim());
}
/**
* Returns the first duplicate ID OR the string "unique" OR (rarely)
* totalLinesInCSV != totaluniqueIDs
*
* #param inpCSV
* #param totalLinesInCSV
* #return
* #throws IOException
*/
private static String areIDsunique(String inpCSV, int totalLinesInCSV) throws IOException {
CSVReader reader = new CSVReader(new FileReader(new File(dliRootPath + dli)), '|');
List<String[]> allRows = new ArrayList<>(reader.readAll());
reader.close();
Set<String> id = new HashSet<>();
for (String[] thisRow : allRows) {
if (thisRow[0] != null || !thisRow[0].isEmpty() || id.add(thisRow[0])) {
return thisRow[0];
}
}
if (id.size() == totalLinesInCSV) {
return "unique";
} else {
return "totalLinesInCSV != totaluniqueIDs";
}
}
/**
* writes 20 rowsof input csv into the output file
*
* #param input
* #param output
* #throws IOException
*/
public static void _readSample(String input, String output) throws IOException {
File opFile = new File(dliRootPath + newFile);
opFile.createNewFile();
CSVWriter writer = new CSVWriter(new FileWriter(opFile));
CSVReader reader = new CSVReader(new FileReader(new File(dliRootPath + dli)), '|');
for (int i = 0; i < 20; i++) {
// String[] op;
// for(String temp: reader.readNext()){
writer.writeNext(reader.readNext());
// }
// System.out.println();
}
reader.close();
writer.flush();
writer.close();
}
}
RC's comment nailed it!
If you check the java docs you will see that there are two methods in the CSVReader: getLinesRead and getRecordsRead. And they both do exactly what they say. getLinesRead returns the number of lines that was read using the FileReader. getRecordsRead returns the number of records that the CSVReader read. Keep in mind that if you have embedded new lines in the records of your file then it will take multiple line reads to get one record. So it is very conceivable to have a csv file with 100 records but taking 200 line reads to read them all.
Unescaped quotes inside a CSV cell can mess up your whole data. This might happen in a CSV if the data you are working with has been created manually. Below is a function I wrote a while back for this situation. Let me know if this is not the right place to share it.
/**
* removes quotes inside a cell/column puts curated data in
* "../CuratedFiles"
*
* #param curateDir
* #param del Csv column delimiter
* #throws IOException
*/
public static void curateCsvRowQuotes(File curateDir, String del) throws IOException {
File parent = curateDir.getParentFile();
File curatedDir = new File(parent.getAbsolutePath() + "/CuratedFiles");
curatedDir.mkdir();
for (File file : curateDir.listFiles()) {
BufferedReader bufRead = new BufferedReader(new FileReader(file));
// output
File fOp = new File(curatedDir.getAbsolutePath() + "/" + file.getName());
fOp.createNewFile();
BufferedWriter bufW = new BufferedWriter(new FileWriter(fOp));
bufW.append(bufRead.readLine() + "\n");// heading
// logs
File logFile = new File(curatedDir.getAbsolutePath() + "/CurationLogs.txt");
logFile.createNewFile();
BufferedWriter logWriter = new BufferedWriter(new FileWriter(logFile));
String thisLine = null;
int lineCount = 0;
while ((thisLine = bufRead.readLine()) != null) {
String opLine = "";
int endIndex = thisLine.indexOf("\"" + del);
String str = thisLine.substring(0, endIndex);
opLine += str + "\"" + del;
while (endIndex != (-1)) {
// leave out first " in a cell
int tempIndex = thisLine.indexOf("\"" + del, endIndex + 2);
if (tempIndex == (-1)) {
break;
}
str = thisLine.substring(endIndex + 2, tempIndex);
int indexOfQuote = str.indexOf("\"");
opLine += str.substring(0, indexOfQuote + 1);
// remove all "
str = str.substring(indexOfQuote + 1);
str = str.replace("\"", "");
opLine += str + "\"" + del;
endIndex = thisLine.indexOf("\"" + del, endIndex + 2);
}
str = thisLine.substring(thisLine.lastIndexOf("\"" + del) + 2);
if ((str != null) && str.matches("[" + del + "]+")) {
opLine += str;
}
System.out.println(opLine);
bufW.append(opLine + "\n");
bufW.flush();
lineCount++;
}
System.out.println(lineCount + " no of lines in " + file.getName());
bufRead.close();
bufW.close();
}
}
In my case, I've used csvReader.readAll() before the readNext().
Like
List<String[]> myData =csvReader.readAll();
while ((nextRecord = csvReader.readNext()) != null) {
}
So my csvReader.readNext() returns always null. Since all the values were already read by myData.
Please be caution for using readNext() and readAll() functions.
The background info here is that I have a working Indexer and Search (in java) that indexes and searches a file directory for the filenames and then copies the files to a "Results" Directory.
What I need/ don't have much experience in is writing jsp files. I need the jsp file to have a search bar for the text and then a search button. When text is entered in the bar, and the button is clicked, I need it to run my search program with the entered text as an arg.
I have added the IndexFiles and the SearchFiles classes for reference.
Please explain with a good example if you can help out!
public class SearchFiles {
static File searchDirectory = new File(
"C:\\Users\\flood.j.2\\Desktop\\IndexSearch\\Results");
static String v = new String();
static String path = null;
String title = null;
File addedFile = null;
OutputStream out = null;
String dirName = "C:\\Users\\flood.j.2\\Desktop\\IndexSearch\\Results";
public static void main(String[] args) throws Exception {
String usage = "Usage:\tjava org.apache.lucene.demo.SearchFiles [-index dir] [-field f] [-repeat n] [-queries file] [-query string]";
if (args.length > 0
&& ("-h".equals(args[0]) || "-help".equals(args[0]))) {
System.out.println(usage);
System.exit(0);
}
for (int j = 5; j < args.length; j++) {
v += args[j] + " ";
}
String index = "index";
String field = "contents";
String queries = null;
boolean raw = false;
String queryString = null;
int hits = 100;
for (int i = 0; i < args.length; i++) {
if ("-index".equals(args[i])) {
index = args[i + 1];
i++;
} else if ("-field".equals(args[i])) {
field = args[i + 1];
i++;
} else if ("-queries".equals(args[i])) {
queries = args[i + 1];
i++;
} else if ("-query".equals(args[i])) {
queryString = v;
i++;
}
}
IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(
index)));
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
BufferedReader in = null;
if (queries != null) {
in = new BufferedReader(new InputStreamReader(new FileInputStream(
queries), "UTF-8"));
} else {
in = new BufferedReader(new InputStreamReader(System.in, "UTF-8"));
}
QueryParser parser = new QueryParser(Version.LUCENE_40, field, analyzer);
for (int m = 0; m < 2; m++) {
if (queries == null && queryString == null) {
System.out.println("Enter query: ");
}
String line = queryString != null ? queryString : in.readLine();
if (line == null || line.length() == -1) {
break;
}
line = line.trim();
if (line.length() == 0) {
break;
}
Query query = parser.parse(line);
System.out.println("Searching for: " + query.toString(field));
doPagingSearch(in, searcher, query, hits, raw, queries == null
&& queryString == null);
if (queryString == null) {
break;
}
}
reader.close();
}
public static void doPagingSearch(BufferedReader in,
IndexSearcher searcher, Query query, int hitsPerPage, boolean raw,
boolean interactive) throws IOException {
// Collect enough docs to show 500 pages
TopDocs results = searcher.search(query, 5 * hitsPerPage);
ScoreDoc[] hits = results.scoreDocs;
int numTotalHits = results.totalHits;
System.out.println(numTotalHits + " total matching documents");
int start = 0;
int end = Math.min(numTotalHits, hitsPerPage);
FileUtils.deleteDirectory(searchDirectory);
while (true) {
for (int i = start; i < end; i++) {
Document doc = searcher.doc(hits[i].doc);
path = doc.get("path");
if (path != null) {
System.out.println((i + 1) + ". " + path);
File addFile = new File(path);
try {
FileUtils.copyFileToDirectory(addFile, searchDirectory);
} catch (IOException e) {
e.printStackTrace();
}
}
}
if (!interactive || end == 0) {
break;
}
System.exit(0);
}
}
}
public class IndexFiles {
private IndexFiles() {
}
public static void main(String[] args) {
String usage = "java org.apache.lucene.demo.IndexFiles"
+ " [-index INDEX_PATH] [-docs DOCS_PATH] [-update]\n\n"
+ "This indexes the documents in DOCS_PATH, creating a Lucene index"
+ "in INDEX_PATH that can be searched with SearchFiles";
String indexPath = null;
String docsPath = null;
boolean create = true;
for (int i = 0; i < args.length; i++) {
if ("-index".equals(args[i])) {
indexPath = args[i + 1];
i++;
} else if ("-docs".equals(args[i])) {
docsPath = args[i + 1];
i++;
} else if ("-update".equals(args[i])) {
create = false;
}
}
if (docsPath == null) {
System.err.println("Usage: " + usage);
System.exit(1);
}
final File docDir = new File(docsPath);
if (!docDir.exists() || !docDir.canRead()) {
System.out
.println("Document directory '"
+ docDir.getAbsolutePath()
+ "' does not exist or is not readable, please check the path");
System.exit(1);
}
Date start = new Date();
try {
System.out.println("Indexing to directory '" + indexPath + "'...");
Directory dir = FSDirectory.open(new File(indexPath));
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_40);
IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_40,
analyzer);
if (create) {
iwc.setOpenMode(OpenMode.CREATE);
} else {
iwc.setOpenMode(OpenMode.CREATE_OR_APPEND);
}
IndexWriter writer = new IndexWriter(dir, iwc);
indexDocs(writer, docDir);
writer.close();
Date end = new Date();
System.out.println(end.getTime() - start.getTime()
+ " total milliseconds");
} catch (IOException e) {
System.out.println(" caught a " + e.getClass()
+ "\n with message: " + e.getMessage());
}
}
static void indexDocs(IndexWriter writer, File file) throws IOException {
if (file.canRead()) {
if (file.isDirectory()) {
String[] files = file.list();
if (files != null) {
for (int i = 0; i < files.length; i++) {
indexDocs(writer, new File(file, files[i]));
}
}
} else {
FileInputStream fis;
try {
fis = new FileInputStream(file);
} catch (FileNotFoundException fnfe) {
return;
}
try {
Document doc = new Document();
Field pathField = new StringField("path",
file.getAbsolutePath(), Field.Store.YES);
doc.add(pathField);
doc.add(new LongField("modified", file.lastModified(),
Field.Store.NO));
doc.add(new TextField("title", file.getName(), null));
System.out.println(pathField);
if (writer.getConfig().getOpenMode() == OpenMode.CREATE) {
System.out.println("adding " + file);
writer.addDocument(doc);
} else {
System.out.println("updating " + file);
writer.updateDocument(new Term("path", file.getPath()),
doc);
}
} finally {
fis.close();
}
}
}
}
}
First, you should definitely do this in a servlet rather than a JSP. Putting lots of logic in JSP is bad practice. (See the servlets info page).
Second, it would probably be better on performance to make a cronjob (Linux) or Task (Windows) to run the search program every hour and store the results in a database and just have your servlet pull from there rather than allow the user to initiate the search program.
I'm trying to load an excel file(xlsx) into a Workbook Object using apache POI 3.10.
I'm receiving a java.lang.OutofMemoryError.
I'm using Java 8 with the -Xmx2g argument on the JVM.
All 4 cores(64bit System) and my RAM(4gb) are maxed out when I run the program.
The excel sheet has 43 columns and 166,961 Rows which equal 7,179,323 Cells.
I'm using Apache POIs WorkBookFactory.create(new File) because it uses less memory than using InputFileStream.
Does anyone have any ideas how to optimize memory usage or another way to create the Workbook?
Below is my test Reader class, don't judge, it's rough and includes debugging statements:
import java.io.File;
import java.io.IOException;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.usermodel.WorkbookFactory;
public class Reader {
private Workbook wb;
public Reader(File excel) {
System.out.println("CONSTRUCTOR");
wb = null;
try {
wb = WorkbookFactory.create(excel);
} catch (IOException e) {
System.out.println("IO Exception");
System.out.println(e.getMessage());
} catch (InvalidFormatException e) {
System.out.println("Invalid Format");
System.out.println(e.getMessage());
}
}
public boolean exists() { return (wb != null); }
public void print() {}
public static void main(String[] args) {
System.out.println("START PRG");
//File f = new File("oldfilename.xls");
File f = new File("filename.xlsx");
System.out.println("PATH:" + f.getAbsoluteFile());
if (!f.exists()) {
System.out.println("File does not exist.");
System.exit(0);
}
System.out.println("FILE");
Reader r = new Reader(f);
System.out.println("Reader");
r.print();
System.out.println("PRG DONE");
}
}
apparently loading a 24mb file shouldn't be causing OOM...
at first glance it appears to me, though Xmx set to 2G, there's actually not that much memory free in system. in other words OS and other processes may have taken more than 2G out of 4G of physical memory! Check available physical memory first. in case available below what's expected, try closing some other running apps/processes.
if that's not the case and there's indeed enough memory left, without profiling it's really hard to identify the real cause. use a profile tool to check JVM status, related to memory first. you may simply use jconsole (as it comes with JDK). #see this on how to activate JMX
once you are connected, check readings related to memory, specifically below memory spaces:
old gen
young gen
perm gen
monitor these spaces and see where it's struggling. I assume this is a standalone application. in case this is deployed on server (as web or services), you may consider '-XX:NewRatio' option for distributing heap spaces effectively and efficiently. #see tuning related details here.
Please confirm these before proceeding,
Is there any infinite execution in looping(for/while)
Ensure your physical storage size
Maximize buffer memory
Note
As per my understanding Apache POI will not consume that much amount of memory.
I am just a beginner, but may I ask you some questions.
Why not use XSSFWorkbook class to open XLSX file. I mean, I always use it to handle XLSX files, and this time I tried with a file(7 MB; that was the largest I could find in my computer), and it worked perfectly.
Why not use newer File API(NIO, Java 7). Again, I do not know if this will make any difference or not. But, it worked for me.
Windows 7 Ultimate | 64 bit | Intel 2nd Gen Core i3|Eclipse Juno|JDK 1.7.45|Apache POI 3.9
Path file = Paths.get("XYZABC.xlsx");
try {
XSSFWorkbook wb = new XSSFWorkbook(Files.newInputStream(file, StandardOpenOption.READ));
} catch (IOException e) {
System.out.println("Some IO Error!!!!");
}
Do, tell if it works for you or not.
Did you tried using SXSSFWorkbook? We also used Apache POI to handle relatively big XLSX files, and we also had memory problems when using plain XSSFWorkbook. Although we didn't have to read in the files, we were just writing tens of thousands of lines of informations. Using this, our memory problems got solved. You can pass an XSSFWorkbook to its constructor and the size of data you want to keep in memory.
Java 1.8
based on HSSF and XSSF Limitations
my poi version is 3.17 POI Examples
lauches my code
public class Controller {
EX stressTest;
public void fineFile() {
String stresstest = "C:\\Stresstest.xlsx";
HashMap<String, String[]> stressTestMap = new HashMap<>();
stressTestMap.put("aaaa", new String[]{"myField", "The field"});
stressTestMap.put("bbbb", new String[]{"other", "Other value"});
try {
InputStream stressTestIS = new FileInputStream(stresstest);
stressTest = new EX(stresstest, stressTestIS, stressTestMap);
} catch (IOException exp) {
}
}
public void printErr() {
if (stressTest.thereAreErrors()) {
try {
FileWriter myWriter = new FileWriter(
"C:\\logErrorsStressTest" +
(new SimpleDateFormat("ddMMyyyyHHmmss")).format(new Date()) +
".txt"
);
myWriter.write(stressTest.getBodyFileErrors());
myWriter.close();
} catch (IOException e) {
e.printStackTrace();
}
} else {
}
}
public void createBD() {
List<OneObjectWhatever> entitiesList =
(
!stressTest.thereAreErrors()
? ((List<OneObjectWhatever>) stressTest.toListCustomerObject(OneObjectWhatever.class))
: new ArrayList<>()
);
entitiesList.forEach(entity -> {
Field[] fields = entity.getClass().getDeclaredFields();
String valueString = "";
for (Field attr : fields) {
try {
attr.setAccessible(true);
valueString += " StressTest:" + attr.getName() + ": -" + attr.get(fields) + "- ";
attr.setAccessible(true);
} catch (Exception reflectionError) {
System.out.println(reflectionError);
}
}
});
}
}
MY CODE
public class EX {
private HashMap<Integer, HashMap<Integer, String> > rows;
private List<String> errors;
private int maxColOfHeader, minColOfHeader;
private HashMap<Integer, String> header;
private HashMap<String,String[]> relationHeaderClassPropertyDescription;
private void initVariables(String name, InputStream file) {
this.rows = new HashMap();
this.header = new HashMap<>();
this.errors = new ArrayList<String>(){{add("["+name+"] empty cells in position -> ");}};
try{
InputStream is = FileMagic.prepareToCheckMagic(file);
FileMagic fm = FileMagic.valueOf(is);
is.close();
switch (fm) {
case OLE2:
XLS2CSVmra xls2csv = new XLS2CSVmra(name, 50, rows);
xls2csv.process();
System.out.println("OLE2");
break;
case OOXML:
File flatFile = new File(name);
OPCPackage p = OPCPackage.open(flatFile, PackageAccess.READ);
XLSX2CSV xlsx2csv = new XLSX2CSV(p, System.out, 50, this.rows);
xlsx2csv.process();
p.close();
System.out.println("OOXML");
break;
default:
System.out.println("Your InputStream was neither an OLE2 stream, nor an OOXML stream");
break;
}
} catch (IOException | EncryptedDocumentException | SAXException | OpenXML4JException exp){
System.out.println(exp);
exp.printStackTrace();
}
int rowHeader = rows.keySet().stream().findFirst().get();
this.header.putAll(rows.get(rowHeader));
this.rows.remove(rowHeader);
this.minColOfHeader = this.header.keySet().stream().findFirst().get();
this.maxColOfHeader = this.header.entrySet().stream()
.mapToInt(e -> e.getKey()).max()
.orElseThrow(NoSuchElementException::new);
}
public EX(String name, InputStream file, HashMap<String,String[]> relationHeaderClassPropertyDescription_) {
this.relationHeaderClassPropertyDescription = relationHeaderClassPropertyDescription_;
initVariables(name, file);
validate();
}
private void validate(){
rows.forEach((inx,row) -> {
for(int i = minColOfHeader; i <= maxColOfHeader; i++) {
//System.out.println("r:"+inx+" c:"+i+" cr:"+(!row.containsKey(i))+" vr:"+((!row.containsKey(i)) || row.get(i).trim().isEmpty())+" ch:"+header.containsKey(i)+" vh:"+(header.containsKey(i) && (!header.get(i).trim().isEmpty()))+" val:"+(row.containsKey(i)&&!row.get(i).trim().isEmpty()?row.get(i):"empty"));
if((!row.containsKey(i)) || row.get(i).trim().isEmpty()) {
if(header.containsKey(i) && (!header.get(i).trim().isEmpty())) {
String description = getRelationHeaders(i,1);
errors.add(" ["+header.get(i)+"]{"+description+"} = fila: "+(inx+1)+" - columna: "+ CellReference.convertNumToColString(i));
// System.out.println(" fila: "+inx+" - columna: " + i + " - valor: "+ (row.get(i).isEmpty()?"empty":row.get(i)));
}
}
}
});
header.forEach((i,v)->{System.out.println("stressTestMap.put(\""+v+"\", new String[]{\"{"+i+"}\",\"Mi descripcion XD\"});");});
}
public String getBodyFileErrors()
{
return String.join(System.lineSeparator(), errors);
}
public boolean thereAreErrors() {
return errors.stream().count() > 1;
}
public<T extends Class> List<? extends Object> toListCustomerObject(T type) {
List<Object> list = new ArrayList<>();
rows.forEach((inx, row) -> {
try {
Object obj = type.newInstance();
for(int i = minColOfHeader; i <= maxColOfHeader; i++) {
if (row.containsKey(i) && !row.get(i).trim().isEmpty()) {
if (header.containsKey(i) && !header.get(i).trim().isEmpty()) {
if(relationHeaderClassPropertyDescription.containsKey(header.get(i))) {
String nameProperty = getRelationHeaders(i,0);
Field field = type.getDeclaredField(nameProperty);
try{
field.setAccessible(true);
field.set(obj, (isConvertibleTo(field.getType(),row.get(i)) ? toObject(field.getType(),row.get(i)) : defaultValue(field.getType())) );
field.setAccessible(false);
}catch (Exception fex) {
//System.out.println("113"+fex);
continue;
}
}
}
}
}
list.add(obj);
} catch (Exception ex) {
//System.out.println("123:"+ex);
}
});
return list;
}
private Object toObject( Class clazz, String value ) {
if( Boolean.class == clazz || Boolean.TYPE == clazz) return Boolean.parseBoolean( value );
if( Byte.class == clazz || Byte.TYPE == clazz) return Byte.parseByte( value );
if( Short.class == clazz || Short.TYPE == clazz) return Short.parseShort( value );
if( Integer.class == clazz || Integer.TYPE == clazz) return Integer.parseInt( value );
if( Long.class == clazz || Long.TYPE == clazz) return Long.parseLong( value );
if( Float.class == clazz || Float.TYPE == clazz) return Float.parseFloat( value );
if( Double.class == clazz || Double.TYPE == clazz) return Double.parseDouble( value );
return value;
}
private boolean isConvertibleTo( Class clazz, String value ) {
String ptn = "";
if( Boolean.class == clazz || Boolean.TYPE == clazz) ptn = ".*";
if( Byte.class == clazz || Byte.TYPE == clazz) ptn = "^\\d+$";
if( Short.class == clazz || Short.TYPE == clazz) ptn = "^\\d+$";
if( Integer.class == clazz || Integer.TYPE == clazz) ptn = "^\\d+$";
if( Long.class == clazz || Long.TYPE == clazz) ptn = "^\\d+$";
if( Float.class == clazz || Float.TYPE == clazz) ptn = "^\\d+(\\.\\d+)?$";
if( Double.class == clazz || Double.TYPE == clazz) ptn = "^\\d+(\\.\\d+)?$";
Pattern pattern = Pattern.compile(ptn, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(value);
return matcher.find();
}
private Object defaultValue( Class clazz) {
if( Boolean.class == clazz || Boolean.TYPE == clazz) return Boolean.parseBoolean( "false" );
if( Byte.class == clazz || Byte.TYPE == clazz) return Byte.parseByte( "0" );
if( Short.class == clazz || Short.TYPE == clazz) return Short.parseShort( "0" );
if( Integer.class == clazz || Integer.TYPE == clazz) return Integer.parseInt( "0" );
if( Long.class == clazz || Long.TYPE == clazz) return Long.parseLong( "0" );
if( Float.class == clazz || Float.TYPE == clazz) return Float.parseFloat( "0.0" );
if( Double.class == clazz || Double.TYPE == clazz) return Double.parseDouble( "0.0" );
return "";
}
private String getRelationHeaders(Integer columnIndexHeader, Integer TypeOrDescription /*0 - Type, 1 - Description*/) {
try {
return relationHeaderClassPropertyDescription.get(header.get(columnIndexHeader))[TypeOrDescription];
} catch (Exception e) {
}
return header.get(columnIndexHeader);
}
}
these are the modifications I made to the examples:
XLSX2CSV
public class XLSX2CSV {
/**
* Uses the XSSF Event SAX helpers to do most of the work
* of parsing the Sheet XML, and outputs the contents
* as a (basic) CSV.
*/
private class SheetToCSV implements SheetContentsHandler {
private boolean firstCellOfRow = false;
private int currentRow = -1;
private int currentCol = -1;
HashMap<Integer, String> valuesCell;
private void outputMissingRows(int number) {
for (int i=0; i<number; i++) {
for (int j=0; j<minColumns; j++) {
output.append(',');
}
output.append('\n');
}
}
#Override
public void startRow(int rowNum) {
// If there were gaps, output the missing rows
outputMissingRows(rowNum-currentRow-1);
// Prepare for this row
firstCellOfRow = true;
currentRow = rowNum;
currentCol = -1;
valuesCell = new HashMap<>();
}
#Override
public void endRow(int rowNum) {
// Ensure the minimum number of columns
for (int i = currentCol; i < minColumns; i++) {
output.append(',');
}
output.append('\n');
if (!valuesCell.isEmpty())
_rows.put(rowNum, valuesCell);
}
#Override
public void cell(String cellReference, String formattedValue,
XSSFComment comment) {
if (firstCellOfRow) {
firstCellOfRow = false;
} else {
output.append(',');
}
// gracefully handle missing CellRef here in a similar way as XSSFCell does
if (cellReference == null) {
cellReference = new CellAddress(currentRow, currentCol).formatAsString();
}
// Did we miss any cells?
int thisCol = (new CellReference(cellReference)).getCol();
int missedCols = thisCol - currentCol - 1;
for (int i = 0; i < missedCols; i++) {
output.append(',');
}
currentCol = thisCol;
if (!formattedValue.isEmpty())
valuesCell.put(thisCol, formattedValue);
// Number or string?
output.append(formattedValue);
/*try {
//noinspection ResultOfMethodCallIgnored
Double.parseDouble(formattedValue);
output.append(formattedValue);
} catch (NumberFormatException e) {
output.append('"');
output.append(formattedValue);
output.append('"');
}*/
}
#Override
public void headerFooter(String text, boolean isHeader, String tagName) {
// Skip, no headers or footers in CSV
}
}
///////////////////////////////////////
private final OPCPackage xlsxPackage;
/**
* Number of columns to read starting with leftmost
*/
private final int minColumns;
/**
* Destination for data
*/
private final PrintStream output;
public HashMap<Integer, HashMap<Integer, String>> _rows;
/**
* Creates a new XLSX -> CSV converter
*
* #param pkg The XLSX package to process
* #param output The PrintStream to output the CSV to
* #param minColumns The minimum number of columns to output, or -1 for no minimum
*/
public XLSX2CSV(OPCPackage pkg, PrintStream output, int minColumns, HashMap<Integer, HashMap<Integer, String> > __rows) {
this.xlsxPackage = pkg;
this.output = output;
this.minColumns = minColumns;
this._rows = __rows;
}
/**
* Parses and shows the content of one sheet
* using the specified styles and shared-strings tables.
*
* #param styles The table of styles that may be referenced by cells in the sheet
* #param strings The table of strings that may be referenced by cells in the sheet
* #param sheetInputStream The stream to read the sheet-data from.
* #exception java.io.IOException An IO exception from the parser,
* possibly from a byte stream or character stream
* supplied by the application.
* #throws SAXException if parsing the XML data fails.
*/
public void processSheet(
StylesTable styles,
ReadOnlySharedStringsTable strings,
SheetContentsHandler sheetHandler,
InputStream sheetInputStream) throws IOException, SAXException {
DataFormatter formatter = new DataFormatter();
InputSource sheetSource = new InputSource(sheetInputStream);
try {
XMLReader sheetParser = SAXHelper.newXMLReader();
ContentHandler handler = new XSSFSheetXMLHandler(
styles, null, strings, sheetHandler, formatter, false);
sheetParser.setContentHandler(handler);
sheetParser.parse(sheetSource);
} catch(ParserConfigurationException e) {
throw new RuntimeException("SAX parser appears to be broken - " + e.getMessage());
}
}
/**
* Initiates the processing of the XLS workbook file to CSV.
*
* #throws IOException If reading the data from the package fails.
* #throws SAXException if parsing the XML data fails.
*/
public void process() throws IOException, OpenXML4JException, SAXException {
ReadOnlySharedStringsTable strings = new ReadOnlySharedStringsTable(this.xlsxPackage);
XSSFReader xssfReader = new XSSFReader(this.xlsxPackage);
StylesTable styles = xssfReader.getStylesTable();
XSSFReader.SheetIterator iter = (XSSFReader.SheetIterator) xssfReader.getSheetsData();
int index = 0;
while (iter.hasNext()) {
InputStream stream = iter.next();
String sheetName = iter.getSheetName();
this.output.println();
this.output.println(sheetName + " [index=" + index + "]:");
processSheet(styles, strings, new SheetToCSV(), stream);
stream.close();
++index;
break;
}
}
}
XLS2CSVmra
public class XLS2CSVmra implements HSSFListener {
private int minColumns;
private POIFSFileSystem fs;
private PrintStream output;
public HashMap<Integer, HashMap<Integer, String>> _rows;
private HashMap<Integer, String> valuesCell;
private int lastRowNumber;
private int lastColumnNumber;
/** Should we output the formula, or the value it has? */
private boolean outputFormulaValues = false;
/** For parsing Formulas */
private SheetRecordCollectingListener workbookBuildingListener;
private HSSFWorkbook stubWorkbook;
// Records we pick up as we process
private SSTRecord sstRecord;
private FormatTrackingHSSFListener formatListener;
/** So we known which sheet we're on */
private int sheetIndex = -1;
private BoundSheetRecord[] orderedBSRs;
private List<BoundSheetRecord> boundSheetRecords = new ArrayList<BoundSheetRecord>();
// For handling formulas with string results
private int nextRow;
private int nextColumn;
private boolean outputNextStringRecord;
/**
* Creates a new XLS -> CSV converter
* #param fs The POIFSFileSystem to process
* #param output The PrintStream to output the CSV to
* #param minColumns The minimum number of columns to output, or -1 for no minimum
*/
public XLS2CSVmra(POIFSFileSystem fs, PrintStream output, int minColumns, HashMap<Integer, HashMap<Integer, String>> __rows) {
this.fs = fs;
this.output = output;
this.minColumns = minColumns;
this._rows = __rows;
this.valuesCell = new HashMap<>();
}
/**
* Creates a new XLS -> CSV converter
* #param filename The file to process
* #param minColumns The minimum number of columns to output, or -1 for no minimum
* #throws IOException
* #throws FileNotFoundException
*/
public XLS2CSVmra(String filename, int minColumns, HashMap<Integer, HashMap<Integer, String>> __rows) throws IOException, FileNotFoundException {
this(
new POIFSFileSystem(new FileInputStream(filename)),
System.out, minColumns,
__rows
);
}
/**
* Initiates the processing of the XLS file to CSV
*/
public void process() throws IOException {
MissingRecordAwareHSSFListener listener = new MissingRecordAwareHSSFListener(this);
formatListener = new FormatTrackingHSSFListener(listener);
HSSFEventFactory factory = new HSSFEventFactory();
HSSFRequest request = new HSSFRequest();
if(outputFormulaValues) {
request.addListenerForAllRecords(formatListener);
} else {
workbookBuildingListener = new SheetRecordCollectingListener(formatListener);
request.addListenerForAllRecords(workbookBuildingListener);
}
factory.processWorkbookEvents(request, fs);
}
/**
* Main HSSFListener method, processes events, and outputs the
* CSV as the file is processed.
*/
#Override
public void processRecord(Record record) {
if(sheetIndex>0)
return;
int thisRow = -1;
int thisColumn = -1;
String thisStr = null;
switch (record.getSid())
{
case BoundSheetRecord.sid:
if(sheetIndex==-1)
boundSheetRecords.add((BoundSheetRecord)record);
break;
case BOFRecord.sid:
BOFRecord br = (BOFRecord)record;
if(br.getType() == BOFRecord.TYPE_WORKSHEET && sheetIndex==-1) {
// Create sub workbook if required
if(workbookBuildingListener != null && stubWorkbook == null) {
stubWorkbook = workbookBuildingListener.getStubHSSFWorkbook();
}
// Output the worksheet name
// Works by ordering the BSRs by the location of
// their BOFRecords, and then knowing that we
// process BOFRecords in byte offset order
sheetIndex++;
if(orderedBSRs == null) {
orderedBSRs = BoundSheetRecord.orderByBofPosition(boundSheetRecords);
}
output.println();
output.println(
orderedBSRs[sheetIndex].getSheetname() +
" [" + (sheetIndex+1) + "]:"
);
}
break;
case SSTRecord.sid:
sstRecord = (SSTRecord) record;
break;
case BlankRecord.sid:
BlankRecord brec = (BlankRecord) record;
thisRow = brec.getRow();
thisColumn = brec.getColumn();
thisStr = "";
break;
case BoolErrRecord.sid:
BoolErrRecord berec = (BoolErrRecord) record;
thisRow = berec.getRow();
thisColumn = berec.getColumn();
thisStr = "";
break;
case FormulaRecord.sid:
FormulaRecord frec = (FormulaRecord) record;
thisRow = frec.getRow();
thisColumn = frec.getColumn();
if(outputFormulaValues) {
if(Double.isNaN( frec.getValue() )) {
// Formula result is a string
// This is stored in the next record
outputNextStringRecord = true;
nextRow = frec.getRow();
nextColumn = frec.getColumn();
} else {
thisStr = formatListener.formatNumberDateCell(frec);
}
} else {
thisStr = '"' +
HSSFFormulaParser.toFormulaString(stubWorkbook, frec.getParsedExpression()) + '"';
}
break;
case StringRecord.sid:
if(outputNextStringRecord) {
// String for formula
StringRecord srec = (StringRecord)record;
thisStr = srec.getString();
thisRow = nextRow;
thisColumn = nextColumn;
outputNextStringRecord = false;
}
break;
case LabelRecord.sid:
LabelRecord lrec = (LabelRecord) record;
thisRow = lrec.getRow();
thisColumn = lrec.getColumn();
thisStr = '"' + lrec.getValue() + '"';
break;
case LabelSSTRecord.sid:
LabelSSTRecord lsrec = (LabelSSTRecord) record;
thisRow = lsrec.getRow();
thisColumn = lsrec.getColumn();
if(sstRecord == null) {
thisStr = '"' + "(No SST Record, can't identify string)" + '"';
} else {
thisStr = '"' + sstRecord.getString(lsrec.getSSTIndex()).toString() + '"';
}
break;
case NoteRecord.sid:
NoteRecord nrec = (NoteRecord) record;
thisRow = nrec.getRow();
thisColumn = nrec.getColumn();
// TODO: Find object to match nrec.getShapeId()
thisStr = '"' + "(TODO)" + '"';
break;
case NumberRecord.sid:
NumberRecord numrec = (NumberRecord) record;
thisRow = numrec.getRow();
thisColumn = numrec.getColumn();
// Format
thisStr = formatListener.formatNumberDateCell(numrec);
break;
case RKRecord.sid:
RKRecord rkrec = (RKRecord) record;
thisRow = rkrec.getRow();
thisColumn = rkrec.getColumn();
thisStr = '"' + "(TODO)" + '"';
break;
default:
break;
}
// Handle new row
if(thisRow != -1 && thisRow != lastRowNumber) {
lastColumnNumber = -1;
}
// Handle missing column
if(record instanceof MissingCellDummyRecord) {
MissingCellDummyRecord mc = (MissingCellDummyRecord)record;
thisRow = mc.getRow();
thisColumn = mc.getColumn();
thisStr = "";
}
// If we got something to print out, do so
if(thisStr != null) {
if (thisColumn > 0) {
output.print(',');
}
if (!thisStr.isEmpty())
valuesCell.put(thisColumn, thisStr);
output.print(thisStr);
}
// Update column and row count
if(thisRow > -1)
lastRowNumber = thisRow;
if(thisColumn > -1)
lastColumnNumber = thisColumn;
// Handle end of row
if(record instanceof LastCellOfRowDummyRecord) {
// Print out any missing commas if needed
if(minColumns > 0) {
// Columns are 0 based
if(lastColumnNumber == -1) { lastColumnNumber = 0; }
for(int i=lastColumnNumber; i<(minColumns); i++) {
output.print(',');
}
}
// We're onto a new row
lastColumnNumber = -1;
// End the row
output.println();
if(!valuesCell.isEmpty()) {
HashMap<Integer, String> newRow = new HashMap<>();
valuesCell.forEach((inx,vStr) -> {
newRow.put(inx, vStr);
});
_rows.put(lastRowNumber, newRow);
valuesCell = new HashMap<>();
}
}
}
}
This code gets the file name, but I want to get the file path:
private List <String> checkFiles(FTPClient clients){
List <String> it = new ArrayList <String>();
try {
FTPFile[] ftpFiles = clients.listFiles();
int length = ftpFiles.length;
for (int i = 0; i < length; i++) {
String name = ftpFiles[i].getName();
Calendar date = ftpFiles[i].getTimestamp();
Log.v("aasd", name );
it.add (name);
}
} catch(Exception e) {
e.printStackTrace();
}
return it ;
}
The path is in the client, not the files.
String path = clients.printWorkingDirectory()
if you want specific path
client.changeWorkingDirectory(PathName) eg client.changeWorkingDirectory(folder1/folder2) where folder 2 is inside folder 1
System.out.println(client.printWorkingDirectory)
printWorkingDirectory gives the current path
Below code finds that all files path in any folder on ftp server.
ftpPath is likes that "ftpserver/folder". List contains paths of all files in folder.
public List<string> GetFilesPath(string ftpPath)
{
FtpWebRequest request;
string FtpServerPath = ftpPath;
List<string> filePathList=new List<string>();
try
{
request = WebRequest.Create(new Uri(FtpServerPath)) as FtpWebRequest;
request.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
request.UseBinary = true;
request.UsePassive = true;
request.KeepAlive = true;
request.Credentials = new NetworkCredential("ftpuser", "ftpPassword");
request.ConnectionGroupName = "group";
Stream rs = (Stream)request.GetResponse().GetResponseStream();
StreamReader sr = new StreamReader(rs);
string strList = sr.ReadToEnd();
string[] lines = null;
if (strList.Contains("\r\n"))
{
lines = strList.Split(new string[] { "\r\n" }, StringSplitOptions.None);
}
else if (strList.Contains("\n"))
{
lines = strList.Split(new string[] { "\n" }, StringSplitOptions.None);
}
if (lines == null || lines.Length == 0)
return null;
else{
foreach (string line in lines)
{
if (line.Length == 0)
continue;
int x=line.LastIndexOf(' ');
int len = line.Length;
var str = line.Substring( (x+1), (len - x - 1));
var filePath = FtpServerPath+"/"+str;
filePathList.Add(filePath);
}
}
catch (Exception ex)
{
MessageBox.Show("Error: " + ex.Message);
}
}