Issue with writing a List into CSV file using CSVWriter - java

I have the following method to write a list into a CSV file using CSVWriter. Unfortunately, it does not separate them by comma which make them messy when I open it in Excel. How can I modify it?
private void generateCSV(List<String> dataset) throws IOException {
CSVWriter writer = null;
JFileChooser chooser = new JFileChooser();
chooser.setAcceptAllFileFilterUsed(true);
if (chooser.showSaveDialog(chooser) == JFileChooser.APPROVE_OPTION) {
File f = chooser.getSelectedFile();
String file_name = f.toString();
if (!(file_name.endsWith(".csv"))) {
file_name += ".csv";
}
writer = new CSVWriter(new FileWriter(f));
for(int i=0; i< dataset.size(); i++){
String[] str = new String[] {dataset.get(i)};
writer.writeNext(str);
}
} else {
return;
}
writer.close();
}

You're creating a String[] for each element on your dataset in your for loop and then writing one element per each line with writeNext(). So they are not comma separated because it's just one element per line, using the line separator at the end of each line.
I think that this is what you want. Am I right?
private void generateCSV(List<String> dataset) throws IOException {
CSVWriter writer = null;
JFileChooser chooser = new JFileChooser();
chooser.setAcceptAllFileFilterUsed(true);
if (chooser.showSaveDialog(chooser) == JFileChooser.APPROVE_OPTION) {
File f = chooser.getSelectedFile();
String file_name = f.toString();
if (!(file_name.endsWith(".csv"))) {
file_name += ".csv";
}
writer = new CSVWriter(new FileWriter(f));
String[] str = new String[dataset.size()];
for (int i = 0; i < dataset.size(); i++) {
str[i] = dataset.get(i);
}
writer.writeNext(str);
} else {
return;
}
writer.close();
}

Related

Not able to write string to temp file in spring boot

i am creating temp file(outputFile) and writing text with BufferedWriter. i am not getting exception. But the String is not appended. I used sysout also for outputfile but it printing nothing.
File outputFile = File.createTempFile("abc",".tmp");
ArrayList<LoadDirectoryResponse> dirlist = new ArrayList<LoadDirectoryResponse>();
ArrayList<LoadDirectoryResponse> dirlistReq = new ArrayList<LoadDirectoryResponse>();
filenames = filenames.substring(0, filenames.length()-1);
String[] finalFileName = filenames.split(",");
dirlistReq = DataQuestService.mapLoadDirectoryList(lines,keyword,finalFileName);
for(MultipartFile multifile : files)
{
String fileName = multifile.getOriginalFilename();
String prefix = fileName.substring(fileName.lastIndexOf("."));
File file = null;
file = File.createTempFile(fileName, prefix);
multifile.transferTo(file);
for(LoadDirectoryResponse obj: dirlistReq )
{
if(fileName.equals(obj.getFilename()))
{
LoadDirectoryResponse objRes = new LoadDirectoryResponse(obj.getFilename(),obj.getKeyWord(),obj.getLinesToBeCopied(),obj.isChecked());
Scanner sc = new Scanner(file);
if(obj.isChecked()) {
BufferedWriter br = new BufferedWriter(new FileWriter(outputFile));
ArrayList<String> linesToBeAdd = new ArrayList<String>();
int i = 0;
while (sc.hasNextLine()) {
String line = sc.nextLine();
linesToBeAdd.add(line);
if(line.contains(obj.getKeyWord()))
{
String value = DataQuestService.getLinesToBeAdd(i,linesToBeAdd,lines);
br.write(value);
break;
}
else
{
objRes.setStatus("Not Found");
}
}
}
dirlist.add(objRes);
}
}
}
Scanner sc1 = new Scanner(outputFile);
while(sc1.hasNextLine())
{
System.out.println(sc1.nextLine());
}
return ResponseEntity.ok()
.header(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=\"" + outputFile.getName() + "\"")
.body(outputFile);
}
Please help with this
Thanks in Advance
When use the BufferedWriter you need to flush and close the writer.
You can use the try-with-resources in Java7+.
try (BufferedWriter br = new BufferedWriter(new FileWriter(outputFile))) {
// write something...
}
In this case, I recommend moving the BufferedWriter outside. Because you just output many files to temporary files.
// Your code ...
dirlistReq = DataQuestService.mapLoadDirectoryList(lines,keyword,finalFileName);
try (BufferedWriter br = new BufferedWriter(new FileWriter(outputFile))) {
for(MultipartFile multifile : files) {
String fileName = multifile.getOriginalFilename();
// Your code ...
}
}
// Your code ...

How to convert txt file delimited by pipe symbol to xls file in Java

I need to have the data from a csv file into excel in selenium.
Having csv file in format like:
PERIOD|EMPLID|EMPL_RCD|HOME HOST|NAME|FIRST_NAME|LAST_NAME|FTE|EMPL_STATUS
5/04/2018|78787|0|Home|mandon|steven|jabobs|1|A
6/04/2018|78789|0|Home|stacy|carvin|tans|1|A
11/04/2018|17892|0|Home|neel|harvis|bammer|1|A
Need to have this data in excel like shown in image:
EDIT My attempt at creating an Excel file
I am using the below code for generating the (.xls) file from csv file with pipe symbol delimiter as shown in the image
but is is giving java.lang.NullPointerException after reading first line.
public class DelimitedToXls {
#SuppressWarnings("deprecation")
public static void main(String args[]) throws IOException {
ArrayList<ArrayList<String>> allRowAndColData = null;
ArrayList<String> oneRowData = null;
String fName = "C:\\input.csv";
String currentLine;
FileInputStream fis = new FileInputStream(fName);
DataInputStream myInput = new DataInputStream(fis);
int i = 0;
allRowAndColData = new ArrayList<ArrayList<String>>();
while ((currentLine = myInput.readLine()) != null) {
oneRowData = new ArrayList<String>();
String oneRowArray[] = currentLine.split(";");
for (int j = 0; j < oneRowArray.length; j++) {
oneRowData.add(oneRowArray[j]);
}
allRowAndColData.add(oneRowData);
System.out.println();
i++;
}
try {
HSSFWorkbook workBook = new HSSFWorkbook();
HSSFSheet sheet = workBook.createSheet("sheet1");
for (int i = 0; i < allRowAndColData.size(); i++) {
ArrayList<?> ardata = (ArrayList<?>) allRowAndColData.get(i);
HSSFRow row = sheet.createRow((short) 0 + i);
for (int k = 0; k < ardata.size(); k++) {
System.out.print(ardata.get(k));
HSSFCell cell = row.createCell((short) k);
cell.setCellValue(ardata.get(k).toString());
}
System.out.println();
}
FileOutputStream fileOutputStream = new FileOutputStream("C:\\outputFile.xls");
workBook.write(fileOutputStream);
fileOutputStream.close();
} catch (Exception ex) {
}
}
}
You can directly open the file with excel. Open file->Select file type: text file.
Select the file, then 'delimited' option on next window. Next window select 'other' and type | as delimiter.
Of course, save it as xls.
That's all.
You have 3 main options:
Open it directly with Excel and setting the delimiter as | (pipe)
Rewrite it as a valid CSV (Comma-Separated Values) file (ie, replace the pipes by commas)
Write the content of the file into a proper Excel file.
Option 1 - Open it directly with Excel
See Fabrizio's answer.
Option 2 - Rewrite it as a valid CSV file
If you are sure you have no commas in your file
You just need to replace all occurrences of | by , to have a valid csv (Comma-Separated Values) file. Then you can open it with Excel.
String fileName = "/path/to/your/file/textFile.txt";
String csvFileName = "/path/to/your/file/csvFile.csv";
try (BufferedReader br = new BufferedReader(new FileReader(fileName));
Writer writer = new FileWriter(csvFileName)) {
String line;
while ((line = br.readLine()) != null) {
writer.append(line.replaceAll("[|]", ","));
writer.append("\n");
}
} catch(Exception e) {
e.printStackTrace();
}
This code changes the content of your file to
PERIOD,EMPLID,EMPL_RCD,HOME HOST,NAME,FIRST_NAME,LAST_NAME,FTE,EMPL_STATUS
5/04/2018,78787,0,Home,mandon,steven,jabobs,1,A
6/04/2018,78789,0,Home,stacy,carvin,tans,1,A
11/04/2018,17892,0,Home,neel,harvis,bammer,1,A
If you might have commas in your file
You need to read token by token, and surround tokens which contain a comma by double quotes. Then replace all pipes by commas. Example, this line
5/04/2018|78787|0|Home, Work|mandon|steven|jabobs|1|A
Would be transformed to
5/04/2018,78787,0,"Home, Work",mandon,steven,jabobs,1,A
You can do it this way:
String fileName = "/path/to/your/file/textFile.txt";
String csvFileName = "/path/to/your/file/csvFile.csv";
try (BufferedReader br = new BufferedReader(new FileReader(fileName));
Writer writer = new FileWriter(csvFileName)) {
String line;
while ((line = br.readLine()) != null) {
String csvLine = Arrays.stream(line.split("[|]")) // split on pipes
.map(token -> token.contains(",") ? "\""+token+"\"" : token) // surround with double quotes if there is a comma in the value
.collect(Collectors.joining(",", "", "\n")); // join with commas
writer.append(csvLine);
}
} catch(Exception e) {
e.printStackTrace();
}
Option 3 - Writing to an Excel file
You can also create a proper Excel file .xls or .xlsx using the Apache POI library. Here is an example using POI-OOXML 3.17 (latest version as of today) You can get it from Maven Repository
String fileName = "/path/to/your/file/textFile.txt";
String excelFileName = "/path/to/your/file/excelFile.xlsx";
// Create a Workbook and a sheet in it
XSSFWorkbook workbook = new XSSFWorkbook();
XSSFSheet sheet = workbook.createSheet("Sheet1");
// Read your input file and make cells into the workbook
try (BufferedReader br = new BufferedReader(new FileReader(fileName))) {
String line;
Row row;
Cell cell;
int rowIndex = 0;
while ((line = br.readLine()) != null) {
row = sheet.createRow(rowIndex);
String[] tokens = line.split("[|]");
for(int iToken = 0; iToken < tokens.length; iToken++) {
cell = row.createCell(iToken);
cell.setCellValue(tokens[iToken]);
}
rowIndex++;
}
} catch(Exception e) {
e.printStackTrace();
}
// Write your xlsx file
try (FileOutputStream outputStream = new FileOutputStream(excelFileName)) {
workbook.write(outputStream);
workbook.close();
} catch (IOException e) {
e.printStackTrace();
}
You can replace | with char \t and store that in .csv file.
I have done something similar while converting CSV to XLS:
try{
FileReader fr=new FileReader("TEJAS.CSV");
FileWriter fw=new FileWriter("TEJASEXEL.xls");
while((c=fr.read())!=-1){
if(c==','){
c='\t';
}
fw.write(c);
}
fr.close();
fw.close();
}

Read and Write CSV File using Java

I have a CSV log file and it contains many rows like this:
2016-06-21 12:00:00,000 : helloworld: header1=2;header2=6;header=0
I want to write them to a new CSV file.
public void readLogFile() throws Exception
{
String currentLine = "";
String nextLine = "";
BufferedReader reader = new BufferedReader(new FileReader(file(false)));
while ((currentLine = reader.readLine()) != null)
{
if (currentLine.contains("2016") == true)
{
nextLine = reader.readLine();
if (nextLine.contains("helloworld") == true)
{
currentLine = currentLine.substring(0, 23);
nextLine = nextLine.substring(22, nextLine.length());
String nextBlock = replaceAll(nextLine);
System.out.println(currentLine + " : helloworld: " + nextBlock);
String[] data = nextBlock.split(";");
for (int i = 0, max = data.length; i < max; i++)
{
String[] d = data[i].split("=");
map.put(d[0], d[1]);
}
}
}
}
reader.close();
}
This is my method to write the content:
public void writeContentToCsv() throws Exception
{
FileWriter writer = new FileWriter(".../file_new.csv");
for (Map.Entry<String, String> entry : map.entrySet())
{
writer.append(entry.getKey()).append(";").append(entry.getValue()).append(System.getProperty("line.separator"));
}
writer.close();
}
This is the output I want to have:
header1; header2; header3
2;6;0
1;5;1
5;8;8
...
Currently, the CSV file looks like this (only showing one dataset):
header1;4
header2;0
header3;0
Can anyone help me fix the code?
Create a class to store the header values, and store it in the list.
Iterate over the list to save the results.
The currently used map can only store 2 values (which it is storing the header value (name its corresponding value)
map.put(d[0], d[1]);
here d[0] will be header1 and d[1] will be 4 (but we want only 4 from here)
class Headervalues {
String[] header = new String[3];
}
public void readLogFile() throws Exception
{
List<HeaderValues> list = new ArrayList<>();
String currentLine = "";
BufferedReader reader = new BufferedReader(new FileReader(file(false)));
while ((currentLine = reader.readLine()) != null)
{
if (currentLine.contains("2016") && currentLine.contains("helloworld"))
{
String nextBlock = replaceAll(currentLine.substring(22, currentLine.length());
String[] data = nextBlock.split(";");
HeaderValues headerValues = new HeaderValues();
//Assuming data.length will always be 3.
for (int i = 0, max = data.length; i < max; i++)
{
String[] d = data[i].split("=");
//Assuming split will always have size 2
headerValues.header[i] = d[1];
}
list.add(headerValues)
}
}
}
reader.close();
}
public void writeContentToCsv() throws Exception
{
FileWriter writer = new FileWriter(".../file_new.csv");
for (HeaderValues value : headerValues)
{
writer.append(value.header[0]).append(";").append(value.header[1]).append(";").append(value.header[2]);
}
writer.close();
}
For writing to CSV
public void writeCSV() {
// Delimiter used in CSV file
private static final String NEW_LINE_SEPARATOR = "\n";
// CSV file header
private static final Object[] FILE_HEADER = { "Empoyee Name","Empoyee Code", "In Time", "Out Time", "Duration", "Is Working Day" };
String fileName = "fileName.csv");
List<Objects> objects = new ArrayList<Objects>();
FileWriter fileWriter = null;
CSVPrinter csvFilePrinter = null;
// Create the CSVFormat object with "\n" as a record delimiter
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withRecordSeparator(NEW_LINE_SEPARATOR);
try {
fileWriter = new FileWriter(fileName);
csvFilePrinter = new CSVPrinter(fileWriter, csvFileFormat);
csvFilePrinter.printRecord(FILE_HEADER);
// Write a new student object list to the CSV file
for (Object object : objects) {
List<String> record = new ArrayList<String>();
record.add(object.getValue1().toString());
record.add(object.getValue2().toString());
record.add(object.getValue3().toString());
csvFilePrinter.printRecord(record);
}
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
fileWriter.flush();
fileWriter.close();
csvFilePrinter.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Read and write/append CSV file using org.apache.commons.csv.CSVParser.
public void appendCSV(){
String [] records = {};
String csvWrite= "";
Boolean status = false;
try(BufferedReader csvReaders = new BufferedReader(new FileReader("csvfile.csv"));
CSVParser parser = CSVFormat.DEFAULT.withDelimiter(',').withHeader().parse(csvReaders);
) {
for(CSVRecord record : parser) {
status= record.get("Microservice").equalsIgnoreCase(apipath);
int status_code=0;
String httpMethod = record.get("Method");
if(status==true) {
csvWrite = record.get("apiName")+"-"+record.get("Microservice")+"-"+record.get("R_Data")+"-"+record.get("Method")+"-"+record.get("A_Status")+"-"+400+"-"+record.get("A_Response")+"-"+"{}";
records = csvWrite.split("-");
CSVWriter writer = new CSVWriter(new FileWriter(pathTowritecsv,true));
writer.writeNext(records);
writer.close();
}else {
}
}
}
catch (Exception e) {
System.out.println(e);
}
}

File Problems Java

I am struggling to understand what i am doing wrong here. I have checked many times the file does exist and i cant get the For loop to find it. Debugging this section of code it says the path for the variable "folder" but says the filePath is null for that variable. I am very confused any help would be amazing.
String path = varablePath1;
File folder = new File(path);
if (folder.exists()){
System.out.println("got folder");
}
File[] listOfFiles = folder.listFiles();
for (int i = 0; i < listOfFiles.length; i++) {
if (listOfFiles[i].isDirectory()) {
String FileNames = listOfFiles[i].getName();
FileWriter fw1 = new FileWriter(file1, true);
BufferedWriter bw1 = new BufferedWriter(fw1);
bw1.write(FileNames);
bw1.newLine();
bw1.close();
}
}
you can Check Folder and File form Code and if Folder then get list of Folder and Write Using BufferedWriter. it works fine . please Check if any updation want .
public class Check {
public static void main(String args[]) throws IOException {
File f = null;
String path = "/home/ananddw";
f = new File(path);
if (f.isDirectory()) {
System.out.println("if");
File[] ss = f.listFiles();
for (File file : ss) {
if (file.isFile()) {
String FileFinalName = file.getName();
System.out.println(file.getName());
FileWriter fw1 = new FileWriter(file, true);
BufferedWriter bw1 = new BufferedWriter(fw1);
bw1.write(FileFinalName);
bw1.newLine();
bw1.close();
}
}
} else if (f.isFile()) {
System.out.println("elkse");
}
}
}
what about the else piece in that verification, i.e:
if (folder.exists()){
System.out.println("got folder");
}
else {
File[] listOfFiles = folder.listFiles();
for (int i = 0; i < listOfFiles.length; i++) {
if (listOfFiles[i].isDirectory()) {
String FileNames = listOfFiles[i].getName();
FileWriter fw1 = new FileWriter(file1, true);
BufferedWriter bw1 = new BufferedWriter(fw1);
bw1.write(FileNames);
bw1.newLine();
bw1.close();
}
}
}

Java - Read file and split into multiple files

I have a file which I would like to read in Java and split this file into n (user input) output files. Here is how I read the file:
int n = 4;
BufferedReader br = new BufferedReader(new FileReader("file.csv"));
try {
String line = br.readLine();
while (line != null) {
line = br.readLine();
}
} finally {
br.close();
}
How do I split the file - file.csv into n files?
Note - Since the number of entries in the file are of the order of 100k, I can't store the file content into an array and then split it and save into multiple files.
Since one file can be very large, each split file could be large as well.
Example:
Source File Size: 5GB
Num Splits: 5: Destination
File Size: 1GB each (5 files)
There is no way to read this large split chunk in one go, even if we have such a memory. Basically for each split we can read a fix size byte-array which we know should be feasible in terms of performance as well memory.
NumSplits: 10 MaxReadBytes: 8KB
public static void main(String[] args) throws Exception
{
RandomAccessFile raf = new RandomAccessFile("test.csv", "r");
long numSplits = 10; //from user input, extract it from args
long sourceSize = raf.length();
long bytesPerSplit = sourceSize/numSplits ;
long remainingBytes = sourceSize % numSplits;
int maxReadBufferSize = 8 * 1024; //8KB
for(int destIx=1; destIx <= numSplits; destIx++) {
BufferedOutputStream bw = new BufferedOutputStream(new FileOutputStream("split."+destIx));
if(bytesPerSplit > maxReadBufferSize) {
long numReads = bytesPerSplit/maxReadBufferSize;
long numRemainingRead = bytesPerSplit % maxReadBufferSize;
for(int i=0; i<numReads; i++) {
readWrite(raf, bw, maxReadBufferSize);
}
if(numRemainingRead > 0) {
readWrite(raf, bw, numRemainingRead);
}
}else {
readWrite(raf, bw, bytesPerSplit);
}
bw.close();
}
if(remainingBytes > 0) {
BufferedOutputStream bw = new BufferedOutputStream(new FileOutputStream("split."+(numSplits+1)));
readWrite(raf, bw, remainingBytes);
bw.close();
}
raf.close();
}
static void readWrite(RandomAccessFile raf, BufferedOutputStream bw, long numBytes) throws IOException {
byte[] buf = new byte[(int) numBytes];
int val = raf.read(buf);
if(val != -1) {
bw.write(buf);
}
}
import java.io.*;
import java.util.Scanner;
public class split {
public static void main(String args[])
{
try{
// Reading file and getting no. of files to be generated
String inputfile = "C:/test.txt"; // Source File Name.
double nol = 2000.0; // No. of lines to be split and saved in each output file.
File file = new File(inputfile);
Scanner scanner = new Scanner(file);
int count = 0;
while (scanner.hasNextLine())
{
scanner.nextLine();
count++;
}
System.out.println("Lines in the file: " + count); // Displays no. of lines in the input file.
double temp = (count/nol);
int temp1=(int)temp;
int nof=0;
if(temp1==temp)
{
nof=temp1;
}
else
{
nof=temp1+1;
}
System.out.println("No. of files to be generated :"+nof); // Displays no. of files to be generated.
//---------------------------------------------------------------------------------------------------------
// Actual splitting of file into smaller files
FileInputStream fstream = new FileInputStream(inputfile); DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in)); String strLine;
for (int j=1;j<=nof;j++)
{
FileWriter fstream1 = new FileWriter("C:/New Folder/File"+j+".txt"); // Destination File Location
BufferedWriter out = new BufferedWriter(fstream1);
for (int i=1;i<=nol;i++)
{
strLine = br.readLine();
if (strLine!= null)
{
out.write(strLine);
if(i!=nol)
{
out.newLine();
}
}
}
out.close();
}
in.close();
}catch (Exception e)
{
System.err.println("Error: " + e.getMessage());
}
}
}
Though its a old question but for reference I am listing out the code which I used to split large files to any sizes and it works with any Java versions above 1.4 .
Sample Split and Join blocks were like below:
public void join(String FilePath) {
long leninfile = 0, leng = 0;
int count = 1, data = 0;
try {
File filename = new File(FilePath);
//RandomAccessFile outfile = new RandomAccessFile(filename,"rw");
OutputStream outfile = new BufferedOutputStream(new FileOutputStream(filename));
while (true) {
filename = new File(FilePath + count + ".sp");
if (filename.exists()) {
//RandomAccessFile infile = new RandomAccessFile(filename,"r");
InputStream infile = new BufferedInputStream(new FileInputStream(filename));
data = infile.read();
while (data != -1) {
outfile.write(data);
data = infile.read();
}
leng++;
infile.close();
count++;
} else {
break;
}
}
outfile.close();
} catch (Exception e) {
e.printStackTrace();
}
}
public void split(String FilePath, long splitlen) {
long leninfile = 0, leng = 0;
int count = 1, data;
try {
File filename = new File(FilePath);
//RandomAccessFile infile = new RandomAccessFile(filename, "r");
InputStream infile = new BufferedInputStream(new FileInputStream(filename));
data = infile.read();
while (data != -1) {
filename = new File(FilePath + count + ".sp");
//RandomAccessFile outfile = new RandomAccessFile(filename, "rw");
OutputStream outfile = new BufferedOutputStream(new FileOutputStream(filename));
while (data != -1 && leng < splitlen) {
outfile.write(data);
leng++;
data = infile.read();
}
leninfile += leng;
leng = 0;
outfile.close();
count++;
}
} catch (Exception e) {
e.printStackTrace();
}
}
Complete java code available here in File Split in Java Program link.
a clean solution to edit.
this solution involves loading the entire file into memory.
set all line of a file in List<String> rowsOfFile;
edit maxSizeFile to choice max size of a single file splitted
public void splitFile(File fileToSplit) throws IOException {
long maxSizeFile = 10000000 // 10mb
StringBuilder buffer = new StringBuilder((int) maxSizeFile);
int sizeOfRows = 0;
int recurrence = 0;
String fileName;
List<String> rowsOfFile;
rowsOfFile = Files.readAllLines(fileToSplit.toPath(), Charset.defaultCharset());
for (String row : rowsOfFile) {
buffer.append(row);
numOfRow++;
sizeOfRows += row.getBytes(StandardCharsets.UTF_8).length;
if (sizeOfRows >= maxSizeFile) {
fileName = generateFileName(recurrence);
File newFile = new File(fileName);
try (PrintWriter writer = new PrintWriter(newFile)) {
writer.println(buffer.toString());
}
recurrence++;
sizeOfRows = 0;
buffer = new StringBuilder();
}
}
// last rows
if (sizeOfRows > 0) {
fileName = generateFileName(recurrence);
File newFile = createFile(fileName);
try (PrintWriter writer = new PrintWriter(newFile)) {
writer.println(buffer.toString());
}
}
Files.delete(fileToSplit.toPath());
}
method to generate Name of file:
public String generateFileName(int numFile) {
String extension = ".txt";
return "myFile" + numFile + extension;
}
Have a counter to count no of entries. Let's say one entry per line.
step1: Initially create new subfile, set counter=0;
step2: increment counter as you read each entry from source file to buffer
step3: when counter reaches limit to number of entries that you want to write in each sub file, flush contents of buffer to subfile. close the subfile
step4 : jump to step1 till you have data in source file to read from
There's no need to loop twice through the file. You could estimate the size of each chunk as the source file size divided by number of chunks needed. Then you just stop filling each cunk with data as it's size exceeds estimated.
Here is one that worked for me and I used it to split 10GB file. it also enables you to add a header and a footer. very useful when splitting document based format such as XML and JSON because you need to add document wrapper in the new split files.
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
public class FileSpliter
{
public static void main(String[] args) throws IOException
{
splitTextFiles("D:\\xref.csx", 750000, "", "", null);
}
public static void splitTextFiles(String fileName, int maxRows, String header, String footer, String targetDir) throws IOException
{
File bigFile = new File(fileName);
int i = 1;
String ext = fileName.substring(fileName.lastIndexOf("."));
String fileNoExt = bigFile.getName().replace(ext, "");
File newDir = null;
if(targetDir != null)
{
newDir = new File(targetDir);
}
else
{
newDir = new File(bigFile.getParent() + "\\" + fileNoExt + "_split");
}
newDir.mkdirs();
try (BufferedReader reader = Files.newBufferedReader(Paths.get(fileName)))
{
String line = null;
int lineNum = 1;
Path splitFile = Paths.get(newDir.getPath() + "\\" + fileNoExt + "_" + String.format("%02d", i) + ext);
BufferedWriter writer = Files.newBufferedWriter(splitFile, StandardOpenOption.CREATE);
while ((line = reader.readLine()) != null)
{
if(lineNum == 1)
{
System.out.print("new file created '" + splitFile.toString());
if(header != null && header.length() > 0)
{
writer.append(header);
writer.newLine();
}
}
writer.append(line);
if (lineNum >= maxRows)
{
if(footer != null && footer.length() > 0)
{
writer.newLine();
writer.append(footer);
}
writer.close();
System.out.println(", " + lineNum + " lines written to file");
lineNum = 1;
i++;
splitFile = Paths.get(newDir.getPath() + "\\" + fileNoExt + "_" + String.format("%02d", i) + ext);
writer = Files.newBufferedWriter(splitFile, StandardOpenOption.CREATE);
}
else
{
writer.newLine();
lineNum++;
}
}
if(lineNum <= maxRows) // early exit
{
if(footer != null && footer.length() > 0)
{
writer.newLine();
lineNum++;
writer.append(footer);
}
}
writer.close();
System.out.println(", " + lineNum + " lines written to file");
}
System.out.println("file '" + bigFile.getName() + "' split into " + i + " files");
}
}
Below code used to split a big file into small files with lesser lines.
long linesWritten = 0;
int count = 1;
try {
File inputFile = new File(inputFilePath);
InputStream inputFileStream = new BufferedInputStream(new FileInputStream(inputFile));
BufferedReader reader = new BufferedReader(new InputStreamReader(inputFileStream));
String line = reader.readLine();
String fileName = inputFile.getName();
String outfileName = outputFolderPath + "\\" + fileName;
while (line != null) {
File outFile = new File(outfileName + "_" + count + ".split");
Writer writer = new OutputStreamWriter(new FileOutputStream(outFile));
while (line != null && linesWritten < linesPerSplit) {
writer.write(line);
line = reader.readLine();
linesWritten++;
}
writer.close();
linesWritten = 0;//next file
count++;//nect file count
}
reader.close();
} catch (Exception e) {
e.printStackTrace();
}
Split a file to multiple chunks (in memory operation), here I'm splitting any file to a size of 500kb(500000 bytes) :
public static List<ByteArrayOutputStream> splitFile(File f) {
List<ByteArrayOutputStream> datalist = new ArrayList<>();
try {
int sizeOfFiles = 500000;
byte[] buffer = new byte[sizeOfFiles];
try (FileInputStream fis = new FileInputStream(f); BufferedInputStream bis = new BufferedInputStream(fis)) {
int bytesAmount = 0;
while ((bytesAmount = bis.read(buffer)) > 0) {
try (OutputStream out = new ByteArrayOutputStream()) {
out.write(buffer, 0, bytesAmount);
out.flush();
datalist.add((ByteArrayOutputStream) out);
}
}
}
} catch (Exception e) {
//get the error
}
return datalist; }
I am a bit late to answer, But here's how I did it:
Approach:
First I determine how many bytes each of the individual files should contain then I split the large file by bytes. Only one file chunk worth of data is loaded into memory at a time.
Example:- if a 5 GB file is split into 10 files then only 500MB worth of bytes are loaded into memory at a time which are held in the buffer variable in the splitBySize method below.
Code Explaination:
The method splitFile first gets the number of bytes each of the individual file chunks should contain by calling the getSizeInBytes method, then it calls the splitBySize method which splits the large file by size (i..e maxChunkSize represents the number of bytes each of file chunks will contain).
public static List<File> splitFile(File largeFile, int noOfFiles) throws IOException {
return splitBySize(largeFile, getSizeInBytes(largeFile.length(), noOfFiles));
}
public static List<File> splitBySize(File largeFile, int maxChunkSize) throws IOException {
List<File> list = new ArrayList<>();
int numberOfFiles = 0;
try (InputStream in = Files.newInputStream(largeFile.toPath())) {
final byte[] buffer = new byte[maxChunkSize];
int dataRead = in.read(buffer);
while (dataRead > -1) {
list.add(stageLocally(buffer, dataRead));
numberOfFiles++;
dataRead = in.read(buffer);
}
}
System.out.println("Number of files generated: " + numberOfFiles);
return list;
}
private static int getSizeInBytes(long totalBytes, int numberOfFiles) {
if (totalBytes % numberOfFiles != 0) {
totalBytes = ((totalBytes / numberOfFiles) + 1)*numberOfFiles;
}
long x = totalBytes / numberOfFiles;
if (x > Integer.MAX_VALUE){
throw new NumberFormatException("Byte chunk too large");
}
return (int) x;
}
Full Code:
public class StackOverflow {
private static final String INPUT_FILE_PATH = "/Users/malkesingh/Downloads/5MB.zip";
private static final String TEMP_DIRECTORY = "/Users/malkesingh/temp";
public static void main(String[] args) throws IOException {
File input = new File(INPUT_FILE_PATH);
File outPut = fileJoin2(splitFile(input, 5));
try (InputStream in = Files.newInputStream(input.toPath()); InputStream out = Files.newInputStream(outPut.toPath())) {
System.out.println(IOUtils.contentEquals(in, out));
}
}
public static List<File> splitFile(File largeFile, int noOfFiles) throws IOException {
return splitBySize(largeFile, getSizeInBytes(largeFile.length(), noOfFiles));
}
public static List<File> splitBySize(File largeFile, int maxChunkSize) throws IOException {
List<File> list = new ArrayList<>();
int numberOfFiles = 0;
try (InputStream in = Files.newInputStream(largeFile.toPath())) {
final byte[] buffer = new byte[maxChunkSize];
int dataRead = in.read(buffer);
while (dataRead > -1) {
list.add(stageLocally(buffer, dataRead));
numberOfFiles++;
dataRead = in.read(buffer);
}
}
System.out.println("Number of files generated: " + numberOfFiles);
return list;
}
private static int getSizeInBytes(long totalBytes, int numberOfFiles) {
if (totalBytes % numberOfFiles != 0) {
totalBytes = ((totalBytes / numberOfFiles) + 1)*numberOfFiles;
}
long x = totalBytes / numberOfFiles;
if (x > Integer.MAX_VALUE){
throw new NumberFormatException("Byte chunk too large");
}
return (int) x;
}
private static File stageLocally(byte[] buffer, int length) throws IOException {
File outPutFile = File.createTempFile("temp-", "split", new File(TEMP_DIRECTORY));
try(FileOutputStream fos = new FileOutputStream(outPutFile)) {
fos.write(buffer, 0, length);
}
return outPutFile;
}
public static File fileJoin2(List<File> list) throws IOException {
File outPutFile = File.createTempFile("temp-", "unsplit", new File(TEMP_DIRECTORY));
FileOutputStream fos = new FileOutputStream(outPutFile);
for (File file : list) {
Files.copy(file.toPath(), fos);
}
fos.close();
return outPutFile;
}}
import java.util.*;
import java.io.*;
public class task13 {
public static void main(String[] args)throws IOException{
Scanner s =new Scanner(System.in);
System.out.print("Enter path:");
String a=s.next();
File f=new File(a+".txt");
Scanner st=new Scanner(f);
System.out.println(f.canRead()+"\n"+f.canWrite());
long l=f.length();
System.out.println("Length is:"+l);
System.out.print("Enter no.of partitions:");
int p=s.nextInt();
long x=l/p;
st.useDelimiter("\\Z");
String t=st.next();
int j=0;
System.out.println("Each File Length is:"+x);
for(int i=1;i<=p;i++){
File ft=new File(a+"-"+i+".txt");
ft.createNewFile();
int g=(j*(int)x);
int h=(j+1)*(int)x;
if(g<=l&&h<=l){
FileWriter fw=new FileWriter(a+"-"+i+".txt");
String v=t.substring(g,h);
fw.write(v);
j++;
fw.close();
}}
}}

Categories