Java - Apache Beam: Read file from GCS with "UCS2-LE BOM" encoding - java

I want to read a file in UCS2-LE BOM using TextIO, however It doesn't seem to work.
Is there a way to use TextIO with this encoding ? Or is there another library that does well with this type of encoding ?
My code is in JAVA (Apache Beam)
PCollection<KV<String, String>> csvElements =
pipeline.apply("Reads the input csv file", TextIO
.read()
.from(options.getPolledFile()))
.apply("Read File", ParDo.of(new DoFn<String, KV<String,String>>(){
#ProcessElement
public void processElement(ProcessContext c) throws UnsupportedEncodingException {
String element = c.element();
String elStr = new String(element.getBytes(),"UTF-16LE");
c.output(elStr);}}));

I found a solution, in a medium post : Solution
The file I am reading is stored in GCS, hence the added lines in try part (compared to the original code.)
file = "path to gas file";
PCollection<String> readCollection = pipeline.apply(FileIO.match().filepattern(file))
.apply(FileIO.readMatches())
.apply(FlatMapElements
.into(strings())
.via((FileIO.ReadableFile f) -> {
List<String> result = new ArrayList<>();
try {
ReadableByteChannel byteChannelParse = f.open();
InputStream inputStream = Channels.newInputStream(byteChannelParse);
BufferedReader br = new BufferedReader(new InputStreamReader(inputStream, "UTF-16"));
String line = br.readLine();
while (line != null) {
result.add(line);
line = br.readLine();
}
br.close();
inputStream.close();
}
catch (IOException e) {
throw new RuntimeException("Error while reading", e);
}
return result;
}));
P.S: I didn't add a line with credentials because I passed it into IntelliJ parameters.

Related

How to delete a line of string in a text file - Java [duplicate]

I'm looking for a small code snippet that will find a line in file and remove that line (not content but line) but could not find. So for example I have in a file following:
myFile.txt:
aaa
bbb
ccc
ddd
Need to have a function like this: public void removeLine(String lineContent), and if I pass
removeLine("bbb"), I get file like this:
myFile.txt:
aaa
ccc
ddd
This solution may not be optimal or pretty, but it works. It reads in an input file line by line, writing each line out to a temporary output file. Whenever it encounters a line that matches what you are looking for, it skips writing that one out. It then renames the output file. I have omitted error handling, closing of readers/writers, etc. from the example. I also assume there is no leading or trailing whitespace in the line you are looking for. Change the code around trim() as needed so you can find a match.
File inputFile = new File("myFile.txt");
File tempFile = new File("myTempFile.txt");
BufferedReader reader = new BufferedReader(new FileReader(inputFile));
BufferedWriter writer = new BufferedWriter(new FileWriter(tempFile));
String lineToRemove = "bbb";
String currentLine;
while((currentLine = reader.readLine()) != null) {
// trim newline when comparing with lineToRemove
String trimmedLine = currentLine.trim();
if(trimmedLine.equals(lineToRemove)) continue;
writer.write(currentLine + System.getProperty("line.separator"));
}
writer.close();
reader.close();
boolean successful = tempFile.renameTo(inputFile);
public void removeLineFromFile(String file, String lineToRemove) {
try {
File inFile = new File(file);
if (!inFile.isFile()) {
System.out.println("Parameter is not an existing file");
return;
}
//Construct the new file that will later be renamed to the original filename.
File tempFile = new File(inFile.getAbsolutePath() + ".tmp");
BufferedReader br = new BufferedReader(new FileReader(file));
PrintWriter pw = new PrintWriter(new FileWriter(tempFile));
String line = null;
//Read from the original file and write to the new
//unless content matches data to be removed.
while ((line = br.readLine()) != null) {
if (!line.trim().equals(lineToRemove)) {
pw.println(line);
pw.flush();
}
}
pw.close();
br.close();
//Delete the original file
if (!inFile.delete()) {
System.out.println("Could not delete file");
return;
}
//Rename the new file to the filename the original file had.
if (!tempFile.renameTo(inFile))
System.out.println("Could not rename file");
}
catch (FileNotFoundException ex) {
ex.printStackTrace();
}
catch (IOException ex) {
ex.printStackTrace();
}
}
This I have found on the internet.
You want to do something like the following:
Open the old file for reading
Open a new (temporary) file for writing
Iterate over the lines in the old file (probably using a BufferedReader)
For each line, check if it matches what you are supposed to remove
If it matches, do nothing
If it doesn't match, write it to the temporary file
When done, close both files
Delete the old file
Rename the temporary file to the name of the original file
(I won't write the actual code, since this looks like homework, but feel free to post other questions on specific bits that you have trouble with)
So, whenever I hear someone mention that they want to filter out text, I immediately think to go to Streams (mainly because there is a method called filter which filters exactly as you need it to). Another answer mentions using Streams with the Apache commons-io library, but I thought it would be worthwhile to show how this can be done in standard Java 8. Here is the simplest form:
public void removeLine(String lineContent) throws IOException
{
File file = new File("myFile.txt");
List<String> out = Files.lines(file.toPath())
.filter(line -> !line.contains(lineContent))
.collect(Collectors.toList());
Files.write(file.toPath(), out, StandardOpenOption.WRITE, StandardOpenOption.TRUNCATE_EXISTING);
}
I think there isn't too much to explain there, basically Files.lines gets a Stream<String> of the lines of the file, filter takes out the lines we don't want, then collect puts all of the lines of the new file into a List. We then write the list over top of the existing file with Files.write, using the additional option TRUNCATE so the old contents of the file are replaced.
Of course, this approach has the downside of loading every line into memory as they all get stored into a List before being written back out. If we wanted to simply modify without storing, we would need to use some form of OutputStream to write each new line to a file as it passes through the stream, like this:
public void removeLine(String lineContent) throws IOException
{
File file = new File("myFile.txt");
File temp = new File("_temp_");
PrintWriter out = new PrintWriter(new FileWriter(temp));
Files.lines(file.toPath())
.filter(line -> !line.contains(lineContent))
.forEach(out::println);
out.flush();
out.close();
temp.renameTo(file);
}
Not much has been changed in this example. Basically, instead of using collect to gather the file contents into memory, we use forEach so that each line that makes it through the filter gets sent to the PrintWriter to be written out to the file immediately and not stored. We have to save it to a temporary file, because we can't overwrite the existing file at the same time as we are still reading from it, so then at the end, we rename the temp file to replace the existing file.
Using apache commons-io and Java 8 you can use
List<String> lines = FileUtils.readLines(file);
List<String> updatedLines = lines.stream().filter(s -> !s.contains(searchString)).collect(Collectors.toList());
FileUtils.writeLines(file, updatedLines, false);
public static void deleteLine() throws IOException {
RandomAccessFile file = new RandomAccessFile("me.txt", "rw");
String delete;
String task="";
byte []tasking;
while ((delete = file.readLine()) != null) {
if (delete.startsWith("BAD")) {
continue;
}
task+=delete+"\n";
}
System.out.println(task);
BufferedWriter writer = new BufferedWriter(new FileWriter("me.txt"));
writer.write(task);
file.close();
writer.close();
}
Here you go. This solution uses a DataInputStream to scan for the position of the string you want replaced and uses a FileChannel to replace the text at that exact position. It only replaces the first occurrence of the string that it finds. This solution doesn't store a copy of the entire file somewhere, (either the RAM or a temp file), it just edits the portion of the file that it finds.
public static long scanForString(String text, File file) throws IOException {
if (text.isEmpty())
return file.exists() ? 0 : -1;
// First of all, get a byte array off of this string:
byte[] bytes = text.getBytes(/* StandardCharsets.your_charset */);
// Next, search the file for the byte array.
try (DataInputStream dis = new DataInputStream(new FileInputStream(file))) {
List<Integer> matches = new LinkedList<>();
for (long pos = 0; pos < file.length(); pos++) {
byte bite = dis.readByte();
for (int i = 0; i < matches.size(); i++) {
Integer m = matches.get(i);
if (bytes[m] != bite)
matches.remove(i--);
else if (++m == bytes.length)
return pos - m + 1;
else
matches.set(i, m);
}
if (bytes[0] == bite)
matches.add(1);
}
}
return -1;
}
public static void replaceText(String text, String replacement, File file) throws IOException {
// Open a FileChannel with writing ability. You don't really need the read
// ability for this specific case, but there it is in case you need it for
// something else.
try (FileChannel channel = FileChannel.open(file.toPath(), StandardOpenOption.WRITE, StandardOpenOption.READ)) {
long scanForString = scanForString(text, file);
if (scanForString == -1) {
System.out.println("String not found.");
return;
}
channel.position(scanForString);
channel.write(ByteBuffer.wrap(replacement.getBytes(/* StandardCharsets.your_charset */)));
}
}
Example
Input: ABCDEFGHIJKLMNOPQRSTUVWXYZ
Method Call:
replaceText("QRS", "000", new File("path/to/file");
Resulting File: ABCDEFGHIJKLMNOP000TUVWXYZ
Here is the complete Class. In the below file "somelocation" refers to the actual path of the file.
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
public class FileProcess
{
public static void main(String[] args) throws IOException
{
File inputFile = new File("C://somelocation//Demographics.txt");
File tempFile = new File("C://somelocation//Demographics_report.txt");
BufferedReader reader = new BufferedReader(new FileReader(inputFile));
BufferedWriter writer = new BufferedWriter(new FileWriter(tempFile));
String currentLine;
while((currentLine = reader.readLine()) != null) {
if(null!=currentLine && !currentLine.equalsIgnoreCase("BBB")){
writer.write(currentLine + System.getProperty("line.separator"));
}
}
writer.close();
reader.close();
boolean successful = tempFile.renameTo(inputFile);
System.out.println(successful);
}
}
This solution reads in an input file line by line, writing each line out to a StringBuilder variable. Whenever it encounters a line that matches what you are looking for, it skips writing that one out. Then it deletes file content and put the StringBuilder variable content.
public void removeLineFromFile(String lineToRemove, File f) throws FileNotFoundException, IOException{
//Reading File Content and storing it to a StringBuilder variable ( skips lineToRemove)
StringBuilder sb = new StringBuilder();
try (Scanner sc = new Scanner(f)) {
String currentLine;
while(sc.hasNext()){
currentLine = sc.nextLine();
if(currentLine.equals(lineToRemove)){
continue; //skips lineToRemove
}
sb.append(currentLine).append("\n");
}
}
//Delete File Content
PrintWriter pw = new PrintWriter(f);
pw.close();
BufferedWriter writer = new BufferedWriter(new FileWriter(f, true));
writer.append(sb.toString());
writer.close();
}
Super simple method using maven/gradle+groovy.
public void deleteConfig(String text) {
File config = new File("/the/path/config.txt")
def lines = config.readLines()
lines.remove(text);
config.write("")
lines.each {line -> {
config.append(line+"\n")
}}
}
public static void deleteLine(String line, String filePath) {
File file = new File(filePath);
File file2 = new File(file.getParent() + "\\temp" + file.getName());
PrintWriter pw = null;
Scanner read = null;
FileInputStream fis = null;
FileOutputStream fos = null;
FileChannel src = null;
FileChannel dest = null;
try {
pw = new PrintWriter(file2);
read = new Scanner(file);
while (read.hasNextLine()) {
String currline = read.nextLine();
if (line.equalsIgnoreCase(currline)) {
continue;
} else {
pw.println(currline);
}
}
pw.flush();
fis = new FileInputStream(file2);
src = fis.getChannel();
fos = new FileOutputStream(file);
dest = fos.getChannel();
dest.transferFrom(src, 0, src.size());
} catch (IOException e) {
e.printStackTrace();
} finally {
pw.close();
read.close();
try {
fis.close();
fos.close();
src.close();
dest.close();
} catch (IOException e) {
e.printStackTrace();
}
if (file2.delete()) {
System.out.println("File is deleted");
} else {
System.out.println("Error occured! File: " + file2.getName() + " is not deleted!");
}
}
}
package com.ncs.cache;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.File;
import java.io.FileWriter;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.PrintWriter;
public class FileUtil {
public void removeLineFromFile(String file, String lineToRemove) {
try {
File inFile = new File(file);
if (!inFile.isFile()) {
System.out.println("Parameter is not an existing file");
return;
}
// Construct the new file that will later be renamed to the original
// filename.
File tempFile = new File(inFile.getAbsolutePath() + ".tmp");
BufferedReader br = new BufferedReader(new FileReader(file));
PrintWriter pw = new PrintWriter(new FileWriter(tempFile));
String line = null;
// Read from the original file and write to the new
// unless content matches data to be removed.
while ((line = br.readLine()) != null) {
if (!line.trim().equals(lineToRemove)) {
pw.println(line);
pw.flush();
}
}
pw.close();
br.close();
// Delete the original file
if (!inFile.delete()) {
System.out.println("Could not delete file");
return;
}
// Rename the new file to the filename the original file had.
if (!tempFile.renameTo(inFile))
System.out.println("Could not rename file");
} catch (FileNotFoundException ex) {
ex.printStackTrace();
} catch (IOException ex) {
ex.printStackTrace();
}
}
public static void main(String[] args) {
FileUtil util = new FileUtil();
util.removeLineFromFile("test.txt", "bbbbb");
}
}
src : http://www.javadb.com/remove-a-line-from-a-text-file/
This solution requires the Apache Commons IO library to be added to the build path. It works by reading the entire file and writing each line back but only if the search term is not contained.
public static void removeLineFromFile(File targetFile, String searchTerm)
throws IOException
{
StringBuffer fileContents = new StringBuffer(
FileUtils.readFileToString(targetFile));
String[] fileContentLines = fileContents.toString().split(
System.lineSeparator());
emptyFile(targetFile);
fileContents = new StringBuffer();
for (int fileContentLinesIndex = 0; fileContentLinesIndex < fileContentLines.length; fileContentLinesIndex++)
{
if (fileContentLines[fileContentLinesIndex].contains(searchTerm))
{
continue;
}
fileContents.append(fileContentLines[fileContentLinesIndex] + System.lineSeparator());
}
FileUtils.writeStringToFile(targetFile, fileContents.toString().trim());
}
private static void emptyFile(File targetFile) throws FileNotFoundException,
IOException
{
RandomAccessFile randomAccessFile = new RandomAccessFile(targetFile, "rw");
randomAccessFile.setLength(0);
randomAccessFile.close();
}
I refactored the solution that Narek had to create (according to me) a slightly more efficient and easy to understand code. I used embedded Automatic Resource Management, a recent feature in Java and used a Scanner class which according to me is more easier to understand and use.
Here is the code with edited Comments:
public class RemoveLineInFile {
private static File file;
public static void main(String[] args) {
//create a new File
file = new File("hello.txt");
//takes in String that you want to get rid off
removeLineFromFile("Hello");
}
public static void removeLineFromFile(String lineToRemove) {
//if file does not exist, a file is created
if (!file.exists()) {
try {
file.createNewFile();
} catch (IOException e) {
System.out.println("File "+file.getName()+" not created successfully");
}
}
// Construct the new temporary file that will later be renamed to the original
// filename.
File tempFile = new File(file.getAbsolutePath() + ".tmp");
//Two Embedded Automatic Resource Managers used
// to effectivey handle IO Responses
try(Scanner scanner = new Scanner(file)) {
try (PrintWriter pw = new PrintWriter(new FileWriter(tempFile))) {
//a declaration of a String Line Which Will Be assigned Later
String line;
// Read from the original file and write to the new
// unless content matches data to be removed.
while (scanner.hasNextLine()) {
line = scanner.nextLine();
if (!line.trim().equals(lineToRemove)) {
pw.println(line);
pw.flush();
}
}
// Delete the original file
if (!file.delete()) {
System.out.println("Could not delete file");
return;
}
// Rename the new file to the filename the original file had.
if (!tempFile.renameTo(file))
System.out.println("Could not rename file");
}
}
catch (IOException e)
{
System.out.println("IO Exception Occurred");
}
}
}
Try this:
public static void main(String[] args) throws IOException {
File file = new File("file.csv");
CSVReader csvFileReader = new CSVReader(new FileReader(file));
List<String[]> list = csvFileReader.readAll();
for (int i = 0; i < list.size(); i++) {
String[] filter = list.get(i);
if (filter[0].equalsIgnoreCase("bbb")) {
list.remove(i);
}
}
csvFileReader.close();
CSVWriter csvOutput = new CSVWriter(new FileWriter(file));
csvOutput.writeAll(list);
csvOutput.flush();
csvOutput.close();
}
Old question, but an easy way is to:
Iterate through file, adding each line to an new array list
iterate through the array, find matching String, then call the remove method.
iterate through array again, printing each line to the file, boolean for append should be false, which basically replaces the file
This solution uses a RandomAccessFile to only cache the portion of the file subsequent to the string to remove. It scans until it finds the String you want to remove. Then it copies all of the data after the found string, then writes it over the found string, and everything after. Last, it truncates the file size to remove the excess data.
public static long scanForString(String text, File file) throws IOException {
if (text.isEmpty())
return file.exists() ? 0 : -1;
// First of all, get a byte array off of this string:
byte[] bytes = text.getBytes(/* StandardCharsets.your_charset */);
// Next, search the file for the byte array.
try (DataInputStream dis = new DataInputStream(new FileInputStream(file))) {
List<Integer> matches = new LinkedList<>();
for (long pos = 0; pos < file.length(); pos++) {
byte bite = dis.readByte();
for (int i = 0; i < matches.size(); i++) {
Integer m = matches.get(i);
if (bytes[m] != bite)
matches.remove(i--);
else if (++m == bytes.length)
return pos - m + 1;
else
matches.set(i, m);
}
if (bytes[0] == bite)
matches.add(1);
}
}
return -1;
}
public static void remove(String text, File file) throws IOException {
try (RandomAccessFile rafile = new RandomAccessFile(file, "rw");) {
long scanForString = scanForString(text, file);
if (scanForString == -1) {
System.out.println("String not found.");
return;
}
long remainderStartPos = scanForString + text.getBytes().length;
rafile.seek(remainderStartPos);
int remainderSize = (int) (rafile.length() - rafile.getFilePointer());
byte[] bytes = new byte[remainderSize];
rafile.read(bytes);
rafile.seek(scanForString);
rafile.write(bytes);
rafile.setLength(rafile.length() - (text.length()));
}
}
Usage:
File Contents: ABCDEFGHIJKLMNOPQRSTUVWXYZ
Method Call: remove("ABC", new File("Drive:/Path/File.extension"));
Resulting Contents: DEFGHIJKLMNOPQRSTUVWXYZ
This solution could easily be modified to remove with a certain, specifiable cacheSize, if memory is a concern. This would just involve iterating over the rest of the file to continually replace portions of size, cacheSize. Regardless, this solution is generally much better than caching an entire file in memory, or copying it to a temporary directory, etc.

Unable to a read a large file using BufferedReader in Java

I am trying to read a file using BufferedReader, but when I tried to print, It is returning some weird characters.
Code of reading file is:
private static String readJsonFile(String fileName) throws IOException{
BufferedReader br = null;
try {
StringBuilder sb = new StringBuilder();
br = new BufferedReader(new FileReader(fileName));
String line = br.readLine();
while(line != null ){
sb.append(line);
System.out.println(line);
line=br.readLine();
}
return sb.toString();
} finally{
br.close();
}
}
This function is being called as :
String jsonString = null;
try {
jsonString = readJsonFile(fileName);
} catch (IOException e) {
e.printStackTrace();
}
But when I tried to print this in console using System.out.println(jsonString);, It is returning some fancy pictures.
Note: It is Working file when file size is small.
Is there any limit on size of file it can read ?
You're using the platform default encoding to read the file, which is probably encoded in UTF8. Check the actual encoding of the file, and specify the encoding:
BufferedReader r = new BufferedReader(new InputStreamReader(new FileInputStream("...", StandardCharsets.UTF_8));
Note that since you simply want to read everything from the file, you could simply use
String json = new String(Files.readAllBytes(...), StandardCharsets.UTF_8);

Java : Writing CSV in String format to CSV in a file

A method returns a String in comma separated format. For example, the returned String can be like the one given below.
Tarantino,50,M,USA\n Carey Mulligan,27,F,UK\n Gong Li,45,F,China
I will need to get this String and write it into a CSV file. I'll have to insert a header and a footer for this file as well.
For example, when I open the file, the contents for the above data will be
Name,Age,Gender,Country
Tarantino,50,M,USA
Carey Mulligan,27,F,UK
Gong Li,45,F,China
How do we do that ? Are there any open source libraries to do this task ?
CSV format is not very well defined. You don't have to write headers for the file. Instead it is pretty SIMPLE format. Data values are separated using commas or semicolon or space etc.
You just have to write your own simple method that writes your string to a file on local computer using FileOutputStream or Writer in java.io package.
You can use this as a learning example.
I used BufferedReader because he will take care about line separators, but you can also use #split method, and write the resulting tokens.
import java.io.*;
public class Tests {
public static void main(String[] args) {
File file = new File("out.csv");
BufferedWriter out = null;
try {
out = new BufferedWriter(new FileWriter(file));
String string = "Tarantino,50,M,USA\n Carey Mulligan,27,F,UK\n Gong Li,45,F,China";
BufferedReader reader = new BufferedReader(new InputStreamReader(new ByteArrayInputStream(string.getBytes())));
String line;
while ((line = reader.readLine()) != null) {
out.write(line.trim());
out.newLine();
}
}
catch (IOException e) {
// log something
e.printStackTrace();
}
finally {
if (out != null) {
try {
out.close();
} catch (IOException e) {
// ignored
}
}
}
}
}
This is pretty simple
String str = "Tarantino,50,M,USA\n Carey Mulligan,27,F,UK\n Gong Li,45,F,China";
PrintWriter pr = new PrintWriter(new FileWriter(new File("test.csv"), true));
String arr[] = str.split("\\n");
// splited the string by new line provided with the string
pr.println("Name,Age,Gender,Country");
// header written first and rest of data appended
for(String s : arr){
pr.println(s);
}
pr.close();
don't forget to close the stream in finally block and handle the exception

Reading multiple text file in Java

I have few text files. Each text file contains some path and/or the reference of some other file.
File1
#file#>D:/FilePath/File2.txt
Mod1>/home/admin1/mod1
Mod2>/home/admin1/mod2
File2
Mod3>/home/admin1/mod3
Mod4>/home/admin1/mod4
All I want is, copy all the paths Mod1, Mod2, Mod3, Mod4 in another text file by supplying only File1.txt as input to my java program.
What I have done till now?
public void readTextFile(String fileName){
try {
br = new BufferedReader(new FileReader(new File(fileName)));
String line = br.readLine();
while(line!=null){
if(line.startsWith("#file#>")){
String string[] = line.split(">");
readTextFile(string[1]);
}
else if(line.contains(">")){
String string[] = line.split(">");
svnLinks.put(string[0], string[1]);
}
line=br.readLine();
}
} catch (Exception e) {
e.printStackTrace();
}
}
Currently my code reads the contents of File2.txt only, control does not come back to File1.txt.
Please ask if more inputs are required.
First of all you are jumping to another file without closing the current reader and when you come back you lose the cursor. Read one file first and then write all its contents that match to another file. Close the current reader (Don't close the writer) and then open the next file to read and so on.
Seems pretty simple. You need to write your file once your svnLinks Map is populated, assuming your present code works (haven't seen anything too weird in it).
So, once the Map is populated, you could use something along the lines of:
File newFile = new File("myPath/myNewFile.txt");
// TODO check file can be written
// TODO check file exists or create
FileOutputStream fos = null;
OutputStreamWriter osw = null;
BufferedWriter bw = null;
try {
fos = new FileOutputStream(newFile);
osw = new OutputStreamWriter(fos);
bw = new BufferedWriter(osw);
for (String key: svnLinks.keySet()) {
bw.write(key.concat(" my separator ").concat(svnLinks.get(key)).concat("myNewLine"));
}
}
catch (Throwable t) {
// TODO handle more gracefully
t.printStackTrace();
if (bw != null) {
try {
bw.close();
}
catch (Throwable t) {
t.printStackTrace();
}
}
Here is an non-recursive implementation of your method :
public static void readTextFile(String fileName) throws IOException {
LinkedList<String> list = new LinkedList<String>();
list.add(fileName);
while (!list.isEmpty()) {
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader(new File(list.pop())));
String line;
while ((line = br.readLine()) != null) {
if (line.startsWith("#file#>")) {
String string[] = line.split(">");
list.add(string[1]);
} else if (line.contains(">")) {
String string[] = line.split(">");
svnLinks.put(string[0], string[1]);
}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
br.close();
}
}
}
Just used a LinkedList to maintain the order. I suggest you to add some counter if you to limit the reading of files to a certain number(depth). eg:
while (!list.isEmpty() && readCount < 10 )
This will eliminate the chance of running the code to infinity(in case of circular reference).

csv parser reading headers

I'm working on a csv parser, I want to read headers and the rest of the csv file separately.
Here is my code to read csv.
The current code reads everything in the csv file, but I need to read headers separate.
please help me regarding this.
public class csv {
private void csvRead(File file)
{
try
{
BufferedReader br = new BufferedReader( new FileReader(file));
String strLine = "";
StringTokenizer st = null;
File cfile=new File("csv.txt");
BufferedWriter writer = new BufferedWriter(new FileWriter(cfile));
int tokenNumber = 0;
while( (strLine = br.readLine()) != null)
{
st = new StringTokenizer(strLine, ",");
while(st.hasMoreTokens())
{
tokenNumber++;
writer.write(tokenNumber+" "+ st.nextToken());
writer.newLine();
}
tokenNumber = 0;
writer.flush();
}
}
catch(Exception e)
{
e.getMessage();
}
}
We have withHeader() method available in CSVFormat. If you use this option then you will be able to read the file using headers.
CSVFormat format = CSVFormat.newFormat(',').withHeader();
Map<String, Integer> headerMap = dataCSVParser.getHeaderMap();
will give you all headers.
public class CSVFileReaderEx {
public static void main(String[] args){
readFile();
}
public static void readFile(){
List<Map<String, String>> csvInputList = new CopyOnWriteArrayList<>();
List<Map<String, Integer>> headerList = new CopyOnWriteArrayList<>();
String fileName = "C:/test.csv";
CSVFormat format = CSVFormat.newFormat(',').withHeader();
try (BufferedReader inputReader = new BufferedReader(new FileReader(new File(fileName)));
CSVParser dataCSVParser = new CSVParser(inputReader, format); ) {
List<CSVRecord> csvRecords = dataCSVParser.getRecords();
Map<String, Integer> headerMap = dataCSVParser.getHeaderMap();
headerList.add(headerMap);
headerList.forEach(System.out::println);
for(CSVRecord record : csvRecords){
Map<String, String> inputMap = new LinkedHashMap<>();
for(Map.Entry<String, Integer> header : headerMap.entrySet()){
inputMap.put(header.getKey(), record.get(header.getValue()));
}
if (!inputMap.isEmpty()) {
csvInputList.add(inputMap);
}
}
csvInputList.forEach(System.out::println);
} catch (Exception e) {
System.out.println(e);
}
}
}
Please consider the use of Commons CSV. This library is written according RFC 4180 - Common Format and MIME Type for Comma-Separated Values (CSV) Files. What is compatible to read such lines:
"aa,a","b""bb","ccc"
And the use is quite simple, there is just 3 classes, and a small sample according documentation:
Parsing of a csv-string having tabs as separators, '"' as an optional
value encapsulator, and comments starting with '#':
CSVFormat format = new CSVFormat('\t', '"', '#');
Reader in = new StringReader("a\tb\nc\td");
String[][] records = new CSVParser(in, format).getRecords();
And additionally you get this parsers already available as constants:
DEFAULT - Standard comma separated format as defined by RFC 4180.
EXCEL - Excel file format (using a comma as the value delimiter).
MYSQL - Default MySQL format used by the SELECT INTO OUTFILE and LOAD DATA INFILE operations.
TDF - Tabulation delimited format.
Have you considered OpenCSV?
Previous question here...
CSV API for Java
Looks like you can split out the header quite easily...
String fileName = "data.csv";
CSVReader reader = new CSVReader(new FileReader(fileName ));
// if the first line is the header
String[] header = reader.readNext();
// iterate over reader.readNext until it returns null
String[] line = reader.readNext();
Your code here, being
while( (strLine = br.readLine()) != null)
{
//reads everything in your csv
}
will print all of your CSV content.
For example, the following fetches your header:
Reader in = ...;
CSVFormat.EXCEL.withHeader("Col1", "Col2", "Col3").parse(in);
As suggested, life could be easier using the predefined CSVFormat from the apache commons library. Link here (https://commons.apache.org/proper/commons-csv/user-guide.html).
Cheers.

Categories