Convert CSV values to a HashMap key value pairs in JAVA - java

HI I have a csv called test.csv . I am trying to read the csv line by line and convert the values into a hash key value pairs .
Here is the code :-
public class Example {
public static void main(String[] args) throws ParseException, IOException {
// TODO Auto-generated method stub
BufferedReader br = new BufferedReader(new FileReader("test.csv"));
String line = null;
HashMap<String,String> map = new HashMap<String, String>();
while((line=br.readLine())!=null){
String str[] = line.split(",");
for(int i=0;i<str.length;i++){
String arr[] = str[i].split(":");
map.put(arr[0], arr[1]);
}
}
System.out.println(map);
}
}
The csv file is as follows :-
1,"testCaseName":"ACLTest","group":"All_Int","projectType":"GEN","vtName":"NEW_VT","status":"ACTIVE","canOrder":"Yes","expectedResult":"duplicateacltrue"
2,"testCaseName":"DCLAddTest","group":"India_Int","projectType":"GEN_NEW","vtName":"OLD_VT","status":"ACTIVE","canOrder":"Yes","expectedResult":"invalidfeaturesacltrue"
When I run this code I get this error :-
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
Example.main(Example.java:33)
Can anyone please help me to fix the code and find out the error in my program ?

Using FasterXML's CSV package:
https://github.com/FasterXML/jackson-dataformats-text/tree/master/csv
public static List<Map<String, String>> read(File file) throws JsonProcessingException, IOException {
List<Map<String, String>> response = new LinkedList<Map<String, String>>();
CsvMapper mapper = new CsvMapper();
CsvSchema schema = CsvSchema.emptySchema().withHeader();
MappingIterator<Map<String, String>> iterator = mapper.reader(Map.class)
.with(schema)
.readValues(file);
while (iterator.hasNext()) {
response.add(iterator.next());
}
return response;
}

In your String when you split it on first time only contains arr[0] as 1 nothing in arr[1] so it will cause an Exception
If you does not need the 1,2, etc.. You can look the following code:
String str[] = line.split(",");
for(int i=1;i<str.length;i++){
String arr[] = str[i].split(":");
map.put(arr[0], arr[1]);
}

The problem is that when you split your str, the first element in each line is alone (i.e 1 and 2). So arr only contains ["1"], and hence arr[1] doesn't exists.
I.e for the example input :
1,"testCaseName":"ACLTest"
split by , => str contains {1, testCaseName:ACLTest}
split by : at the first iteration => arr contains {1}
Example :
String s = "1,testCaseName:ACLTest";
String str[] = s.split(",");
System.out.println(Arrays.toString(str));
for(String p : str){
String arr[] = p.split(":");
System.out.println(Arrays.toString(arr));
}
Output :
[1, testCaseName:ACLTest]
[1] //<- here arr[1] doesn't exists, you only have arr[0] and hence the ArrayIndexOutOfBoundsException when trying to access arr[1]
[testCaseName, ACLTest]
To fix your code (if you don't want to use a CSV parser), make your loop starting at 1 :
for(int i=1;i<str.length;i++){
String arr[] = str[i].split(":");
map.put(arr[0], arr[1]);
}
Another problem is that the HashMap use the hashCode of the keys to store the (key, value) pairs.
So when insering "testCaseName":"ACLTest" and "testCaseName":"DCLAddTest", the first value will be erased and replace by the second one :
Map<String, String> map = new HashMap<>();
map.put("testCaseName","ACLTest");
map.put("testCaseName","DCLAddTest");
System.out.println(map);
Output :
{testCaseName=DCLAddTest}
So you have to fix that too.

Look at the output of the call
String arr[] = str[i].split(":");
arr[1] does not exists for the first element in your CSV file which happens to be 1, 2... You can start the loop with int i=0 to fix this issue.

String.split is rubbish for parsing CSV. Either use the Guava Splitter or a proper CSV parser. You can parse CSV into beans using the Jackson CSV mapper like this:
public class CSVPerson{
public String firstname;
public String lastname;
//etc
}
CsvMapper mapper = new CsvMapper();
CsvSchema schema = CsvSchema.emptySchema().withHeader().withColumnSeparator(delimiter);
MappingIterator<CSVPerson> it = = mapper.reader(CSVPerson).with(schema).readValues(input);
while (it.hasNext()){
CSVPerson row = it.next();
}
more info at http://demeranville.com/how-not-to-parse-csv-using-java/

Beside the problem you have with the first number which its not a pair and its causing the Exception, you will not want to use Hashmap, since hashmap use a unique key, so line 2 will replace values from line 1.
You should use a MultiMap, or a List of pairs in this case.

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.*;
public class Example {
public static void main(String[] args) {
String csvFile = "test.csv";
String line = "";
String cvsSplitBy = ",";
HashMap<String, String> list = new HashMap<>();
try (BufferedReader br = new BufferedReader(new FileReader(csvFile))) {
while ((line = br.readLine()) != null) {
// use comma as separator
String[] country = line.split(cvsSplitBy);
//System.out.println(country[0] +" " + country[1]);
list.put(country[0], country[1]);
}
} catch (IOException e) {
e.printStackTrace();
}
System.out.println(list);
}
// enter code here
}

using openCSV would be one way to do it
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import au.com.bytecode.opencsv.CSVReader;
public class CsvFileReader {
public static void main(String[] args) {
try {
System.out.println("\n**** readLineByLineExample ****");
String csvFilename = "C:/Users/hussain.a/Desktop/sample.csv";
CSVReader csvReader = new CSVReader(new FileReader(csvFilename));
String[] col = null;
while ((col = csvReader.readNext()) != null)
{
System.out.println(col[0] );
//System.out.println(col[0]);
}
csvReader.close();
}
catch(ArrayIndexOutOfBoundsException ae)
{
System.out.println(ae+" : error here");
}catch (FileNotFoundException e)
{
System.out.println("asd");
e.printStackTrace();
} catch (IOException e) {
System.out.println("");
e.printStackTrace();
}
}
}
the jar is available here

Related

Compare two columns in csv and write only unique values to another csv

I am fairly new to java, I have a csv file with 8 columns and I need create new csv from that file with 5 columns. Now I have already done that, to read the csv and create a new one. But there are repeated data in original csv, and the scenario is, if the data are repeated, I only need to take a single row from that. For example:
a, 123, value1, a#email.com
a, 123, value1, a#email.com
a, 123, value1, a#email.com
a, 123, Value7, a#email.com
b, 567, Value5, b#email.com
b, 567, Value6, b#email.com
b, 567, Value6, b#email.com
Like above values, a has value1 repeating 3 times, and b has Value6 repeating two times. In my new csv, I only need to write those value once. So that the ourput looks something like this:
a, 123, value1, a#email.com
a, 123, Value7, a#email.com
b, 567, Value5, b#email.com
b, 567, Value6, b#email.com
Below is the code I have written to read and write the csv file. I am finding it hard to get a logic for the above scenario. Any help would be appreciated.
Thank you.
public static void main(String[] args) throws IOException {
try {
String row = "";
List<List<String>> data = new ArrayList<>();
Map newMap = new HashMap();
BufferedReader br = new BufferedReader(new FileReader("myFile.csv"));
row=br.readLine();
while((row=br.readLine())!=null){
String[] line = row.split(",", -1);
//System.out.println(line[4]);
//newMap.put(line[1], line[4]);
List<String> newList = new ArrayList<String>();
for (String cell : line) {
newList.add(cell);
// System.out.println(newList.get(3));
}
data.add(newList);
}
FileWriter csvWriter = new FileWriter("newFile.csv");
//Write To New File
//Add Headers
csvWriter.append("User Name,"+"User LoginID,"+"User Position,"+"Permission,"+"Email Address"+"\n");
for(List rowData:data) {
if(rowData.toString().length()>1) {
rowData.remove(5);
rowData.remove(2);
rowData.remove(4);
newMap.put(rowData.get(0), rowData.get(3));
csvWriter.append(String.join(",",rowData));
csvWriter.append("\n");
}
}
}
csvWriter.flush();
csvWriter.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
Can't see how you're getting the values for user/permissions so I'm taking some liberty here by specifying some bits arbitrarily which I'm assuming you already have:
Map<String,Set<String>> userToPermissionMap = new HashMap<>();
// read in CSV line by line
String[] lines = csvFileInput; // not actual code
for (String line : lines) {
String rowData = line.split(",");
String user = rowData[0];
String permission = rowData[3];
if (!userToPermissionMap.contains(user)) {
userToPermissionMap.put(user, new HashSet<>());
}
userToPermissionMap.get(user).add(permission);
}
This only shows how to group the permissions per user. You'll still want to record the other details I'm guessing, but that should be straight forward to add as you see fit. Then you write it to a new CSV.
Alternatively you could delete rows that you find are duplicates. This could introduce the interesting problem of removing rows, then you moving to the next row, when in fact the row you just deleted got replaced with the next row.... do it in reverse if you follow this method. The approach will be similar to above though - you'll need to maintain a list of users and permissions already seen and only keep the rows where the user/permission combination has not yet been encountered.
See what you think of this:
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.Arrays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
class Scratch {
public static final int USER_COL = 0;
public static final int LOGIN_ID_COL = 1;
public static final int USER_POSITION_COL = 2;
public static final int PERMISSION_COL = 3;
public static final int EMAIL_COL = 4;
private static Map<String, List<String>> userData = new HashMap<>();
private static Map<String, Set<String>> userToPermissionMap = new HashMap<>();
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("myFile.csv"));
String header = br.readLine(); // store header for later use
String rowEntry = null;
while ((rowEntry = br.readLine()) != null) {
String[] row = rowEntry.split(",");
String user = row[USER_COL];
String permission = row[PERMISSION_COL];
// doesn't matter if we overwrite an entry here as we'll extract the unique permissions each time and ignore this afterwards
userData.put(user, Arrays.asList(row));
if (!userToPermissionMap.containsKey(user)) {
userToPermissionMap.put(user, new HashSet<>()); // new user
}
userToPermissionMap.get(user).add(permission); // add permission to the Set
}
FileWriter csvWriter = new FileWriter("newFile.csv");
csvWriter.append(header + "\n"); // copy of original header (may not need \n)
for (String user : userToPermissionMap.keySet()) { // for each user
for (String permission: userToPermissionMap.get(user)) { // for each unique permission
StringBuilder builder = new StringBuilder();
builder.append(user + ",");
builder.append(userData.get(user).get(LOGIN_ID_COL) + ",");
builder.append(userData.get(user).get(USER_POSITION_COL) + ",");
builder.append(permission + ",");
builder.append(userData.get(user).get(EMAIL_COL) + "\n");
csvWriter.append(builder.toString());
}
}
}
}

Scanning integer and string from file in Java

I'm new to Java and I have to read from a file, and then convert what I have read into variables. My file consists of a fruit, then a price and it has a long list of this. The file looks like this:
Bananas,4
Apples,5
Strawberry,8
...
Kiwi,3
So far I have created two variables(double price and String name), then set up a scanner that reads from the file.
public void read_file(){
try{
fruits = new Scanner(new File("fruits.txt"));
print_file();
}
catch(Exception e){
System.out.printf("Could not find file\n");
}
}
public void print_file(){
while(fruits.hasNextLine()){
String a = fruits.nextLine();
System.out.printf("%s\n", a);
return;
}
}
Currently I am only able to print out the entire line. But I was wondering how I could break this up to be able to store the lines into variables.
So your string a has an entire line like Apples,5. So try to split it by comma and store it into variables.
String arr[] = a.split(",");
String name = arr[0];
int number = Integer.parseInt(arr[1]);
Or if prices are not integers, then,
double number = Double.parseDouble(arr[1]);
Using java 8 stream and improved file reading capabilities you can do it as follows. it stores item and count as key value pair in a map. It is easy to access by key afterwards.
I know this Maybe too advance but eventually this will help you later when getting to know new stuff in java.
try (Stream<String> stream = Files.lines(Paths.get("src/test/resources/items.txt"))) {
Map<String, Integer> itemMap = stream.map(s -> s.split(","))
.collect(toMap(a -> a[0], a -> Integer.valueOf(a[1])));
System.out.println(itemMap);
} catch (IOException e) {
e.printStackTrace();
}
output
{Apples=5, Kiwi=3, Bananas=4, Strawberry=8}
You can specify a delimiter for the scanner by calling the useDelimiter method, like:
public static void main(String[] args) {
String str = "Bananas,4\n" + "Apples,5\n" + "Strawberry,8\n";
try (Scanner sc = new Scanner(str).useDelimiter(",|\n")) {
while (sc.hasNext()) {
String fruit = sc.next();
int price = sc.nextInt();
System.out.printf("%s,%d\n", fruit, price);
}
} catch (Exception e) {
e.printStackTrace(System.out);
}
}
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
public class Test {
public static void main(String[] args) {
BufferedReader reader;
try {
reader = new BufferedReader(new FileReader(
"C://Test/myfile.txt")); //Your file location
String line = reader.readLine(); //reading the line
while(line!=null){
if(line!=null && line.contains(",")){
String[] data = line.split(",");
System.out.println("Fruit:: "+data[0]+" Count:: "+Integer.parseInt(data[1]));
}
//going over to next line
line = reader.readLine();
}
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}

How to read each line and then find the missing document?

Sample Input format:
Name of the file, Author, format type, id, content length.
resume, abc, pdf, 7, 90
resume, asc, doc, 2, 90
resume, azc, docx, 3, 90
Output:
Missing document format
pdf
2,3
doc
7,3
Here is my approach: take input from external txt file(Required).
File file = new File("//Users//Downloads//test_cases//input.txt");
ArrayList<String> al=new ArrayList<String>();//creating new generic arraylist
BufferedReader br = new BufferedReader(new FileReader(file));
String st;
while ((st = br.readLine()) != null)
al.add(st);
br.close();
So, my question is which is apt data structure to use? After reading each line. also how should i approach in storing data ?
A sample code would be great help. thanks in advance.
The solution is based on the premise there will be only one value in the entry in the "format type" field.
This solution requires the use of google guava collection. The jars can downloaded from "https://github.com/google/guava/wiki/Release19"
import java.io.BufferedReader;
import java.io.File;
import java.lang.reflect.Array;
import java.util.*;
import com.google.common.collect.ArrayListMultimap;
import com.google.common.collect.Multimap;
public class FileReader {
public void processData() {
Multimap dataMap = readFile();
dataMap.get("format type");
Object[] array = ((ArrayListMultimap) dataMap).get("format type").toArray();
System.out.println("Missing formats");
for (Object entry:array) {
System.out.println(entry.toString().trim());
String position= "";
for(int i=0;i<array.length;i++) {
if(!entry.toString().equalsIgnoreCase (array[i].toString())) {
position= position+" "+i;
}
}
System.out.println(position);
}
}
public Multimap readFile() {
File file = new File("/Users/sree/Desktop/text.txt");
Multimap<String,String> dataMap = ArrayListMultimap.create();
ArrayList<String> al=new ArrayList<String>();
BufferedReader br;
try {
br = new BufferedReader(new java.io.FileReader(file));
Arrays.stream(br.readLine().split(",")).forEach(s ->al.add(s.trim()));
String st;
while ((st = br.readLine()) != null) {
VariableIcrementor instance = new VariableIcrementor();
Arrays.stream(st.split(",")).
forEach(s->dataMap.put(al.get(instance.increment()),s));
}
br.close();
} catch (Exception e) {
e.printStackTrace();
}
return dataMap;
}
public static void main(String[] args) {
FileReader instance = new FileReader();
instance.processData();
}
}
variable incrementor implementation details
public class VariableIcrementor {
private int i=0;
public int increment() {
return i++;
}
}
You would use hashmaps to store the data with keys as formats and ids as values. If you want to use python, here's a sample:
# Read line from file
with open('filename.txt') as f:
line = f.readline()
entries = line.split(' ')
# Hashmap to store formats and corresponding ids
formats = {}
# set to store all the ids
ids = set()
for en in entries:
vals = en.split(',')
frmt, identifier = vals[2], vals[3]
# Set of ids for each format
if frmt not in formats:
formats[frmt] = set()
formats[frmt].add(identifier)
ids.add(identifier)
print("Missing formats")
for frmt in formats:
print(frmt)
# Missing formats are set difference between all the ids and the current ids
print(ids-formats[frmt])

Adding data from .txt document to array

Below is what the text document looks like. The first line is the number of elements that I want the array to contain. The second is the ID for the product, separated by # and the third line is the total price of the products once again separated by #
10
PA/1234#PV/5732#Au/9271#DT/9489#HY/7195#ZR/7413#bT/4674#LR/4992#Xk/8536#kD/9767#
153#25#172#95#235#159#725#629#112#559#
I want to use the following method to pass inputFile to the readProductDataFile method:
public static Product[] readProductDataFile(File inputFile){
// Code here
}
I want to create an array of size 10, or maybe an arrayList. Preferably to be a concatenation of Customer ID and the price, such as Array[1] = PA/1234_153
There you go the full class, does exactly what you want:
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.Arrays;
import java.io.FileNotFoundException;
import java.io.IOException;
class myRead{
public static void main(String[] args) throws FileNotFoundException, IOException {
BufferedReader inputFile = new BufferedReader(new FileReader("test.txt"));
String numberOfElements = inputFile.readLine();
//this is the first line which contains the number "10"
//System.out.println(numberOfElements);
String secondLine = inputFile.readLine();
//this is the second line which contains your data, split it using "#" as a delimiter
String[] strArray = secondLine.split("#");
//System.out.println(Arrays.toString(strArray));
//System.out.println(strArray[0]);
String thirdLine = inputFile.readLine();
//this is the third line which contains your data, split it using "#" as a delimiter
String[] dataArray = thirdLine.split("#");
//combine arrays
String[] combinedArray = new String[strArray.length];
for (int i=0;i<strArray.length;i++) {
combinedArray[i]=strArray[i]+"_"+dataArray[i];
System.out.println(combinedArray[i]);
}
}
}
OUTPUT:
PA/1234_153
PV/5732_25
Au/9271_172
DT/9489_95
HY/7195_235
ZR/7413_159
bT/4674_725
LR/4992_629
Xk/8536_112
kD/9767_559
The trick in what I am doing is using a BufferedReader to read the file, readLine to read each of the three lines, split("#"); to split each token using the # as the delimiter and create the arrays, and combinedArray[i]=strArray[i]+"_"+dataArray[i]; to put the elements in a combined array as you want...!
public static Product[] readProductDataFile(File inputFile){
BufferedReader inputFile = new BufferedReader(new FileReader(inputFile));
// the rest of my previous code goes here
EDIT: Everything together with calling a separate method from inside the main, with the file as an input argument!
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.Arrays;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.File;
class myRead{
public static void main(String[] args) throws FileNotFoundException, IOException {
File myFile = new File("test.txt");
readProductDataFile(myFile);
}
public static String[] readProductDataFile(File inputFile) throws FileNotFoundException, IOException{
BufferedReader myReader = new BufferedReader(new FileReader("test.txt"));
String numberOfElements = myReader.readLine();
//this is the first line which contains the number "10"
//System.out.println(numberOfElements);
String secondLine = myReader.readLine();
//this is the second line which contains your data, split it using "#" as a delimiter
String[] strArray = secondLine.split("#");
//System.out.println(Arrays.toString(strArray));
//System.out.println(strArray[0]);
String thirdLine = myReader.readLine();
//this is the third line which contains your data, split it using "#" as a delimiter
String[] dataArray = thirdLine.split("#");
//combine arrays
String[] combinedArray = new String[strArray.length];
for (int i=0;i<strArray.length;i++) {
combinedArray[i]=strArray[i]+"_"+dataArray[i];
System.out.println(combinedArray[i]);
}
return combinedArray;
}
}
OUTPUT
PA/1234_153
PV/5732_25
Au/9271_172
DT/9489_95
HY/7195_235
ZR/7413_159
bT/4674_725
LR/4992_629
Xk/8536_112
kD/9767_559
You don't even need the first line. Just read the second line directly into a single string and then split it by using String,split() method.
Read more for split method here.
You could use something like this (Be aware that i can't test it at the moment)
BufferedReader in = null;
try {
in = new BufferedReader(new FileReader("fileeditor.txt"));
String read = null;
String firstLine=in.readLine();
//reads the first line
while ((read = in.readLine()) != null) {
// reads all the other lines
read = in.readLine();
String[] splited = read.split("#");
//split the readed row with the "#" character
for (String part : splited) {
System.out.println(part);
}
}
} catch (IOException e) {
System.out.println("There was a problem: " + e);
e.printStackTrace();
} finally {
try {
//close file
in.close();
} catch (Exception e) {
}
}
This is how you can do it using Java (don't forget to import):
public static Product[] readProductDataFile(File inputFile){
Scanner s = new Scanner(inputFile);
String data = "";
while(s.hasNext())
data += s.nextLine();
String[] dataArray = data.split("#");
}
You can try this way ..
Reading line by line and storing each row in a array.
Use while storing so it will split and save .
String[] strArray = secondLine.split("#");
Now use the for loop and concat the values as u wish and save ina third array .
For(int i=0 ;i< file.readline;i++)
{
string s = a[customerid];
s.concat(a[productid]);
a[k] =s;
}

Sort one column of data in a csv file in ascending order in java

import java.io.*;
import java.util.*;
public class Sort {
public static void main(String[] args) throws Exception {
BufferedReader reader = new BufferedReader(new FileReader("data1.csv"));
Map<String, String> map=new TreeMap<String, String>();
String line="";
while((line=reader.readLine())!=null){
map.put(getField(line),line);
}
reader.close();
FileWriter writer = new FileWriter("sorted_numbers.txt");
for(String val : map.values()){
writer.write(val);
writer.write('\n');
}
writer.close();
}
private static String getField(String line) {
return line.split(",")[0];//extract value you want to sort on
}
}
Hia
I am trying to read a unsorted file and get Java to sort one column of the csv data file and print those results in a new file. I borrowed this solution whilst I was searching on this website because I think it is ideal for what I am trying to acomplish. I have a 282 rows of data in the form of
UserID, Module, Mark
Ab004ui, g46PRo, 54
cb004ui, g46GRo, 94
gy004ui, g46GRo, 12
ab004ui, g46PRo, 34
this is in the csv file.
when I use the above code it only gives me one line in the sorted_marks.txt, like this
ab004ui, g46PRo, 34
and I believe it wasnt even sorted.
I want all the results from the new file to be sorted based on their userID and nothing else but I can't seem to get it to work, please any help would be greatful
Remove the new lines from data1.csv.
I would prefer to use the second generic String of the Map as a list of string and everything is almost same like below
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileWriter;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;
public class Sort {
public static void main(String[] args) throws Exception {
BufferedReader reader = new BufferedReader(new FileReader("data1.csv"));
Map<String, List<String>> map = new TreeMap<String, List<String>>();
String line = reader.readLine();//read header
while ((line = reader.readLine()) != null) {
String key = getField(line);
List<String> l = map.get(key);
if (l == null) {
l = new LinkedList<String>();
map.put(key, l);
}
l.add(line);
}
reader.close();
FileWriter writer = new FileWriter("sorted_numbers.txt");
writer.write("UserID, Module, Mark\n");
for (List<String> list : map.values()) {
for (String val : list) {
writer.write(val);
writer.write("\n");
}
}
writer.close();
}
private static String getField(String line) {
return line.split(",")[0];// extract value you want to sort on
}
}
You probably want line.split(",")[0], not space. But you should also trim the white space from beginning and end of your sort key using the appropriate string function.

Categories