Splitting data in CSV file - java

Below is the data format in my CSV file
userid,group,username,status
In my Java code I delimited the data by using , as delimiter
Eg:
normal scennario in which my code works fine:
1001,admin,ram,active
in this scenario(user with firstname,lastname) when i take the status of the 1002 user it is coming as KUMAR since it is taking 4th column as status
1002,User,ravi,kumar,active
Kindly help me on how to change the code logic so that it works fine for both the scenenarios

You can use OpenCSV library.
CSVReader csvReader = new CSVReader(new FileReader(fileName),';');
List<String[]> rows = csvReader.readAll();
then you can test the first column : if rows.get(0)[0] == 1002 ....

Related

i want to convert a report which is in text format into a xlsx document. but the problem is data in text file has some missing column values

typical report data is like this,
A simple approach that i wanted to follow was to use space as a delimeter but the data is not in a well structured manner
read the first line of the file and split each column by checking if there is more than 1 whitespace. In addition to that you count how long each column is.
after that you can simply go through the other rows containing data and extract the information, by checking the length of the column you are at
(and please don't put images of text into stackoverflow, actual text is better)
EDIT:
python implementation:
import pandas as pd
import re
file = "path/to/file.txt"
with open("file", "r") as f:
line = f.readline()
columns = re.split(" +", line)
column_sizes = [re.finditer(column, line).__next__().start() for column in columns]
column_sizes.append(-1)
# ------
f.readline()
rows = []
while True:
line = f.readline()
if len(line) == 0:
break
elif line[-1] != "\n":
line += "\n"
row = []
for i in range(len(column_sizes)-1):
value = line[column_sizes[i]:column_sizes[i+1]]
row.append(value)
rows.append(row)
columns = [column.strip() for column in columns]
df = pd.DataFrame(data=rows, columns=columns)
print(df)
df.to_excel(file.split(".")[0] + ".xlsx")
You are correct that export from text to csv is not a practical start, however it would be good for import. So here is your 100% well structured source text to be saved into plain text.
And here is the import to Excel
you can use google lens to get your data out of this picture then copy and paste to excel file. the easiest way.
or first convert this into pdf then use google lens. go to file scroll to print option in print setting their is an option of MICROSOFT PRINT TO PDF select that and press print it will ask you for location then give it and use it

Karate : In my CSV file, columns are not having same row count. While reading data empty values are added for columns having less rows

My csv file data : 1 column is HeaderText(6 rows) and other is accountBtn(4 rows)
accountBtn,HeaderText
New Case,Type
New Note,Phone
New Contact,Website
,Account Owner
,Account Site
,Industry
When I'm reading file with below code
* def csvData = read('../TestData/Button.csv')
* def expectedButton = karate.jsonPath(csvData,"$..accountBtn")
* def eHeaderTest = karate.jsonPath(csvData,"$..HeaderText")
data set generated as per code is : ["New Case","New Note","New Contact","","",""]
My expected data set is : ["New Case","New Note","New Contact"]
Any idea how can this be handled?
That's how it is in Karate and it shouldn't be a concern since you are just using it as data to drive a test. You can run a transform to convert empty strings to null if required: https://stackoverflow.com/a/56581365/143475
Else please consider contributing code to make Karate better !
The other option is to use JSON as a data-source instead of CSV: https://stackoverflow.com/a/47272108/143475

Adding header to processed RDDs in Spark java

My question is almost same as Add a header before text file on save in Spark. The difference is that my header RDD is
String headerSTR = "inc_id,po_id,ass,inci_type,cat,sub_cat";
JavaRDD<String> PMheader = jsc.parallelize(Arrays.asList(headerSTR));
And my lines RDD is of PM Table type.
JavaRDD<PMTable>rdd_records=noheader.map(new Function<String,PMTable>(){---
PMTable sd = new PMTable(----);
return sd;});
rdd_records.saveAsTextFile();
mergeAllFiles();
I have merged all the result files to a single csv file which does not contain header .Now I need to get union of header rdd and lines rdd .But the method union(JavaRDD) in the type JavaRDD is not applicable for the arguments (JavaRDD of PMTable type). So how can i get the union of header and lines using spark-java api.
Thanks in advance.

Extracting a column from a paragraph from a csv file using java

MAJOR ACC NO,MINOR ACC NO,STD CODE,TEL NO,DIST CODE
7452145,723456, 01,4213036,AAA
7254287,7863265, 01,2121920,AAA
FRUNDTE,FMACNO,FACCNO,FDISTCOD,FBILSEQ,FOOCTYP,FOOCDES,FOOCAMT,FSTD,FTELNO,FNORECON,FXFRACCN,FLANGIND,CUR
12345,71234,7643234,AAA,001,DX,WLR Promotion - Insitu /Pre-Cabled PSTN Connection,-37.87,,,0,,E,EUR
FRUNDTE,FMACNO,FACCNO,FDISTCOD,FBILSEQ,FORDNO,FREF,FCHGDES,FCHGAMT,CUR,FORENFRM,FORENTO
3242241,72349489,2345352,AAA,001,30234843P ,1,NEW CONNECTION - PRECABLED CHARGE,37.87,EUR,2123422,201201234
12123471,7618412389,76333232,AAA,001,3123443P ,2,BROKEN PERIOD RENTAL,5.40,EUR,201234523,20123601
I have a csv file something like the one above and I want to extract certain columns from it. For example I want to extract the first column of the first paragraph. I'm kind of new to java but I am able to read the file but I want to extract certain columns from different paragraphs. Any help will be appreciated.

Reading selective column data from a text file into a list in Java

Can someone help me to read a selective column of data in a text file into a list..
e.g.: if the text file data is as follows
-------------
id name age
01 ron 21
02 harry 12
03 tom 23
04 jerry 25
-------------
from the above data if I need to gather the column "name" using list in java and print it..
java.util.Scanner could be used to read the file, discarding the unwanted columns.
Either print the wanted column values as the file is processed or add() them to a java.util.ArrayList and print them once processing is complete.
A small example with limited error checking:
Scanner s = new Scanner(new File("input.txt"));
List<String> names = new ArrayList<String>();
// Skip column headings.
// Read each line, ensuring correct format.
while (s.hasNext())
{
s.nextInt(); // read and skip 'id'
names.add(s.next()); // read and store 'name'
s.nextInt(); // read and skip 'age'
}
for (String name: names)
{
System.out.println(name);
}
Use a file reader and read it line by line, break on the spaces and add any column you want to the List. Use a BufferedReader to grab the lines by something like this:
BufferedReader br = new BufferedReader(new FileReader("C:\\readFile.txt"));
Then you can do to grab a line:
String line = br.readLine();
Finally you can split the string into an array by column by doing this:
String[] columns = line.split(" ");
Then you can access the columns and add them into the list depending on if you want column 0, 1, or 2.
Are the columns delimited by tabs?
Look at Java CSV, an open source library for reading comma delimited or tab delimited text files. SHould do most of the job. I've never used it myself, but I assume you'd be able to ask for all the values from column 1 (or similar).
Alternatively, you could read the file one line at a time using a BufferedReader (which has a readLine()) method) and then call String.split() and grab the parts you want.

Categories