Converting CSV file to Hierarchy XML with JAVA - java

We have a program in Java that needs to convert CSV file to Hierarchy XML:
the output should be like this:
`<?xml version="1.0" encoding="UTF-8"?>
<UteXmlComuniction xmlns="http://www....../data">
<Client Genaral Data>
<Client>
<pfPg></pfPg>
<name>Arnold</name>
<Family>Bordon</family>
</Client>
<Contract>
<ContractDetail>
<Contract>100020</Contract>
<ContractYear>2019</ContractYear>
</ContractDetail>
</Contract>
</Client Genaral Data>``
But for CSV file we are flexible, we can define it as we want. I thought maybe in this way it works:
"UteXmlComuniction/ClientGeneralData/Client/pfpg", "UteXmlComuniction/ClientGeneralData/Client/name" ,
"UteXmlComuniction/ClientGeneralData/Client/Family" , ...```
This is our code, but it just gives me the flat XML. Also I can not insert "/" character in CSV file, because program can not accept this character.
public class XMLCreators {
// Protected Properties
protected DocumentBuilderFactory domFactory = null;
protected DocumentBuilder domBuilder = null;
public XMLCreators() {
try {
domFactory = DocumentBuilderFactory.newInstance();
domBuilder = domFactory.newDocumentBuilder();
} catch (FactoryConfigurationError exp) {
System.err.println(exp.toString());
} catch (ParserConfigurationException exp) {
System.err.println(exp.toString());
} catch (Exception exp) {
System.err.println(exp.toString());
}
}
public int convertFile(String csvFileName, String xmlFileName,
String delimiter) {
int rowsCount = -1;
try {
Document newDoc = domBuilder.newDocument();
// Root element
Element rootElement = newDoc.createElement("XMLCreators");
newDoc.appendChild(rootElement);
// Read csv file
BufferedReader csvReader;
csvReader = new BufferedReader(new FileReader(csvFileName));
int line = 0;
List<String> headers = new ArrayList<String>(5);
String text = null;
while ((text = csvReader.readLine()) != null) {
StringTokenizer st = new StringTokenizer(text, delimiter, false);
String[] rowValues = new String[st.countTokens()];
int index = 0;
while (st.hasMoreTokens()) {
String next = st.nextToken();
rowValues[index++] = next;
}
if (line == 0) { // Header row
for (String col : rowValues) {
headers.add(col);
}
} else { // Data row
rowsCount++;
Element rowElement = newDoc.createElement("row");
rootElement.appendChild(rowElement);
for (int col = 0; col < headers.size(); col++) {
String header = headers.get(col);
String value = null;
if (col < rowValues.length) {
value = rowValues[col];
} else {
// ?? Default value
value = "";
}
Element curElement = newDoc.createElement(header);
curElement.appendChild(newDoc.createTextNode(value));
rowElement.appendChild(curElement);
}
}
line++;
}
ByteArrayOutputStream baos = null;
OutputStreamWriter osw = null;
try {
baos = new ByteArrayOutputStream();
osw = new OutputStreamWriter(baos);
TransformerFactory tranFactory = TransformerFactory.newInstance();
Transformer aTransformer = tranFactory.newTransformer();
aTransformer.setOutputProperty(OutputKeys.INDENT, "yes");
aTransformer.setOutputProperty(OutputKeys.METHOD, "xml");
aTransformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");
Source src = new DOMSource(newDoc);
Result result = new StreamResult(osw);
aTransformer.transform(src, result);
osw.flush();
System.out.println(new String(baos.toByteArray()));
} catch (Exception exp) {
exp.printStackTrace();
} finally {
try {
osw.close();
} catch (Exception e) {
}
try {
baos.close();
} catch (Exception e) {
}
}
// Output to console for testing
// Resultt result = new StreamResult(System.out);
} catch (IOException exp) {
System.err.println(exp.toString());
} catch (Exception exp) {
System.err.println(exp.toString());
}
return rowsCount;
// "XLM Document has been created" + rowsCount;
}
}
Do you have any suggestion that how should I modify the code or how can I change my CSV in order to have a Hierarchy XML?

csv:
pfPg;name;Family;Contract;ContractYear
There are several libs for reading csv in Java. Store the values in a container e.g. hashmap.
Then create java classes representing your xml structure.
class Client {
private String pfPg;
private String name;
private String Family
}
class ClientGenaralData {
private Client client;
private Contract contract;
}
Do the mapping from csv to your Java classes by writing custom code or a mapper like dozer... Then use xml binding with Jackson or JAXB to create xml from Java objects.
Jackson xml
Dozer HowTo

Related

How to replace a part of a string in an xml file?

I have an xml file with something like this:
<Verbiage>
The whiskers plots are based on the responses of incarcerated
<Choice>
<Juvenile> juveniles who have committed sexual offenses. </Juvenile>
<Adult> adult sexual offenders. </Adult>
</Choice>
If the respondent is a
<Choice>
<Adult>convicted sexual offender, </Adult>
<Juvenile>juvenile who has sexually offended, </Juvenile>
</Choice>
#his/her_lc# percentile score, which defines #his/her_lc# position
relative to other such offenders, should be taken into account as well as #his/her_lc# T score. Percentile
scores in the top decile (> 90 %ile) of such offenders suggest that the respondent
may be defensive and #his/her_lc# report should be interpreted with this in mind.
</Verbiage>
I am trying to find a way to parse the xml file (I've been using DOM), search for #his/her_lc# and replace that with "her". I've tried using FileReader,BufferedReader, string.replaceAll, FileWriter, but those didn't work.
Is there a way I could do this using XPath?
Ultimately I want to search this xml file for this string and replace it with another string.
do I have to add a tag around the string I want it parse it that way?
Code I tried:
protected void parse() throws ElementNotValidException {
try {
//Parse xml File
File inputXML = new File("template.xml");
DocumentBuilderFactory parser = DocumentBuilderFactory.newInstance(); // new instance of doc builder
DocumentBuilder dParser = parser.newDocumentBuilder(); // calls it
Document doc = dParser.parse(inputXML); // parses file
FileReader reader = new FileReader(inputXML);
String search = "#his/her_lc#";
String newString;
BufferedReader br = new BufferedReader(reader);
while ((newString = br.readLine()) != null){
newString.replaceAll(search, "her");
}
FileWriter writer = new FileWriter(inputXML);
writer.write(newString);
writer.close();
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
}
Code I was given to fix:
try {
File inputXML = new File("template.xml"); // creates new input file
DocumentBuilderFactory parser = DocumentBuilderFactory.newInstance(); // new instance of doc builder
DocumentBuilder dParser = parser.newDocumentBuilder(); // calls it
Document doc = dParser.parse(inputXML); // parses file
doc.getDocumentElement().normalize();
NodeList pList = doc.getElementsByTagName("Verbiage"); // gets element by tag name and places into list to begin parsing
int gender = 1; // gender has to be taken from the response file, it is hard coded for testing purposes
System.out.println("----------------------------"); // new line
// loops through the list of Verbiage tags
for (int temp = 0; temp < pList.getLength(); temp++) {
Node pNode = pList.item(0); // sets node to temp
if (pNode.getNodeType() == Node.ELEMENT_NODE) { // if the node type = the element node
Element eElement = (Element) pNode;
NodeList pronounList = doc.getElementsByTagName("pronoun"); // gets a list of pronoun element tags
if (gender == 0) { // if the gender is male
int count1 = 0;
while (count1 < pronounList.getLength()) {
if ("#he/she_lc#".equals(pronounList.item(count1).getTextContent())) {
pronounList.item(count1).setTextContent("he");
}
if ("#he/she_caps#".equals(pronounList.item(count1).getTextContent())) {
pronounList.item(count1).setTextContent("He");
}
if ("#his/her_lc#".equals(pronounList.item(count1).getTextContent())) {
pronounList.item(count1).setTextContent("his");
}
if ("#his/her_caps#".equals(pronounList.item(count1).getTextContent())) {
pronounList.item(count1).setTextContent("His");
}
if ("#him/her_lc#".equals(pronounList.item(count1).getTextContent())) {
pronounList.item(count1).setTextContent("him");
}
count1++;
}
pNode.getNextSibling();
} else if (gender == 1) { // female
int count = 0;
while (count < pronounList.getLength()) {
if ("#he/she_lc#".equals(pronounList.item(count).getTextContent())) {
pronounList.item(count).setTextContent("she");
}
if ("#he/she_caps3".equals(pronounList.item(count).getTextContent())) {
pronounList.item(count).setTextContent("She");
}
if ("#his/her_lc#".equals(pronounList.item(count).getTextContent())) {
pronounList.item(count).setTextContent("her");
}
if ("#his/her_caps#".equals(pronounList.item(count).getTextContent())) {
pronounList.item(count).setTextContent("Her");
}
if ("#him/her_lc#".equals(pronounList.item(count).getTextContent())) {
pronounList.item(count).setTextContent("her");
}
count++;
}
pNode.getNextSibling();
}
}
}
// write the content to file
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
System.out.println("-----------Modified File-----------");
StreamResult consoleResult = new StreamResult(System.out);
transformer.transform(source, new StreamResult(new FileOutputStream("template.xml"))); // writes changes to file
} catch (Exception e) {
e.printStackTrace();
}
}
This code I think would work if I could figure out how to associate the tag Pronoun with the pronounParser that this code is in.
I used this example and your template.xml, and I think it works.
public static void main(String[] args) {
File inputXML = new File("template.xml");
BufferedReader br = null;
String newString = "";
StringBuilder strTotale = new StringBuilder();
try {
FileReader reader = new FileReader(inputXML);
String search = "#his/her_lc#";
br = new BufferedReader(reader);
while ((newString = br.readLine()) != null){
newString = newString.replaceAll(search, "her");
strTotale.append(newString);
}
} catch ( IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} // calls it
finally
{
try {
br.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
System.out.println(strTotale.toString());
}
First you must reassign the result of replaceAll:
newString = newString.replaceAll(search, "her");
Second I used a StringBuffer to collect all lines.
I hope this help.
Since strings are immutable you can not modify them, use
StringBuilder/StringBuffer
instead of String.
FileReader reader = new FileReader(inputXML);
String search = "#his/her_lc#";
String newString;
StringBuffer str;
BufferedReader br = new BufferedReader(reader);
while ((newString = br.readLine()) != null){
str.append(newString.replaceAll(search, "her"));
}
FileWriter writer = new FileWriter(inputXML);
writer.write(str);
writer.close();

Read and Write CSV File using Java

I have a CSV log file and it contains many rows like this:
2016-06-21 12:00:00,000 : helloworld: header1=2;header2=6;header=0
I want to write them to a new CSV file.
public void readLogFile() throws Exception
{
String currentLine = "";
String nextLine = "";
BufferedReader reader = new BufferedReader(new FileReader(file(false)));
while ((currentLine = reader.readLine()) != null)
{
if (currentLine.contains("2016") == true)
{
nextLine = reader.readLine();
if (nextLine.contains("helloworld") == true)
{
currentLine = currentLine.substring(0, 23);
nextLine = nextLine.substring(22, nextLine.length());
String nextBlock = replaceAll(nextLine);
System.out.println(currentLine + " : helloworld: " + nextBlock);
String[] data = nextBlock.split(";");
for (int i = 0, max = data.length; i < max; i++)
{
String[] d = data[i].split("=");
map.put(d[0], d[1]);
}
}
}
}
reader.close();
}
This is my method to write the content:
public void writeContentToCsv() throws Exception
{
FileWriter writer = new FileWriter(".../file_new.csv");
for (Map.Entry<String, String> entry : map.entrySet())
{
writer.append(entry.getKey()).append(";").append(entry.getValue()).append(System.getProperty("line.separator"));
}
writer.close();
}
This is the output I want to have:
header1; header2; header3
2;6;0
1;5;1
5;8;8
...
Currently, the CSV file looks like this (only showing one dataset):
header1;4
header2;0
header3;0
Can anyone help me fix the code?
Create a class to store the header values, and store it in the list.
Iterate over the list to save the results.
The currently used map can only store 2 values (which it is storing the header value (name its corresponding value)
map.put(d[0], d[1]);
here d[0] will be header1 and d[1] will be 4 (but we want only 4 from here)
class Headervalues {
String[] header = new String[3];
}
public void readLogFile() throws Exception
{
List<HeaderValues> list = new ArrayList<>();
String currentLine = "";
BufferedReader reader = new BufferedReader(new FileReader(file(false)));
while ((currentLine = reader.readLine()) != null)
{
if (currentLine.contains("2016") && currentLine.contains("helloworld"))
{
String nextBlock = replaceAll(currentLine.substring(22, currentLine.length());
String[] data = nextBlock.split(";");
HeaderValues headerValues = new HeaderValues();
//Assuming data.length will always be 3.
for (int i = 0, max = data.length; i < max; i++)
{
String[] d = data[i].split("=");
//Assuming split will always have size 2
headerValues.header[i] = d[1];
}
list.add(headerValues)
}
}
}
reader.close();
}
public void writeContentToCsv() throws Exception
{
FileWriter writer = new FileWriter(".../file_new.csv");
for (HeaderValues value : headerValues)
{
writer.append(value.header[0]).append(";").append(value.header[1]).append(";").append(value.header[2]);
}
writer.close();
}
For writing to CSV
public void writeCSV() {
// Delimiter used in CSV file
private static final String NEW_LINE_SEPARATOR = "\n";
// CSV file header
private static final Object[] FILE_HEADER = { "Empoyee Name","Empoyee Code", "In Time", "Out Time", "Duration", "Is Working Day" };
String fileName = "fileName.csv");
List<Objects> objects = new ArrayList<Objects>();
FileWriter fileWriter = null;
CSVPrinter csvFilePrinter = null;
// Create the CSVFormat object with "\n" as a record delimiter
CSVFormat csvFileFormat = CSVFormat.DEFAULT.withRecordSeparator(NEW_LINE_SEPARATOR);
try {
fileWriter = new FileWriter(fileName);
csvFilePrinter = new CSVPrinter(fileWriter, csvFileFormat);
csvFilePrinter.printRecord(FILE_HEADER);
// Write a new student object list to the CSV file
for (Object object : objects) {
List<String> record = new ArrayList<String>();
record.add(object.getValue1().toString());
record.add(object.getValue2().toString());
record.add(object.getValue3().toString());
csvFilePrinter.printRecord(record);
}
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
fileWriter.flush();
fileWriter.close();
csvFilePrinter.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Read and write/append CSV file using org.apache.commons.csv.CSVParser.
public void appendCSV(){
String [] records = {};
String csvWrite= "";
Boolean status = false;
try(BufferedReader csvReaders = new BufferedReader(new FileReader("csvfile.csv"));
CSVParser parser = CSVFormat.DEFAULT.withDelimiter(',').withHeader().parse(csvReaders);
) {
for(CSVRecord record : parser) {
status= record.get("Microservice").equalsIgnoreCase(apipath);
int status_code=0;
String httpMethod = record.get("Method");
if(status==true) {
csvWrite = record.get("apiName")+"-"+record.get("Microservice")+"-"+record.get("R_Data")+"-"+record.get("Method")+"-"+record.get("A_Status")+"-"+400+"-"+record.get("A_Response")+"-"+"{}";
records = csvWrite.split("-");
CSVWriter writer = new CSVWriter(new FileWriter(pathTowritecsv,true));
writer.writeNext(records);
writer.close();
}else {
}
}
}
catch (Exception e) {
System.out.println(e);
}
}

splitting of csv file based on column value in java

I want to split csv file into multiple csv files depending on column value.
Structure of csv file: Name,Id,Dept,Course
abc,1,CSE,Btech
fgj,2,EE,Btech
(Rows are not separated by ; at end)
If value of Dept is CSE or ME , write it to file1.csv, if value is ECE or EE write it to file2.csv and so on.
Can I use drools for this purpose? I don't know drools much.
Any help how it can be done?
This is what I have done yet:
public void run() {
String csvFile = "C:/csvFiles/file1.csv";
BufferedReader br = null;
BufferedWriter writer=null,writer2=null;
String line = "";
String cvsSplitBy = ",";
String FileName = "C:/csvFiles/file3.csv";
String FileName2 = "C:/csvFiles/file4.csv";
try {
writer = new BufferedWriter(new FileWriter(FileName));
writer2 = new BufferedWriter(new FileWriter(FileName2));
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
String[] values=line.split(cvsSplitBy);
if(values[2].equals("CSE"))
{
writer.write(line);
}
else if(values[2].equals("ECE"))
{
writer2.write(line);
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
writer.flush();
writer.close();
writer2.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
1) First find column index using header row or if header is not present then by index
2) Follow below algorithm which will result map of key value where key is column by which split is performed
global resultMap;
Method add(key,row) {
data = (resultMap.containsKey(key))? resultMap.get(key):new ArrayList<String>();
data.add(row);
resultMap.put(key, data );
}
Method getSplittedMap(List rows) {
for (String currentRow : rows) {
add(key, currentRow);
}
return resultMap;
}
hope this helps.
FileOutputStream f_ECE = new FileOutputStream("provideloaction&filenamehere");
FileOutputStream f_CSE_ME = new FileOutputStream("provideloaction&filenamehere");
FileInputputStream fin = new FileinputStream("provideloaction&filenamehere");
int size = fin.available(); // find the length of file
byte b[] = new byte[size];
fin.read(b);
String s = new String(b); // file copied into string
String s1[] = s.split("\n");
for (int i = 0; i < s1.length; i++) {
String s3[] = s1[i].split(",")
if (s3[2].equals("ECE"))
f_ECE.write(s1.getBytes());
if (s3[2].equals("CSE") || s3.equals("EEE"))
f_CSE_ME.write(payload.getBytes());
}

How to parse xml from NOT resource file

My app works with data and saves it in the file [root]/data/data/appName/files/list.xml
I know how to parse the XML, like this:
XmlResourceParser parser = getResources().getXml(R.xml.list);
but because I havea file not in res dir, I need to find another way.
I know how to get my file as a string, like this:
FileInputStream fIn = openFileInput("samplefile.txt");
InputStreamReader isr = new InputStreamReader(fIn);
char[] inputBuffer = new char[TESTSTRING.length()];
isr.read(inputBuffer);
String readString = new String(inputBuffer);
It is important to be able to specify the name of file.
Also, when I save file with:
FileOutputStream fOut = openFileOutput("list1.xml", MODE_WORLD_READABLE);
The compiler shows: "MODE_WORLD_READABLE" because
"This constant was deprecated in API level 17".
But it works. What does it mean for me?
Read Xml File From Path-
public boolean ReadXmlFile(String filePath)
{
try {
String Data="";
File fIN = new File(filePath);
if (fIN.exists())
{
StringBuffer fileData = new StringBuffer(1000);
BufferedReader reader = new BufferedReader(
new FileReader(filePath));
char[] buf = new char[1024];
int numRead=0;
while((numRead=reader.read(buf)) != -1){
String readData = String.valueOf(buf, 0, numRead);
fileData.append(readData);
buf = new char[1024];
}
reader.close();
Data= fileData.toString();
}
else
{
return false;
}
docData = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try
{
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(Data));
docData = db.parse(is);
} catch (ParserConfigurationException e) {
return false;
} catch (SAXException e) {
return false;
} catch (IOException e) {
return false;
}
return true;
} catch (Exception e) {
return false;
}
}

lucene indexing of html files

Dear Users I am working on apache lucene for indexing and searching .
I have to index html files stored on the local disc of computer . I have to make indexing on filename and contents of the html files . I am able to store the file names in the lucene index but not the html file contents which should index not only the data but the entire page consisting images link and url and how can i access the contents from those indexed files
for indexing i am using the following code:
File indexDir = new File(indexpath);
File dataDir = new File(datapath);
String suffix = ".htm";
IndexWriter indexWriter = new IndexWriter(
FSDirectory.open(indexDir),
new SimpleAnalyzer(),
true,
IndexWriter.MaxFieldLength.LIMITED);
indexWriter.setUseCompoundFile(false);
indexDirectory(indexWriter, dataDir, suffix);
numIndexed = indexWriter.maxDoc();
indexWriter.optimize();
indexWriter.close();
private void indexDirectory(IndexWriter indexWriter, File dataDir, String suffix) throws IOException {
try {
for (File f : dataDir.listFiles()) {
if (f.isDirectory()) {
indexDirectory(indexWriter, f, suffix);
} else {
indexFileWithIndexWriter(indexWriter, f, suffix);
}
}
} catch (Exception ex) {
System.out.println("exception 2 is" + ex);
}
}
private void indexFileWithIndexWriter(IndexWriter indexWriter, File f,
String suffix) throws IOException {
try {
if (f.isHidden() || f.isDirectory() || !f.canRead() || !f.exists()) {
return;
}
if (suffix != null && !f.getName().endsWith(suffix)) {
return;
}
Document doc = new Document();
doc.add(new Field("contents", new FileReader(f)));
doc.add(new Field("filename", f.getFileName(),
Field.Store.YES, Field.Index.ANALYZED));
indexWriter.addDocument(doc);
} catch (Exception ex) {
System.out.println("exception 4 is" + ex);
}
}
thanks in advance
This line of code is the reason why your contents is not being stored:
doc.add(new Field("contents", new FileReader(f)));
This method DOES NOT STORE the contents being indexed.
If you are trying to index HTML files, try using JTidy. It will make the process much easier.
Sample Codes:
public class JTidyHTMLHandler {
public org.apache.lucene.document.Document getDocument(InputStream is) throws DocumentHandlerException {
Tidy tidy = new Tidy();
tidy.setQuiet(true);
tidy.setShowWarnings(false);
org.w3c.dom.Document root = tidy.parseDOM(is, null);
Element rawDoc = root.getDocumentElement();
org.apache.lucene.document.Document doc =
new org.apache.lucene.document.Document();
String body = getBody(rawDoc);
if ((body != null) && (!body.equals(""))) {
doc.add(new Field("contents", body, Field.Store.NO, Field.Index.ANALYZED));
}
return doc;
}
protected String getTitle(Element rawDoc) {
if (rawDoc == null) {
return null;
}
String title = "";
NodeList children = rawDoc.getElementsByTagName("title");
if (children.getLength() > 0) {
Element titleElement = ((Element) children.item(0));
Text text = (Text) titleElement.getFirstChild();
if (text != null) {
title = text.getData();
}
}
return title;
}
protected String getBody(Element rawDoc) {
if (rawDoc == null) {
return null;
}
String body = "";
NodeList children = rawDoc.getElementsByTagName("body");
if (children.getLength() > 0) {
body = getText(children.item(0));
}
return body;
}
protected String getText(Node node) {
NodeList children = node.getChildNodes();
StringBuffer sb = new StringBuffer();
for (int i = 0; i < children.getLength(); i++) {
Node child = children.item(i);
switch (child.getNodeType()) {
case Node.ELEMENT_NODE:
sb.append(getText(child));
sb.append(" ");
break;
case Node.TEXT_NODE:
sb.append(((Text) child).getData());
break;
}
}
return sb.toString();
}
}
To get an InputStream from a URL:
URL url = new URL(htmlURLlocation);
URLConnection connection = url.openConnection();
InputStream stream = connection.getInputStream();
To get an InputStream from a File:
InputStream stream = new FileInputStream(new File (htmlFile));

Categories