Java File Iteration in For loop - java

I have few files saved in my local directory. One video file and one
xml file. Video file details will be stored in xml file.
We are moving videos from one system to another system. Before
uploading the video file and xml data from one system to another
system, need to check for the title of video in the other system and
upload only if the same title doesn't exist.
This is working fine. But uploading is happening 4 times instead of 2
times. Please help.
Here is the main code:
List<File> videoFiles = new ArrayList<File>();
List<File> xmlFiles = new ArrayList<File>();
File[] allVideos = checkVideos();
for(File file:allVideos ) {
if(file.getName().endsWith("flv")) {
videoFiles .add(file);
}
if(file.getName().endsWith("xml")) {
xmlFiles .add(file);
}
}
System.out.println(videoFiles.size());
System.out.println(xmlFiles.size());
processUpload(videoFiles ,xmlFiles);
Here are the methods:
private static void processUpload(List<File> videoFiles, List<File> xmlFiles) throws ParserConfigurationException, SAXException, IOException, ApiException {
NodeList nodes = null;
File video= null;
File xml = null;
String title = null;
String localFileTitle = null;
Media newMedia = null;
for(int i=0;i < videoFiles.size();i++) {
System.out.println("videoFiles.getName() ->"+videoFiles.get(i).getName());
video= videoFiles.get(i);
for(int j=0;j < xmlFiles.size();j++) {
xml = xmlFiles.get(j);
System.out.println("xmlFiles.getName() ->"+xmlFiles.get(i).getName());
nodes = parseXml(xml);
localFileTitle = processNodes(nodes);
title = checkTitles(localFileTitle);
newMedia = initiateUploadProcess(flv, title );
}
}
}
private static NodeList parseXml(File xmlFile) throws ParserConfigurationException, SAXException, IOException {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
//System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("video");
return nList;
}
private static String processNodes(NodeList nodes) {
String fileTitle = null;
if(nodes.getLength() >= 1) {
for (int temp = 0; temp < nodes.getLength(); temp++) {
Node nNode = nodes.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
fileTitle = eElement.getElementsByTagName("title").item(0).getTextContent();
if(fileTitle != null) {
//System.out.println("Local File Title ------>"+fileTitle);
}
}
}
}
return fileTitle;
}
private static String checkTitles(String localTitle) throws ApiException {
String title = null;
MediaList mediaResponse = fetchVideos(caID);
if(mediaResponse.totalCount >= 1) {
for(MediaEntry media:mediaResponse.objects) {
if(!localTitle.equals(media.name)) {
System.out.println("Titles are not same. Need to return");
title = localTitle;
}
}
}
return title ;
}
private static MediaEntry initiateUploadProcess(File videoFile,
String localFileTitle) throws ApiException, ParserConfigurationException, SAXException, IOException {
UploadToken ktoken = null;
UploadMedia entry = null;
MediaEntry mediaEntry = null;
ktoken = generateToken();
if (ktoken != null) {
//System.out.println("ktoken.id ----------->" + ktoken.id);
if (ktoken.id != null) {
uploadToken(ktoken.id, flvFile);
entry = uploadMediaToChannel(categoryID, categoryName, localFileTitle);
if (entry.id != null) {
System.out.println("entry.id ------->" + entry.id);
mediaEntry = addMediaContent(ktoken.id, entry.id);
}
}
}
return mediaEntry;
}
Here is the output:
videoFiles.getName() ->22701846_91167469.flv
xmlFiles.getName() ->22701846_91167469.xml
Titles are not same. Need to return
Titles are not same. Need to return
video.id ------->0_50wh1m4p
xmlFiles.getName() ->22701846_91167469.xml
Titles are not same. Need to return
Titles are not same. Need to return
video.id ------->0_79v605ue
videoFiles.getName() ->22701846_91477939.flv
xmlFiles.getName() ->22701846_91477939.xml
Titles are not same. Need to return
Titles are not same. Need to return
video.id ------->0_0kihent1
xmlFiles.getName() ->22701846_91477939.xml
Titles are not same. Need to return
Titles are not same. Need to return
Titles are not same. Need to return
video.id ------->0_miogft0i

In pseudo-code, you have something like those two loops there:
for (i in 1,2)
for (j in 1, 2)
...
upload
And you are surprised that you have 4 (2 x 2) uploads?
Try something like:
for (i in 1,2)
... fetch video file info
... fetch xml file info
upload
instead!

Doing the nested for loops is not very efficient, but I think the problem is in your checkVideos() method (not listed in the question) and it is returning duplicate file objects.
Edit: the line where you print the xml file is using the "i" variable (from the outer loop)

Related

Not able to get the specific location of the bookmark in a page using PDFBOX

I am using PDFBOX 2.0.2 jar to add more than one PDF in a existing heading bookmarked PDF file. And for the same i am splitting it and merging other PDF.
Splitter splitter = new Splitter();
splitter.setStartPage(1);
splitter.setEndPage(noOfPagesInHeadingBkmrkedPDF);
Before Split and merge , i am keeping all the bookmark in HashMap with key as pageNumber and value as bookmark name. And after merge i am setting back the bookmark w.r.t. My query is - how to get the specific co-ordinate (location) of bookmark on the page so that the after merge i should be able to set it back to that particular location of the page.
Code snippet for creating the HashMap before Split :
public void getAllBookmarks(PDOutlineNode bookmarksInOriginalFile, String emptyString, Map<Integer, String> bookmarkMap) throws IOException {
PDOutlineItem current = null;
if (null != bookmarksInOriginalFile)
current = bookmarksInOriginalFile.getFirstChild();
while (current != null) {
Integer pageNumber = 0;
PDPageDestination pd = null;
if (current.getDestination() instanceof PDPageDestination) {
pd = (PDPageDestination) current.getDestination();
pageNumber = (pd.retrievePageNumber() + 1); // Do we have any method available to get the location on the specific page ??
}
if (current.getAction() instanceof PDActionGoTo) {
PDActionGoTo gta = (PDActionGoTo) current.getAction();
if (gta.getDestination() instanceof PDPageDestination) {
pd = (PDPageDestination) gta.getDestination();
pageNumber = (pd.retrievePageNumber() + 1);
}
}
String bookmarkName = emptyString + current.getTitle();
if(null!=bookmarkName && !EMPTY_STRING.equalsIgnoreCase(bookmarkName)){
bookmarkMap.put(pageNumber-1,bookmarkName);
}
getAllBookmarks(current, emptyString,bookmarkMap);
current = current.getNextSibling();
}
}
Any help would be much appreciated.
Thank you...
As i am able to solve my solution using #TilmanHausherr Suggestion. I am answering my question. I changed the below piece of code :
public void getAllBookmarks(PDOutlineNode bookmarksInOriginalFile, String emptyString, Map<Integer,BookmarkMetaDataBO> bookmarkMap) throws IOException {
PDOutlineItem current = null;
if (null != bookmarksInOriginalFile)
current = bookmarksInOriginalFile.getFirstChild();
while (current != null) {
Integer pageNumber = 0;
PDPageDestination pd = null;
PDPageXYZDestination pdx = null;
// These value will give the specific location
**int left = 0;
int top = 0;**
if (current.getDestination() instanceof PDPageXYZDestination) {
pdx = (PDPageXYZDestination) current.getDestination();
pageNumber = (pdx.retrievePageNumber() + 1);
**left = pdx.getLeft();
top = pdx.getTop();**
}
if (current.getAction() instanceof PDActionGoTo) {
PDActionGoTo gta = (PDActionGoTo) current.getAction();
if (gta.getDestination() instanceof PDPageDestination) {
pd = (PDPageDestination) gta.getDestination();
pageNumber = (pd.retrievePageNumber() + 1);
}
}
String bookmarkName = emptyString + current.getTitle();
if(null!=bookmarkName && !EMPTY_STRING.equalsIgnoreCase(bookmarkName)){
BookmarkMetaDataBO bkmrkBo = new BookmarkMetaDataBO();
**bkmrkBo.setTop(left);
bkmrkBo.setLeft(top);**
bookmarkMap.put(pageNumber-1,bkmrkBo);
}
getAllBookmarks(current, emptyString,bookmarkMap);
current = current.getNextSibling();
}
}
Thank you...

How can get node child tag from xml with dom (Java)

Took a long time searching and watching videos.
I'm trying to access the course subjects ID
This is my xml code
<list>
<Asignatura>
<id>1</id>
<nombre>Programación</nombre>
<curso>
<id>1</id>
<nombre>1º DAM</nombre>
<listaAsignaturas>
<Asignatura reference="../../.."/>
<Asignatura>
<id>2</id>
<nombre>Bases de datos</nombre>
<curso reference="../../.."/>
<listaAlumnos/>
</Asignatura>
<Asignatura>
<id>3</id>
<nombre>Formación y orientación laboral</nombre>
<curso reference="../../.."/>
<listaAlumnos/>
</Asignatura>
<Asignatura>
<id>4</id>
<nombre>Entornos de desarrollo</nombre>
<curso reference="../../.."/>
<listaAlumnos/>
</Asignatura>
</listaAsignaturas>
</curso>
<listaAlumnos/>
</Asignatura>
</list>
And here my code in java
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse("./datos/Asignaturas.xml");
document.getDocumentElement().normalize();
NodeList Asignatura = document.getElementsByTagName("Asignatura");
for (int i = 0; i < Asignatura.getLength(); i++) {
Node c = Asignatura.item(i);
if (c.getNodeType() == Node.ELEMENT_NODE) {
Element elemento = (Element) c;
int id = Integer.parseInt(getValorHijo("id", elemento));
String nombre = getValorHijo("nombre", elemento);
//int idCurso = Integer.parseInt(getValorHijo("curso", elemento));
curso = new Curso();
curso.setId(idCurso);
curso = (Curso) FileXMLDAOFactory.getInstance().getCursoDAO().buscar(curso);
}
}
} catch (Exception e) {
e.printStackTrace();
}
I have to say it works, pick up the ID and name of the subject.
But I can not pick up the ID or name of the course subject that is within.
Im not have idea how can get it :(
I tried a slighty different approach with the same xml input file with the following code.
public static void main(String[] args) {
List<Assignatura> assignaturas = new ArrayList<Assignatura>();
List<Curso> cursos = new ArrayList<Curso>();
try{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder loader = factory.newDocumentBuilder();
Document document = loader.parse("datos.xml");
DocumentTraversal traversal = (DocumentTraversal) document;
NodeIterator iterator = traversal.createNodeIterator(
document.getDocumentElement(), NodeFilter.SHOW_ALL, new ListAsignaturasFilter(), true);
for (Node n = iterator.nextNode(); n != null; n = iterator.nextNode()) {
Element el = (Element) n;
NodeList assignments = el.getElementsByTagName("Asignatura");
for(int i=0; i<assignments.getLength(); i++){
Node currentNode = assignments.item(i);
NodeList childs = currentNode.getChildNodes();
String id = getValorHijo(childs, "id");
String nombre = getValorHijo(childs, "nombre");
if(!id.isEmpty() || !nombre.isEmpty())
assignaturas.add(new Assignatura(id, nombre));
}
}
NodeIterator itCurso = traversal.createNodeIterator(
document.getDocumentElement(), NodeFilter.SHOW_ALL, new CursoFilter(), true);
for (Node n = itCurso.nextNode(); n != null; n = itCurso.nextNode()) {
Element el = (Element) n;
NodeList cursos = el.getChildNodes();
String id = getValorHijo(cursos, "id");
String nombre = getValorHijo(cursos, "nombre");
if(!id.isEmpty() || !nombre.isEmpty())
cursos.add(new Curso(id, nombre));
}
for(Assignatura assignatura : assignaturas){
System.out.println(assignatura);
}
for(Curso curso : cursos){
System.out.println(curso);
}
}catch (Exception e) {
e.printStackTrace();
}
}
private static String getValorHijo(NodeList childs, String data){
String search="";
if(childs.getLength()>0)
for(int j=0; j<childs.getLength(); j++){
if(childs.item(j).getNodeName().equals(data)){
return childs.item(j).getTextContent();
}
}
return search;
}
private static final class ListAsignaturasFilter implements NodeFilter {
public short acceptNode(Node n) {
if (n instanceof Element) {
if (((Element) n).getTagName().equals("listaAsignaturas")) {
return NodeFilter.FILTER_ACCEPT;
}
}
return NodeFilter.FILTER_REJECT;
}
}
private static final class CursoFilter implements NodeFilter {
public short acceptNode(Node n) {
if (n instanceof Element) {
if (((Element) n).getTagName().equals("curso")) {
return NodeFilter.FILTER_ACCEPT;
}
}
return NodeFilter.FILTER_REJECT;
}
}
As you can see I created two lists to memorize "Assignatura" an "Curso" object which are simple POJO with two attributes id and nombre. At the end of the main method I display the content of these lists and I've got the id and nombre information of element "Curso" besides all "Asignatura" elements. Hope this code help you !
Just for the record. The use of DOM was compulsory ? Because, it would have been easier to use JAXB.

Get text from xml - Android

I have this xml online http://64.182.231.116/~spencerf/test.xml
And I am trying to get the two text values Assorted Cereal and Yogurt Parfait (2). Here is how I am currently parsing it, and I get the values I want as well as all the values under then, all the numbers and such, but I just want to get the names, and I am struggling how to just do that, any help or guidance would be great. Here is my code:
String currentDay = "";
String currentMeal = "";
String counter = "";
String icon1 = "";
LinkedHashMap<String, List<String>> itemsByCounter = new LinkedHashMap<String , List<String>>();
List<String> items = new ArrayList<String>();
while (eventType != XmlResourceParser.END_DOCUMENT) {
String tagName = xmlData.getName();
switch (eventType) {
case XmlResourceParser.START_TAG:
if (tagName.equalsIgnoreCase("day")) {
currentDay = xmlData.getAttributeValue(null, "name");
}
if (tagName.equalsIgnoreCase("meal")) {
currentMeal = xmlData.getAttributeValue(null, "name");
}
if (tagName.equalsIgnoreCase("counter") && currentDay.equalsIgnoreCase(day) && currentMeal.equalsIgnoreCase(meal)) {
counter = xmlData.getAttributeValue(null, "name");
}
if (tagName.equalsIgnoreCase("name") && counter != null && currentDay.equalsIgnoreCase(day) && currentMeal.equalsIgnoreCase(meal)) {
icon1 = xmlData.getAttributeValue(null, "icon1");
Log.i(TAG, "icon1: " + icon1);
}
break;
case XmlResourceParser.TEXT:
if (currentDay.equalsIgnoreCase(day) && currentMeal.equalsIgnoreCase(meal) && counter !=(null)) {
if (xmlData.getText().trim().length() > 0) {
//Here gets everything but I just want 2 names
Log.i(TAG, "data: " + xmlData.getText());
items.add(xmlData.getText().trim().replaceAll(" +", " "));
}
}
break;
case XmlPullParser.END_TAG:
if (tagName.equalsIgnoreCase("counter")) {
if (items.size() > 0) {
itemsByCounter.put(counter, items);
items = new ArrayList<String>();
recordsFound++;
}
}
break;
}
eventType = xmlData.next();
So as you can see in the comment in my code I am getting everything under the name tag, back but I just want the value of the name, and not all the other stuff.
You will need to store the name in its own child element (meaning put an end tag before the nutritional facts). Under each dish, you could have this:
<name>Assorted Cereal</name>
<nutrition_facts> ... </nutrition_facts>
Not tested but could do it along these lines:
List<Nutrition_Facts> nutrition_facts = new ArrayList<Nutrition_Facts>();
XMLDOMParser parser = new XMLDOMParser();
AssetManager manager = context.getAssets();
InputStream stream;
try {
stream = manager.open("test.xml"); //need full path to your file here - mine is stored in assets folder
Document doc = parser.getDocument(stream);
}catch(IOException ex){
System.out.printf("Error reading map %s\n", ex.getMessage());
}
NodeList nodeList = doc.getElementsByTagName("nutrition_facts");
for (int i = 0; i < nodeList.getLength(); i++) {
Element e = (Element) nodeList.item(i);
String name;
if (elementName.equals(e.getAttribute("Assorted Cereal"))){
name = e.getAttribute("name");
//do some stuff
}
}
//XMLDOMParser Class
public class XMLDOMParser {
//Returns the entire XML document
public Document getDocument(InputStream inputStream) {
Document document = null;
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder db = factory.newDocumentBuilder();
InputSource inputSource = new InputSource(inputStream);
document = db.parse(inputSource);
} catch (ParserConfigurationException e) {
Log.e("Error: ", e.getMessage());
return null;
} catch (SAXException e) {
Log.e("Error: ", e.getMessage());
return null;
} catch (IOException e) {
Log.e("Error: ", e.getMessage());
return null;
}
return document;
}
/*
* I take a XML element and the tag name, look for the tag and get
* the text content i.e for <employee><name>Kumar</name></employee>
* XML snippet if the Element points to employee node and tagName
* is name I will return Kumar. Calls the private method
* getTextNodeValue(node) which returns the text value, say in our
* example Kumar. */
public String getValue(Element item, String name) {
NodeList nodes = item.getElementsByTagName(name);
return this.getTextNodeValue(nodes.item(0));
}
private final String getTextNodeValue(Node node) {
Node child;
if (node != null) {
if (node.hasChildNodes()) {
child = node.getFirstChild();
while(child != null) {
if (child.getNodeType() == Node.TEXT_NODE) {
return child.getNodeValue();
}
child = child.getNextSibling();
}
}
}
return "";
}
}

getChildNodes() on org.w3c.dom.Node returns NodeLits of only nulls

I'm trying to parse a configuration file, but when I call getChildNodes() on the node excludes, the returned NodeList contains 7 null values and nothing else.
My code:
public class MinNull {
private static final List<String> excludes = new ArrayList<>();
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {
File cfgFile = new File("ver.cfg");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(cfgFile);
Element rootElement = doc.getDocumentElement();
NodeList cfg = rootElement.getChildNodes();
parseConfig(cfg);
}
private static void parseConfig(NodeList cfg) {
for (int i = 0; i < cfg.getLength(); i++) {
Node node = cfg.item(i);
String name = node.getNodeName();
switch (name) {
case "excludes":
NodeList exc = node.getChildNodes();
for(int j = 0; j < exc.getLength();j++){
Node ex = exc.item(i);
System.out.println(ex);
if(ex != null && ex.getNodeName().equals("file")){
excludes.add(ex.getTextContent());
}
}
break;
}
}
}
}
My xml file (named ver.cfg) :
<?xml version="1.0" encoding="UTF-8"?>
<root>
<server>localhost</server>
<port>6645</port>
<project>MyApp</project>
<mainfile>MyApp.jar</mainfile>
<version>v1</version>
<excludes>
<file>Launcher.jar</file>
<file>Launch.bat</file>
<file>ver.cfg</file>
</excludes>
</root>
Output:
null
null
null
null
null
null
null
Everything else is working correctly.
You use the wrong index variable in
Node ex = exc.item(i);
you should have used j here.
At that point i = 5, and child node 5 does not exists, so you are getting the null indicating that that node does not exists.

Parsing dblp.xml with java DOM/SAX

I am trying to parse dblp.xml in java to get the author names/title/year etc, but since the file is huge (860MB), I cannot use DOM/SAX on the complete file.
So I split the file into multiple small files of around 100MB each.
Now each file contains various (thousands of) nodes like this:
<dblp>
<inproceedings mdate="2011-06-23" key="conf/aime/BianchiD95">
<author>Nadia Bianchi</author>
<author>Claudia Diamantini</author>
<title>Integration of Neural Networks and Rule Based Systems in the Interpretation of Liver Biopsy Images.</title>
<pages>367-378</pages>
<year>1995</year>
<crossref>conf/aime/1995</crossref>
<booktitle>AIME</booktitle>
<url>db/conf/aime/aime1995.html#BianchiD95</url>
<ee>http://dx.doi.org/10.1007/3-540-60025-6_152</ee>
</inproceedings>
</dblp>
100MB should be readable in DOM, I am assuming, but the code stops after roughly 45k lines. Here is the java code I am using:
#SuppressWarnings({"unchecked", "null"})
public List<dblpModel> readConfigDOM(String configFile) {
List<dblpModel> items = new ArrayList<dblpModel>();
List<String> strList = null;
dblpModel item = null;
try {
File fXmlFile = new File(configFile);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList nList = doc.getElementsByTagName("incollection");
for (int temp = 0; temp < nList.getLength(); temp++) {
item = new dblpModel();
strList = new ArrayList<String>();
Node nNode = nList.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
strList = getTagValueString("title", eElement);
System.out.println(strList.get(0).toString());
strList = getTagValueString("author", eElement);
System.out.println("Author : " + strList.size());
for(String s: strList) {
System.out.println(s);
}
}
items.add(item);
}
} catch (Exception e) {
e.printStackTrace();
}
return items;
}
private static String getTagValueString(String sTag, Element eElement) {
String temp = "";
StringBuffer concatTestSb = new StringBuffer();
List<String> strList = new ArrayList<String>();
int len = eElement.getElementsByTagName(sTag).getLength();
try {
for (int i = 0; i < len; i++) {
NodeList nl = eElement.getElementsByTagName(sTag).item(i).getChildNodes();
if (nl.getLength() > 1) {
for (int j = 0; j < nl.getLength(); j++) {
concatTestSb.append(nl.item(j).getTextContent());
}
} else {
temp = nl.item(0).getNodeValue();
concatTestSb.append(temp);
if (len > 1) {
concatTestSb.append("*");
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
return concatTestSb.toString();
}
Any help? I have tried using STAX api for parsing large documents also, but that also
If you goal is to just get the details out, the just use a BufferedReader to read the file as a text file. If you want, throw in some regex.
if using mysql is an option, you may be able to get it to do the heavy lifting through it's XML Functions
Hope this helps.
Don't fuss too much about the xml format. It is not terribly useful anyway. Just read it as text file and parse the lines as string. You can then export the data to a csv and use it the way you want from that point.
Unfortunately xml is not very efficient for large documents. I did something similar here for a research project:
http://qualityofdata.com/2011/03/27/dblp-for-sql-server/

Categories