below i try to look for element and append a child to it ;but what the wrong with it!!??
// Document doc;
Element cust = doc.createElement("cust");
cust.appendChild(doc.createTextNode("anyname"));
org.w3c.dom.Node custmers = doc.getElementsByTagName("custmers").item(0);
custmers.appendChild(cust);
If you want to avoid navigating to the specific element, then try using xpath.
Link
Related
Consider i have a XML file like the below xml file.
<top>
<CRAWL>
<NAME>div[class=name],attr=0</NAME>
<PRICE>span[class~=(?i)(price-new|price-old)],attr=0</PRICE>
<DESC>div[class~=(?i)(sttl dyn|bin)],attr=0</DESC>
<PROD_IMG>div[class=image]>a>img,attr=src</PROD_IMG>
<URL>div[class=name]>a,attr=href</URL>
</CRAWL>
<CRAWL>
<NAME>img[class=img],attr=alt</NAME>
<PRICE>div[class=g-b],attr=0</PRICE>
<DESC>div[class~=(?i)(sttl dyn|bin)],attr=0</DESC>
<PROD_IMG>img[itemprop=image],attr=src</PROD_IMG>
<URL>a[class=img],attr=href</URL>
</CRAWL>
</top>
what i want is first take all the values coming under and after finishing the first operation go to the next one and repeat it even though i have more than two tag.I have managed to get if just one is available. using the values coming inside the tags i am doing some other function. in each it has values from different and i am using that values for different operations. everything else if fine other than i dont know how to loop the fetching inside the xml file.
regards
If I'm understanding this correctly, you're trying to extract data from ALL tags that exist within your XML fragment. There are multiple solutions to this. I'm listing them below:
XPath: If you know exactly what your XML structure is, you can employ XPath for each node=CRAWL to find data within tags:
// Instantiate XPath variable
XPath xpath = XPathFactory.newInstance().newXPath();
// Define the exact XPath expressions you want to get data for:
XPathExpression name = xpath.compile("//top/CRAWL/NAME/text()");
XPathExpression price = xpath.compile("//top/CRAWL/PRICE/text()");
XPathExpression desc = xpath.compile("//top/CRAWL/DESC/text()");
XPathExpression prod_img = xpath.compile("//top/CRAWL/PROD_IMG/text()");
XPathExpression url = xpath.compile("//top/CRAWL/URL/text()");
At this point, each of the variables above will contain the data for each of the tags. You could drop this into an array for each where you will have all the data for each of the tags in all elements.
The other (more efficient solution) is to have the data stored by doing DOM based parsing:
// Instantiate the doc builder
DocumentBuilder xmlDocBuilder = domFactory.newDocumentBuilder();
Document xmlDoc = xmlDocBuilder.parse("xmlFile.xml");
// Create NodeList of element tag "CRAWL"
NodeList crawlNodeList = xmlDoc.getElementsByTagName("CRAWL");
// Now iterate through each item in the NodeList and get the values of
// each of the elements in Name, Price, Desc etc.
for (Node node: crawlNodeList) {
NamedNodeMap subNodeMap = node.getChildNodes();
int currentNodeMapLength = subNodeMap.getLength();
// Get each node's name and value
for (i=0; i<currentNodeMapLength; i++){
// Iterate through all of the values in the nodeList,
// e.g. NAME, PRICE, DESC, etc.
// Do something with these values
}
}
Hope this helps!
There is this element which has child elements, those child elements again have child elements and so on. I would like to get all elements that are descendants of the element. Thanks.
Try this one:
(Java)
List<WebElement> childs = rootWebElement.findElements(By.xpath(".//*"));
(C#)
IReadOnlyList<IWebElement> childs = rootWebElement.FindElements(By.XPath(".//*"));
Try this one
List<WebElement> allDescendantsChilds = rootWebElement.findElements(By.xpath("//tr[#class='parent']//*"));
The above thing will gives you all descendant child elements (not only immediate child) of parent tr
Try this one:
List<WebElement> childs = rootWebElement.findElements(By.tagName(".//*"));
I try to get all info contained in div class named : bg_block_info, but instead i get info for another div class <div class="bg_block_info pad_20"> Why i'm getting it wrong ?
Document doc = Jsoup.connect("http://www.maib.md").get();
Elements myin = doc.getElementsByClass("bg_block_info");
You can combine and chain selectors to refine your query, e.g.:
Document doc = Jsoup.connect("http://www.maib.md/").get();
Elements els = doc.getElementsByClass("bg_block_info").not(".pad_10").not(".pad_20");
That element has two classes (notice the space between bg_block_info and pad_20):
<div class="bg_block_info pad_20">
So it does have the class bg_block_info and your code is working as expected.
Elements downloadLinks = dContent.select("a[href]");
Elements pdfLinks = downloadLinks.select("a[data-format$=pdf]");
Full reference jsoup selector syntax
In your case you probably might use Element content = doc.getElementById("pollsstart"); instead Elements myin = doc.getElementsByClass("bg_block_info");.
Just use comma between bg_block_info" and "pad_20". It should be like this.
Elements myin = doc.getElementsByClass("div.bg_block_info.pad_20");
I have a xml structure as follows:
<rurl modify="0" children="yes" index="8" name="R-URL">
<status>enabled</status>
<rurl-link priority="3">http</rurl-link>
<rurl-link priority="5">http://localhost:80</rurl-link>
<rurl-link priority="4">abc</rurl-link>
<rurl-link priority="3">b</rurl-link>
<rurl-link priority="2">a</rurl-link>
<rurl-link priority="1">newlinkkkkkkk</rurl-link>
</rurl>
Now, I want to remove a child node, where text is equal to http. currently I am using this code:
while(subchilditr.hasNext()){
Element subchild = (Element)subchilditr.next();
if (subchild.getText().equalsIgnoreCase(text)) {
message = subchild.getText();
update = "Success";
subchild.removeAttribute("priority");
subchild.removeContent();
}
But it is not completely removing the sub element from xml file. It leaves me with
<rurl-link/>
Any suggestions?
You'll need to do this:
List<Element> elements = new ArrayList<Element>();
while (subchilditr.hasNext()) {
Element subchild = (Element) subchilditr.next();
if (subchild.getText().equalsIgnoreCase(text)) {
elements.add(subchild);
}
}
for (Element element : elements) {
element.getParent().removeContent(element);
}
If you try to remove an element inside of the loop you'll get a ConcurrentModificationException.
If you have the parent element rurl you can remove its children using the method removeChild or removeChildren.
Use removeChild()
http://download.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/Node.html#removeChild(org.w3c.dom.Node)
<amount currency="USD">1000500</amount>
while parsing above string i am getting only attribute value .when i try to get node value null pointer exception
for getting node value using
NodeList amountList= estimateElement.getElementsByTagName("amount");
Element amtElement= (Element)amountList.item(0);
String amount=amtElement.getFirstChild().getnodevalue()
Thanks in advance
Aswan
Please try this. I assume that it is true:
NodeList list = estimateElement.getElementsByTagName("amount").item(0).getChildNodes();
Node node = (Node) list.item(0);
String value = node.getNodeValue();
Source : DOM parser
Element amtElement= (Element)amountList.item(0);
seems to be your element - so why are you calling getFirstChild()?
try this:
String amount=amtElement.getnodevalue()
have you checked out jdom? it has a nice documentation and is easy to use..
Try using the getTextContent() method:
NodeList amountList= estimateElement.getElementsByTagName("amount");
Element amtElement= (Element)amountList.item(0);
String amount=amtElement.getTextContent();
See here for more info.