Is it possible to parse and then modify a single element in an XML document?
I'm currently writing a script in ruby which needs to modify a value (specified by xpath) in an xml file. I'm currently using the REXML library to do this:
xmldocument = Document.new(File.new(filename))
property = XPath.first(xmldocument, "/parent/element/property")
property.text = "New property value"
puts xmldocument
Where the input xml is:
<?xml version="1.0" encoding="UTF-8"?>
<parent>
<element>
<property>Old property value</property>
<verbose />
</element>
...
(more elements here)
...
</parent>
And the output is:
<?xml version='1.0' encoding='UTF-8'?>
<parent>
<element>
<property>New property value</property>
<verbose/>
</element>
...
(more elements here)
...
</parent>
You should notice that the output xml is slightly reformatted and more than my desired change are made. For example the tag <verbose /> is changed to <verbose/> and double quotes are replaced with single quotes in the first line.
What is the best way to modify just a given element of an xml file and leave the rest of the file intact? Ideally, there is a solution for Ruby but I'd love to know the solution in other languages such as Java.
In Java, the Saxon library should accomplish everything you're looking for:
http://sourceforge.net/projects/saxon/
Related
I am trying to read this XML file using PHP and I have two root elements. The code that I wrote in PHP reads only one root element and when I add the other one (<action>) it gives me an error.
I want to do something like this : if($xml->action=="register") then print all parameters.
This is my XML file:
<?xml version='1.0' encoding='ISO-8859-1'?>
<action>register</action>
<paramters>
<name>Johnny B</name>
<username>John</username>
</paramters>
And this is my PHP script:
<?php
$xml = simplexml_load_file("test.xml");
echo $xml->getName() . "<br />";
foreach($xml->children() as $child)
{
echo $child->getName() . ": " . $child . "<br />";
}
?>
I really don't know how to do all this...
Fix your XML, it's invalid. XML files can only have 1 root element.
Example valid XML:
<?xml version='1.0' encoding='ISO-8859-1'?>
<action>
<type>register</type>
<name>Johnny B</name>
<username>John</username>
</actions>
Or if you want only parameters to have own elements:
<?xml version='1.0' encoding='ISO-8859-1'?>
<action type="register">
<name>Johnny B</name>
<username>John</username>
</actions>
or if you want multiple actions:
<?xml version='1.0' encoding='ISO-8859-1'?>
<actions>
<action type="register">
<name>Johnny B</name>
<username>John</username>
</action>
</actions>
EDIT:
As I've said in my comment, your teacher should fix his XML. It is invalid. Also he should put his XML through a validator.
If you're really desperate you can introduce an articificial root element, but this is really bad practice and should be avoided at all costs:
$xmlstring = str_replace(
array('<action>','</paramters>'),
array('<root><action>', '</paramters></root>'),
$xmlstring
);
None of the previous answers is quite accurate. The XML specification defines several kinds of entity: document entities, external parsed entities, document type definitions for example. Your example is not a well-formed document entity, which is what XML parsers are normally asked to parse. However, it is a well-formed external parsed entity. The way to process a well-formed external parsed entity is to reference it from a skeletal document entity, like this:
<!DOCTYPE wrapper [
<!ENTITY e SYSTEM "my.xml">
]>
<wrapper>&e;</wrapper>
and then pass the document entity to the XML parser.
As it is an invalid xml file, you can do the following trick.
Insert a dummy start tag at the second line as <dummy>
In the end finish it with </dummy>
Happy parsing ;)
I have the two following xml files:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pricebooks xmlns="http://www.blablabla.com">
<pricebook>
<header pricebook-id="my-id">
<currency>GBP</currency>
<display-name xml:lang="x-default">display name</display-name>
<description>my description 1</description>
</header>
<price-tables>
<price-table product-id="id1" mode="mode1">
<amount quantity="1">30.00</amount>
</price-table>
<price-table product-id="id2" mode="mode2">
<amount quantity="1">60.00</amount>
</price-table>
</price-tables>
</pricebook>
</pricebooks>
and
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pricebooks xmlns="http://www.blablabla.com">
<pricebook>
<header pricebook-id="my-id">
<currency>GBP</currency>
<display-name xml:lang="x-default">display name</display-name>
<description>my description 1</description>
</header>
<price-tables>
<price-table product-id="id2" mode="mode2">
<amount quantity="1">60.00</amount>
</price-table>
<price-table product-id="id1" mode="mode1">
<amount quantity="1">30.00</amount>
</price-table>
</price-tables>
</pricebook>
</pricebooks>
Which I'm trying to compare ignoring the order of the elements price-table, so for me those two are equal. I'm using
<dependency>
<groupId>org.xmlunit</groupId>
<artifactId>xmlunit-core</artifactId>
<version>2.5.0</version>
</dependency>
and the code is the following, but I'm not able to make it work. It complains because the attribute values id1 and id2 are different.
Diff myDiffSimilar = DiffBuilder
.compare(expected)
.withTest(actual)
.checkForSimilar()
.ignoreWhitespace()
.ignoreComments()
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText))
.build();
assertFalse(myDiffSimilar.hasDifferences());
I have also tried to edit the the nodeMatcher as follow:
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.conditionalBuilder()
.whenElementIsNamed("price-tables")
.thenUse(ElementSelectors.byXPath("./price-table", ElementSelectors.byNameAndText))
.elseUse(ElementSelectors.byName)
.build()))
Thank you for your help.
I don't see any nested text inside your price-table elements at all, so byNamAndText matches on element names only - which will be the same for all price-tables and thus not do what you want.
In you example there is no ambiguity for price-tables as there is only one anyway. So the byXPath approach looks wrong. At least in your snippet XMLUnit should do fine with byName except for the price-table elements.
I'm not sure whether product-id alone is what identifies your price-table elements or the combination of all attributes. Either byNameAndAttributes("product-id") or its byNameAndAllAttributes cousin should work.
If it is only product-id then byNameAndAttributes("product-id") becomes byName for all elements that don't have any product-id attribute at all. In this special case byNameAndAttribute("product-id") alone will work for your whole document as we can see it - more or less by accident.
If you need more complex rules for other elements than price-table or you want to make things more explicit than
ElementSelectors.conditionalBuilder()
.whenElementIsNamed("price-table")
.thenUse(ElementSelectors.byNameAndAttributes("product-id"))
// more special cases
.elseUse(ElementSelectors.byName)
is the better choice.
I am having some trouble using xpath to extract the "Payload" values below using apache-camel. I use the below xpath in my route for both of the example xml, the first example xml returns SomeElement and SomeOtherElement as expected, but the second xml seems unable to parse the xml at all.
xpath("//Payload/*")
This example xml parses just fine.
<Message>
<Payload>
<SomeElement />
<SomeOtherElement />
</Payload>
</Message>
This example xml does not parse.
<Message xmlns="http://www.fake.com/Message/1">
<Payload>
<SomeElement />
<SomeOtherElement />
</Payload>
</Message>
I found a similar question about xml and xpath, but it deals with C# and is not a camel solution.
Any idea how to solve this using apache-camel?
Your 2nd example xml, specifies a default namespace: xmlns="http://www.fake.com/Message/1" and so your xpath expression will not match, as it specifies no namespace.
See http://camel.apache.org/xpath.html#XPath-Namespaces on how to specify a namespace.
You would need something like
Namespaces ns = new Namespaces("fk", "http://www.fake.com/Message/1");
xpath("//fk:Payload/*", ns)
I'm not familiar with Apache-Camel, this was just a result of some quick googling.
An alternative maybe to just change your xPath to something like
xpath("//*[local-name()='Payload']/*)
Good luck.
i have this XML document:
<?xml version="1.0" encoding="UTF-16"?>
<root>
<items>
<item1>
<tag1>1</tag1>
<tag2>2</tag2>
<tag3>3</tag3>
</item1>
<item2>
<tag1>4</tag1>
<tag2>5</tag2>
<tag3>6</tag3>
</item2>
</items>
</root>
I want to iterate the item elements (item1, item2...), and for each tag get the tag name and after that the value of the tag.
I am using DOM parser.
Any ideas?
Sorry, but this ain't an unsolvable or complicated problem, this is simply reading a tutorial which can be googled within seconds.
And of course, you might also check the documentation, which will give you a hint about this handy method called "getNodeName()".
I download an XML-file, I generate using PHP, that looks similar to this
<?xml version="1.0" encoding="utf-8" ?>
<customersXML>
...
<customer id="12" name="Me+%26+My+Brother" swid="1" />
...
</customersXML>
Now I need to parse it in Java, but before that I use URL-Decode, so the XML become this
<?xml version="1.0" encoding="utf-8" ?>
<customersXML>
...
<customer id="12" name="Me & My Brother" swid="1" />
...
</customersXML>
But when I parse the XML-file using SAX, I get a problem with "&". How can I get around this?
The ampersand is a special character in xml (O'reilly Xml: Entities: Handling Special Content) and needs to be encoded. Replace it with & before sending it.
If the XML in question isn't urlencoded in the first place (which it doesn't look like it is), then you shouldn't be urldecoding it. Breaking the xml and then "unbreaking" it really doesn't seem like the best way to go about it. Just use the original xml and parse that.
Never process XML as a string without parsing it, or you are liable to end up with something that is no longer XML. As you have discovered.
You should FIRST parse, THEN url decode.