I have an object which I write to an xml. The xml has escape characters like "&", "<" etc. Now before I process this xml I want a utility to escape these special characters so that the resultant xml has & followed by "amp;" for "&" and "&" followed by "lt;" for "<". I tried StringUtils, XMLWriter and few more but they convert the "<" in opening and closing tags as well which I dont want. I only want "<" in the attribute values to be replaced. Please help.
Example;
I have the input xml as this
<?xml version="1.0" encoding="UTF-8"?>
<personName><firstName>Sam & Pat </firstName>
<sal> > than 10000 </sal>
</personName>
And the expected xml should be `
<?xml version="1.0" encoding="UTF-8"?>
<personName><firstName>Sam & Pat </firstName>
<sal> < than 10000 </sal>
</personName>
If I am using StringUtils, it converts all the "<" characters like this
<sal> < than 10000 </sal>
EDIT: I can't actually use JaxB. I am using FreeMarkerTemplate to do this. Here is the code .
File tempFile = File.createTempFile(fileName, ".tmp");
try (FileWriter writer = new FileWriter(tempFile)) {
freeMarkerConfig.setOutputEncoding(UTF_8);
Template template = freeMarkerConfig.getTemplate(templateName);
template.process(data, writer);
} `
The resultant file which get created should have the handled escape characters.
You can also use Apache Commons Lang Library for escaping the characters:
Example:
String escapeString1 = "Sam & Pat ";
System.out.println("Escaped : " + StringEscapeUtils.escapeXml11(escapeString1));
String escapeString2 = " > than 10000";
System.out.println("Escaped : " + StringEscapeUtils.escapeXml11(escapeString2));
Output:
Escaped : Sam & Pat
Escaped : > than 10000
You can use JAXB for the XML generation. Annotate your Model-Class with #XmlRootElement
Then you can use JAXB for marshalling the XML-Object:
try {
JAXBContext context = JAXBContext.newInstance(Person.class);
Marshaller m = context.createMarshaller();
m.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
Person object = new Person();
object.setPersonName("Sam & Pat");
object.setSal("> than 10000");
m.marshal(object, System.out);
} catch (JAXBException e) {
e.printStackTrace();
}
The output will be
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<person>
<personName>Sam & Pat</personName>
<sal>> than 10000</sal>
</person>
Using CDATA will fix your problem.
Such as <![CDATA[abc]]>
You can include XML special characters in XPL. XPL has exactly the same structure as XML, but allows the special characters in text fields. http://hll.nu/
Related
Hi I found really useful the apache operator
StringUtils.substringBetween(fileContent, "<![CDATA[", "]]>")
to extract information inside
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<envelope>
<xxxx>
<yyyy>
<![CDATA[
<?xml version="1.0" encoding="UTF-8" ?>
<Document >
<eee>
<tt>
<ss>zzzzzzz</ss>
<aa>2021-09-09T10:39:29.850Z</aa>
<aaaa>
<Cd>cccc</Cd>
</aaaa>
<dd>ssss</dd>
<ff></ff>
</tt>
</eee>
</Document>
]]>
</yyyy>
</xxxx>
</envelope>
But now what I'm looking is another operator or regex that allow me to replace a dynamic xml
![CDATA["old_xml"]]
by another xml
![CDATA["new_xml"]]
Any idea idea how to accomplish this?
Regards.
Instead of StringUtils, you can use String#replaceAll method:
fileContent = fileContent
.replaceAll("(?s)(<!\\[CDATA\\[).+?(]]>)", "$1foo$2");
Explanation:
(?s): Enable DOTALL mode so that . can match line breaks as well in .+?
(<!\\[CDATA\\[): Match opening <![CDATA[ substring and capture in group #1
.+?: Match 0 or more of any characters including line break
(]]>): Match closing ]]? substring and capture in group #2
$1foo$2: Replace with foo surrounded with back-references of capture group 1 and 2 on both sides
You can use the regex, (\<!\[CDATA\[).*?(\]\]>).
Demo:
public class Main {
public static void main(String[] args) {
String xml = """
...
<data><![CDATA[a < b]]></data>
...
""";
String replacement = "foo";
xml = xml.replaceAll("(\\<!\\[CDATA\\[).*?(\\]\\]>)", "$1" + replacement + "$2");
System.out.println(xml);
}
}
Output:
...
<data><![CDATA[foo]]></data>
...
Explanation of the regex:
( : Start of group#1
\<!\[CDATA\[ : String <![CDATA[
) : End of group#1
.*? : Any character any number of times
( : Start of group#2
\]\]>: String ]]>
) : End of group#2
I'm trying to generate an XML file with JAXB Annotations
So, i'll generate the JAXB Classes & the package-info.java from the XSD
Lets go :
1 - Package-info.java
//
// This file was generated by the JavaTM Architecture for XML Binding(JAXB) Reference Implementation, v2.2.8-b130911.1802
// See http://java.sun.com/xml/jaxb
// Any modifications to this file will be lost upon recompilation of the source schema.
// Generated on: 2017.01.05 at 01:51:40 PM CET
//
#XmlSchema(
xmlns = {
#XmlNs(namespaceURI = "urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2", prefix = "cac"),
#XmlNs(namespaceURI = "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2", prefix = "cbc"),
#XmlNs(namespaceURI = "urn:oasis:names:specification:ubl:schema:xsd:Invoice-2", prefix = "") //this must be empty prefix
},
elementFormDefault = javax.xml.bind.annotation.XmlNsForm.QUALIFIED)
package com.audaxis.compiere.osg.ei.ubl2;
import javax.xml.bind.annotation.XmlNs;
import javax.xml.bind.annotation.XmlSchema;
2 - then I generate an XML with the folowing code :
JAXBContext context = JAXBContext.newInstance(InvoiceType.class);
Marshaller m = context.createMarshaller();
m.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
m.setProperty(Marshaller.JAXB_ENCODING, "UTF-8");
m.setProperty(Marshaller.JAXB_SCHEMA_LOCATION, TUEInvoiceConstants.UBLInvoiceShcemaLocation);
// Write to File
File f = new File(System.getProperty("java.io.tmpdir"), getOutputFileNameSimple(root));
m.marshal(root, f);
3 - result Generated XML:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Invoice
xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
xmlns:ns11="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2"
xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 UBL-Invoice-2.1.xsd">
<cbc:UBLVersionID>2.1</cbc:UBLVersionID>
<-- XML tags -->
<END XML>
4 - As you can see, the third namespace was created with a prefix = ns11 , this will cause problems for me in the next step.
Question : How can i let it generate the XML without any prefixes ??
I want to take an XML file as input which contains the following:
<?xml version='1.0' encoding='utf-8' standalone='yes'>
<map>
<int name="count" value="10" />
</map>
and, read and change the value from 10 to any other integer value.
How can I do this in Android/Java. I'm new to Android and Java and all the tutorials available on the internet are way too complicated.
Thank You
You can change the value by matching the pattern and replacing the string as like below,
String xmlString = "<int name=\"count\" value=\"10\" />";
int newValue = 100;
Pattern pattern = Pattern.compile("(<int name=\"count\" value=\")([0-9]{0,})(\" />)");
Matcher matcher = pattern.matcher(xmlString);
while (matcher.find()) {
String match = matcher.group(2);
xmlString = xmlString.replace(match, String.valueOf(newValue));
}
System.out.println(xmlString);
You can find your answer here. It is like parsing json. You can cast your string(from file) to object and do anything with parameters
I have an XML File that looks like this:
<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>
<allinfo>
<filepath>/mnt/sdcard/Audio_Recorder/</filepath>
<filename>newxml35500.3gp</filename>
<annotation>
<file>newxml35500.3gp</file>
<timestamp>0:05</timestamp>
<note>uuuouou</note>
</annotation>
<filepath>/mnt/sdcard/Audio_Recorder/</filepath>
<filename>newxml35501.3gp</filename>
<annotation>
<file>newxml35501.3gp</file>
<timestamp>0:04</timestamp>
<note>tyty</note>
</annotation>
</allinfo>
I am trying to add an addition annotation to the XML after it has been created so the XML has an additional:
<annotation>
<file>blah</file>
<timestamp>0:00</timestamp>
<note>this is a note</note>
</annotation>
What is the best way to find the root and then write a few lines to the XML in Java? I have seen DocumentBuilderFactory get some use from others but I am not sure how to implement it correctly. Any help would be much appreciated.
This works:
final DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
final Document document = documentBuilder.parse(new ByteArrayInputStream("<foo><bar/></foo>".getBytes("UTF-8")));
final Element documentElement = document.getDocumentElement();
documentElement.appendChild(document.createElement("baz"));
You will get:
<foo><bar/><baz/></foo>
Load the file contents into a String and use Regex and String operations to perform the insertion.
String xml = "<?xml version='1.0' encoding='UTF-8' standalone='yes' ?>"
+ "<allinfo>"
+ "<filepath>/mnt/sdcard/Audio_Recorder/</filepath>"
+ "...";
// String xml = loadFromFile();
Pattern p = Pattern.compile("(.*?)(<allinfo>)(.*?)");
Matcher m = p.matcher(xml);
if (m.matches()) {
StringBuilder bld = new StringBuilder(m.group(1));
bld.append(m.group(2));
bld.append("<annotation>").append("\n");
bld.append("<file>blah</file>").append("\n");
bld.append("<timestamp>0:00</timestamp>").append("\n");
bld.append("<note>this is a note</note>").append("\n");
bld.append("</annotation>").append("\n");
bld.append("m.group(3));
xml = bld.toString();
}
i have some user defined tag. for example data here , jssj .I have a file(not xml) which contains some data embeded in tags.I need a parser for this which will identify my tags and will extract the data in proper format.
Eg
<newpage> thix text </newpage>
<tagD>
<tagA> kk</tagA>
</tagD>
tags can also have some attributes as simlar to html tags. Eg
<mytag height="f" width ="d" > bla bla bla </mytag>
<mytag attribute="val"> bla bla bla</mytag>
You could look at a parser generator like antlr.
Unless your tag syntax can be represented with a (simple) regular grammar (in which case you could try to scan the file with regexes), you will need a proper parser. It is actually not very hard to do at all - just the first time tastes like biting bullets...
You can use JAXB, already included in Java. It's quite simple.
First you need to create a binding to your XML code. The binding provides a map between Java objects and the XML code.
An example would be:
#XmlRootElement(name = "YourRootElement", namespace ="http://someurl.org")
#XmlAccessorType(XmlAccessType.FIELD)
#XmlType(name = "", propOrder = {
"intValue",
"stringArray",
"stringValue"}
)
public class YourBindingClass {
protected int intValue;
#XmlElement(nillable = false)
protected List<String> stringArray;
#XmlElement(name = "stringValue", required = true)
protected String stringValue;
public int getIntValue() {
return intValue;
}
public void setIntValue(int value) {
this.intValue = value;
}
public List<String> getStringArray() {
if (stringArray == null) {
stringArray = new ArrayList<String>();
}
return this.stringArray;
}
public String getStringValue() {
return stringValue;
}
public void setStringValue(String value) {
this.stringValue = value;
}
}
Then, to encode your Java objects into XML, you can use:
YourBindingClass yourBindingClass = ...;
JAXBContext jaxbContext = JAXBContext.newInstance(YourBindingClass.class);
Marshaller marshaller = jaxbContext.createMarshaller();
marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
marshaller.setProperty(Marshaller.JAXB_FRAGMENT, false);
/** If you need to specify a schema */
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sf.newSchema(new URL("http:\\www.someurl.org"));
marshaller.setSchema(schema);
marshaller.setProperty(Marshaller.JAXB_SCHEMA_LOCATION, true);
ByteArrayOutputStream stream = new ByteArrayOutputStream();
marshaller.marshal(yourBindingClass, stream);
System.out.println(stream);
To parse your XML back to objects:
InputStream resourceAsStream = ... // Your XML, File, etc.
JAXBContext jaxbContext = JAXBContext.newInstance(YourBindingClass.class);
Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
Object r = unmarshaller.unmarshal(resourceAsStream);
if (r instanceof YourBindingClass) ...
Example starting from a Java object:
YourBindingClass s = new YourBindingClass();
s.setIntValue(1);
s.setStringValue("a");
s.getStringArray().add("b1");
s.getStringArray().add("b2");
// marshal ...
Result:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:YourRootElement xmlns:ns2="http://someurl.org">
<intValue>1</intValue>
<stringArray>b1</stringArray>
<stringArray>b2</stringArray>
<stringValue>a</stringValue>
</ns2:YourRootElement>
If you don't know the input format, that means you probably don't have a XML schema. If you don't have a schema you don't have some it's benefits such as:
It is easier to describe allowable document content
It is easier to validate the correctness of data
It is easier to define data facets (restrictions on data)
It is easier to define data patterns (data formats)
It is easier to convert data between different data types
Anyway, the previous code also works with XML code that contains 'unknown' tags. However your XML code still have to present the required fields and follow the declared patterns.
So the following XML code is also valid. The only restriction is: the tag 'stringValue' should be there. Note that 'stringArrayQ' was not previously declared.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<ns2:YourRootElement xmlns:ns2="http://someurl.org">
<stringValue>a</stringValue>
<stringArrayQ>b1</stringArrayQ>
</ns2:YourRootElement>
Are these XML tags? If so, look into one of the many Java XML libraries already available. If they're some kind of custom tagging format, then you're just going to have to write it yourself.
For xml tags - use DOM parser or SAX parser.
You example is XML with this modification:
<root>
<newpage> thix text </newpage>
<tagD>
<tagA> kk</tagA>
</tagD>
</root>
You can use any XML parser you want to parse it.
Edit:
Attributes are a normal part of XML.
<root>
<newpage> thix text </newpage>
<tagD>
<tagA> kk</tagA>
</tagD>
<mytag height="f" width ="d" > bla bla bla </mytag>
<mytag attribute="val"> bla bla bla</mytag>
</root>
Every XML parser can deal with them.
Edit:
If you were able to use Python, you could do something like this:
import lxml.etree
doc = lxml.etree.parse("foo.xml")
print doc.xpath("//mytag[1]/#width")
# => ['d']
That's what i call simple.