I have a requirement to publish a HTML file from an XML file where the HTML file will show hard-coded values for the specific point in time they were present on the XML file (i.e. independent of XML changes after the HTML doc is created).
Example: XML File
<dvd>
<name>Titanic</name>
<price>10</price>
</dvd>
<dvd>
<name>Avatar</name>
<price>12</price>
</dvd>
Now I need to convert these into a HTML document whereby the values are hardcoded into the HTML
Example HTML File
<html>
<body>
<h1>DVD List</h1>
<table>
<tr ...>
<th>Name</th><th>Price</th>
<td>Titanic</td><td>10</td>
<td>Avatar</td><td>12</td>
I have tried using XSLT however this only provides a render of the XML document that is updated according to XML changes. I would require a point-in-time HTML document referring to the values as they were on the XML.
Perhaps there is an easy way to do this with existing technologies, or some simple custom Java code?
Related
A part of my assignment requires me to make a converter that as the title suggests, reads -any- JSON file and converts it into an HTML file.
Having scoured the web, I've personally found naught except people converting hand made JSON files into HTML ones, but the problem with that is that they are preset and I need to be able to read any JSON file instead of a premade one.
Easy:
<html>
<body>
<pre>
$jsonContent
</pre>
</body>
</html>
If you can't make any assumptions on the structure of the json, there is not more you can do. If you know the structure of the json, you can try things like:
convert json to xml (Converting JSON to XML in Java)
use XSLT to generate HTML based on XML
Or do all in one: XSLT equivalent for JSON
I have an XML String which is actually an HTML. It contains few custom tags that should be read and replaced with actual value. I am unable to figure out how to do this using SAX parsing
<html>
<body>
<p>The joiner report for today</p>
<p><APP:FT value="THIS_WEEKDAY"/></p>
<p> </p>
</body>
</html>
This template would be evaluated using a SAX parsing and java code, where the value of the custom tag
<APP:FT>
would be evaluated using java code. For example
<APP:FT value="THIS_WEEKDAY"/>
should be replaced by TUESDAY considering today is 13-Dec-2016. It is easy to find the value, but I am unable to figure out a way to replace this in the HTML string. The final HTML should look like
<html>
<body>
<p>The joiner report for today</p>
<p>TUESDAY</p>
<p> </p>
</body>
</html>
Thank you folks for reading through. i solved the problem not by XML but by using freemarker template API - http://freemarker.org/
I want to extract html page from an xml file. Any ideas please ?
<?xml ....>
<first>
</first>
<second>
</second>
<xhtml>
<html>
.....some html code here
</html>
</xhtml>
I want to extract html page as it is from the above.
because xml and html markup is similar any xml parser might have issues with it. I would suggest when you save the html data in the xml file, you encode it to prevent the xml parser from having issues. Then when you recall the data from the xml you just need to decode it for use.
<?xml ....?
<first></first>
<second></second>
<markup>
<html>
code here
</html>
</markup>
when you decode the markup section it will look like this
<html>
code here
</html>
You might find this of some use:
http://www.w3schools.com/xml/xml_parser.asp
You can extract the HTML from the XML using JavaScript. You can then create an element on your HTML page in JavaScript and dump the HTML in there. The only issue with this is that it seems that the XML data you're receiving has a HTML tag.
If you want to add the content to an existing page, then you would have to strip the html and body tags.
If you use python, extraction can be very easy.
from simplified_scrapy.simplified_doc import SimplifiedDoc
html='''
<?xml >
<first>
</first>
<second>
</second>
<xhtml>
<html>
.....some html code here
</html>
</xhtml>
'''
doc = SimplifiedDoc(html)
html = doc.xhtml.html
print (html)
First you need to install simplified_scrapy using pip.
pip install simplified_scrapy
I have an xml file with html tags like:
<?xml version="1.0" encoding="utf-8" ?>
<blog>
<blogid>49</blogid>
<title>[FIXED] Job requests page broken</title>
<fulltext>
<img title="page broken" src="images/west/blog/site-broken.jpg" alt="page broken" />
<p><span style="background-color: #ccffcc;">Update 28/05/2011</span>: Job requests page seems to be working OK now. If you find any issues please use the contact page to notify us. Thank you for your patience!</p>
<p>Â </p>
<p>Well, what can I say? Why does it always have to be that way? You are trying to create something new and something else gets broken on the way...</p>
</fulltext>
Now I want the whole html part between tag as it is.
What I get right now is blank as I think dom is parsing html tags as well.
I tried xpath but it is not working with android.
I don't think you can get this not well-formed XML into a DOM as-is. (EDIT: or is it well-formed?)
You would need to a) either escape the characters - making the XML well-formed and parseable (but probably not into a DOM you want, I guess you want to display the HTML in a different system) or b) parse it using a stream processor or c) fix it using string manipulation (add <[[CDATA .. ]]>) and then parse it into a DOM.
HTH
HTML is a sub-language of XML (without getting into details related to XHTML). Therefore, there is no reason for the DOM parser not to treat those inner tags as XML tags.
Maybe what you're looking for is a way to flatten what's inside <fulltext>?
use a library like Jsoup for this purpose.
public static void main(String args[]){
String html = "<?xml version="1.0"?><foo>" +
"<bar>Some text — invalid!</bar></foo>";
Document doc = Jsoup.parse(html, "", Parser.xmlParser());
for (Element e : doc.select("bar")) {
System.out.println(e);
}
}
I am developing xml editor using jsp and servlet. In this case i am using DOM parser.
I have one problem in XML editor ,
How to edit the following xml file without losing elements.
eg:
<book id="b1">
<bookbegin id="bb1">
<para id="p1">This is<b>first</b>line</para>
<para id="p2">This is<b>second</b>line</para>
<para id="p3">This is<b>third</b>line</para>
</bookbegin>
</book>
I try to edit the above xml file using dtd using jsp,servlet. but while i read the textvalue from xml, it return only first,second,third.How to read the 'This is' and 'line '. Then how to store back to the xml file using xpath.
thank in advance.
The <b> tag inside the <para> tag is another element, not a formatting tag (in XML). Therefore, you need to traverse down to it.
Like #JRL says, the <b> tags are cosnidered as well-formed XML and, as a consequence, splitted by your DOM processor.
I think youf ail to read other text elements because you only read text when an XML node has no more XML node, which is not your case here.