How can I transform xml to html on android? - java

I'm relatively new to java and android, in my android application I need to get an XML file, transform it and show it to a user.
I know how to parse XML, but I don't want to parse it and generate views after. I'd like to transform it to an HTML and display in a WebView.
I'm trying to find something on the Internet but can't find anything.
How can I do it? Any ideas or links will be appreciated,
thanks

The usual tool to use for transforming XML to HTML is XSLT. Search SO for XSLT tutorial and you'll get some good results.
Here is another question showing how one developer used XSLT on Android.
You can also search for examples of the use of Transformer in Java, as in this helpful article:
// JAXP reads data using the Source interface
Source xmlSource = new StreamSource(xmlFile);
Source xsltSource = new StreamSource(xsltFile);
// the factory pattern supports different XSLT processors
TransformerFactory transFact =
TransformerFactory.newInstance();
Transformer trans = transFact.newTransformer(xsltSource);
trans.transform(xmlSource, new StreamResult(System.out));
Update:
For older versions of the Android java API:
This article shows how to use a SAX parser to parse the input XML, then use XmlSerializer to output XML. The latter could easily output whatever XHTML you want. Both are available since API level 1.
Unfortunately I don't see a way to do XPath in API level 3, but if your input XML isn't too complex you should be able to code your own transformations. I know you "don't want to parse it and generate view after", but if you mean you don't want to even use an XML parser that's provided by Android, then I don't know of any alternative.
Update 2:
I just learned about XOM, which supports a subset of XPath. This question shows someone using XOM on Android (to write XML) with API level 4. You could take advantage of the XPath features, as well as the serialization features. It requires a small external library, XOM's jar. I don't know if it's compatible with API level 3.

Please see the complete working example, created as a part of "AndStatus" application.
It uses XSLT to show localized (!) Application Change Log in a WebView of the HelpActivity.
The working example consists of these parts:
The small utility class without any dependencies (i.e. it may be easily reused) which has this function:
/**
Transform XML input files using supplied XSL stylesheet and show it in the WebView
#param activity Activity hosting the WebView
#param resView WebView in which the output should be shown
#param resXml XML file to transform. This file is localized! It should be put into "raw-" folder
#param resXsl XSL stylesheet. In the "raw" folder. May be single for all languages...
*/
public static void toWebView(Activity activity, int resView, int resXml, int resXsl) {
...
}
See full source code here: Xslt.java
The example of its usage: See HelpActivity.java
The XML file to be transformed: changes.xml
and the corresponding XSL stylesheet: changesxsl.xsl

Use XSLT, XSLT convert your XML in to HTML that we can display in Android Webview.
This question will help.

Related

How to add custom XML storage part to Word doc - preferrably with docx4j

I'm trying to populate a Word content control with XML data using docx4j (version 3.2.1). I'm evaluating this in order to use it for invoice generation. The documents we want to generate are not very complicated so this looks like a good approach to me.
I have created the content control through Word 2010 dev tools. This is how I try to inject the XML into the docx (taken from this example):
WordprocessingMLPackage wordMLPackage = Docx4J.load(new File(input_DOCX));
FileInputStream xmlStream = new FileInputStream(new File(input_XML));
Docx4J.bind(wordMLPackage, xmlStream, Docx4J.FLAG_BIND_INSERT_XML & Docx4J.FLAG_BIND_BIND_XML);
I get the following exception:
org.docx4j.openpackaging.exceptions.Docx4JException: Couldn't find CustomXmlDataStoragePart! exiting..
at org.docx4j.Docx4J.bind(Docx4J.java:300)
at org.docx4j.Docx4J.bind(Docx4J.java:271)
How can I add the CustomXmlDataStoragePart with docx4j, if it doesn't exist yet? Or should/can I do this in Word directly?
Note: I decided to prepare templates in Word directly, because later on these templates must be edited by non-technical users and I don't want to burden them with extra tools, if possible.
You say you "created the content control through Word 2010 dev tools". Unless you mean the content control toolkit, you need to use that or better, either of the OpenDoPE Word addins. Not both.
These tools add a custom xml part into the docx, and allow you to associate it with your content controls via XPath data bindings.
Then, when at runtime you invoke Docx4J.bind, docx4j finds that existing custom xml part, and replaces it with the xml file you provide which contains your runtime data.

Saxon in Java: XSLT for CSV to XML

Mostly continued from this question: XSLT: CSV (or Flat File, or Plain Text) to XML
So, I have an XSLT from here: http://andrewjwelch.com/code/xslt/csv/csv-to-xml_v2.html
And it converts a CSV file to an XML document. It does this when used with the following command on the command line:
java -jar saxon9he.jar -xsl:csv-to-xml.csv -it:main -o:output.xml
So now the question becomes: How do I do I do this in my Java code?
Right now I have code that looks like this:
TransformerFactory transformerFactory = TransformerFactory.newInstance();
StreamSource xsltSource = new StreamSource(new File("location/of/csv-to-xml.xsl"));
Transformer transformer = transformerFactory.newTransformer(xsltSource);
StringWriter stringWriter = new StringWriter();
transformer.transform(documentSource, new StreamResult(stringWriter));
String transformedDocument = stringWriter.toString().trim();
(The Transformer is an instance of net.sf.saxon.Controller.)
The trick on the command line is to specify "-it:main" to point right at the named template in the XSLT. This means you don't have to provide the source file with the "-s" flag.
The problem starts again on the Java side. Where/how would I specify this "-it:main"? Wouldn't doing so break other XSLT's that don't need that specified? Would I have to name every template in every XSLT file "main?" Given the method signature of Transformer.transform(), I have to specify the source file, so doesn't that defeat all the progress I've made in figuring this thing out?
Edit: I found the s9api hidden inside the saxon9he.jar, if anyone is looking for it.
You are using the JAXP API, which was designed for XSLT 1.0. If you want to make use of XSLT 2.0 features, like the ability to start a transformation at a named template, I would recommend using the s9api interface instead, which is much better designed for this purpose.
However, if you've got a lot of existing JAXP code and you don't want to rewrite it, you can usually achieve what you want by downcasting the JAXP objects to the underlying Saxon implementation classes. For example, you can cast the JAXP Transformer as net.sf.saxon.Controller, and that gives you access to controller.setInitialTemplate(); when it comes to calling the transform() method, just supply null as the Source parameter.
Incidentally, if you're writing code that requires a 2.0 processor then I wouldn't use TransformerFactory.newInstance(), which will give you any old XSLT processor that it finds on the classpath. Use new net.sf.saxon.TransformerFactoryImpl() instead, which (a) is more robust, and (b) much much faster.

Storing html values in xml

Trying to figure out a way to strip out specific information(name,description,id,etc) from an html file leaving behind the un-wanted information and storing it in an xml file.
I thought of trying using xslt since it can do xml to html... but it doesn't seem to work the other way around.
I honestly don't know what other language i should try to accomplish this. i know basic java and javascript but not to sure if it can do it.. im kind of lost on getting this started.
i'm open to any advice/help. willing to learn a new language too as i'm just doing this for fun.
There are a number of Java libraries for handling HTML input that isn't well-formed (according to XML). These libraries also have built-in methods for querying or manipulating the document, but it's important to realize that once you've parsed the document it's usually pretty easy to treat it as though it were XML in the first place (using the standard Java XML interfaces). In other words, you only need these libraries to parse the malformed input; the other utilities they provide are mostly superfluous.
Here's an example that shows parsing HTML using HTMLCleaner and then converting that object into a standard org.w3c.dom.Document:
TagNode tagNode = new HtmlCleaner().clean("<html><div><p>test");
DomSerializer ser = new DomSerializer(new CleanerProperties());
org.w3c.dom.Document doc = ser.createDOM(tagNode);
In Jsoup, simply parse the input and serialize it into a string:
String text = Jsoup.parse("<html><div><p>test").outerHtml();
And convert that string into a W3C Document using one of the methods described here:
How to parse a String containing XML in Java and retrieve the value of the root node?
You can now use the standard JAXP interfaces to transform this document:
TransformerFactory tFact = TransformerFactory.newInstance();
Transformer transformer = tFact.newTransformer();
Source source = new DOMSource(doc);
Result result = new StreamResult(System.out);
transformer.transform(source, result);
Note: Provide some XSLT source to tFact.newTransformer() to do something more useful than the identity transform.
I would use HTMLAgilityPack or Chris Lovett's SGMLReader.
Or, simply HTML Tidy.
Ideally, you can treat your HTML as XML. If you're lucky, it will already be XHTML, and you can process it as HTML. If not, use something like http://nekohtml.sourceforge.net/ (a HTML tag balancer, etc.) to process the HTML into something that is XML compliant so that you can use XSLT.
I have a specific example and some notes around doing this on my personal blog at http://blogger.ziesemer.com/2008/03/scraping-suns-bug-database.html.
TagSoup
JSoup
Beautiful Soup

Validating an XML NCName in Java

I'm getting some values from Java annotations in an annotation processor to generate metadata. Some of these values are supposed to indicate XML element or attribute names. I'd like to validate the input to find out if the provided values are actually legal NCNames according to the XML specification. Only the local name is important in this case, the namespace URI doesn't play a part here.
Is there some simple way of finding out if a string is a legal XML element or attribute name? Preferably I'd use some XML API that is readily available in Java SE. One of the reasons I'm doing this stuff in the first place is to cut back on dependencies. I'm using JDK 7 so I have access to the most up-to-date classes/methods.
So far, browsing through content handler classes and SAX/DOM stuff hasn't yielded any result.
If you're prepared to have Saxon on your class path you can do
new Name10Checker().isValidNCName(s);
I can't see anything simpler in the public JDK interface.
didn't find anything straightforward in any of the jdk 6 APIs (don't know about jdk 7). a quick but possibly "hackish" way to check would be to convert it to an xml doc and see if it parses:
String name = ...;
if(name.contains(">")) {
return false;
}
String xmlDoc = "<" + name + "/>";
DocumentBuilder db = ...;
db.parse(new InputSource(new StringReader(xmlDoc)));
I ran into the same problem and found lots of implementations in foss libraries, and even an old implementation in a Java class library, which has been removed ages ago... So here's a few options to choose from:
Java Class Library: XMLUtils.isValidNCName(String ncName) (note: removed in 2004)
Apache Axis: NCName.isValid(String stValue)
Saxonica: NameChecker.isValidNCName(CharSequence ncName)
OWL API: XMLUtils.isNCName(java.lang.CharSequence s)
Validator.nu HTML Parser: NCName.isNCName(java.lang.String str)
So, if you're using one of these libraries anyway, you're fine.
As I am not, I'll go with a copy of the XMLUtils from the OWL API, which has no external dependencies, is available under non-restrictive licenses (LGPL and Apache 2.0) and consists of nice and clean code.

how to create an odt file programmatically with java?

How can I create an odt (LibreOffice/OpenOffice Writer) file with Java programmatically? A "hello world" example will be sufficient. I looked at the OpenOffice website but the documentation wasn't clear.
Take a look at ODFDOM - the OpenDocument API
ODFDOM is a free OpenDocument Format
(ODF) library. Its purpose is to
provide an easy common way to create,
access and manipulate ODF files,
without requiring detailed knowledge
of the ODF specification. It is
designed to provide the ODF developer
community with an easy lightwork
programming API portable to any
object-oriented language.
The current reference implementation
is written in Java.
// Create a text document from a standard template (empty documents within the JAR)
OdfTextDocument odt = OdfTextDocument.newTextDocument();
// Append text to the end of the document.
odt.addText("This is my very first ODF test");
// Save document
odt.save("MyFilename.odt");
later
As of this writing (2016-02), we are told that these classes are deprecated... big time, and the OdfTextDocument API documentation tells you:
As of release 0.8.8, replaced by org.odftoolkit.simple.TextDocument in
Simple API.
This means you still include the same active .jar file in your project, simple-odf-0.8.1-incubating-jar-with-dependencies.jar, but you want to be unpacking the following .jar to get the documentation: simple-odf-0.8.1-incubating-javadoc.jar, rather than odfdom-java-0.8.10-incubating-javadoc.jar.
Incidentally, the documentation link downloads a bunch of jar files inside a .zip which says "0.6.1"... but most of the stuff inside appears to be more like 0.8.1. I have no idea why they say "as of 0.8.8" in the documentation for the "deprecated" classes: just about everything is already marked deprecated.
The equivalent simple code to the above is then:
odt_doc = org.odftoolkit.simple.TextDocument.newTextDocument()
para = odt_doc.getParagraphByIndex( 0, False )
para.appendTextContent( 'stuff and nonsense' )
odt_doc.save( 'mySpankingNewFile.odt' )
PS am using Jython, but the Java should be obvious.
I have not tried it, but using JOpenDocument may be an option. (It seems to be a pure Java library to generate OpenDocument files.)
A complement of previously given solutions would be JODReports, which allows creating office documents and reports in ODT format (from templates, composed using the LibreOffice/OpenOffice.org Writer word processor).
DocumentTemplateFactory templateFactory = new DocumentTemplateFactory();
DocumentTemplate template = templateFactory .getTemplate(new File("template.odt"));
Map data = new HashMap();
data.put("title", "Title of my doc");
data.put("picture", new RenderedImageSource(ImageIO.read(new File("/tmp/lena.png"))));
data.put("answer", "42");
//...
template.createDocument(data, new FileOutputStream("output.odt"));
Optionally the documents can then be converted to PDF, Word, RTF, etc. with JODConverter.
Edit/update
Here you can find a sample project using JODReports (with non-trivial formatting cases).
I have written a jruby DSL for programmatically manipulating ODF documents.
https://github.com/noah/ocelot
It's not strictly java, but it aims to be much simpler to use than the ODFDOM.
Creating a hello world document is as easy as:
% cat examples/hello.rb
include OCELOT
Text::create "hello" do
paragraph "Hello, world!"
end
There are a few more examples (including a spreadsheet example or two) here.
I have been searching for an answer about this question for myself. I am working on a project for generating documents with different formats and I was in a bad need for library to generate ODT files.
I finally can say the that ODFToolkit with the latest version of the simple-odf library is the answer for generating text documents.
You can find the the official page here :
Apache ODF Toolkit(Incubating) - Simple API
Here is a page to download version 0.8.1 (the latest version of Simple API) as I didn't find the latest version at the official page, only version 0.6.1
And here you can find Apache ODF Toolkit (incubating) cookbook
You can try using JasperReports to generate your reports, then export it to ODS. The nice thing about this approach is
you get broad support for all JasperReports output formats, e.g. PDF, XLS, HTML, etc.
Jasper Studio makes it easy to design your reports
The ODF Toolkit project (code hosted at Github) is the new home of the former ODFDOM project, which was until 2018-11-27 a Apache Incubator project.
the solution may be JODF Java API Independentsoft company.
For example, if we want to create an Open Document file using this Java API we could do the following:
import com.independentsoft.office.odf.Paragraph;
import com.independentsoft.office.odf.TextDocument;
public class Example {
public static void main(String[] args)
{
try
{
TextDocument doc = new TextDocument();
Paragraph p1 = new Paragraph();
p1.add("Hello World");
doc.getBody().add(p1);
doc.save("c:\\test\\output.odt", true);
}
catch (Exception e)
{
System.out.println(e.getMessage());
e.printStackTrace();
}
}
}
There are also .NET solutions for this API.

Categories