Making Saxon produce new result document when run from Java

Making Saxon produce new result document when run from Java - java

I am trying to run Saxon HE from java, using code that can be found in Saxon resources. I have tried changing it so that it doesn't create an aditional file from the java code, but instead having the xslt file doing that throught the use of "result-document".
My xslt did work as intended in Altova XMLSpy, but I wanted to see if I could get Saxon doing the same thing - no luck there, save from a massive head ache and loads of frustration and lots of wishes that Python will get support for this some day soon...
I get the following error message: The system identifier of the principal output file is unknown.
When I google it, I find an answer that the base uri can't be found, but nowhere can be seen how to set the base uri...
So my firt question is: Where is the base uri set? Is it in the java class or in the xslt file? I cannot see where I would set this in the xslt file, so my guess is that I would have to set this as a property of the compiler/transformer?
ANother question is about the actual href attribute of the result-document. If I want to point to a relative path, what is the syntax, and maybe what would an example look like?
And what about absolute paths?
In my file that is working in Altova, I somehow get the base uri for the source xml file that is to be transformed, and then I direct the output to a relative directory. In Saxon, the base uri instead seems to get the location of the xslt file... No idea why this is the case.
When setting an absolute path, I get an error stating I'm using an unknown protocol. So I entered "file:///" before the path. Now I get a warning complaining about a document not beeing available at a path that is concatenated of the xslt file path, and a lookup path I'm using during the transform.
As you can see, I'm all over the place here, so some guide lines and help would be greatly appreciated.

There are two APIs for running a Saxon transformation, and you haven't said which of them you are using.
Either way, a relative URI used in the href attribute of xsl:result-document is resolved relative to the "base output URI" of the transformation.
If you're using the JAXP transformation API, this was designed for XSLT 1.0, which doesn't recognize the concept of a base output URI. Saxon therefore uses the SystemID of the JAXP Result object provided as the destination of the transformation. If the JAXP Result object doesn't have a system ID, for example if you supply a DOMResult or StreamResult with no system ID specified, you're likely to get an error.
By contrast the s9api API was designed for XSLT 2.0 (with extensions for 3.0), and its XsltTransformer object therefore has an explicit setBaseOutputURI() method.
If you did something and it didn't work, then please tell us exactly what you did and exactly how it failed, and then we can help you get it right next time. It's hard to debug code that we can't see.

Related

Create XML file according to XSD schema

I know that there is a lot of such a topics online but the problem is, that none of solutions work for me.
Language I'm using: Java
IDE: Intellij
Just to make clear, I'm using community edition, maybe that's why none of plugins, like JAXB doesn't work.
From the other file I extracted data (values) and I need to create XML file with this data. Also there is XSD schema here: http://www.bpsim.org/schemas/1.0/
I'm thinking maybe there is any third-party solutions what I can use?
Cause I really don't want to code all the XML file by hands cause it's thousands of values and code.
Anyone knows a good solution?

On Linux you can use xsd2inst from xmlbeans, xmlbeans-tools packages to generate and instance out of the xsd
XMLBEANS_LIB='/usr/share/java/xmlbeans/' xsd2inst test.xsd -name shiporder> test.xml
xsd2inst -h
Generates a document based on the given Schema file
having the given element as root.
The tool makes reasonable attempts to create a valid document,
but this is not always possible since, for example,
there are schemas for which no valid instance document
can be produced.
Usage: xsd2inst [flags] schema.xsd -name element_name
Flags:
-name the name of the root element
-dl enable network downloads for imports and includes
-nopvr disable particle valid (restriction) rule
-noupa disable unique particle attribution rule

How to read .EAP file using java

Within an Enterprise Architect file I have the definition of an XML (The definition specify which attributes are mandatory or not) , My goal is to read this definition, and afterwards validate the actual XMl file.
Is there a way to read an .Eap file using JAVA with Eclipse ?
PS: The definition might changes , this why I need to do it programmatically.
Any help would be appreciated.

No. Or: not directly. EAP files are actually Mickeysoft Access databases with just another suffix. To read them you need to use the EA API. Or use an ODBC driver for Access.

I manage a way around it.With the program Entreprise Architect,I exported the model into XML format using the highlighted option below :
Once the XML is generated,I am able to read the definitions easily with a couple of Xpath queries.
Before that I was using the viewer version of the program so I didn't have the option to export the model into an XML file. Once I downloaded the trial version , I got this option.
For those who still want to read the actual .EAP, file you can refer to the answer of Thomas Kilian.

Random error with OpenSAML XML Parser Configuration

I'm running a webapp in Tomcat 8 that uses OpenSAML. I've endorsed Xerces within Tomcat, I've checked that the endorsed dir path is set right, it appears that everything is working fine:
[ajp-apr-8009-exec-22] DEBUG org.opensaml.xml.Configuration - VM using JAXP parser org.apache.xerces.jaxp.DocumentBuilderFactoryImpl
I get several requests that work just fine, everything seems great, I can run through that section of code without error, then all of a sudden, I start getting this error:
OpenSAML requires an xml parser that supports JAXP 1.3 and DOM3.
The JVM is currently configured to use the Sun XML parser, which is known
to be buggy and can not be used with OpenSAML. Please endorse a functional
JAXP library(ies) such as Xerces and Xalan. For instructions on how to endorse
a new parser see http://java.sun.com/j2se/1.5.0/docs/guide/standards/index.html
at org.opensaml.xml.Configuration.validateNonSunJAXP(Configuration.java:278)
at org.opensaml.xml.parse.BasicParserPool.<init>(BasicParserPool.java:126)
Once I start getting the error, I will get an error every time but I haven't been able to isolate what it takes to trigger the problem. (Edit: it appears that this may be related in some way to docx4j usage, the errors start after a request that uses docx4j to generate a file as a word document. Since docx4j is so reliant on XML, this maybe makes some sense.)
Basically, what validateNonSunJAXP() does is pretty simple. All it does is check the class name for the DocumentBuilderFactory and if it starts with "com.sun", it throws the error.
Any ideas on what could be going on that would cause the VM to stop using the endorsed library?

docx4j manipulates:
javax.xml.parsers.SAXParserFactory
javax.xml.parsers.DocumentBuilderFactory
javax.xml.transform.TransformerFactory
You can see what it does, at https://github.com/plutext/docx4j/blob/master/src/main/java/org/docx4j/XmlUtils.java
javax.xml.parsers.SAXParserFactory
In summary, you can prevent docx4j from touching this value via a docx4j properties setting.
We found Crimson fails to parse docx4j XSLT files, which is why docx4j by default tries to use Xerces, where it is included in the JDK. (Things may be better in more recent JDKs)
If you don't want this, you can specify different behaviour via docx4j.properties:
docx4j.javax.xml.parsers.SAXParserFactory.donotset=true stops docx4j from changing the setting, or
javax.xml.parsers.SAXParserFactory allows you to specify what you want
Note that we don't restore the value to its original setting since we want to avoid Crimson being used for the life of the application.
javax.xml.parsers.DocumentBuilderFactory
This works similarly to SAXParserFactory
The relevant docx4j properties are as follows:
docx4j.javax.xml.parsers.DocumentBuilderFactory.donotset
javax.xml.parsers.DocumentBuilderFactory
We don't restore the value to its original setting (though maybe we could; would need to review whether docx4j always uses XmlUtils.getNewDocumentBuilder() )

What is the equivalent to python equivalent to using Class.getResource() [duplicate]

This question already has answers here:
Way to access resource files in python
(4 answers)
Closed 9 years ago.
In java if I want to read a file that contains resource data for my algorithms how do I do it so the path is correctly referenced.
Clarification
I am trying to understand how in the Python world one packages data along with code in a module.
For example I might be writing some code that looks at a string and tries to classify the language the text is written in. For this to work I need to have a file that contains data about language models.
So when my code is called I would like to load a file (or files) that is packaged along with the module. I am not clear on how I should do that in Python.
TIA.

I think you may be looking for pkgutil.get_data(). The docs for this say:
pkgutil.get_data(package, resource)
Get a resource from a package.
This is a wrapper for the PEP 302 loader get_data() API. The package
argument should be the name of a package, in standard module format
(foo.bar). The resource argument should be in the form of a relative
filename, using / as the path separator. The parent directory name ..
is not allowed, and nor is a rooted name (starting with a /).
The function returns a binary string that is the contents of the
specified resource.
For packages located in the filesystem, which have already been
imported, this is the rough equivalent of:
d = os.path.dirname(sys.modules[package].__file__)
data = open(os.path.join(d, resource), 'rb').read()
If the package cannot be
located or loaded, or it uses a PEP 302 loader which does not support
get_data(), then None is returned.

I think you are looking for imp.load_source:
import imp
module = imp.load_source('ModuleName', '/path/of/the/file.py')
module.FooBar()

For Pythonistas who don't know, the behaviour of Java's Class.getResource is basically: the supplied file name is (unless it's already an absolute path) transformed into a relative path by using the class' package (since the directory path to the class file is expected to mirror the explicit "package" declaration for the class). The ClassLoader that was used to load the class in the first place then gets to transform this path string, by its own logic, into a URL object that could encode a file name, a location on the WWW, etc.
Python is not Java, so we have to approximate a few things and read intent into the question.
Python classes don't really explicitly go into packages, although you can create packages by putting them in folders with an additional __init__.py file.
Python does not really have anything quite like the URL class in its standard library; although there is plenty of support for connecting to the Internet, you're generally expected to just use strings to represent URLs (and file names) and format them appropriately. This is arguably an unfortunate missed opportunity for polymorphism (it would not be hard to make your own wrapper, though you might miss lots of special cases and useful functionality). Anyway, in normal cases with Java, you're not expecting to get a web URL from this process.
Python has a concept of a "working directory" that depends on how the Python process was launched. File paths are not necessarily relative to the directory where the "main class" (well, really, "main module", because Python doesn't make you put everything in a class) is found.
So what you really want, probably, is to get the absolute path on disk to the source file corresponding to the class. But that isn't really going to work out either. The problem is: given a class, you can get the name of the module it comes from, and then look up that name to get the actual module object, and then from the module object get the file name that the module was loaded from. However, that file name is relative to whatever the working directory was when the module was loaded, and that information isn't recorded. If the working directory has changed since then (with os.chdir), you're out of luck.
Please try to be more clear about what you're really trying to do.

Anyone know of a Java File servlet framework that supports HttpRanges etc

I have been attempting to find a servlet file framework that provides a bit more than just reading a file setting the appropriate headers and thats it. There are countless samples available on the net most are quite elementary and very few (almost none) support something more complex as i will describe below.
HTTP
Http provides much more rich features such as
- ranges which help implement file download resumes.
- cache control via etags and last modified dates.
Googling
However i cannot find anything more than a simple file servlet example. Unfortunately "Java File Download Servlet Framework" and other similar combinations, is a very overloaded form and most of the time Google returns web frameworks and not something that makes it easy to support some or all the advanced features previously mentioned.
Thinking...
Just off the top of my head the framework would provide an interface like this:
FileProvider {
Date lastModified();
INputStream inputStream();
String etag();
...
}
the FileProvider takes the file path and resolves it to a real file, perhaps something from a database etc.
if the file has not changed (determined by reading FileProvider.lastModifier() thats it.
If the request is asking for a range, then the f/w would read FileProvider.inputStream() writing only the interested ranges to the HttpServletResponse.
The etag value would be used during the negotation phase to determine if ranges are supported etc.
another interface would exist to "create" the FileProvider given a path etc.
If anyone knows of a framework that separates all the nasty bits of reading headers and comparing values that would be great.
The best source i could find to get one started is
http://balusc.blogspot.com/2009/02/fileservlet-supporting-resume-and.html
but unfrotunately the sample has no provision for inserting a FileProvider and assumes that the path info from the request maps to a file on disk in some directory.

Apache Tomcat's DefaultServlet might have most of what you're looking for. At a glance, I see ETag parsing and AcceptRange handling.
http://www.docjar.com/html/api/org/apache/catalina/servlets/DefaultServlet.java.html

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.