Convert PMML - Model (Artificial Neural Network) to Java Code - java

I have a PMML file of a trained Artificial Neural Network (ANN). I would like to create a Java method which simply takes in the inputs and returns the targeted value.
This seems pretty easy, but I do not know how realize it.
The PMML Version = 3.0
Update: 24.05.2013
I tried to use the jpmml Java API.
This is how I have done:
(1) Downloaded via Maven Central Repository (link) three .Jar files:
pmml-manager-1.0.2.jar
pmml-model-1.0.2.jar
pmml-evaluator-1.0.2.jar
(2) Used eclipse to "configure Build path" and added those three external .Jar's
(3) Import my PMML-File named "text.xml" ( an artificial neural network (ANN)) PMML version="3.0"
(4) Tried to run an example "TreeModelTraversalExample.java" provided by the jpmml-project
Obviously it did not work for some reasons:
the mentioned example is not for ANN's. How to rewrite it?
my PMML-file is in XML-format. Is it the right format?
I do not know how to handle or to add Java API's. Should I even add those by "configure build path" in eclipse?
Obvious fact #2, I have no clue what I do :-)
Thanks again and kindest regards.
Stefan

JPMML should be able to handle PMML 3.X and newer versions of NeuralNetwork models without problem. Moreover, it should be able to handle all the normalization and denormalization transformations that may accompany such models.
I could use a clarification that why are you interested in converting PMML models to Java code in the first place. This complicates the whole matter a lot and it doesn't add any value. The JPMML library itself is rather compact and has minimal external dependencies (at the moment of writing this, it only depends on commons-math). There shouldn't be much difference performance-wise. You can reasonably expect to obtain up to 10'000 scorings/sec on a modern desktop computer.
The JPMML codebase has recently moved to GitHub: http://github.com/jpmml/jpmml
Fellow coders in Turn Inc. have forked this codebase and are implementing PMML-to-Java translation (see top-level module "pmml-translation") for selected model types: https://github.com/turn/jpmml
At the moment I recommend you to check out the Openscoring project (uses JPMML internally): http://www.openscoring.org
Then, you could try the following:
Deploy your XML file using the HTTP PUT method.
Get your model summary information using the HTTP GET method. If the request succeeds (as opposed to failing with an HTTP status 500 error code) then your model is well supported.
Execute the model either in single prediction mode or batch prediction mode using the HTTP POST method. Try sending larger batches to see if it meets your performance requirements.
Undeploy the model using the HTTP DELETE method.
You can always try contacting project owners for more insight. I'm sure they are nice people.

Another approach would be to use the Cascading API. There's a library called "Pattern" for Cascading, which translates PMML models into Cascading apps in Java. https://github.com/Cascading/pattern
Generally those are for Hadoop jobs; however, if you use the "local mode" flow planner in Cascading, it can be built as a JAR file to include with some other Java app.
There is work in progress for ANN models. Check on the developer email list: https://groups.google.com/forum/?fromgroups#!forum/pattern-user

I think this might do what you need. It is an open source library that claims to be able to read and evaluate pmml neural networks. I have not tried it.
https://code.google.com/p/jpmml/

Related

Process Mining and Process Discovery using ProM

I am new to this domain. My goal is to find similarities between event logs pattern. For this I have selected alpha algorithm. I have already seen videos about heuristic approach in ProM. But my confusion is that how can I implement this in my java project using ProM Framework/Plugin. Is this possible or not? Have I selected a right algorithm for this task?
As I said I am new to this domain, it would be very helpful for me if someone guide me about this stating step.
Thanks
You can not. ProM remain itself, not support to include any other project(such as java, web etc). you may make promM plugin to use your algorithm in promM, or create your own java project but would be implement process mining logic from bottom.
You can implement your ProM plugin as a class in java. You can also modify the current ProM plugins locally on your machine. However, using Alpha algorithm is not suitable for this task. There are plenty of plugins available that can help you in this regard. For example, if you consider directly follows relations as pattern, "Discover Matrix" plugin could be useful.

Tensorflow Android demo: load a custom graph in?

The Tensorflow Android demo provides a decent base for building an Android app that uses a TensorFlow graph, but I've been getting stuck on how to repurpose it for an app that does not do image classification. As it is, it loads in the Inception graph from a .pb file and uses that to run inferences (and the code assumes as such), but what I'd like to do is load my own graph in (from a .pb file), and do a custom implementation of how to handle the input/output of the graph.
The graph in question is from Assignment 6 of Udacity's deep learning course, an RNN that uses LSTMs to generate text. (I've already frozen it into a .pb file.) However, the Android demo's code is based on the assumption that they're dealing with an image classifier. So far I've figured out that I'll need to change the values of the parameters passed into tensorflow.initializeTensorflow (called in TensorFlowImageListener), but several of the parameters represent properties of image inputs (e.g. IMAGE_SIZE), which the graph I'm looking to load in doesn't have. Does this mean I'll have to change the native code? More generally, how can I approach this entire issue?
Look at TensorFlow Serving for a generic way to load and serve tensorflow models.
Good news: it recently became a lot easier to embed a pre-trained TensorFlow model in your Android app. Check out my blog posts here:
https://medium.com/#daj/using-a-pre-trained-tensorflow-model-on-android-e747831a3d6 (part 1)
https://medium.com/#daj/using-a-pre-trained-tensorflow-model-on-android-part-2-153ebdd4c465 (part 2)
My blog post goes into a lot more detail, but in summary, all you need to do is:
Include the compile org.tensorflow:tensorflow-android:+ dependency in your build.gradle.
Use the Java TensorFlowInferenceInterface class to interface with your model (no need to modify any of the native code).
The TensorFlow Android demo app has been updated to use this new approach. See TensorFlowImageClassifier.recognizeImage for where it uses the TensorFlowInferenceInterface.
You'll still need to specify some configuration, like the names of the input and output nodes in the graph, and the size of the input, but you should be able to figure that information out from using TensorBoard, or inspecting the training script.

Dart: How to setup a project

Since my attempt to set up a Dart project myself I think I miss something fundamental since I didn't succeeded. So I still need the help of the community.
Coming from GWT I am used to a single application forming a single JS file which is ran and will augment a HTML element once it is recognized by the application.
There will be usually two JS files, one for the user-frontend and the web applications backend application.
I want a solution with an incremental build during development time (which I guess Dart offers when used in Dartium)
I have an inhouse web framework that I want to be started and used to send the Dart files for the Dartium session. How this will integrate and interfere with the debug sessions?
Update regarding web framework:
The web framework is a component based rendering engine, including database and uses its own resource management including everything http related like setting the cache flags etc. Its about 1.5 MB with 1200+ tests. Its simply everything you need starting with a simple servlet. Its also using an embedded jetty.
The relevance here is that I need to know how the debugger connects to Dartium and how it finds the files once an instance is running and delivered a html file containing dartium sources, so how can I start my own web server at a given port and still have dartium debug capabilities?
Update regarding the former answers:
I tried it but after two days gave up to learn more and do some other stuff. I just don't know why it is just not possible to add a simple file to the root package of my Dart module like the good old package.html (javadoc) fil. I then just add the Dart libaries to my project and the Dart plugin adds the required Dart nature to the project and creates a builder entry, done. Why do I have to do all the fuzz. Or even better why cant I just annotate my Module's main class to form a module and so I can replace the extra file completely?
I guess the Dart plugin has a model of the Dart code already so discovery is done on the fly in Eclipse.
I also do not know why I cant put my dart code in a dart source folder like src/dart/main and src/dart/test.
Or is this possible? I am still trying to get this done. I will use a fresh Eclipse 3.8 install and check if I can get Dartium to work. Just installing the plugin seams not to do the trick.
Update regarding the JS generation:
I cannot understand why Dart is not offering an incremental build of JS files. Even if it is a single file. It should not be that hard to debundle the given compile steps. I guess it will be something like compile each source file independently and link those together, do some tree shaking and done. Would be awesome if this can be made possible. Remember one can hold a model of the output file in memory (or on disk) and know what part of the js relates to what source file. Then just look up the link symbol tables and write back the part that has changed.
For me the killer feature for Dart would be the ease of configuration as I outlined and the incremental build of JS files making co-developing in JS a no-brainer. I guess in the end both JS files will be just about 750kb combined. So all the stuff with additional compression would not force me to upgrade my 8GB memory or will stress my SSD at all (350MB/sec for writes in burst mode).
Is there any work planed on this? Would be great to have Dart as the final solution for JS creation but to be honest I do not understand why GWT is the way to create JS this way. An incremental build and easy setup for GWT would be also welcome.
Seems not to be a question ...
In Dart you have usually one JS file because Dart on the server runs native (without transpiling)
With Dartium you don't have a build at all because it also runs Dart natively.
You build to JavaScript only for deployment (and of course to test the build output before deployment).
The debugging is done by Dartium itself (you can use the Chrome DevTools debugger without DartEditor if you want). DartEditor access the debugger API of Dartium and acts as a remote display/control.
Debugging web clients loaded from other webservers is supported.
What might cause some work is setting up your custom web server so that it forwards requests to source files to pub serve the web server used by DartEditor (or standalone).
pub serve runs transformers (on the fly code transformations/generation). Some framework depend on transformers being run on the code to make it functional.
I have no idea what this means but I don't use Eclipse/Dart plugin.
[Update regarding the former answers] I tried it but after two
days gave up to learn more and do some other stuff. I just dont
know why it is just not possible to add a simple file to the
root package of my module like the good old package.html file
for the java docs and then all i do is add the Dart libaries
to my project and the Dart plugin adds the nature to it and
creates a builder entry, done. Why do I have to do all the fuzz.
Or even better why cant I just annotate my Module's main class
to form a module and so I can replace the extra files?
To integrate Dart with your Java project create the Dart project independent from your project and move the Dart build output to a directory where you have your other static files.
While development configure your web server to forward to pub serve as explained above.
As already stated in my first answer, this
[Update regarding the JS generation] I can not understand why
dartium is not offering an incremental build of JS files. Even
if it is a single file. It should not be that hard to debundle
the given compile steps. I guess it will be something like
compile a single file and link those then the magical tree
shake and done
is irrelevant. You don't do anything with JavaScript while developing.
If you load the page with a non-Dartium browser pub serve will serve
built JavaScript instead of Dart. Incremental build is in the works
to improve responsiveness. But incremental build is not available
for file generation (would make sense anyway IMHO).

Java crawler library - recursive HTTP subtree download with directory listing parser

My application currently reads data by copying filesystem tree from remote machine via shared disk, so it works as filesystem deep copy from application's point of view.
This solution is somewhat limiting and I want to support also second option - copy subtree via http.
The library should do something like wget --recursive which parses the directory listing and use it for traversing down the tree.
I could not find any java library doing this.
I am able to implement such functionality myself (with NekoHTML or something similar), but I don't like reinventing the wheel.
Is there such a library that I can easily use within my application ?
Ideally:
published in Maven Central Repository as I am using Maven for builds
with as few dependencies on other libraries as possible
no need for robots exclusion support - will operate on limited set of interim servers only
Thanks.
Note: please post pointers to homepages of libraries which you personally used.
The Norconex HTTP Collector traverses websites like a tree, given one or more start URLs. It can be used as a java library in your application, or as a command line application. You can decide what to do with each document it crawls. Being a full-blown web crawler, it probably does more than what you are after, but you can configure it to suit your need.
For instance, it will by default extract text found in your documents and it let's you decide what to do with that text via plugging a "Committer" (i.e. where to "commit" the extracted content). In your case I think you want to the raw documents only and ignore the text conversion part. You can do so by plugging in your own document processor, followed by "filtering out" documents so they stop being processed once you have dealt with them your own way.
The project is open-source, hosted on Github and is fully "mavenized". It supports robots.txt, but that can turn that off if you want. The only downside to you is having more than a few dependencies, but since you are using Maven, those should get resolved automatically without effort. You'll find Maven repository info on the product site.

Testing StarTeam operations

In a Java application I need to checkout files from Borland Starteam 2006 R2 using Starteam API by various parameters (date, label). Is there any framework that helps to write automatic tests for such functionality?
I'm not aware of any; the approach i'd take is a project which has sample files you can checkout by various criteria, and then verify everything you expected arrived, and it is the right file (hash matches).
You're aware that they ship a command line client (stcmd) too, right? For a lot of things, you don't need to use the api at all.

Categories