Most performant Bayesian Network Java API available?

Most performant Bayesian Network Java API available? - java

I'm currently planning the implementation of a bayesian network solution for inference of outcome probabilities given known node networks, in a Java application. I've been looking around the web for what Java APIs are available, and have come across a number of these – jSMILE, AgenaRisk, JavaBayes, netica-J, Jayes, WEKA(?), etc.
Now, I'm struggling to find any good comparison of/compare all these APIs in terms of performance, usability and price for commercial applications, to know which one is the best to go for. I have tested out the AgenaRisk API, which suited my needs however before committing to this product it'd be great to hear if anyone has any knowledge of:
variation in performance between different APIs (or is it negligible? i.e. they all rely on identical fundamental Bayesian calculations?)
robust free alternatives to AgenaRisk?
does it matter if one of these solutions seems to no longer be supported/relies on a very old version of Java (e.g. JavaBayes is on Java 1.1 I believe)?
(Bonus points) are Bayesian Networks and Bayesian Network Classifications one and the same thing? As for example WEKA advertises itself as providing the latter.
List item
The last post on here looking for a good solution was from 2012, so I'm wondering if anyone would recommend any new solutions which have emmerged or if it's still a good bet to work with those.
Thanks!

Well for anyone who comes across this and is interested:
Will be figuring this out myself soon and will update this (If I forget nudge me).
Jayes seems to do everything AgenaRisk does if you're simply creating a graph and want to get some beliefs out of it, minus the GUI. Also it seems there you can choose what algorithm to use, which I haven't seen with AgenaRisk.
I'm going to stick to current solutions to be safe
...yes they are

Related

IVR Development in java

I'm going to develop an on-line IVR application using Java (without PBX).
In the software requirements there are some mathematical calculations and database communication which I prefer to implement on Java side.
As you know, different technologies are ready to integrate with Java, such as JTAPI, Zanzibar OpenIVR, Moho, VoiceXML, CCXML, Jive, Prophecy, Voicent, Voxeo etc.
Now the question is: What is the best solution? Which one is easiest to reach? Which one have the best efficiency? Do you recommend Open Source frameworks? Is there any Windows API for handling IVR systems?

If you're going to do VoiceXML with Java, you should take a look at Rivr, an open-source VoiceXML dialogue engine.
Rivr let you code your callflow naturally in the Java language. Thus you can reuse all the available Java tools (e.g. debugger, unit testing framework, coverage test tool) to develop the callflow. You also benefit from all your IDE features too (refactorings, source navigation, version control, etc).
The API is very simple. You can code a complete callflow with a single method. No need to define "states" or to manipulate templates or XML files.
Integration with server-side logic is trivial since you are only coding for the server side.

There is far too little information here to provide a direct answer, but I'll try to give you some basics.
The standards for IVR application development is VoiceXML for dialog (caller interaction) and CCXML for call control. The latter is not as commonly available. There are also numerous proprietary solutions. Your choice of an open standard versus a proprietary solution should be more about vendor/solution lock in. Even with the open standards, you'll likely use custom enhancements and have some amount of lock in, but portability will be easier. You can code directly to the telephony boards (challenging and usually poorly documented if you are someone new to telephony) or work with solutions that provide end to end capability. I find very few people porting IVR applications so I would focus on supportability of your application, features and ease of use in your decision.
Platform choices run the spectrum. You have premise (onsite) and hosted solutions. You mostly have high end enterprise solutions and low end solutions. There are very few middle ground solutions. Features (telephony and integration capabilities) vary dramatically.
From a telephony perspective, take nothing for granted. In particular, transfers. There are many ways to transfer a call. How it is done will be constrained by your connection. An analog line to the CO (phone company) can have multiple mechanisms and the one in place will typically be dictated to you. Not all telephony platforms will support what you need. Hangup detection, at least on analog lines, can also catch the novice out. Hosted solutions will typically allow you to avoid most of these problems. VoIP solutions are even more complicated due to compatibility between devices (yes there are standards, lots of them, with lots of optional parts and then there are custom flavors).
For windows specifically, you can use Lync, but it is complicated...though many of the solutions you will explore will be complicated.
In short, there is no best solution. Your knowledge of the technologies, requirements and budget are going to drive the decision. I've generally worked with enterprise IVRs in on premise and hosted configurations that are typically fronting large call centers. I have come in contact with many of the open source solutions. Anything on premise is likely to be complicated because of the system and telephony configuration. Hosted solutions have typically done most of that for you.

I know that those are "de jure standards". But you should also take Asterisk(with AGI/AMI) as a consideration for your project. If you decide to try Asterisk and Java, take a look of astivetoolkit.org it may be very helpful.

Ricky from Twilio here.
For me, picking the best tool for a particular problem is one of my favorite tasks a developer. One technique to figuring this out is blocking off a day and spending an hour or two with each potential option. A few question I'll typically explore:
Which tool is the easiest to get started with?
Which tool has the best documentation?
Which tool has an engaged community that I can learn from?
I'm sure there are a ton more questions depending on your scenario you'd want to explore (Does it fit within my budget? Can I use it with the technologies I already know and love?).
If you're looking at building an IVR, we have an API that could help. We just dropped some new tutorials including a non-trivial, production ready IVR application using Java.

Drop-in replacement for MICO Corba?

We are currently using MICO to establish the communication between a server and a client, where the client is a simulator written in C++ and the server is a java program displaying an animation of what happens in the simulation. It seems that the developemt of MICO has slowed down to an almost halt and bugs that only allow us to hack around them (as we don't have the time to first figure out which parts of MICO are responsible for codegeneration and so on) keep us from making real progress.
So, does any one of you know of a good drop-in replacement? We would like to have the following:
The compiler can generate both C++ and Java-Code from the idl.
The project should still show considerable support.
The implementation should be open-sourced (GPL or BSD, or something alike), as we use our programs to teach students as well.
The migration from MICO should be as easy as possible. (This is not a hard requirement, but would be a good thing)
I found some other CORBA implementations, e.g. TAO, but I didn't find any of which I could
say that they are still supported. Correct me if I am wrong here.

The Free CORBA® Downloads page might be of interest to you.

Just naming:
orbit2 1, also pyorbit etc.
omniORB
TAO (has already been mentioned)
1 On my Ubuntu box, apt-rdepends -r liborbit2 returns 5530 lines...

I don't know where CORBA or MICO in particular has gone in the last 5 years, but back then a drop in replacement was not really possible, since differences between vendors where still there.
Not necessarily API differences (POA, etc.) but
in implementation behavior,
in custom extensions which were required to make it work in a real-world environment (threading, load balancing, security, etc.),
in how the development tools worked
and also about the whole deployment or runtime story.
We had Orbix ASP/2000/Whatever and ORBacus which were interchangeable having a small compatibility layer, some Makefile framework to hide differences in tools (e.g. the IDL compiler) and some scripts for wrapping ORB specific processes.
Unfortunately, ORBacus has long been bought by (then) IONA, which already make Orbix. IONA itself has been bought by someone else (I forgot). The original authors of ORBacus, plus some devs from IONA Orbix, changed their ways somewhat and produced Ice, which is not CORBA but somewhat alike - of course without the glitches ;-)
Concerning TOA, I think it would be the "best" choice concerning still being developed as it is driven by research on the Washington University. But last time I looked, they didn't have Java implementation but people seemed to use JacORB.
Maybe all this helps you little, unless it brings even more confusion :-)

TAO as C++ ORB is still actively supported and developed (see http://www.cs.wustl.edu/~schmidt/commercial-support.html). For Java I would propose JacORB.
On http://www.orbzone.org there is an overview of available corba implementations.

How can one convince a team to use a new technology (LinQ, MVC, etc )?

Obviously, it's easier to do with some developers, but I'm sure many of us are on teams that prefer the status quo.
You know the type. You see some benefit in a piece of new technology and they prefer the tried and true methods.
Try, for example, DBA/C# programmer the advantages of using LinQ ( not necessarily LinQ to SQL, just LinQ in general ).
For example, When a project requirement is to be cross-platform... instead of thinking about how one can run Windows on a Mac through a VM Machine, introducing the idea of using relatively new Silverlight or creating it in Java ( as an option to look into ).
I know most people don't like to be out of their comfort level, so it takes a bit of convincing, and not ALL new technology makes business sense... but how have you convinced your team to look at a new technology?
What technologies have you successfully introduced to your workplace?
What technologies do you think are hardest to introduce? ( I'm thinking paradigm-shifting ones, like MVC from WebForms... or new languages )
What strategies do you employ to make these new technologies appealing?

Know the technology well before pitching it. You're going to get questions like "but how can we make it do X?", and you want to be able to give at least a general answer.
Try not to be a religious zealot. Acknowledging that the new technology is not perfect, that it's just another tool in the toolbox, goes a long way towards credibility.
Give a well-prepared live demo to show what it can do. For example, a friend of mine built a simple blog in Ruby on Rails in half an hour, in front of a live audience. I want to stress the word "well-prepared"; if things keep breaking along the way, or you don't fully understand what you're doing, or you are unable to answer basic questions, you'll hurt your cause rather than help it.

When it comes to coding practices my favorite is to quite simply use examples. I will take a few hours and edit our code base to use the new technique in place of the previous pattern. Then send a shelveset or changelist around to the rest of the developer list displaying the difference. Or just have a meeting to talk about the difference.
Showing examples in real production code really helps other developers see the advantages.

I've successfully introduced LINQ to my company and it has helped quite a bit.
What worked for me? Show and tell. Our previous technology was database programming with C, which is quite the mess. Our lead developer did about 3000 lines of code to fill a dataset, and I did it in a 10th of that with LINQ/C#.
Once I broke down what I did and he saw how powerful it was, he was convinced it was time to upgrade.

The advice from people who convinced the management to consider using F# goes something like this:
Implement the most important bits of the next key project of the company in F# in your free time and then show others what benefits it has, how quickly you were able to implement it and how easy it is to adapt the solution to changing requirements.
I think this is quite effective way - when people actually see the productivity (of any new technology), it is much easier to convince them that it is worht learning it.

It's best to lead by example. Complete a successful project using the new tool and wait for developers to ask how you did it.

I managed to convince the team Im part of to switch from CVS to Mercurial. Can you believe we were still using CVS? Neither could I when I started.
I became almost somewhat of a preacher, a royal pain in the butt. Everytime CVS screw up or caused some sort of discomfort (being brutally slow, for example) I held a little speech of how much better it could be.
Soon enough they accepted the possibility that there were alternatives (none of them actually knew there were alternatives to CVS!) and began to say things like "if there indeed are alternatives, anything must be better than this".
Thats when I made my move and simply ran some scripts converting the CVS repository into a Mercurial one and uploaded it to the company server. Once they saw it in action, they were sold.
Not that I planned anything during this little migration black ops, but in retrospect I would give the following advice to anyone attempting something similar:
Let people know there are (better) alternatives, it is entirely possible to work outside your comfort zone.
Lead by example, if you want something done, do it yourself. Show the alternative in action. No one is going to make the jump unless you jump first, especially not if they are already hesitating.

Show them how it solves a common problem. Pick out some problem that appears regularly for them and show them the solution. That will usually at least get them thinking about it.

Stand the two technologies next to one another, its safe to assume that advancements have been made and that what you are bringing to the table is better suited to the job at hand.
Put the raw results in front of them and let them decide for them selves!
I work for a data beauro, and until recently the company was hooked on MS Access, which was cumbersom and unfit for the job, after some serious convincing and showing the power of SQL in comparison to Access, its now the weapon of choice.
And it took standing the two techs side by side and allowing the guys up top to see for themselves, the time saved did make business sense!

You'll need to show why it's a BETTER technology (or at least better at something) than the current tool/method in use, and probably significantly so. Otherwise, why go through the effort of learning something new?
Otherwise, convince the boss and then get a mandate ... (though I don't really recommend that if you can't get at least half the team on board).

NetLogo vs. Repast Simphony? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I would like to simulate some scenarios using the multiagent
paradigm, and it seems NetLogo and Repast are the most popular tools for that.
I'd like to know if anyone has had any experience with either one and could tell me more about them? For example, I've noticed that there is a fluxogram-like modeling option for Repast, but I believe it is rather limited. I've looked around the tutorials and documentation in the official site, and the documentation seems to be lacking. While there are some examples with it, I'd say extending it to simulate an ambient which it has not been specifically prepared to seems like an unreachable goal at the moment, despite Repast obviously being very robust and apparently able to handle it, given enough familiarity with it.
On the other hand, NetLogo has more examples and overall I've liked it more for its simplicity, but it seems to be more focused on the simulating propagation of diseases or similar models. I've found a programming book teaching Logo, so I figure it'd be easier to get started with it too.
Currently, I am thinking of simulating botnets and IDSes as multiagents. The problem, however, is that I would have to abstract the network and transport layers to an extent to be able to do it, as well as generate traffic between the nodes. Repast is apparently more fitting for this, but given its complexity and lack of documentation I'm thinking of using NetLogo. While there are some examples of NetLogo with traditional applications (ex: Tetris or Pac-Man), I'm not sure about how appropriate it'd be for that.

I have a webpage with a couple dozed netlogo multiagent simulations. I use netlogo for teaching and I have found that, once you get past the learning curve, you can develop simulations amazingly fast. Stuff that would take you 80 man-hours in other so-called agent environments (Jade, Repast, which are really mostly just programming libraries) can be done in 2 hours.
On the other hand, netlogo is not really good for simulations that require immense amount of details, like say simulating a network all the way from TCP/IP to HTTP. That would just require large amounts of code, regardless of programming language, and netlogo currently sucks if your program ends up being more that 10 pages long. Having said that, most people would be amazed at what you can get done in 10 pages of netlogo code.

Short answer: it depends on the programming paradigm or language you want to use, and the design you want for your agents:
If you want a low-entry-high-ceiling language allowing quick prototyping but sophisticated simulations, and are willing to learn a new paradigm (avoiding loops) use NetLogo. Good documentation.
If you want to make a real application to use on highly-parallelized clusters or just want to use Java Groovy or need a specific Java library for your purpose, use Repast or better Repast for High Performance Computing (but avoid ReLogo which is very slow). Mild documentation.
If you want to model cognitive agents (instead of reactive) with FIPA communications, better use Jason or better JaCaMo which supports AgentSpeak + Java (so you can also use your favourite Java libraries), and there's no Groovy required. Bad documentation (a lot of non detailed features and commands and bad too-complex-not-commented examples).
Long answer:
Disclaimer: I am more experienced with NetLogo but I also used Repast and a few others like Jason.
Basically, the difference between NetLogo and Repast is that with NetLogo you will have a simpler framework but you'll need to learn how to program in a turtle-and-patch-oriented paradigm, while in Repast you will have to learn that + the mechanisms behind Java Groovy but you will eventually get more flexibility. Speed isn't really a criteria here (see below).
To be more clear, you can program efficiently in NetLogo if you use to a maximum the turtles and the patchs native functions. For example, if you want to implement A*, instead of implementing a list of nodes, you should directly use the patchs and filter them using stuffs like this:
ask patchs with [criteria1 = value and criteria2 = value2] [do-some-stuff]
ask patchs with-min [criteria][do]
let var [somevalue] of min-one-of patches [criteria]
Also if you can't find a way to efficiently do what you want, be sure to check if maybe an extension exists (check also here under Libraries and Tools) for your purpose, like the now native matrix extension which allowed me to make an efficient neural network in NetLogo.
On the other hand, Repast is potentially more flexible than NetLogo (since you have access to the whole range of Java libraries), but a bit more complex since you have to know how to handle Groovy.
If you are solely interested in speed, do NOT use ReLogo (NetLogo-like syntax for Repast) which has been shown to be a whole lot slower than NetLogo (see the 2012 paper below). In any cases, your best bet would either to try an implementation with NetLogo using the tricks above, or if you want to use your application for real later, there is also a distribution called Repast for High Performance Computing which removes most of the overload that come with turtles and patchs objects, and thus it can be used for real applications. A similar extension exists for NetLogo to compute in clusters with parallelization but it's not an official distribution.
If you want more infos about the diverse platforms, here is a nice review of 2006:
Railsback, S. F., Lytinen, S. L., & Jackson, S. K. (2006). Agent-based Simulation Platforms: Review and Development Recommendations. SIMULATION, 82(9), 609-623.
And an updated version of this paper in 2012 dealing with NetLogo vs ReLogo:
Lytinen, S. L., & Railsback, S. F. (2012, April). The evolution of agent-based simulation platforms: A review of netlogo 5.0 and relogo. In Proceedings of the Fourth International Symposium on Agent-Based Modeling and Simulation.
/EDIT: I cited Jason but didn't give any more details. If you want to model cognitive agents (instead of reactive agents), you can do that in NetLogo using the unofficial BDI extension which works well but is a bit limited (but it's easily extensible since it's pure NetLogo), but your best bet is to use a framework specifically designed to model cognitive agent with full support of AgentSpeak.
Jason is very nice since you have access to a full AgentSpeak language + JAVA to implement the technical side. In fact, you can do whole projects using only AgentSpeak (which I did), but you can also make more Java-oriented versions, it's up to you how you want to design your program, the result will be more or less the same. This offers you a lot of flexibility in your design workflow.
Tip: search for "Jason internal actions" in the documentation to get a good description of the available AgentSpeak commands.
Also if you are interested in Jason, you might be interested in JaCaMo (= Jason + Cartago + Moise) which is the result of a cooperation of three projects authors to make a full-fledged cognitive agents framework which also can model complex environments (with artifacts theory) and multi-agents organisations (roles, groups, missions, etc.).
A last framework I know of but didn't have a chance to try is Mason which supports 2D and 3D environments. Never had a chance to try this one so I don't know how this compares with the others but you can try it out.

Here's a generic comparison.
http://www.duncanrobertson.com/research/AMLE.pdf

I had more or less the same problem a few months ago when I had to choose a framework for my simulation. I look at Repast, NetLogo, Swarm and Jade.
NetLogo was nice and I tried to write some simple test applications but since I wanted to use Java as my programming language, NetLogo wasn't the best candidate. Repast has pretty much everything you need to write larger simulations and there are many projects (especially in social sciences) where Repast is used. My problems with Repasts were: bad API documentation, parameters that are passed to methods or constructers that are never used and don't make any sense at all (have a look at the source code) and a lot of boilerplate code.
I'm using Jade (http://jade.tilab.com/) now and I'm really happy with it. The community is good and their mailing list is VERY active. Okay, Jade is just a library and a framework for agent-based modelling. You don't get anything like those visual editor in Repast and you'll have to write your own tool for visualising the results.
Cheers

You could simulate the traffic using a agent type called "packet" that will be spawned and send from a agent called "bot" to another agent called "bot" or "server". Instead of sending the packets to a IP address, you would be sending them to a pair of X and Y coordinates.
Netlogo has an example of how a virus spreads in a network, this might be a good starting point.

I have never tried NetLogo, but have I tried Repast-J and Simphony. It seems Simphony is good, but at the moment I am stuck at changing the Edge type from straight line to curved one. There is not enough documentation and examples available.
Once I tried Mason which is based on java, too. It is similar to Repast-J, yet it was faster. But recently there is not much development in Mason.
I would like to try out Jade later.

If you can already code in Java, you can also look at the following paper for a comparison between RePast, Swarm, Quicksilver, and VSEit, different freely available programming libraries for support of social scientific agent based computer simulation
Tobias, Robert, and Carole Hofmann. "Evaluation of free Java-libraries for social-scientific agent based simulation." Journal of Artificial Societies and Social Simulation 7.1 (2004).
Repast is definitely more flexible than NetLogo but the documentation is not very detailed for RePast Symphony

Using SIFT for Augmented Reality

I've come across MANY AR libraries/SDKs/APIs, all of them are marker-based, until I found this video, from the description and the comments, it looks like he's using SIFT to detect the object and follow it around.
I need to do that for Android, so I'm gonna need a full implementation of SIFT in pure Java.
I'm willing to do that but I need to know how SIFT is used for augmented reality first.
I could make use of any information you give.

In my opinion, trying to implement SIFT for a portable device is madness. SIFT is an image feature extraction algorithm, which includes complex math and certainly requires a lot of computing power. SIFT is also patented.
Still, if you indeed want to go forth with this task, you should do quite some research at first. You need to check things like:
Any variants of SIFT that enhance performance, including different algorithms all around
I would recommend looking into SURF which is very robust and much more faster (but still one of those scary algorithms)
Android NDK (I'll explain later)
Lots and lots of publications
Why Android NDK? Because you'll probably have a much more significant performance gain by implementing the algorithm in a C library that's being used by your Java application.
Before starting anything, make sure you do that research cause it would be a pity to realize halfway that the image feature extraction algorithms are just too much for an Android phone. It's a serious endeavor in itself implementing such an algorithm that provides good results and runs in an acceptable amount of time, let alone using it to create an AR application.
As in how you would use that for AR, I guess that the descriptor you get from running the algorithm on an image would have to be matched against with data saved in a central database. Then the results can be displayed to the user. The features of an image gathered from SURF are supposed to describe it such as that it can be then identified using those. I'm not really experienced on doing that but there's always resources on the web. You'd probably wanna start with generic stuff such as Object Recognition.
Best of luck :)

I have tried SURF for 330Mhz Symbian mobile and it was still too slow even with all optimizations and lookup tables. And SIFT should be even more slow. Everyone using FAST for mobiles now. Anyway feature extraction is not a biggest problem. Correspondence and clearing false positive in it is more difficult.
FAST link
http://svr-www.eng.cam.ac.uk/~er258/work/fast.html

If I where you, I'd look into how (and why) the SIFT feature works (as was said, its wikipedia-page offers a good cochise explanation, and for more details check the science paper (which is linked to at wikipedia)), and then build your own variant that suits your taste; i.e. has the optimal balance between performance and cpu-load, needed for your application.
For instance, I think Gaussian smoothing might be replaced by some faster way of smoothing.
Also, when you build your own variant, you don't have anything to do with patents (there already are lots of variants, like GLOH).

I would recommend you to start by looking at the features already implemented in the OpenCV library, which include SURF, MSER and others:
http://opencv.willowgarage.com/documentation/cpp/feature_detection.html
This might be enough for your application and are faster than SIFT. And as mentioned above, SIFT is patented.
Also, start by making performance tests in your mobile platform, just by extracting the features at every frame, this way you'll have an idea which ones can run real-time or not.

Have you tried OpenCV's FAST implementation in the Android port? I've tested it out and it runs blazingly fast.
You can also compute reduced histogram descriptors around the detected FAST keypoints. I've heard of 3x3 rather than standard 4x4 of SIFT. That has a decent chance of working in real time if you optimize it heavily with NEON instructions. Otherwise, I'd recommend something fast and simple like sum of squared or absolute differences for a patch around the keypoints which are very fast.
SIFT is not a panacea. For real time video applications, it's usually overkill.

As always, Wikipedia is a good place to start from : http://en.wikipedia.org/wiki/Scale-invariant_feature_transform, but note that SIFT is patented.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.