Java: Text to Speech engines overview [closed] - java

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I'm now in search for a Java Text to Speech (TTS) framework. During my investigations I've found several JSAPI1.0-(partially)-compatible frameworks listed on JSAPI Implementations page, as well as a pair of Java TTS frameworks which do not appear to follow JSAPI spec (Mary, Say-It-Now). I've also noted that currently no reference implementation exists for JSAPI.
Brief tests I've done for FreeTTS (first one listed in JSAPI impls page) show that it is far from reading simple and obvious words (examples: ABC, blackboard). Other tests are currently in progress.
And here goes the question (6, actually):
Which of the Java-based TTS frameworks have you used?
Which ones, by your opinion, are capable of reading the largest wordbase?
What about their voice quality?
What about their performance?
Which non-Java frameworks with Java bindings are there on the scene?
Which of them would you recommend?
Thank you in advance for your comments and suggestions.

I've actually had pretty good luck with FreeTTS

Google Translate has a secret tts api:
https://translate.google.com/translate_tts?ie=utf-8&tl=en&q=Hello%20World

Actually, there is not a big choice:
Festival, most old. Written in C++ but has bindings to Java.
eSpeak, quick and simple, used by Google Translate
mbrola
Pure Java:
FreeTTS, which code was ported from Festival, and then was open-sourced and development was stopped.
MaryTTS - more powerful and looks production ready.
Also there is other proprietary programs like:
Acapella
Nuance Vocalizer
If your software is Windows only, you can use Microsoft Speech API.

I've used Mary before and I was very impressed with the quality of the voices. Unfortunately, I haven't used any of the other ones.

I've used AT&T Natural Voices which provides JSAPI and MS SAPI hooks. It provides excellent quality voices, a good "general" speech dictionary, many controls over pronunciation, and multiple languages. It's a little pricey, but works very well.
I used it to read important sensor telemetry to drivers in a mobile sensor application. We had no complaints about the voice quality. It had about 75% out-of-the-box accuracy with scientific terms and a much higher (maybe 90%+) with normal dialogue. We got it up to about 99+% accuracy by using markups (most errors were on scientific terms with unusual phoneme combinations).
It was a bit hard on the processor (we were running on a Pentium-III equivalent machine and it was pushing 50%-75% peak CPU). This uses a native speech engine (Windows, Linux, and Mac compatible) with a Java interface.
There's a huge variety of voices and languages...

I used FreeTTS but had a major problem getting the MBrola voices to run on My MacbookPro. I did get MBrola voices to run on Windows (painfully) and Linux. I've had no luck loading any other voice packages on FreeTTS which is a shame because the supplied voices are horrible IMO. Outside of that I had a little success with Cloudgarden as well but that only runs on Windows AFAIK. I'd be interested to hear others successes/failures with Voice engines as this type of work is particular challenging. I'm also toying a bit with Sphinx4. I just pulled down JVXML (which appears to be based on Sphinx4) last night but could not get it to run for some strange reason.

I've contributed to mary. I feel it has potential if someone smarter than me separated the HMM voices out of the core (those voices don't need large data sets and sound ok). I'm also trying to do a event system to freetts to send events when it says a word. I've had success, but it is broken in linux now. (probably because of a timer bug).

Thanks a lot everyone, the trick is in FreeTTS source. Briefly: if being run as java -jar freetts.jar some-more-args-here, it spells lesser words than when being executed in a manner of bin/Server.jar and bin/Client.jar.

I found little comfortable with MarryTTS It has multilanguage and clear voice to understand.
T convert speech to text, the better optiion is sphinx4-5prealpha.
I give one thumb, because it has adjustable, flexibility and modifiable recognizer and grammer.

Related

Java for embedded systems? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have recently learned some basic Java and was thinking of seeing if I can use these new skills for an embedded computing project.
I have been looking around but I can't seem to be able to find any microcontrollers that are capable of running Java.
Does such a thing even exist?
Because of Java virtual machine architecture, you need considerable resources just to run the JVM. The path of least resistance to getting an JVM is probably to run an OS on the target that already supports it such as Linux, but that itself requires relatively huge resources.
There are a few stand-alone JVMs that either work bare-metal or integrate with and embedded RTOS for threading support. I compiled a list a while ago in an answer to a similar question, but some of the links are now out of date.
Running Java on an embedded system will certainly hit performance, and is probably not suited to hard real-time applications without a great deal of care.
Microcontrollers are not made for this use. Controllers called "mini computers" can embed JAVA applications (Raspberry PI, BeagleBone, Intel Edison, etc, because they embed an OS, and so can use JRE).
For microcontrollers, C/C++ are really better and more reliable.
Microcontrollers are meant for real low level - they normally don't have much functionality and won't have enough memory/processor speed to run JAVA.
Most entry level microcontrollers use C/C++ and maybe even their own variant of it.
Arduino/Atmega uses Haiku VM to run java. Using the haiku VM you can compile your JAVA code into C - and this will be programmed into Arduino. This makes debugging a little difficult, but it's not that bad - and hey, a high level language like JAVA cuts down your coding time a lot. Issue with this is normally your memory will get over soon, and you cant write huge pieces of code.
PIC - Muvium claimed support for PIC, but they stopped supporting it after a while and have closed down now. I don't think PIC has JAVA support for now.
Renesas is another popular microcontroller provider which has it's own SDK called MicroEJ for java o n RX and RZ boards of theirs. I've never used it, but their boards boast more RAM and flash memory - which helps a lot.
Single board computers (basically, a microcontroller/processor which is more powerful + has more peripherals) is useful when using JAVA for embedded programming. The two most popular ones are Beagle bone and Raspberry Pi. These are basically computers on a chip - and can run a full fledged ARM Ubuntu + Java/Python/any other language.
The easiest to use is Raspberry Pi (in my opinion) - which has huge community support.
Recently I started working on a CM12001/1000000 board that runs Java. It contains two controllers on the same board. Currently I do not have much knowledge of this thing. I will update the answer as soon as I get more info.
To answer your question: Yes, such a thing exists, but it is quite rare. Python however is gaining popularity in the embedded field recently using MicroPython, that includes a small subset of the Python 3 standard library and is optimized to run on micro-controllers.
Edit: ATOP Modules from Telit provide such functionality. Generally they have good amount of both RAM and Flash(A few MB to a few hundred MB). They run Linux over which they load JVM (as pointed by Clifford). Telit provides Java APIs to so control stuffs like GPIO (though very limited) and do stuffs like serial communications, GPS, GSM control etc.
Yes, microcontrollers that are capable of running Java on the bare metal exist
But JVM on this microcontrollers optimized for speed, and low memory usage. It's mean optimized JVM have some limitation instead of regular JVM, it's like Python and MicroPython
But pure code on Java allow you to transfer code with ease from the desktop to a microcontroller or embedded system
For self education with embedded computing project you can try to use Javaino
allow execute Java programs on this development board, read data from sensors via i2c, UART, etc like Arduino

Restful web application, Java or PHP? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm creating an Android application which will access data using a restful web application.
I have quite a lot of experience with Java, but minimal experience with PHP.
Having looked online, it is difficult to determine which language is more suitable, scalable, portable etc.
I'm hoping that the Application may one day have many concurrent users and therefore I need the most suitable option.
If anyone has any experience writing a web application in either language, I'd be really interested to hear about your experiences, and any problems you faced.. i.e. for a java web application do you need a tomcat server or another embeddded to server for it to be able to run?
Thanks, for any answers, Matt.
If you already have experience with java, I would suggest you use the following to build your REST services: http://restlet.com/
Its very easy, and efficient. The performance is very smooth. For PHP, you will have some learning curve, and also there is no standard. Mixing java with PHP is like combining a VERY STRICT LANGUAGE (java) with a VERY LENIENT LANGUAGE (PHP). So its safer to be on the same language.
Tutorials:
Official tutorials to get started: http://restlet.com/learn/tutorial/2.2/
Good step by step tutorial with screenshots and code snippets: http://java.dzone.com/articles/restlet-framework-hello-world
Short:
Take JAVA!
Always choose the language you are comfortable with. Also I think Java is better suited in the end.
PHP isn't my favorit. Most of the people like it, because it is easy to start with. (It was also for me the second (non Browser) language I touched.)
Framework Tips
WebFrontend: Play Framework
Back End&Scaleability: AKKA
JSON: Gson
Long:
Scaleability in the meaning to scale to lots of concurrent users:,
is more a architectonical issue, as a question for the right language. You can write scaleable software in any language. The difference isn't the scaleability of a language, but it could be the performance. One language will take longer for the same task as the other one. But you could always throw more Servers in, to scale out.
Architectures to consider if you want to scale out, are in my opinion message based designs. My favorite is the actor model, there is a very good framework for that in Java, the akka framework (production proved). But I think you first should get your software running. If you get enough users... scalability problems are the problems you like to have (they mean you have users).
Scaleable doesn't only mean, that you can scale to many concurrent users. But the ability, to handle the complexity of the software or can handle concurrent development and so on (your team will grow, thats also a problem to handle). In this topics Java is as clearly static typed OOP language, better suited.
Also the performance will not be as good as in Java (it is a interpreted language). But there are always options. Facebook started with PHP. In an interview one of the lead developers, told that PHP isn't that scaleable, because PHP wasn't designed for OOP. But the performance issue was handled, through writing a compiler for PHP (outputs C++). [If if find the link I will post it] .
Update the PHP Compiler is Called HipHop and it uses HHVM (Hiphop virtual machine), Facebook developed it after excessive CPU usage
You can consider looking at https://jersey.java.net/ As a web container you can use anything like Tomcat. I have used Google App Engine in the past.
To get started quickly with Java look into http://dropwizard.io/, using less EE frameworks and more standard Java.
Has Jersey for REST and is supereasy to run.

Java OCR library recommendations? [duplicate]

This question already has answers here:
Java OCR implementation [closed]
(5 answers)
Closed 9 years ago.
I need to check a tonne of pictures to see if they have a keyword on them. Can anyone recommend a good, reliable OCR library? I'll happily sacrifice speed for accuracy.
There is no pure Java OCR libraries that have something to do with accuracy. Depending on your budget you may choose something that is not purely Java, but can be called from Java:
If you have plenty of time but zero budget - your choice is Tesseract. It is definetely the best among open source
If you have small budget to spend and you only need run this recognition once - Cloud OCR API service would be your best choice. It is based on leading commertial grade OCR engine and offers quite affordable per-project prices. Disclaimer: I work for ABBYY
In case you will need to run this recognition as ongoing process forever, then you may think that it is economically more efficient to purchase dedicated conversion software, for example this one, it has API and can be called from Java too. But there are actually lot of alternatives, if you are prepared to invest some budget in licensing.
If you have plans for recognize not Latin or digit symbols then better way find non java library, but select from some (external) tools and use other ways(1) for get your text.
On Linux I have used cuneiform(2) via command line interface.
command line interface and pipe, for example.
cuneiform have ported on Linux but I don't know about work command line interface for Windows

Starting out NLP - Python + large data set [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I've been wanting to learn python and do some NLP, so have finally gotten round to starting. Downloaded the english wikipedia mirror for a nice chunky dataset to start on, and have been playing around a bit, at this stage just getting some of it into a sqlite db (havent worked with dbs in the past unfort).
But I'm guessing sqlite is not the way to go for a full blown nlp project(/experiment :) - what would be the sort of things I should look at ? HBase (.. and hadoop) seem interesting, i guess i could run then im java, prototype in python and maybe migrate the really slow bits to java... alternatively just run Mysql.. but the dataset is 12gb, i wonder if that will be a problem? Also looked at lucene, but not sure how (other than breaking the wiki articles into chunks) i'd get that to work..
What comes to mind for a really flexible NLP platform (i dont really know at this stage WHAT i want to do.. just want to learn large scale lang analysis tbh) ?
Many thanks.
NLTK is where you should start from (it's Python-based -- not sure why you're already thinking about parallelizing your processing at such an early stage... start with a more flexible experimental setup, is my advice). sqlite should be fine for a few GB -- if you need more advanced and standard SQL power you could consider postgresql.
There is a related talk on PyCon 2010 "The Python and the Elephant: Large Scale Natural Language Processing with NLTK and Dumbo".
The link has introductory information, slides and video.
I think sqlite is still a good choice for 12G size data. I have a text classification training set which has the similar size, both sqlite and plain text is fine as long as just iterator it line by line.
It is most likely that you are going to use Vector Space Model to represent the text while doing the anlaysis.
In which case, you should look at platforms that can help you store term vectors with term frequencies. It makes your life so much easier.
Have a look at Apache Lucene which has a python library to access Java Lucene. Elasticsearch is also a good alternative, which uses Apache Lucene underneath and has a really good python package. Elasticsearch also exposes a REST API.
Postgresql is also really good at storing tokens. Check out this article to learn more.
I have worked with sizable language data before and I personally prefer Lucene/Elasticsearch for analysis projects.
Cheers.
Summary from the internet:
Spacy is a natural language processing (NLP) library for Python designed to have fast performance, and with word embedding models built in, it’s perfect for a quick and easy start.
Gensim is a topic modelling library for Python that provides access to Word2Vec and other word embedding algorithms for training, and it also allows pre-trained word embeddings that you can download from the internet to be loaded.
NLTK details already given above.
Standford NLP has recently launched 50+ langauge supported python framework. You should check it out for sure.
There are many others but the above 4 are most usable in the sense of community support and latest features
I personally prefer Spacy.
Spacy is one of fastest of all and can use gensim/other APIs integrated into its model.
Moreover, Spacy models has a lots of languages in its alpha stage making it a perfect choice for multilingual apps.
Scaling is whole different thing[you can use alot of tools].But lets stick to scaling in NLP: Spacy gives so much control over different pipelines that you can disable unwanted pipelines making it faster.
Look into it try yourself and explore.

Pros and Cons of JavaFX and Silverlight [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I know there is already a question about the performance of Flex, JavaFX, and Silverlight. My question is a bit more broad:
We are evaluating the merits of JavaFX and Silverlight to serve as the GUI technology that controls/configures our back-end service (currently written in Java). The service and GUI are usually on the same machine, but remote management (via browser) must also be supported. We are currently split into 2 teams: one .Net and one Java, although the Java developers also have some .Net experience.
As I see it, the pros & cons are currently the following:
Silverlight 4:
Pros:
Excellent IDE integration
Good developer-designer workflow
Performance
Extensive 3rd-party support (tools, controls, etc)
Lots of momentum and drive from Microsoft
Very good out-of-browser integration
Cons:
Only supported on Mac & Windows (Moonlight support is not up to standard)
JavaFX 1.2:
Pros:
Wide platform support
Cons:
Only supports Latin character sets (at this time)
Fewer designer tools
Little or no out-of-browser integration. Update: apparently there is out-of-browser support.
Performance (at least on the demos I viewed at www.javafx.com)
Maturity
Please let me know if I'm missing anything or mistaken about something, and what else I haven't considered. We also looked at Adobe AIR, but ruled it out because all our developers already have experience in Java and/or .Net.
Please don't start any flame wars here. This is not a religious question, and I really would like some practical advice and facts.
I have been playing around with JavaFX the last months and i would not recommend anyone to start using it unless the limitations (like lack of Linux support) are too harsh. The IDE support for JavaFX is ridiculous at the moment. You have no refactoring help, no autoformat and not even help with indentaion.
I like JavaFX and will continue to play around with it, but for 2 equally good languages, the huge IDE different is hard to overcome.
Silverlight has got Expression Blend as well, for (kind of) WYSIWYG.
I think a solution in JavaFX would be better, but creating it will probably be alot more difficult.
Several thinks about JavaFX.
Only supports Latin character sets (at this time) (false) JavaFx uses standard Java string representation and also rendering is fully capable to handle non Latin characters.
Fewer designer tools (true) but take a look at newest NetBeans (more # link text)
Little or no out-of-browser integration (false) JavaFX runs in web/desktop, mobile and new platforms are planned.
Performance is improving with each release.
Maturity has same as Silverlight, but with better market share based on installed JVMs.
Your evaluation of JavaFX is kind of wrong.
I've been developing some materials in JavaFX recently.
The performance of JavaFX has improved markedly over the last 6 months (between 1.0 and 1.2), and is supposed to improve yet again with the 1.3 release.
"Out of browser integration" is essentially JNLP (ie, Web-start). It's perfectly reasonable from what I can tell. For instance, WidgetFX have written a Vista/7-like desktop sidebar entirely in JavaFX http://widgetfx.org/
There is supposedly "momentum and drive" from Oracle -- Larry Ellison has been publicly enthusing about it -- but that of course is held up by the EU's investigation of the Oracle-Sun merger.
Note that JavaFX does not use Java syntax. It is, however, a very concise and quick language to write a GUI in, but does have a (relatively short) learning curve of its own. It can however include any Swing components (and there are quite a few libraries of them out there), and can use Java classes.
I wanted to expand a bit on your point about the IDE and dev/designer workflow - I've been working with Silverlight for a year and half now, and I have to say the key to my success has been the tooling. On the dev side the ability to step through code in the debugger from client side to server side across a web service call is very helpful. We've hired designers with experience in the Adobe toolset and seen them become immediately productive in Blend (animating UIs, transitioning screens, hiding/showing elements, etc). Couple that with the fact that both Visual Studio and Blend can share the same source control system and you've got a great ecosystem for rapidly pulling together good looking web apps.
One other pro for Silverlight is the language independence. If you choose C# you also get LINQ, lambda expression and (soon) parallel foreach loops.

Categories