Calling java code from python - java

I am trying to optimize the performance of some natural language processing in a python project I am currently working on. Basically I would like to outsource the computationally intensive parts to use apache OpenNLP, which is written in Java.
My question is what would be the recommended way to link Java functions/classes back to my python code? The three main ways I have thought about are
using C/C++ bindings in python and then embedding a JVM in my C program. This is what I am leaning towards because I am somewhat familiar writing C extensions to python, but using a triangle of languages where C only functions as an intermediary doesn't seem right somehow.
using Jython. My main concern with this is that CPython is the overwhelmingly popular python implementation as far as I know and I don't want to break compatibility with other collaborators or packages.
streaming input and output to the binaries that come with OpenNLP. Apache provides tokenizers and such as stand-alone binaries that you can pipe data to and from. This would probably be the easiest option to implement, but it also seems like the most crude.
I'm wondering if anyone who has experience interfacing python and java knows how much the performance is likely to differ between these options, and which one is "recommended" or considered best practice in such a situation - or of course if there is an entirely different way to do it that I haven't thought of.
I did search SO for existing answers and found this, but it's an answer from 3.5 years ago and mentions some projects that are either dead, hard to integrate/configure/install or still under development.
Some comments mentioned that the overhead for all three methods is likely to be insignificant compared to the time required to run the actual NLP code. This is probably true, but I'm still interested in what the answer is from a more general perspective.
Thanks!

Consider building a java server with existing language independent RPC mecahnism(thirift, ....). And use python as the RPC client to talk with the server. It has loose coupling。

Related

What are the possible approaches to Common-Lisp / Java Interoperability?

So... in an attempt to use preexisting wheels, rather than reinvent my own at every turn, I've been trying to get a decent Common Lisp environment working with [a particular Java's library]. My ABCL adventures actually went reasonably well and I was able, eventually, to get ABCL talking nicely to [it]. Of course I wanted more than just that, I wanted interoperability between the [it] and my half-round wheel, chemicl, a cheminformatics package I started writing in Common Lisp. This is where the train began to fall of the tracks.
ABCL and cxml-stp
A while back, in an earlier, aborted attempt to get some of my chem/bioinformatics (https://github.com/slyrus/cl-bio) stuff working with ABCL I noticed that plexippus-xpath couldn't be loaded into ABCL. This was fixed, so I was encouraged that things might work with ABCL. However, cxml-stp seems to break ABCL.
Hopefully this is a fixable bug and some future version of ABCL will work with cxml-stp.
In the meantime...
Other CL and Java
So, I figured I'd try some other approaches to getting Java and a Common Lisp implementation to play nice. I know, you're thinking "why doesn't the dude just use clojure? After all, that's what clojure was designed for!" Well, that's a good question. I did use clojure for some earlier explorations with [this Java library] and, while the java integration generally works well, I have a bunch of existing Common Lisp code I'd like to use and, at the time at least, it seemed like all of the clojure wrappers where thin wrappers around ugly Java libraries. I've grown to know and love many Common Lisp libraries, many of which are nicely available in QuickLisp, and I'd like to be able to use those (things like cxml-stp, plexippus-xpath, opticl, etc...).
Clozure Common-Lisp (CCL), for five years now, has shipped with a fully ported distribution of JFLI (JFLI previously depended on the LispWorks FFI) as a standard component of the "examples" provided with the CCL source distribution. JFLI (by Rich Hickey, creator of Clojure) uses an in-process model and will likely be at least an order of magnitude more performant than anything you might put together from the model employed by Hickey's next attempt, a more widely compatible socket-based solution he named FOIL.
Have look at the following URL to browse the current JFLI source code as it currently exists in the Clozure development trunk:
https://github.com/Clozure/ccl/tree/master/examples/jfli
Rich Hickey introduced JFLI with the following summary of the approach he had taken
(Substitute CCL's FFI where he references LW-FFI obviously):
My objective was to provide comprehensive, safe, dynamic and Lisp-y access to
Java and Java libraries as if they were Lisp libraries, for use in Lisp programs,
i.e. with an emphasis on working in Lisp rather than in Java.
The approach I took was to embed a JVM instance in the Lisp process using JNI. I
was able to do this using LispWorks' own FLI and no C (or Java! *) code, which
is a tribute to the LW FLI. On top of the JNI layer (essentially a wrapper
around the entire JNI API), I built this user-level API using Java Reflection.

Python egg that makes Java calls to a running Java application

I am making a proposal for a Python adapter to the Oracle NoSQL Database. The Oracle NoSQL Database runs as a stand alone java application, and at least in a Java program, you interface with it by telling your program the hostname and port to connect to, and some configuration settings. Then you make java calls off of the "kvstore" object that contains that configuration.
I'd like to make a Python library that essentially exposes Python versions of the java methods Oracle NoSQL has, and converts those to Java to speak with the running Oracle NoSQL application, but I'm not sure what technologies would be best to be able to do that.
Does anyone know what technology I would want to use? I'd rather not use Jython (so the application could run in a standard Python environment) or JNI (as it seems to have some nasty caveats.)
EDIT: The only potentially technology I've found so far is: Jpype
Would it work for me?
Also, here are the ideal requirements the library would have. I would consider using Jython or JNI if one of them really did best match these requirements.
Performance. The main benefits of Oracle NoSQL are performance and scalability, so that would be the most important component for the adapter.
Easy to implement for the Python users. In order for the library to actually be used by Python programmers, it would have to relatively easy for them to use in a natural sort of way.
Reliability. It would need to be possible for it to be trustworthy and bug free, while working on the platforms you naturally expect Python to work on. (This is what made me concerned about JNI. It sounds like it is platform dependent for its implementation, and can be error-prone.)
Development speed. The last point of importance is that it be relatively fast to develop. The team of developers would enjoy learning Python or C, but we know Java better than any other programming langauge right now.
I've tried to answer each point in order, with it's own related notes.
Performance: My opinion is that JNI would be the winner here, but I could be wrong, because Jython could be JITed as well.
Easy to Implement: I am taking this as you mean easy to consume the library? That would depend entirely on how you build the API, 1 to 1 method calls, object handles, etc.
Reliability: Jython is the clear winner here, because there is no room for error when you merely instantiate POJOs/ directly access the API right in the client code.
Development Speed: JNI can be very tedious. You are essentially learning CPython modules, C, Python extensions, JNI and then referencing an existing API built in Java.
All in all, if you can make the jump, I think you'd get more benefit in the short term embedding Jython, mostly because there you can directly manipulate the API. I have personally embedded IronPython in a .NET codebase with good success. Yes you lose native speed, but the tradeoffs are hard to justify with the amount of C coding needed for a working JNI bridge. That said, you may well find projects like the one you listed (Jpype) that can do much of the legwork for you.
I would be asking what features of a native CPython runtime I need that you lose when going to Jython. Is your existing codebase in CPython heavily reliant on CPython native features?
Anyway, there's my attempt at an answer.

ML/Data Mining/Big Data : Popular language for programming and community support

I am not sure if this question is correct, but I am asking to resolve the doubts I have.
For Machine Learning/Data Mining, we need to learn about data, which means you need to learn Hadoop, which has implementation in Java for MapReduce(correct me if I am wrong).
Hadoop also provides streaming api to support other languages(like python)
Most grad students/researchers I know solve ML problems in python
we see job posts for hadoop and Java combination very often
I observed that Java and Python(in my observation) are most widely used languages for this domain.
My question is what is most popular language for working on this domain.
what factors involve in deciding which language/framework one should choose
I know both Java and python but confused always :
whether I start programming in Java(because of hadoop implementation)
whether I start programming in Python(because its easier and quicker to write)
This is a very open ended question, I am sure the advices might help me and people who have same doubt.
Thanks a lot in advance
Unfortunately, it seems to me that the reigning language is MATLAB... I say unfortunately because I neither like nor use this language, I'm much more likely to program in C++/Java. But Data Miners and Machine Learning persons around me tend to stick to MATLAB...
Edit : I've just read a really interesting line in Wikipedia's page on R :
According to Rexer's Annual Data Miner Survey in 2010, R has become
the data mining tool used by more data miners (43%) than any other.
I'm not experienced in Java and Hadoop but I used both Python and MATLAB for machine learning stuff and I use MATLAB more often now. Actually, the important factors for my case are as follows:
Almost all of my colleagues use MATLAB and C++, and very few of them use Python. Their Python usage is limited to general scripting, not particular machine learning stuff. So, when I use Python, the only way to get help is web and we face problems to share code within the lab.
The IDE of MATLAB and its extensive documentation makes it powerful for my case.
You can handle large data sets in MATLAB. link 1 link2
There are many machine learning/data mining libraries written in MATLAB, and most of the libraries written in C++/Java have MATLAB wrappers.
Some points are also true for Python. But as I mentioned, the community I work in plays an important role in deciding the language.
R is an excellent candidate for data mining (certainly) and machine learning as well.
(Generalizations, of course.)
Java and Hadoop are really meaningful in context of seriously big data and/or scaling requirements. Java gives you the libraries and and an army of programmers. Hadoop gives you fairly painless distribution and a growing knowledge base of mapping various algorithms to the framework.
Python seems to have the academics on its side, specially recent graduates who are now active and influential in the professional practice. Also, if you just want to try out stuff, an expressive dynamic language like Python obviously will prove to be quite useful.
Then there is R. (There is a lot more, but this is the extent of my knowledge /g/)
I think besides the obvious focus on data that R brings to the table (and thus a community of data geeks to help out with the science part as well), it is a delightfully lightweight system and not too shabby at all in terms of libraries as well.
That said, one would think the (~) functional languages (Scala, Clojure on JVM; Haskell, etc.) would be quite a good fit for manipulating data and working on huge datasets.
I think in this field most popular combination is Java/Hadoop. When vacancies requires also python/perl/ruby it usually means that they are migrating from those script languages(usually main languages till that time) to java due to moving from startup code base to enterprise.
Also in real world data mining application python is frequently used for prototyping, small sized data processing tasks.
Python is gaining in popularity, has a lot of libraries, and is very useful for prototyping. I find that due to the many versions of python and its dependencies on C libs to be difficult to deploy though.
R is also very popular, has a lot of libraries, and was designed for data science. However, the underlying language design tends to make things overcomplicated.
Personally, I prefer Clojure because it has great data manipulation support and can interop with Java ecosystem. The downside of it currently is that there aren't too many data science libraries yet!

Choosing Java vs Python on Google App Engine

Currently Google App Engine supports both Python & Java. Java support is less mature. However, Java seems to have a longer list of libraries and especially support for Java bytecode regardless of the languages used to write that code. Which language will give better performance and more power? Please advise. Thank you!
Edit:
http://groups.google.com/group/google-appengine-java/web/will-it-play-in-app-engine?pli=1
Edit:
By "power" I mean better expandability and inclusion of available libraries outside the framework. Python allows only pure Python libraries, though.
I'm biased (being a Python expert but pretty rusty in Java) but I think the Python runtime of GAE is currently more advanced and better developed than the Java runtime -- the former has had one extra year to develop and mature, after all.
How things will proceed going forward is of course hard to predict -- demand is probably stronger on the Java side (especially since it's not just about Java, but other languages perched on top of the JVM too, so it's THE way to run e.g. PHP or Ruby code on App Engine); the Python App Engine team however does have the advantage of having on board Guido van Rossum, the inventor of Python and an amazingly strong engineer.
In terms of flexibility, the Java engine, as already mentioned, does offer the possibility of running JVM bytecode made by different languages, not just Java -- if you're in a multi-language shop that's a pretty large positive. Vice versa, if you loathe Javascript but must execute some code in the user's browser, Java's GWT (generating the Javascript for you from your Java-level coding) is far richer and more advanced than Python-side alternatives (in practice, if you choose Python, you'll be writing some JS yourself for this purpose, while if you choose Java GWT is a usable alternative if you loathe writing JS).
In terms of libraries it's pretty much a wash -- the JVM is restricted enough (no threads, no custom class loaders, no JNI, no relational DB) to hamper the simple reuse of existing Java libraries as much, or more, than existing Python libraries are similarly hampered by the similar restrictions on the Python runtime.
In terms of performance, I think it's a wash, though you should benchmark on tasks of your own -- don't rely on the performance of highly optimized JIT-based JVM implementations discounting their large startup times and memory footprints, because the app engine environment is very different (startup costs will be paid often, as instances of your app are started, stopped, moved to different hosts, etc, all trasparently to you -- such events are typically much cheaper with Python runtime environments than with JVMs).
The XPath/XSLT situation (to be euphemistic...) is not exactly perfect on either side, sigh, though I think it may be a tad less bad in the JVM (where, apparently, substantial subsets of Saxon can be made to run, with some care). I think it's worth opening issues on the Appengine Issues page with XPath and XSLT in their titles -- right now there are only issues asking for specific libraries, and that's myopic: I don't really care HOW a good XPath/XSLT is implemented, for Python and/or for Java, as long as I get to use it. (Specific libraries may ease migration of existing code, but that's less important than being able to perform such tasks as "rapidly apply XSLT transformation" in SOME way!-). I know I'd star such an issue if well phrased (especially in a language-independent way).
Last but not least: remember that you can have different version of your app (using the same datastore) some of which are implemented with the Python runtime, some with the Java runtime, and you can access versions that differ from the "default/active" one with explicit URLs. So you could have both Python and Java code (in different versions of your app) use and modify the same data store, granting you even more flexibility (though only one will have the "nice" URL such as foobar.appspot.com -- which is probably important only for access by interactive users on browsers, I imagine;-).
Watch this app for changes in Python and Java performance:
http://gaejava.appspot.com/
(edit: apologies, link is broken now. But following para still applied when I saw it running last)
Currently, Python and using the low-level API in Java are faster than JDO on Java, for this simple test. At least if the underlying engine changes, that app should reflect performance changes.
Based on experience with running these VMs on other platforms, I'd say that you'll probably get more raw performance out of Java than Python. Don't underestimate Python's selling points, however: The Python language is much more productive in terms of lines of code - the general agreement is that Python requires a third of the code of an equivalent Java program, while remaining as or more readable. This benefit is multiplied by the ability to run code immediately without an explicit compile step.
With regards to available libraries, you'll find that much of the extensive Python runtime library works out of the box (as does Java's). The popular Django Web framework (http://www.djangoproject.com/) is also supported on AppEngine.
With regards to 'power', it's difficult to know what you mean, but Python is used in many different domains, especially the Web: YouTube is written in Python, as is Sourceforge (as of last week).
June 2013: This video is a very good answer by a google engineer:
http://www.youtube.com/watch?v=tLriM2krw2E
TLDR; is:
Pick the language that you and your team is most productive with
If you want to build something for production: Java or Python (not Go)
If you have a big team and a complex code base: Java (because of static code analysis and refactoring)
Small teams that iterate quickly: Python (although Java is also okay)
An important question to consider in deciding between Python and Java is how you will use the datastore in each language (and most other angles to the original question have already been covered quite well in this topic).
For Java, the standard method is to use JDO or JPA. These are great for portability but are not very well suited to the datastore.
A low-level API is available but this is too low level for day-to-day use - it is more suitable for building 3rd party libraries.
For Python there is an API designed specifically to provide applications with easy but powerful access to the datastore. It is great except that it is not portable so it locks you into GAE.
Fortunately, there are solutions being developed for the weaknesses listed for both languages.
For Java, the low-level API is being used to develop persistence libraries that are much better suited to the datastore then JDO/JPA (IMO). Examples include the Siena project, and Objectify.
I've recently started using Objectify and am finding it to be very easy to use and well suited to the datastore, and its growing popularity has translated into good support. For example, Objectify is officially supported by Google's new Cloud Endpoints service. On the other hand, Objectify only works with the datastore, while Siena is 'inspired' by the datastore but is designed to work with a variety of both SQL databases and NoSQL datastores.
For Python, there are efforts being made to allow the use of the Python GAE datastore API off of the GAE. One example is the SQLite backend that Google released for use with the SDK, but I doubt they intend this to grow into something production ready. The TyphoonAE project probably has more potential, but I don't think it is production ready yet either (correct me if I am wrong).
If anyone has experience with any of these alternatives or knows of others, please add them in a comment. Personally, I really like the GAE datastore - I find it to be a considerable improvement over the AWS SimpleDB - so I wish for the success of these efforts to alleviate some of the issues in using it.
I'm strongly recommending Java for GAE and here's why:
Performance: Java is potentially faster then Python.
Python development is under pressure of a lack of third-party libraries. For example, there is no XSLT for Python/GAE at all. Almost all Python libraries are C bindings (and those are unsupported by GAE).
Memcache API: Java SDK have more interesting abilities than Python SDK.
Datastore API: JDO is very slow, but native Java datastore API is very fast and easy.
I'm using Java/GAE in development right now.
As you've identified, using a JVM doesn't restrict you to using the Java language. A list of JVM languages and links can be found here. However, the Google App Engine does restrict the set of classes you can use from the normal Java SE set, and you will want to investigate if any of these implementations can be used on the app engine.
EDIT: I see you've found such a list
I can't comment on the performance of Python. However, the JVM is a very powerful platform performance-wise, given its ability to dynamically compile and optimise code during the run time.
Ultimately performance will depend on what your application does, and how you code it. In the absence of further info, I think it's not possible to give any more pointers in this area.
I've been amazed at how clean, straightforward, and problem free the Python/Django SDK is. However I started running into situations where I needed to start doing more JavaScript and thought I might want to take advantage of the GWT and other Java utilities. I've gotten just half way through the GAE Java tutorial, and have had one problem after another: Eclipse configuration issues, JRE versionitis, the mind-numbing complexity of Java, and a confusing and possibly broken tutorial. Checking out this site and others linked from here clinched it for me. I'm going back to Python, and I'll look into Pyjamas to help with my JavaScript challenges.
I'm a little late to the conversation, but here are my two cents. I really had a hard time choosing between Python and Java, since I am well versed in both languages. As we all know, there are advantages and disadvantages for both, and you have to take in account your requirements and the frameworks that work best for your project.
As I usually do in this type of dilemmas, I look for numbers to support my decision. I decided to go with Python for many reasons, but in my case, there was one plot that was the tipping point. If you search "Google App Engine" in GitHub as of September 2014, you will find the following figure:
There could be many biases in these numbers, but overall, there are three times more GAE Python repositories than GAE Java repositories. Not only that, but if you list the projects by the "number of stars" you will see that a majority of the Python projects appear at the top (you have to take in account that Python has been around longer). To me, this makes a strong case for Python because I take in account community adoption & support, documentation, and availability of open-source projects.
It's a good question, and I think many of the responses have given good view points of pros and cons on both sides of the fence. I've tried both Python and JVM-based AppEngine (in my case I was using Gaelyk which is a Groovy application framework built for AppEngine). When it comes to performance on the platform, one thing I hadn't considered until it was staring me in the face is the implication of "Loading Requests" that occur on the Java side of the fence. When using Groovy these loading requests are a killer.
I put a post together on the topic (http://distractable.net/coding/google-appengine-java-vs-python-performance-comparison/) and I'm hoping to find a way of working around the problem, but if not I think I'll be going back to a Python + Django combination until cold starting java requests has less of an impact.
Based on how much I hear Java people complain about AppEngine compared to Python users, I would say Python is much less stressful to use.
There's also project Unladen Swallow, which is apparently Google-funded if not Google-owned. They're trying to implement a LLVM-based backend for Python 2.6.1 bytecode, so they can use a JIT and various nice native code/GC/multi-core optimisations. (Nice quote: "We aspire to do no original work, instead using as much of the last 30 years of research as possible.") They're looking for a 5x speed-up to CPython.
Of course this doesn't answer your immediate question, but points towards a "closing of the gap" (if any) in the future (hopefully).
The beauty of python nowdays is how well it communicates with other languages. For instance you can have both python and java on the same table with Jython. Of course jython even though it fully supports java libraries it does not support fully python libraries. But its an ideal solution if you want to mess with Java Libraries. It even allows you to mix it with Java code with no extra coding.
But even python itself has made some steps forwared. See ctypes for example, near C speed , direct accees to C libraries all of this without leaving the comfort of python coding. Cython goes one step further , allowing to mix c code with python code with ease, or even if you dont want to mess with c or c++ , you can still code in python but use statically type variables making your python programms as fast as C apps. Cython is both used and supported by google by the way.
Yesterday I even found tools for python to inline C or even Assembly (see CorePy) , you cant get any more powerful than that.
Python is surely a very mature language, not only standing on itself , but able to coooperate with any other language with easy. I think that is what makes python an ideal solution even in a very advanced and demanding scenarios.
With python you can have acess to C/C++ ,Java , .NET and many other libraries with almost zero additional coding giving you also a language that minimises, simplifies and beautifies coding. Its a very tempting language.
Gone with Python even though GWT seems a perfect match for the kind of an app I'm developing. JPA is pretty messed up on GAE (e.g. no #Embeddable and other obscure non-documented limitations). Having spent a week, I can tell that Java just doesn't feel right on GAE at the moment.
One think to take into account are the frameworks you intend yo use. Not all frameworks on Java side are well suited for applications running on App Engine, which is somewhat different than traditional Java app servers.
One thing to consider is the application startup time. With traditional Java web apps you don't really need to think about this. The application starts and then it just runs. Doesn't really matter if the startup takes 5 seconds or couple of minutes. With App Engine you might end up in a situation where the application is only started when a request comes in. This means the user is waiting while your application boots up. New GAE features like reserved instances help here, but check first.
Another thing are the different limitations GAE psoes on Java. Not all frameworks are happy with the limitations on what classes you can use or the fact that threads are not allowed or that you can't access local filesystem. These issues are probably easy to find out by just googling about GAE compatibility.
I've also seen some people complaining about issues with session size on modern UI frameworks (Wicket, namely). In general these frameworks tend to do certain trade-offs in order to make development fun, fast and easy. Sometimes this may lead to conflicts with the App Engine limitations.
I initially started developing working on GAE with Java, but then switched to Python because of these reasons. My personal feeling is that Python is a better choice for App Engine development. I think Java is more "at home" for example on Amazon's Elastic Beanstalk.
BUT with App Engine things are changing very rapidly. GAE is changing itself and as it becomes more popular, the frameworks are also changing to work around its limitations.

Should I study Scala? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am an experienced C++ programmer with average Python skills. The reasons I studied Python in the first place were:
to get a different perspective on programming (static vs dynamic, interpreted vs compiled, etc.)
to increase the breadth of projects that I can work on (Python allows me to do web development, develop for Symbian phones or knock up quick system administration scripts)
to complement my C++ skills.
I think that Python is great and I believe that I have achieved the above goals. I will continue to use it for small projects, scripts and web development.
I doubt that I can use it for medium to large projects though. While the dynamic typing is convenient, it allows a certain class of bugs that I find disturbing. Unit testing and linting can alleviate this problem, but static typing completely eliminates it.
After looking at some programming languages, I think that Scala looks like a good candidate:
I like the type inference and it runs on the JVM so it should be available wherever the JVM is available. I can also learn more about functional programming when using it.
But... I also have some doubts, and this is where I hope that the Stack Overflow community can help:
Portability: Linux and Windows at least I hope. What about mobile phones, is it possible to get it to run there?
C++ compatibility: can I mix C++ code with Scala? (JNI?)
Programming paradigm: I don't feel comfortable with switching to functional programming (FP) at this time. Can I use object oriented and procedural with some FP at first and then change the proportions as I learn?
Tool chain maturity: what's your experience with IDEs and debuggers? I'm using Eclipse right now and it seems OK.
Learning speed: considering my experience, how fast do you think that I can reach a workable level with Scala?
Deployment: how exactly do you deploy a Scala program? Is it a jar, is it an executable?
Finally, what do you think that are some of Scalas disadvantages?
Portability: Linux and Windows at least I hope. What about mobile phones, did anyone succeed in getting it to run there?
Yes. There is quite some movement about Scala on Android. As for J2ME, I saw something in that respect, but not much. There is some code pertaining to J2ME on the source code repository. I'm not sure how viable it is, but it looks to me that there isn't much demand for that.
I'll also mention that there is/was a pool on Scala-Lang about the desired target platforms, and J2ME was one of them, very low on the totem pole.
C++ compatibility: can I mix C++ code with Scala? (JNI?)
As well as you can mix C++ with Java, for whatever that is worth. If you haven't any experience with that, you can just read the Java resources, as anything in them will be applicable with Scala with no changes (aside Scala syntax).
Programming paradigm: I don't feel comfortable with switching to FP at this time. Can I use OO and procedural with some FP at first and then change the proportions as I learn?
Definitely, yes. Scala goes out of it's way to make sure you don't need to program in a functional style. This is the main criticism of Scala from functional folks, as a matter of fact: some do not consider a language functional unless it forces the programmer to write in functional style.
Anyway, you can go right on doing things your way. My bet, though, is that you'll pick up functional habits without even realizing they are functional.
Perhaps you can look at the Matrices series in my own blog about writing a Matrix class. Even though it looks like standard OO code, it is, in fact, very functional.
Tool chain maturity: what's your experience with IDEs and debuggers? I'm using Eclipse right now and it seems ok.
IDEA (IntelliJ), NetBeans and Eclipse all have good support for Scala. It seems IDEA's is the best, and NetBeans/Eclipse keep frog-jumping each other, though NetBeans has certainly been more stable than Eclipse of late. On the other hand, the support on Eclipse is taking a very promising route that should produce results in the next 6 months or so -- it's just that it's a bumping route. :-)
Some interesting signs of Scala tooling for these enviroments is the fact that the Eclipse plugin in development uses AOP to merge more seamlessly with the whole IDE, that the NetBeans plugin is being completely rewritten in Scala, and that there's a Scala Power Pack on IDEA that supports, among other things, translating Java code into Scala code.
The EMACS folks have extensive tools for Scala as well, and lots of smaller editors have support for it too. I'm very comfortable with jEdit's support for small programs and scripts, for instance.
There is also good Maven support -- in fact, the standard way to install Lift is to install maven, and then build a Lift archetype. That will pull in an appropriate Scala version. There's an scala:cc target that will do triggered recompilation as well.
Speaking of recompilation, neither Maven, and particularly nor Ant do a good job at identifying what needs to be recompiled. From that problem sprung SBT (Simple Build Tool), written in Scala, which solves that problem through the use of Scala compiler plugin. SBT uses the same project layout as Maven, as well as Maven/Ivy repositories, but project configurations are done in Scala code instead of XML -- with support for Maven/Ivy configuration files as well.
Learning speed: considering my experience, how fast do you think that I can reach a workable level with Scala?
Very fast. As a purely OO language, Scala already introduces some nice features, comparable to some stuff that's present in C++ but not Java, though they work in different fashion. In that respect, once you realize what such features are for and relate them to C++ stuff, you'll be much ahead of Java programmers, as you'll already know what to do with them.
Deployment: how exactly do you deploy a Scala program? Is it a jar, is it an executable?
The same thing as Java. You can deploy JARs, WARs, or any other of Java targets, because the scala compiler generate class files. In fact, you use Java's jar to generate a Scala's JAR file from the class files, and the Maven targets for Lift support building WAR files.
There is an alternative for script files, though. You can call "scala" to run Scala source code directly, similar to a Perl of Shell script. It can also be done on Windows. However, even with the use of a compilation daemon to speed up execution, start up times are slow enough that effective use of Scala in a heavy scripting environment needs something like Nailgun.
As for Scala's disadvantages, take a look at my answer (and other's) in this Stack Overflow question.
Scala is an evolving language well worth to invest in, especially if you are coming from Java world. Scala is widely covered at Artima. See this article from Bill Venners and also read about Twitter and Scala.
Regarding your questions:
Java can run wherever there is a JVM. No luck with the mobile phones however. You need a full JRE, not the subset that is available there.
This is possible with JNI. If something is possible with Java, then it is possible with Scala. Scala can call Java classes.
Functional programming is a strong point of Scala - you do need to learn it. However you could also start using it without taking full advantage of it and work your way with it.
There is a plug-in of Eclipse. It is not best, but it will do the job. More details here.
If you are experienced, I would say really fast. I recommend that you find a book to start with.
See this faq entry for deployment.
Programming paradigm: I don't feel comfortable with switching to FP at this time. Can I use OO and procedural with some FP at first and then change the proportions as I learn?
Scala has full support for imperative programming, writing programs with no FP elements in it is a breeze (however, FP is useful and worth learning anyway).
Learning speed: considering my experience, how fast do you think that I can reach a workable level with Scala?
Quickly. There is a number of interesting features in Scala that may be not familiar to people coming from a C++, Java environment, like for example some of the features of the typing system. Some argue that the fact that there is a lot to learn in Scala before you know all of it is a problem with the language; I disagree. The presence of those feature is an advantage of the language. The more features the merrier. After all, you don't have to use them all at once, just like you don't have to buy everything that is being sold in the store.
Learning speed: considering my experience, how fast do you think that I can reach a workable level with Scala?
I also come from a C++ background, one thing I noticed is that since you will write a lot less code as compared to C++ for a comparable task, your learning will be expedited as you will get more done in the same time period. This was the same phenomenon that I experienced with Ruby.
Actually - if I were you - I'd study programming paradigms instead of languages. Of course you have to study an example language to study the paradigm. Knowing the drawbacks & benefits of different paradigms enables you to view your problems from a different side and makes you a better programmer (even in the languages you already know).
Picking up a language of a paradigm already known is a relativly easy task if needed. Since Scala is FP (at least you mentioned it) and C++/Python is OOP, it's a good language for you, I'd say.
You should register for this course by the Creator of Scala himself.
https://www.coursera.org/course/progfun
James Strachan (productive Java open source developer, for those not in the loop) has an interesting discussion of Scala here, and why he feels it's a progression from Java (the langauge, not the platform).
Scala looks like it's gaining a lot of traction. I don't think it's a flash in the pan, and is currently on my list of languages to learn (partly for the functional aspect)
Here's an anecdotal evidence regarding learnability of Scala.
In our company, we got several interns from U.Waterloo. They were told to write in Scala; never saw it before.
They picked up Scala and Lift remarkably fast; now they are producing Scala code; it may be not perfect, but nobody's perfect.
So, the fact that a manager does not know Scala may be not the best argument when you decide on adoption.

Categories