Related
I am trying to optimize the performance of some natural language processing in a python project I am currently working on. Basically I would like to outsource the computationally intensive parts to use apache OpenNLP, which is written in Java.
My question is what would be the recommended way to link Java functions/classes back to my python code? The three main ways I have thought about are
using C/C++ bindings in python and then embedding a JVM in my C program. This is what I am leaning towards because I am somewhat familiar writing C extensions to python, but using a triangle of languages where C only functions as an intermediary doesn't seem right somehow.
using Jython. My main concern with this is that CPython is the overwhelmingly popular python implementation as far as I know and I don't want to break compatibility with other collaborators or packages.
streaming input and output to the binaries that come with OpenNLP. Apache provides tokenizers and such as stand-alone binaries that you can pipe data to and from. This would probably be the easiest option to implement, but it also seems like the most crude.
I'm wondering if anyone who has experience interfacing python and java knows how much the performance is likely to differ between these options, and which one is "recommended" or considered best practice in such a situation - or of course if there is an entirely different way to do it that I haven't thought of.
I did search SO for existing answers and found this, but it's an answer from 3.5 years ago and mentions some projects that are either dead, hard to integrate/configure/install or still under development.
Some comments mentioned that the overhead for all three methods is likely to be insignificant compared to the time required to run the actual NLP code. This is probably true, but I'm still interested in what the answer is from a more general perspective.
Thanks!
Consider building a java server with existing language independent RPC mecahnism(thirift, ....). And use python as the RPC client to talk with the server. It has loose coupling。
I am making a proposal for a Python adapter to the Oracle NoSQL Database. The Oracle NoSQL Database runs as a stand alone java application, and at least in a Java program, you interface with it by telling your program the hostname and port to connect to, and some configuration settings. Then you make java calls off of the "kvstore" object that contains that configuration.
I'd like to make a Python library that essentially exposes Python versions of the java methods Oracle NoSQL has, and converts those to Java to speak with the running Oracle NoSQL application, but I'm not sure what technologies would be best to be able to do that.
Does anyone know what technology I would want to use? I'd rather not use Jython (so the application could run in a standard Python environment) or JNI (as it seems to have some nasty caveats.)
EDIT: The only potentially technology I've found so far is: Jpype
Would it work for me?
Also, here are the ideal requirements the library would have. I would consider using Jython or JNI if one of them really did best match these requirements.
Performance. The main benefits of Oracle NoSQL are performance and scalability, so that would be the most important component for the adapter.
Easy to implement for the Python users. In order for the library to actually be used by Python programmers, it would have to relatively easy for them to use in a natural sort of way.
Reliability. It would need to be possible for it to be trustworthy and bug free, while working on the platforms you naturally expect Python to work on. (This is what made me concerned about JNI. It sounds like it is platform dependent for its implementation, and can be error-prone.)
Development speed. The last point of importance is that it be relatively fast to develop. The team of developers would enjoy learning Python or C, but we know Java better than any other programming langauge right now.
I've tried to answer each point in order, with it's own related notes.
Performance: My opinion is that JNI would be the winner here, but I could be wrong, because Jython could be JITed as well.
Easy to Implement: I am taking this as you mean easy to consume the library? That would depend entirely on how you build the API, 1 to 1 method calls, object handles, etc.
Reliability: Jython is the clear winner here, because there is no room for error when you merely instantiate POJOs/ directly access the API right in the client code.
Development Speed: JNI can be very tedious. You are essentially learning CPython modules, C, Python extensions, JNI and then referencing an existing API built in Java.
All in all, if you can make the jump, I think you'd get more benefit in the short term embedding Jython, mostly because there you can directly manipulate the API. I have personally embedded IronPython in a .NET codebase with good success. Yes you lose native speed, but the tradeoffs are hard to justify with the amount of C coding needed for a working JNI bridge. That said, you may well find projects like the one you listed (Jpype) that can do much of the legwork for you.
I would be asking what features of a native CPython runtime I need that you lose when going to Jython. Is your existing codebase in CPython heavily reliant on CPython native features?
Anyway, there's my attempt at an answer.
I must use a commercial Java library, and would like to do it from Python. Jython is robust and I am fine with it being a few dot releases behind. However, I would like to use NumPy as well, which obviously does not work with Jython. Options like CPype and Java numeric libraries are unappealing. The former is essentially dead. The latter are mostly immature and lack the ease of use and wide acceptance of NumPy. My question is: How can one have Jython and Python code interoperate? It would be acceptable for me to call Jython from Cpython or the other way around.
It's ironic, considering that Jython and Numeric (NumPy's ancestor) were initiated by the same developer (Jim Hugunin, who then moved on to also initiate IronPython and now holds some kind of senior architect position at Microsoft, working on all kind of dynamic languages support for .NET and Silverlight), that there's no really good way to use numpy in Jython. The closest thing to that, which I know of, is the "jnumerical" project -- the (scarce) docs are on sourceforge, but the updated sources are on bitbucket.
"Numeric Python", what jnumerical implements, is not as slick and streamlined as its numpy descendant, but it has about the same functionality and shares a lot of the concepts and philosophy, so maybe you could find it usable -- worth checking out, at least.
Consider using execnet, which allows you to combine the strengths of both Jython and CPython, including current NumPy. The disadvantage here is that you will have to pay for the cost of serializing/deserializing objects between the two interpreters in two different process spaces. (You can avoid network overhead by using its support for subprocess.) But such a combination may work well, given that you're considering JPype, which would have similar (and probably higher) overhead. Just ensure you've partitioned the work appropriately.
The Jython developers (and I'm one of them) are looking at supporting NumPy in the future, via support of the C Extension API, but this is very much preliminary planning indeed.
I look very much formard to the Jython C Extension API! That would be awesome!
Until, that point, I think you have two alternatives:
http://jepp.sourceforge.net/ for embedding python in java, it has a nice console. The disadvatage, for me a too big disadvatage, is that it needs to be compiled against your own python. And with the python upgrade, you have to recompile (I don't want to compile python, in order to compile and use the extension - it is also not possible, especially if the code should be executed on different machines, on grid for example)
http://lucene.apache.org/pylucene/jcc/ - this is used for lucene, and for many other projects. I personally use it to wrap GATE NLP engine and also solr. To make that available to Python. Jcc is much faster than the (dead) JPype, probably because some data structures (like lists) are optimized and also because it is interfacing python<->java via C++ extension (according to this: http://www.slideshare.net/onyame/mixing-python-and-java page 30) I have tried moving 6mil of integers in the list between python and java, JPype was orders of magnitude slower (but i don't remember the numbers)
However, using Jcc, you can wrap only public methods, and sometimes it is tricky, especially if that method is receiving or returning certain java objects (in short, JCC must compile wrappers also for the passed-in objects, otherwise all the methods using/returning such methods are not accessible). So unless you need to distribute your code, you are better of with JEPP.
Disclaimer: Have not had persnal experience with it yet
Seems like JyNI – Jython Native Interface is the way to go.
There's also a newer question posted which may have newer alternatives.
If you stick to vector and matrix maths, I suggest to have a look onto vectorz.
It is a pure Java implementation and shall be 100% usable from within jython. I still didn't try it, but will soon, since I have the same necessity in finding a numpy alternative.
Currently Google App Engine supports both Python & Java. Java support is less mature. However, Java seems to have a longer list of libraries and especially support for Java bytecode regardless of the languages used to write that code. Which language will give better performance and more power? Please advise. Thank you!
Edit:
http://groups.google.com/group/google-appengine-java/web/will-it-play-in-app-engine?pli=1
Edit:
By "power" I mean better expandability and inclusion of available libraries outside the framework. Python allows only pure Python libraries, though.
I'm biased (being a Python expert but pretty rusty in Java) but I think the Python runtime of GAE is currently more advanced and better developed than the Java runtime -- the former has had one extra year to develop and mature, after all.
How things will proceed going forward is of course hard to predict -- demand is probably stronger on the Java side (especially since it's not just about Java, but other languages perched on top of the JVM too, so it's THE way to run e.g. PHP or Ruby code on App Engine); the Python App Engine team however does have the advantage of having on board Guido van Rossum, the inventor of Python and an amazingly strong engineer.
In terms of flexibility, the Java engine, as already mentioned, does offer the possibility of running JVM bytecode made by different languages, not just Java -- if you're in a multi-language shop that's a pretty large positive. Vice versa, if you loathe Javascript but must execute some code in the user's browser, Java's GWT (generating the Javascript for you from your Java-level coding) is far richer and more advanced than Python-side alternatives (in practice, if you choose Python, you'll be writing some JS yourself for this purpose, while if you choose Java GWT is a usable alternative if you loathe writing JS).
In terms of libraries it's pretty much a wash -- the JVM is restricted enough (no threads, no custom class loaders, no JNI, no relational DB) to hamper the simple reuse of existing Java libraries as much, or more, than existing Python libraries are similarly hampered by the similar restrictions on the Python runtime.
In terms of performance, I think it's a wash, though you should benchmark on tasks of your own -- don't rely on the performance of highly optimized JIT-based JVM implementations discounting their large startup times and memory footprints, because the app engine environment is very different (startup costs will be paid often, as instances of your app are started, stopped, moved to different hosts, etc, all trasparently to you -- such events are typically much cheaper with Python runtime environments than with JVMs).
The XPath/XSLT situation (to be euphemistic...) is not exactly perfect on either side, sigh, though I think it may be a tad less bad in the JVM (where, apparently, substantial subsets of Saxon can be made to run, with some care). I think it's worth opening issues on the Appengine Issues page with XPath and XSLT in their titles -- right now there are only issues asking for specific libraries, and that's myopic: I don't really care HOW a good XPath/XSLT is implemented, for Python and/or for Java, as long as I get to use it. (Specific libraries may ease migration of existing code, but that's less important than being able to perform such tasks as "rapidly apply XSLT transformation" in SOME way!-). I know I'd star such an issue if well phrased (especially in a language-independent way).
Last but not least: remember that you can have different version of your app (using the same datastore) some of which are implemented with the Python runtime, some with the Java runtime, and you can access versions that differ from the "default/active" one with explicit URLs. So you could have both Python and Java code (in different versions of your app) use and modify the same data store, granting you even more flexibility (though only one will have the "nice" URL such as foobar.appspot.com -- which is probably important only for access by interactive users on browsers, I imagine;-).
Watch this app for changes in Python and Java performance:
http://gaejava.appspot.com/
(edit: apologies, link is broken now. But following para still applied when I saw it running last)
Currently, Python and using the low-level API in Java are faster than JDO on Java, for this simple test. At least if the underlying engine changes, that app should reflect performance changes.
Based on experience with running these VMs on other platforms, I'd say that you'll probably get more raw performance out of Java than Python. Don't underestimate Python's selling points, however: The Python language is much more productive in terms of lines of code - the general agreement is that Python requires a third of the code of an equivalent Java program, while remaining as or more readable. This benefit is multiplied by the ability to run code immediately without an explicit compile step.
With regards to available libraries, you'll find that much of the extensive Python runtime library works out of the box (as does Java's). The popular Django Web framework (http://www.djangoproject.com/) is also supported on AppEngine.
With regards to 'power', it's difficult to know what you mean, but Python is used in many different domains, especially the Web: YouTube is written in Python, as is Sourceforge (as of last week).
June 2013: This video is a very good answer by a google engineer:
http://www.youtube.com/watch?v=tLriM2krw2E
TLDR; is:
Pick the language that you and your team is most productive with
If you want to build something for production: Java or Python (not Go)
If you have a big team and a complex code base: Java (because of static code analysis and refactoring)
Small teams that iterate quickly: Python (although Java is also okay)
An important question to consider in deciding between Python and Java is how you will use the datastore in each language (and most other angles to the original question have already been covered quite well in this topic).
For Java, the standard method is to use JDO or JPA. These are great for portability but are not very well suited to the datastore.
A low-level API is available but this is too low level for day-to-day use - it is more suitable for building 3rd party libraries.
For Python there is an API designed specifically to provide applications with easy but powerful access to the datastore. It is great except that it is not portable so it locks you into GAE.
Fortunately, there are solutions being developed for the weaknesses listed for both languages.
For Java, the low-level API is being used to develop persistence libraries that are much better suited to the datastore then JDO/JPA (IMO). Examples include the Siena project, and Objectify.
I've recently started using Objectify and am finding it to be very easy to use and well suited to the datastore, and its growing popularity has translated into good support. For example, Objectify is officially supported by Google's new Cloud Endpoints service. On the other hand, Objectify only works with the datastore, while Siena is 'inspired' by the datastore but is designed to work with a variety of both SQL databases and NoSQL datastores.
For Python, there are efforts being made to allow the use of the Python GAE datastore API off of the GAE. One example is the SQLite backend that Google released for use with the SDK, but I doubt they intend this to grow into something production ready. The TyphoonAE project probably has more potential, but I don't think it is production ready yet either (correct me if I am wrong).
If anyone has experience with any of these alternatives or knows of others, please add them in a comment. Personally, I really like the GAE datastore - I find it to be a considerable improvement over the AWS SimpleDB - so I wish for the success of these efforts to alleviate some of the issues in using it.
I'm strongly recommending Java for GAE and here's why:
Performance: Java is potentially faster then Python.
Python development is under pressure of a lack of third-party libraries. For example, there is no XSLT for Python/GAE at all. Almost all Python libraries are C bindings (and those are unsupported by GAE).
Memcache API: Java SDK have more interesting abilities than Python SDK.
Datastore API: JDO is very slow, but native Java datastore API is very fast and easy.
I'm using Java/GAE in development right now.
As you've identified, using a JVM doesn't restrict you to using the Java language. A list of JVM languages and links can be found here. However, the Google App Engine does restrict the set of classes you can use from the normal Java SE set, and you will want to investigate if any of these implementations can be used on the app engine.
EDIT: I see you've found such a list
I can't comment on the performance of Python. However, the JVM is a very powerful platform performance-wise, given its ability to dynamically compile and optimise code during the run time.
Ultimately performance will depend on what your application does, and how you code it. In the absence of further info, I think it's not possible to give any more pointers in this area.
I've been amazed at how clean, straightforward, and problem free the Python/Django SDK is. However I started running into situations where I needed to start doing more JavaScript and thought I might want to take advantage of the GWT and other Java utilities. I've gotten just half way through the GAE Java tutorial, and have had one problem after another: Eclipse configuration issues, JRE versionitis, the mind-numbing complexity of Java, and a confusing and possibly broken tutorial. Checking out this site and others linked from here clinched it for me. I'm going back to Python, and I'll look into Pyjamas to help with my JavaScript challenges.
I'm a little late to the conversation, but here are my two cents. I really had a hard time choosing between Python and Java, since I am well versed in both languages. As we all know, there are advantages and disadvantages for both, and you have to take in account your requirements and the frameworks that work best for your project.
As I usually do in this type of dilemmas, I look for numbers to support my decision. I decided to go with Python for many reasons, but in my case, there was one plot that was the tipping point. If you search "Google App Engine" in GitHub as of September 2014, you will find the following figure:
There could be many biases in these numbers, but overall, there are three times more GAE Python repositories than GAE Java repositories. Not only that, but if you list the projects by the "number of stars" you will see that a majority of the Python projects appear at the top (you have to take in account that Python has been around longer). To me, this makes a strong case for Python because I take in account community adoption & support, documentation, and availability of open-source projects.
It's a good question, and I think many of the responses have given good view points of pros and cons on both sides of the fence. I've tried both Python and JVM-based AppEngine (in my case I was using Gaelyk which is a Groovy application framework built for AppEngine). When it comes to performance on the platform, one thing I hadn't considered until it was staring me in the face is the implication of "Loading Requests" that occur on the Java side of the fence. When using Groovy these loading requests are a killer.
I put a post together on the topic (http://distractable.net/coding/google-appengine-java-vs-python-performance-comparison/) and I'm hoping to find a way of working around the problem, but if not I think I'll be going back to a Python + Django combination until cold starting java requests has less of an impact.
Based on how much I hear Java people complain about AppEngine compared to Python users, I would say Python is much less stressful to use.
There's also project Unladen Swallow, which is apparently Google-funded if not Google-owned. They're trying to implement a LLVM-based backend for Python 2.6.1 bytecode, so they can use a JIT and various nice native code/GC/multi-core optimisations. (Nice quote: "We aspire to do no original work, instead using as much of the last 30 years of research as possible.") They're looking for a 5x speed-up to CPython.
Of course this doesn't answer your immediate question, but points towards a "closing of the gap" (if any) in the future (hopefully).
The beauty of python nowdays is how well it communicates with other languages. For instance you can have both python and java on the same table with Jython. Of course jython even though it fully supports java libraries it does not support fully python libraries. But its an ideal solution if you want to mess with Java Libraries. It even allows you to mix it with Java code with no extra coding.
But even python itself has made some steps forwared. See ctypes for example, near C speed , direct accees to C libraries all of this without leaving the comfort of python coding. Cython goes one step further , allowing to mix c code with python code with ease, or even if you dont want to mess with c or c++ , you can still code in python but use statically type variables making your python programms as fast as C apps. Cython is both used and supported by google by the way.
Yesterday I even found tools for python to inline C or even Assembly (see CorePy) , you cant get any more powerful than that.
Python is surely a very mature language, not only standing on itself , but able to coooperate with any other language with easy. I think that is what makes python an ideal solution even in a very advanced and demanding scenarios.
With python you can have acess to C/C++ ,Java , .NET and many other libraries with almost zero additional coding giving you also a language that minimises, simplifies and beautifies coding. Its a very tempting language.
Gone with Python even though GWT seems a perfect match for the kind of an app I'm developing. JPA is pretty messed up on GAE (e.g. no #Embeddable and other obscure non-documented limitations). Having spent a week, I can tell that Java just doesn't feel right on GAE at the moment.
One think to take into account are the frameworks you intend yo use. Not all frameworks on Java side are well suited for applications running on App Engine, which is somewhat different than traditional Java app servers.
One thing to consider is the application startup time. With traditional Java web apps you don't really need to think about this. The application starts and then it just runs. Doesn't really matter if the startup takes 5 seconds or couple of minutes. With App Engine you might end up in a situation where the application is only started when a request comes in. This means the user is waiting while your application boots up. New GAE features like reserved instances help here, but check first.
Another thing are the different limitations GAE psoes on Java. Not all frameworks are happy with the limitations on what classes you can use or the fact that threads are not allowed or that you can't access local filesystem. These issues are probably easy to find out by just googling about GAE compatibility.
I've also seen some people complaining about issues with session size on modern UI frameworks (Wicket, namely). In general these frameworks tend to do certain trade-offs in order to make development fun, fast and easy. Sometimes this may lead to conflicts with the App Engine limitations.
I initially started developing working on GAE with Java, but then switched to Python because of these reasons. My personal feeling is that Python is a better choice for App Engine development. I think Java is more "at home" for example on Amazon's Elastic Beanstalk.
BUT with App Engine things are changing very rapidly. GAE is changing itself and as it becomes more popular, the frameworks are also changing to work around its limitations.
I just started exploring Scala in my free time.
I have to say that so far I'm very impressed. Scala sits on top of the JVM, seamlessly integrates with existing Java code and has many features that Java doesn't.
Beyond learning a new language, what's the downside to switching over to Scala?
Well, the downside is that you have to be prepared for Scala to be a bit rough around the edges:
you'll get the odd cryptic Scala compiler internal error
the IDE support isn't quite as good as Java (neither is the debugging support)
there will be breaks to backwards compatibility in future releases (although these will be limited)
You also have to take some risk that Scala as a language will fizzle out.
That said, I don't think you'll look back! My experiences are positive overall; the IDE's are useable, you get used to what the cryptic compiler errors mean and, whilst your Scala codebase is small, a backwards-compatibility break is not a major hassle.
It's worth it for Option, the monad functionality of the collections, closures, the actors model, extractors, covariant types etc. It's an awesome language.
It's also of great personal benefit to be able to approach problems from a different angle, something that the above constructs allow and encourage.
Some of the downsides of Scala are not related at all to the relative youth of the language. After all, Scala, has about 5 years of age, and Java was very different 5 years into its own lifespan.
In particular, because Scala does not have the backing of an enterprise which considers it a strategic priority, the support resources for it are rather lacking. For example:
Lack of extensive tutorials
Inferior quality of the documentation
Non-existing localization of documentation
Native libraries (Scala uses Java or .NET libraries as base for their own)
Another important difference is due to how Sun saw Java and EPFL sees Scala. Sun saw Java as a product to get enterprise customers. EPFL sees Scala as a language intended to be a better language than existing ones, in some particular respects (OOxFunctional integration, and type system design, mostly).
As a consequence, where Sun made JVM glacially-stable, and Java fully backward compatible, with very slow deprecation and removal of features (actually, removal?), JAR files generated with one version of Scala won't work at all with other versions (a serious problem for third party libraries), and the language is constantly getting new features as well as actually removing deprecated ones, and so is Scala's library. The revision history for Scala 2.x, which I think is barely 3 years old, is impressive.
Finally, because of all of the above, third party support for Scala is incipient. But it's important to note, though, that JetBrains, which makes money out of selling the IntelliJ IDEA IDE, has supported Scala for quite some time, and keeps improving its support. That means, to me, that there is demand for third party support, and support is bound to increase.
I point to the book situation. One year ago there was no Scala book on the market. Right now there are two or three introductory Scala books on the market, about the same number of books should be out before the end of the year, and there is a book about a very important web framework based on Scala, Lift.
I bet we'll see a book about ESME not too far in the future, as well as books about Scala and concurrency. The publishing market has apparently reached the tipping point. Once that happens, enterprises will follow.
I was unshackled from the J2EE leash last year wanted to do something new after 12 years of Java in the enterprise building very large system for some of the worlds biggest companies.
I had tried Ruby on Rails in the past. After building a few sample apps I did not like the feel of it or the fact that I would have to write a ton of unit tests to cover stuff that is normally done by a compiler.
Groovy on Grails was my next port of call. I have to say I do like this but it suffers from the same dynamic typing problems as ROR. Don't get me wrong I am not putting Grails down as it is an excellent framework and I will still use it. Each has its own place IMO.
I then jumped on Scala and have now built a hybrid application based on Scala and Spring MVC. At first working with Scala is difficult but it gets easier and more productive the more time you put into it. I've reached a tipping point where I now want to invest time in Lift as well.
The combination of "Programming in Scala" and David Pollak's "Beginning Scala" books is good for learning the language, the latter with a less academic bent.
Scala is still young and has some way to go. I think it has a bright future and I see momentum is already picking up. Recently one of the creators of the Groovy language said in a blog post he would never have bothered designing Groovy if Scala had been around at the time.
I think some more work on better Java API integration/wrapping will give Scala the boost it needs to win more followers. The basic integration is there already but I think its could be polished a bit more.
Yes IDE support is there but it is basic at the moment. The powerfully refactoring support of Intellij is not there yet and I miss that a lot. The compiler + IDE support with a mix of of other plugins is not mature yet. I sometimes get very weird internal compiler errors caused by how Scala sits with JDO enhancement for the Goggle app engine. However these are little things that can be easily fixed. Early adaptation of new technologies and languages always comes with a little pain. But this bit of pain can produce great pleasure in the future.
If I look at the capabilities of Scala compared to early Java its miles ahead. When I moved from C++ to Java the JVM was not ready yet regarding scalability. There used to be lots of weird crash and burn JVM core dumps on various OSes. All of this has now been fixed in Java and the JVM is rock solid. Scals runs in the JVM so it has been given a massive head start on native platform integration. Its standing on the shoulders of giants!
After years of building and supporting enterprise applications my vote is for a language where a compiler can catch most of the non functional bugs before even unit tests are built. I love the type checking mixed with the power of functional programming. I like the fact that I am doing OO++.
I think the development community will decide if Scala is the future or not. The downside of adopting Scala now would be if it did not pick up momentum and adaptation. It would be very difficult to maintain an Scala code base with very few Scala developers around. However I watched Java come from the skunk works into the enterprise to replace C++ and it was all pushed from the bottom up by the developer community. Time will tell for Scala but currently it has my vote.
Beyond learning a new language, what's the downside to switching over
to Scala?
Thinking, thinking, thinking..... nope, there is none :-)
I'll tell you my little personal experience, and how I found that it wasn't so easy to integrate Scala with existing Java libraries:
I wanted to get started with something easy, and as I thought that Scala was very well suited for scientific computation I wanted to do a little wrapper around JAMA (Java Matrix library)... My initial approach was to extend the Matrix type with a Scala class and then overload the arithmetic operators and call the Java native methods, but:
The Matrix class doesn't provide a default constructor (without arguments)
The Scala class needs one primary constructor
I thought one good primary constructor could be the one accepting an Array[Array[Double]] (first thing that sucks, that syntax is much more verbose and hard to read than Double[][])
As far as I know by reading the manuals, the parameters of the primary constructor are also implicitly fields of the class, so I would end with one Array[Array[Double]] in the Scala subclass and another double[][] in the Java superclass, which is pretty redundant.
I think I could have used an empty primary constructor that initialized the superclass with some default values (for example, a [[0]]), or just make an adapter class that used the Jama.Matrix as a delegate, but if a language is supposed to be elegant and seamless integrated with another, that kind of things shouldn't happen.
Those are my two cents.
I don't think there are any downsides. Actually learning new language is very helpful for broadening your programming knowledge. You might gain from Scala such things as generic classes, variance annotations, upper and lower type bounds, inner classes and abstract types as object members, compound types, explicitly typed self references, views, and polymorphic methods.
It consistently breaks backwards compatibilty.
Community size is small.
IDE support isn't there yet.
Otherwise its fine.
It is just a young language, it will get there eventually.
Great for hobbyists, not ready for enterprise.
The two, by which I mean four, biggest downsides I'm seeing are:
Many working as developers in the professional community aren't trained and are unwilling to learn how to use a functional language, they won't even give it a go so they can understand why it's a better approach. This means you'll always be fighting an uphill battle getting adoption until it's mandated at the corporate level.
RDBMS integration is still a bit spotty. Plenty of solutions, but nothing that really sticks out as becoming a standard. For me though, this might be an advantage rather than a disadvantage. JPA2 is a mess and causes more issues than problems it solves. Hibernate criteria queries aren't much better.
IDE support is still lagging, but mostly in the area of debugging at this point. Code inspection is doing pretty good (at least in IntelliJ).
You'll never want to write another line of Java again! Likely you'll want to punch a wall or break something when forced back into the awkward syntax of Java.
The answers here are somewhat dated circa 2022 so I thought that I would contribute with an update. I have been working in a tech company that started using Scala at about the same time this question was originally asked. I recently blogged about lessons learned in that shop when trying to teach Java developers how to work with a code base written in idiomatic Scala so this topic is top-of-mind for me right now.
Just about all of the maturity issues in tooling, educational
collateral, and integration are gone. Version 2 of Scala is just as
rock solid a programming language as version 8 or 11 of Java.
Some of the most obvious advantages in switching to Scala are no longer
relevant because those language features have been added to newer
versions of Java.
Where Scala continues to outshine Java is in terms
of modularity and readability which affords idiomatic Scala as able
to handle code complexity better than Java.
That improved code scalability comes at the cost of a higher
learning curve which makes Scala less attractive to junior developers
who tend to make up the majority of your engineering group.