As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am currently embarking on a project that will involve crawling and processing huge amounts of data (hundreds of gigs), and also mining them for extracting structured data, named entity recognition, deduplication, classification etc.
I'm familiar with ML tools from both Java and the Python world: Lingpipe, Mahout, NLTK, etc. However, when it comes down to picking a platform for such a large scale problem - I lack sufficient experience to decide between Java or Python.
I know this sounds like a vague question, and but I am looking for general advice on picking either Java or Python. The JVM offers better performance(?) over Python, but are libraries like Lingpipe etc. match up with the Python ecosystem? If I went this Python, how easy would it be scaling it and managing it across multiple machines etc.
Which one should I go with and why?
As Apache is going strong producing excellent stuff like Lucene/Solr/Nutch for Search, Mahout for Big Data Machine Learning, Hadoop for Map Reduce, OpenNLP for NLP, lot of NoSQL stuff. The best part is the big "I" which stands for integration and these products can be integrated with each other well as of course in most situations they (these products) complement each other.
Python is great too however if you consider above from ASF then I will go with Java like Sean Owen. Python will always be available for the above but mostly like Add on's and not the actual stuff. For example you can do Hadoop using Python by using Streaming etc.
I partially switched from C++ to Java in order to utilize some of the very popular Apache products like Lucene, Solr & OpenNLP and also other popular open source NoSQL Java products like Neo4j & OrientDB.
I think one big thing Java has going for it is Hadoop. If you really mean large scale, you'll want to be able to use something like that. Generally speaking Java has the performance advantage, and more libraries available. So: Java.
If you are looking at NoSQL databases fit for ML task, then Neo4J is one of the more production ready (relatively) and capable of handling BigData, it is native to JAVA but comes along with a beautiful REST API out of the box and hence can be integrated with the platform of your choice. JAVA will give you an performance edge here.
Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I need some help on how to choose a technology for developing mobile apps. I have decided to use phonegap (cordova library) and Jquery mobile with HTML and complete my device APIs and UI parts.
Now I am in a dilemma, on which technology to use to connect to a database -
1. PHP
2. .NET
3. JAVA
I heard/read PHP is light-weight and is easy to work with but .NET is more robust and secured. Now, I am unsure of what exactly security here means? Does it mean PHP is not a secured way to handle database operations?
Can anyone please guide me on how to decide on a technology and take my development to a higher level?
I can give you more inputs as required. :)
Many thanks.
If you never ever touched any of these technologies you should use the easiest one.
Your priority should be like this, I will rank then from according to their usability/simplicity:
1. PHP
Good:
By far simpliest of them all. In a matter of days you can learn more then enough to create your basic server. No matter do you want to handle only REST calls or do full a page creation on a server side.
It has largest overall support and you will easily find hosting, if you already don't have it. It works on all current desktop OS's like Windows, Linux and MacOS.
Bad:
Not that much. If I have to think of any I would say that it is a smaller brother of Java and .NET.
2. .NET
Good
My favorite, more secure (but not that much secure) then PHP. It requires much more time to handle and use right. Like with Java I prefer its syntax over PHP. Still more readable syntax then Java, specially if you delve into something more complex.
Bad:
But, as it is a Microsoft technology it will run only on a Windows platform. Skipp it if this is a turn off for you.
3. Java
Good:
Almost best of both worlds. Better and more powerful syntax then PHP and unlike .NET you can run it on any available platform. Like .NET it requires more time to master correctly then PHP.
Bad:
Java is usually used in large corporate projects and you will not find that much help over some basic stuff and usage. Even if you master it correctly you will still need to delve into Java EE if you want to create anything decent and robust, basically it is a largest time sink if you only need to create one server application. Other problem is memory consumption, that is why you will see much less available Java hosting platforms the it is case with .NET and PHP.
Conclusion
If you don't have that much time and you are not sure you are ever going to use it again then stick to PHP. If you are planing on using this technology for a longer period then stick to .NET. And finally if you are going to use it in a longer period but Windows platform is a turn off then stick to Java.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I've got to transform a bunch of XML documents to another format.
From what I can tell I can use XSLT or Java and an XML reader/writer. From past experience I remember XSLT being a slow, uphill struggle.
Any recommendations as to whether I would be better off using XSLT vs Java + XML reader/writer to make these new XML files?
Speed of transformation does not matter to me. This process will happen once for each of the 20 xml files that I need to process. I don't have to do this on an ongoing basis - in other words this process is throw-away.
The output XML will be simmilar, simple XML. No swank.
For a one-off process that you are just going to do the once and then throw away, then use whatever you are comfortable with and have the tools for. If you have more experience/tools for working with Java then go with that.
More generally, my experience has been on larger projects, we almost always end up resorting to java (even if we start development in xslt) - On bigger, ongoing projects, requirements often grown and get more complex and have generally found xslt transforms for large complex documents that has several nested levels (with list nodes) get unbearable to understand/test. Plus the inevitable requirement to have additional logic/conditional transformations/db lookups etc always crops up somewhere.
In the end, go with what you know!
I recommend you to use the technology you are most familiar with. You should not put too much effort into this because it is not a critical project - even the speed does not matter in this case.
You have certain options:
XSLT
Java SAXParser
Java DOMParser + iterate
Java StAX
I'd have to go with XSLT, but ultimately the decision would depend on which tools you have available.
You could probably fairly trivially, using something like Dozer and/or Castor-XML write it in Java, which would be a good thing, if all you have is a Java IDE.
If you have a good XSLT tool (like XMLSpy) then XSLT would be the way to go.
So yeah. Depending on which tools you have, either could do this in a couple of hours (probably slightly more in Java if you need to first learn the frameworks).
There is not enough information in this question to answer conclusively, so here are a few possible pointers.
If the job is simply to transform XML into another form of XML, with all the required data available in the original XML, then XSLT is likely to be the most appropriate. After all, that is the entire purpose of XSLT.
If you have never used XSLT, don't want to learn it or don't have time to learn it, and are very competent in Java, then Java is likely to be the tool to use. You will be quicker using a tool with which you are already familiar.
If the transformations require fetching data from some other source (e.g. a database), or doing some complex calculations, then you may have to use Java.
If the difference between the shapes of the two formats are great, then XSLT is likely to be a much clearer and easier way to perform the transformation.
If the differences are trivial, and you have very many documents to transform, then using the DOM in Java might be noticeably more performant.
Java is a strong language and will give a strong support of thousands of API resources/documentation available. So if I were you I will definitely opt for Java.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
The engine I've been wanting to remake is from a PlayStation 1 game called Final Fantasy Tactics, and the game is basically a 2.5D game I guess you could say. Low-resolution sprites and textures, and 3D maps for battlefields. The plan is to mainly load the graphics from a disc, or .iso (I already know the sectors the graphics need to be read from) and fill in the rest with game logic and graphics routines, and probably load other things from the disc like the map data.
I want this to be a multiplatform project, because I use Linux and would like for more people to join in on the project once I have enough done (and it's easy to get more people through platforms like Windows). I'll be making a website to host the project. Also, none of the graphics will be distributed, they'll have to be loaded from your own disc. I'd rather not have to deal with any legal issues.. At least not soon after the project is hosted on my site.
But anyway, here's my dilemma- I know quite a bit of Java, and some Python, but I'm worried about performance/feature issues if I make this engine using one of these two languages. I chose them due to familiarity and platform independence, but I haven't gotten too into graphics programming yet. I'm very much willing to learn, however, and I've done quite a bit of ASM work on the game- looking at graphics routines and whatnot. What would be the best route to take for a project like this? Oh, and keep in mind I'll eventually want to add higher-resolution textures in an .iso restructuring patch or something.
I'm assuming based on my results on Google that I could go with something like Pygame + OpenGL, JOGL, Pyglet, etc. Any suggestions on an API? Which has plenty of documentation/support for game or graphics programming? Do they have any serious performance hits?
Thank you for your time.
I'd recommend going with PySFML, and, of course, Python.
If you do your Python programming correctly , and if you really are willing to fiddle with C or ASM Python plugins for faster computations, you shouldn't really have too much performace hits.
First thing, I wouldn't worry too much about language performance at this moment. If you worry about performance unnecessarily and choose the wrong/hard platform, your project will be dead before it started..because it will take it longer for you to produce something and harder to get other people to join your project.
Since your are familiar with Java & Python, I'll suggest do your project with Jython or JRuby. That way you get to write in nice and powerful language with the benefit of Java runtime.
By choosing to run it on Java runtime you get:
Multi platforms support, so this address your concern about linux/window platform.
Latest Java runtime is very good and in most cases the JIT can perform equal or better to natively compiled program.
At the end of the day if you're passionate about the project and committed to getting the most out of the language you choose, the performance difference between java and python will be minimal if non-existent.
Personally speaking, the biggest challenge is finishing the project once it loses novelty and initial momentum. I suggest you go with whichever language you're more passionate about and are interested in plumbing the depths of, or one that could boost your resume.
Secondly, as you mention you're hoping to attract contributors, you may want to factor that into your decision. I can't comment much here, but have a look at similar projects with lots of activity.
Good luck!
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
There has been a lot of movement in the Scala based web framework community of late. Coming from Rails, Rake, ActiveRecord and migrations - which is a good Scala framework to build production sites in ?
A small hit in performance is acceptable if it gives a much better maintainable code. It would also be nice if collaboration features are built in - e.g. something like DB migrations, etc.
(moderator edit: David Pollak is the founder of the Lift framework)
If you want a nice simple Scala web framework for doing CRUD and a few pages, Play would be my suggestion. It's got a nice development cycle and it's simple and approachable.
If you're building an app that is going to grow and handle lots of traffic, Lift is my recommendation ;-)
Lift supports a variety of ORM systems. Mapper is much like ActiveRecord. Rather than using migrations, Mapper uses Schemifier to read the schema definition from the Mapper definitions and updates the RDBMS accordingly.
If you're building any kind of Ajax or Comet app, Lift is the right choice. Lift's Ajax support is simple... just associate a function on the server with an Ajax control. When the user clicks the button, pulls down the select, etc. the function gets invoked.
Lift has the best server-push (Comet) support of any framework. Please check out http://liftweb.blip.tv/file/2033658/ for a flavor of the Comet support.
In terms of performance and scalability, Lift powers Foursquare and other very high traffic sites.
In terms of concise code, Lift is very concise, yet type-safe (the same is not true of Play and other frameworks that represent variables with String names). So, you get the kind of type-safe, very maintainable REST support that's also very concise demonstrated here: http://www.assembla.com/wiki/show/liftweb/REST_Web_Services
Play with scala module is far better than lift in my opinion, scala is a first class citizen in play. Stateless, fast, simple, powerful, in production use, have scalate module, have active users/developers, full stack framework including caching, db, logging,...
Look this video: http://vimeo.com/7731173
The current (and quite likely the future) star of the Scala web frameworks is Lift, although you can use any other Java framework like Play with Scala, too.
You don't have to fear any performance hit when moving from Ruby to Scala/Lift, expect it to run faster (I heard numbers between 600% and 2000% faster than Ruby on Rails), but it depends on what you are doing.
Here are two short explanations from the creator of Lift about what Lift does and why it might be interesting for people coming from Rails.
For migrations see Scala Migrations
Lift has no builders (yet), but I think play framework has that. However Lift is probably the way to go if you are developing enterprise sites.
Lifty is a builder/processor for Lift
For an introduction to Lift have a look at Lift in Action (prerelease) and The Definitive Guide to Lift: A Scala-based Web Framework. The latter is also avaliable at Google Groups see file "master.pdf"
Lift
Supposed to be like Ruby on Rails and is prefered by many.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What is your preferred scripting language in java world (scripting language on JVM) and way? When do you prefer your scripting language over java (in what situations for example for prototyping)? Do you use it for large projects or for personal projects only?
For me, Clojure wins hands down. Being a Lisp, it's concise and dynamically typed, it has enormous expressive power so I can write a lot of functionality in a few lines. And its integration with Java is unsurpassed – I'm only partly joking when I say Clojure is more compatible to Java than Java itself. As a result, the entire breadth of Java libraries, both from the JDK and 3rd party, is fully useable.
Finally, with a little care Clojure code can be as fast as Java code, so I'm not even losing performance.
My favorite is Jython, and I use it by embedding a Jython interpreter in my Java application. It is being used in commercial projects, and it allows to easily customize the application according to the customers' needs without having to compile anything. Just send them the script, and that's it. Some customers might even customize their application themselves.
I've successfully used Groovy in a commercial project. I prefer scripting languages because of duck typing and closures:
def results = []
def min = 5
db.select(sql) { row ->
if (row.value > min)
results << row;
}
Translation: Run a SQL query against the database and add all rows where the column "value" is larger than "min" to "result". Note how easily you can pass data to the inner "loop" or get results out of it. And yes, I'm aware that I could achieve the same with SQL:
def results = []
def min = 5
db.select(sql, min) { row ->
results << row;
}
(just imagine that the String in "sql" has a "?" at the right place).
IMHO, using a DB with a language which doesn't offer rich list operations (sort, filter, transform) and closures just serves as an example how you should not do it.
I'd love to use Jython more but the work on Jython 2.5 has started only recently and Python 2.2 is just too old for my purposes.
I might prefer Scala, but I can't say, still learning. At the moment using Groovy to write small utility programs. Haven't tried even Groovy on Grails. Heard lots of good about Lift Framework for Scala as well.
JavaScript Rhino has a compelling advantage -- it is included with the JDK. That being said, later versions of Rhino than the one with Java 6 have nice features like generators, array comprehensions, and destructuring assignment.
I favor using it whenever the ceremony of handling Java exceptions clutters up the code for no real benefit. I also use it when I want to write a simple command-line script that takes advantage of Java libraries.
Java. Seriously. It's a powerful, easy-to-use (if a tad verbose) language that everybody knows. The integration with Java is great.
The company I work for embeds Groovy into a Java/Spring website, which is deployed on a number of sites. The scripts are stored externally of the compiled WAR file, and allow us to manipulate some of the site logic without having to roll out new WAR's to each site. So far, this approach has worked very elegantly for us.
A particularly nice feature of Groovy is that it can closely resemble Java code, which makes it very easy to port existing Java classes to it.
How about SISC (Second Intepreter of Scheme Code)?
REF: http://sisc-scheme.org/