Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a question for you concerning Java. I am basically a Java user and did most of my work with it. However, in the machine learning classes I took in college, we used mostly python with the scikit-learn and numpy packages.
Now I want to do a project where I crawl data from the web, store it in SQL databases, and then do machine learning on this data. Maybe some of you have experience with those things and share some of it? I mean, of course it is possible to do these things with java, but maybe you have had some particular experiences on why I should use something else or what to consider?
I am happy for all your thoughts :-)
Have a great weekend!
It turns out that programming language and database implementation are secondary problems. Think first about the machine learning you want to do. Review the existing packages (in any language) and pick one according to how well it fits the needs of the business problem you are trying to solve. Then work with whatever language is most convenient for that package. You will probably find that no single language is suitable for all parts of the problem; you will end up gluing together Java, Python, R, shell scripts, etc, to make a complete solution, and there's nothing wrong with that. Consider that your job is problem solving instead of programming in a specific language and go from there.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have looked around, and there are no solid guidelines to converting Object Orientated C++ to Java. Most are references to conversion tools.
My question is what are the steps one should take to not get overwhelmed and lost, especially for heavily OO projects.
For example, given one method that accomplishes a task. That method is called, which is dependent on several other cpp, and those helper methods are also dependent on other cpp files, and so on. How should this be addressed?
What are techniques that can be used to break it down, while properly combining .hpp and .cpp?
I understand JNI can be used, however, it is desired to have only Java code, unless something can literally only be done in C++
Tips, suggestions, and ideas will be much appreciated.
PLEASE do NOT mark this as a DUPLICATE, there are only questions posted in respect to specific code, or using conversion tools, not for general techniques.
Also, if this is a terrible question, let me know, I'll take it down, no need to thumb it down. Thank you.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
My company has a production application being built in Java and C++. We have recently added Data Scientists who are skilled at and using R. I am wondering what best practices people have for making sure that work done in R is best leveraged. For instance is our best option to call R code from Java or C++? I have located http://www.renjin.org/about.html.
Or is there a good way to convert code from R to Java or C++?
I am not a big fan of Renjin as its Java-based interpreter will only cover a subset of CRAN, and at that the subset that does not involve calls to C++.
I am a bigger fan of either
separation of concern:
use something like Rserve for headless connection from anything (including Java), or
use something like OpenCPU to turn everything into web-based access
for heavier-duty work, interface C++ directly via Rcpp which well over 400 CRAN packages do.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am working on a small project on metadata extraction from documents and have run into, eh a dilemma. I have some libraries in Java which work well with document-handling for information retrieval, like Apache Tika, POI etc and some more tools in other languages like Ruby(pdf-extract) and a script in bash to fetch data from a RESTful API using wget.
AFAIK, Code reuse is a good thing, right? But then, if its not possible (natively, I mean) to reuse all this code, What approach has to be taken?
Using Java to run terminal-commands is a solution but I don't think it is good programming practice.
Integrating multiple technologies is something that is very common in real world applications. In order for it to scale properly, you probably want to use some methodology to keep things consistent. To me, the weakest part is probably fetching using wget, but that's my opinion.
In order to integrate and for everything to scale nicely you may want to look at some message passing protocols and have some sort of handling of queues where individual workers run in different programming languages and environments. Look at:
https://www.amqp.org/ (message passing standard)
https://www.rabbitmq.com/ (Java, .NET, Ruby, Python, PHP, JavaScript...
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I am a java programmer for 2 years.
My programs usually uses a database (mysql) (Java SE).
Should i use classically command line environment or use GUI tools, mysql administrator for example?
Is it necessary to be a database administrator? Or not?
I want to be a java programmer.
Sorry for this question!
Use of GUI tools, IDE's is very common in IT industry also makes life easier as they have very good features to detect bad usage of the programming language, bad syntaxis, illegal uses of the language and pretty cool stuff as refactor options just to said one of them.
I encourage for people that are starting or want to achieve an IT certification (SCJP) not use in their training lesson, sometimes people know the error only because IDE's is complaining about something and does not really know what is the root cause, so remove GUI or IDE will help you a lot in really know the root cause of the problems and give you really good troubleshooting skills.
Talking about SQL language, although you want to become java programmer, SQL need to be one of your skills, I'm strongly suggest learn it without use of GUI or SQL creators, also is not bad to know by memory some really most frequently used commands and know the solutions just using command line. I know a lot of people in IT that does not how to create an insert statement in sql and its variants as they always use MySQL to add records.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I wanted to start learning the big data technology from the scratch. I wanted to know is it necessary to learn java for operating with hadoop as i am already well versed in python?
No, you don't necessarily need java knowledge, as you can write map-reduce jobs perfectly well in pig or hive (similar to SQL). However, as with all layers of abstraction, at some point you may well need to know what is going on "behind the scenes" and being able to look, understand and debug the underlying java is a big advantage.
There is a lot of effort currently going into providing a more complete SQL interface to hadoop, with tools such as Impala (Cloudera), Presto (Facebook), Phoenix and Hive (already mentioned).
Check out MRJob, a python based wrapped for hadoop jobs running, logging and monitoring.
Although pure java solutions might be faster in some cases, you hardly ever will need to debug java code.
Not needed at all , though thats just my opinion. if you python well you should be fine.
check this out writing a hadoop map reduce in python. theres a lot of ways to implement solutions with hadoop. Just because a great deal of them are in Java doesnt mean java is the only tool to solve use . If your working with legacy that is written in java then knowing the basics may help but to be honest i think you could just reference things as you come across them. There is no need to spend a week learning the intricacies of Java 7 and whats new in Java 8 for your current needs.