I was thinking of working on a project while I have some free time and this one looks pretty nice: http://mindprod.com/project/filefinder.html
One thing I'm wondering about is that will it really be much faster compared to the regular windows search if I use SQL? I'm planning to use MySQL since it's open source. Also, do I need to be good at databases for this? I have basic knowledge about relational databases and can definitely make some SQL statements.
Thanks.
In unix there are commands like find and locate. It is much faster to find file using database based locate. I think Windows Search is also based on database so it will be hard to beat it.
As for database I would use JavaDB or embedded DB like SQLite. MySQL was bought by Oracle and in my private opinion there are better open source alternatives like PostgreSQL.
I think this task is quite easy from SQL point of view. For me the hardest part would be synchronizing database after there are changes in file system.
Related
I currently have a web application that runs with all of the data in Oracle. At the high level, the application consists of a java applet, some java servlets, some Ajax, and the oracle database. I was wondering what converting the whole suite to Hadoop instead would cost in terms of work? Below are some questions that can help me get a grasp on it.
Is there any software that can take SQL database schema creation scripts and queries and convert them to appropriate calls in Hadoop?
How different are the Java APIs for communicating with Hadoop to that of oracle SQL?
Theres a bit of Ajax in there too, how different is that from SQL to Hadoop?
Please consider me a beginner when explaining anything having to do with Hadoop. I don't need to drill down into specifics (unless you want to), just high level talks.
Thanks!
Hadoop is not suitable for usecases which needs real time querying and processing. Hadoop is best when used for offline batch processing and data analysis. You can refer to following link - Common Questions for getting some of you questions answered. You dont have a schema concept in HDFS, which is the filesystem in Hadoop. Data is stored in blocks on disk as a regular file.
I would suggest you visit apache hadoop to learn what is hadoop and in which use case it fits best.
If you are looking for SQL on Hadoop solution that is performant, then check out InfiniDB.
http://infinidb.co
We are a 4th generation columnar MPP engine behind MySQL. We can sit on top of HDFS, GlusterFS or on your local system, so we can be on Hadoop or not, your choice. We are fully open source, GPLv2, there is no difference between the open source version and the enterprise version, use it as you want, scale as you need.
We operate in the interactive SQL area, many people use us for analytical queries against their data. Hadoop MapReduce is great at batch work and transformations, but falls short on the interactive side of things, and that is where solutions like InfiniDB come in.
While you are on Oracle and using Oracle SQL, there may not be much difference between that and the MySQL syntax we support, depends on all the features of Oracle you are using. Many people use us a drop in replacement with their existing MySQL database to start to get the performance of having a cluster MPP database. Also transitioning to Hadoop as you mentioned is another use case, as we can provide the SQL interface for your applications to not even realize they are working on top of a Hadoop cluster.
Feel free to contact me if you have any questions / comments.
I am looking for a database which I can use to store data about certain stock over a number of years. There will probably be a few thousand records. I am writing an application in Java and Clojure which will pull out data from this local database when required to display the data.
I was wondering if anyone knew of a good database to work with for this purpose? I only have experience with MySQL running on the server side.
Which database would be easiest to work with in Clojure and Java for local storage?
Thanks,
Adam
JDK 6 and greater comes bundled with Java DB which good enough for your use case.
For this kind of small-scale application it will almost certainly be easiest if you pick one of the many good embedded Java databases.
My personal top choices would probably be:
H2 - probably the best performance pure Java database overall, and if you believe their benchmarks then it is considerably faster than MySQL and indeed most other databases when run in a single machine environment.
Apache Derby - good all rounder, mature and well supported (Oracle have included a version branded as Java DB in recent JDKs)
After that, you should be able to use them pretty easily using the standard JDBC toolset, so not much different from MySQL.
If you're after a really nice DSL for interfacing with SQL databases with Clojure, you should definitely also take a look at Korma.
I have used Apache Derby for a similar application (although written mostly in Java). They have been running it for almost four years now, and performed more than 60,000 transactions with it with no major problems. Only the occasional bug on my part.
Derby is the same database as JavaDB, however with Derby its easier to keep up on the releases as you can just include it as a dependency, rather than wait on the whim of when the next JDK rev is coming out.
Also, IIRC, JavaDB is only included with JDK, not the JRE.
Depending on the nature of your data and application and your willingness and/or constraints in working with a new database modality, you might also want to consider one of the document-oriented databases, MongoDB or CouchDB. If your data and application are SQL oriented, use one of the databases suggested.
I want to make a java application with a rather small database. The pc on which I want to install this has nothing of database stuff on his pc (no wamp server, no oracle, nothing...). I'm rather new in this kind of stuff, and i don't know if it's already been asked but this is what i want to accomplish
Now I have a couple of questions:
Is this doable?
What should I use? Mysql, Oracle,...
How can i do this?
I hope this is enough to get a decent answer.
Yes, it is doable.
For use with Java, I strongly recommend Apache Derby because
you have the huge flexibility of being able to choose between embedded and client-server db, with no code refactoring needed to change data access mode
over H2 or HSQLDB: according to my experience I've found Apache Derby
to be much more reliable/resilient (other embedded DBMSs tend to break more than derby when power fails)
to eat up less RAM
to have better performance on bigger deployments (lots of rows, lots of data [in microbenchmarks with little real-world data H2 and HSQLDB can actually score better]).
to be particularly fast with select queries in heavily multithreaded environments
over MySql and PostgreSql
it's actually faster, when you are not CPU/network -bound, because I've seen it perform better than them in many cases (especially with bigger DBs -- say 10GB) when it comes to filesystem access (MySql and PostgreSql, however, are more efficient in terms of CPU/network utilization, when these are a constraint)
over MySql, PostgreSql, Oracle db, etc.
it's surprisingly fast (often faster), with very big DBs (say, 30 GB) -- something one wouldn't expect from a DBMS you can embed in any application with no deployment/configuration
To get started, see
Apache Derby Getting Started guide
Apache Derby tutorial
Apache Derby FAQs
WorkingWithDerby wiki
If you don't need clients from the network to remotely connect to your database, an "embedded database" is what you want to implement.
Flame-preventing disclaimer: all the statements above are according to my very own personal experience, with the projects I've worked on and/or articles/benchmarks that I read and trusted as reliable. Unless otherwise stated (and in fact I'm not stating otherwise anywhere :) ), I'm referring to fresh out-of-the-box un-fine-tuned installations.
You should probably use an embedded database like H2 or HSQLDB. They are just a simple libraries that you drop in your application, but they provide exactly the same JDBC interface.
You can use the full power of SQL database without any external dependencies. H2, my personal favourite, allows you to create in-memory as well as persistent databases, you can optionally connect to it using socket, it can expose web interface over default 8082 port, so on and so on. On my developer machine I don't even have "normal" database installed, I always use H2.
HSQL or use one of the SQLite JDBC adapters.
I recommend using Derby database. It is very simple to embed in java application.
How is this computer's hardware? What CPU memory and hard-disk?
What is the OS? Do you have the administrator/root access?
If you have a typical PC with windows OS, and enough CPU, memory and hard-disk.
I recommend you to install mysql. Just download the mysql for your OS and install it.
Download link:
http://www.mysql.com/downloads/mysql/
Here is install documents:
http://dev.mysql.com/doc/refman/5.5/en/installing.html
Good luck.
I'm looking to add a pretty simple SQLite database to an existing Java EE application. I'm very used to using EJBs, ORMs, EntityManager, etc. to work with SQL databases. I've never not used some sort of ORM to work with relational DBs in Java.
I've been "recommended" to use a Java wrapper for SQLite, rather than a JDBC driver - so I'm kind of naked and ORM-less (right?). I'd like to get this up and running quickly.
Context
I've been using an in-memory cache, implemented as a Map, which gets filled with entries linearly over time. At some point, like when the app runs overnight, the cache will use all available heap space. Hence, storing the cache on disk (as a SQLite database) rather than in memory (as a Java Map).
Questions
How should I manage resources like SQLiteConnection? Normally I would let the container worry about all this, but since I'm not using JDBC, I have to do all this !##$%ing, non-value-added stuff manually - right?
Is there a way to implement this cleanly and transparently? I'd like to be able to just swap out an implementing class - e.g. replace FooMapCacheImpl with FooSQLiteCacheImpl.
"[Most] methods are confined to the thread that was used to open the connection". Is there a simple, straightforward way to ensure that I don't try to access a SQLiteConnection from threads other than the one that opened it?
...and the flip side of that question: can I avoid creating a new connection every time I want to read from/write to the database? It seems a bona fide PITA to have to manage connections per-thread rather than, say, per instance, which is how I've been thinking about communicating with databases in the past.
Basically
I'm rather lost when it comes to working with databases in Java/Java EE, without using an ORM.
What's the right way to do it?
I don't think It is too hard to make a front end that would implements Map and save everything to a database using JDBC, but before doing it, think twice about it. The performance of the whole system might be affected badly.
However, if the root cause of your problem is the lack of Heap space, you should take a look at Terracotta's BigMemory. However, it is a commercial (non-free) product.
Terracotta has a pretty good cache framework as well (ehcache) which is opensource. Look at the cookbook, it might be inspiring.
If you want to do everything by hand, and you don't mind using Spring, try spring-jdbc. It is very easy to integrate with any JDBC driver. Take a look at SimpleJdbcTemplate. It does all the boiler plate code for you. You should probably use a connection pool as well, such as commons-dbcp
The easiest SQLite JDBC driver to use is this one. Since it doesn't rely on JNI. It might not be as fast, but for quick testing it is perfect.
If you aren't binded to SQLite, you can take a look at other available JDBC solutions such as hsqldb or derby
I hope this will help you out.
You may also want to look at Berkeley DB Java Edition. It allows you to persist and manage Java objects directly in the library, without requiring an ORM (and the associated overhead). It runs on Android, it's an Java library and can manage data sets ranging in size from very small to very large. It was designed with Java application developers in mind and should be both faster and simpler to use than an ORM+RDBMS solution. You can find more out more about it on our web site at Oracle Berkeley DB Java Edition.
Regards,
Dave
The sqlite4java wrapper is basically a JNI wrapper, it is nowhere near what you want.
An ORM like eclipseLink would anyway be a layer on top of JDBC and the Entity manager would always end up using JDBC accesses.
Instead, sqlite4java allows you to call SQLite in java instead of having to do all the JNI wrapping yourself.
If you want to use an ORM and your preferred entity manager then you should use a JDBC driver and the sqlite4java wiki references a few of them.
Hope this helps.
My workmate and I are trainees and we got an exercise to realize a project. We have decided us to create a customer management in Java. Now we have to choose a database. We are able to use Oracle, MySQL, PostgreSQL, HSQLDB and of course other Open Source databases.
So, what database is recommend for us?
I thought Oracle is too complex for our small project, isnĀ“t it?
Thank you in advance!
finsterr
Why not just use the database bundled with Java 6 ? JavaDb is originally a production-quality database from IBM called Derby, and works well.
Does your company already use Oracle and have people who are expert in it to go to when you get stuck? Then I would use it (if you can learn Oracle, all the other dbs are easy in comparison, take advatage of the resources if you have them). If not then use one of the others.
More important than which database you use is to get up to speed on relational database design before you start to put this together. Here's a starting place:
http://www.deeptraining.com/litwin/dbdesign/FundamentalsOfRelationalDatabaseDesign.aspx
another good read
Database development mistakes made by application developers
You will save yourself quite a bit of tedious setup and platform dependencies if you use an embeddable database written in Java. Apache Derby (in any of its incarnations) would be a good start.
It is strongly recommended to use a database abstraction layer like Hibernate to avoid having raw SQL in your code. This will allow you to chose a database at deployment time, allowing you to scale effortlessly.
I would agree that Oracle is probably a bit much, however since you gave so little details I cannot really say. I would recommend MySQL and it would probably work well and be nice to the budget however you need to give more information.
Do you need the ability to distribute your software? If so, the license will be a part of the decision-making process. Of the ones you listed, Postgresql is the easiest to work with in terms of licensing, essentially having a "do whatever you want to with it" license.
Depending on the size/scope of your project, SQLite may be a good fit.
Firebird can be a good database for project like you :
small footprint
easy to administrate
free and opensource
good driver for Java (Jaybird)