Summary:
I am trying to write a utility program that is based on the information contained in a separate file. The object has to be such that any information on the physical file can be retrieved quickly and can be updated quickly as well.
Details:
The file is a normal ANSI encoded file that is supposed to store definitions of the physical quantities stated in the SI system. What I really want is that I should be able to read and write changes to the definitions whenever required. I'll be using markers(like ":") to get the headings and definitions like:
Length:metre:m:"..length of path traveled by light in vacuum in
1/299792458th of a second"
and so on.
So in this case is extending RandomAccessFile an option? Will it help me in quick retrieval and syncing of data? Should I use another approach?
If you want these things, then I'd advise you to use an embedded ACID database like H2:
Guarantee that you don't lose changes that you made
Have more than one program access the info
This is because coding up something that correctly does this using low level facilities like RandomAccessFile is quite hard. Storing persistent application state in embedded DBs is commonly done. H2 is probably the most popular among DBs implemented in pure Java.
On how to actually do this, see this: Embedding the Java h2 database programmatically
You prob. want to look at introduction on relational DBs & SQL if you aren't familiar with them.
Related
Unfortunately I couldn't find anything specific to this topic / to my problem. Here we go:
I'm building a JavaFX Business Application for a friend of mine. Unfortunately I do not have any possibility to connect to a Database. I want the Application to load a savestate from a file. The application contains a list with clients and the clients got some specific properties. I do not want to hardcode this to a .prop or .txt file, because I'm sure that there's a different way of doing this, isn't there?
Thanks in advance, appreciate it!
Lots of choices for persisting data to local storage. The exact choice depends on your needs. You do not describe enough details to make a specific recommendation.
Here is a list of possibilities, roughly in increasing order of complexity of your data.
Text file
If you have small amounts of simple data, save to a text file. You can store each piece in a separate file, or combine into a single file. Recent versions of Java have new classes to make this easier than ever. See Oracle Tutorial.
Comma-separate & Tab-delimited
For sets of structured data, write to text files in comma-separated values (CSV) or tab-delimited values. For example a list of people with rows for each person, and columns for name, phone number, and email address.
While reading/writing such files is easy enough to program yourself, I suggest using an established library to eliminate the drudgery, avoid bugs, and save yourself some time. There are a few such libraries written in Java.
My favorite is the Apache Commons CSV project. This library makes easy work of the chore of reading/writing such files. Despite the name, this library supports tab-delimited as well as comma-separated formats. I've written a few Answers here on Stack Overflow showing how to use this library, as you can see here, here, and here.
By the way, plain old ASCII defines a few character positions explicitly for delimiting in data files, with four levels of grouping (document, group, record/row, and field). Unicode, of course, inherits these from ASCII as code points. I am puzzled why these have remained so obscure and so infrequently used. Seems much more logical to me than using commas and tabs which may well exist inside the data payload.
Serialization
You can write out the data values stored within an object. This is called serialization. Java has a serialization facility built-in, but be sure to study up on the details.
To more simply write out an object’s values and later read them back in to reconstitute an object, I have enjoyed using the Simple XML Serialization project. This works well for relatively simple needs, and is aimed at the situation where you want the structure of a class to drive the process of determining what to write.
Java has other XML binding facilities both built-in and third-party. These are much more powerful in their flexibility. They are especially good for when you want to define and verify the XML structure in a rigid fashion such as defining a XML DTD or XML Schema against which to validate the data and perhaps even generate the Java class in which to represent the data.
Embedded database
For more complicated data, use an embedded relational database.
The SQLite database is bundled with many platforms. This is a C-based library, not pure Java. As the name indicates, SQLite is indeed quite “lite“, lacking rigid data types and many other common database features. SQLite is meant to be an alternative to writing text files than as a competitor to more serious databases. It is a great product if your needs fit the sweet-spot of its capabilities.
My first choice for an embedded database would be H2 Database Engine. Built in pure Java. Can be run inside your app, or separately as a server (you choice). Has sophisticated relational database features. Has been around for years, often updated, and is well-worn. The principal author has much experience in the field.
I'm looking to make a web that makes use of two sets of databases, given in CSV format and both are 10 MB in size. I've chosen to use Java dynamic web app with JSP, that users can use to search and sort through the data provided through the CSV.
From what I understand, the user/client sends a request to the server, the server will call upon the Java cases in the backend, which has the different sorting methods and data from the CSV that can be manipulated.
This data, that sits in the backend, is where I'm running into confusion. I know its possible to load the data to a database, and have that sitting on the server that I could call upon.
If I use a class that reads the CSV and loads the data to arrays, Would this reading work be done every time someone accesses the website causing latency or would it already be loaded into arrays in the server?
Depending on the scope you use it would be loaded in an application context, therefore one time (say in a singleton class loaded at the application startup).
But I wouldn't recommend this approach, I would recommend a proper designed database where you can put your csv data into. This way you would have the database engine to help you organize your data which would give you scalability and maintainability (although with a proper design of your classes say a DAO pattern would give you the same).
Organized data in a database would give you more flexibility to search through your data using already made SQL functions.
In order to make my case here are some advantages of a Database system over a file system:
No redundant data – Redundancy removed by data normalization
Data Consistency and Integrity – data normalization takes care of it too
Secure – Each user has a different set of access
Privacy – Limited access
Easy access to data
Easy recovery
Flexible
Concurrency - The database engine will allow you to concurrent read the data or even write to it.
I'm not listing the disadvantages since I'm making my case :)
I can read from a CSV file to build your arrays. You can then add the arrays to session scope. The CSV file will only be read at the servlet that processes it. Future usage will be retrieved from session.
I'd like to save persistent objects to the file system using Hibernate without the need for a SQL database.
Is this possible?
Hibernate works on top of JDBC, so all you need is a JDBC driver and a matching Hibernate dialect.
However, JDBC is basically an abstraction of SQL, so whatever you use is going to look, walk and quack like an SQL database - you might as well use one and spare yourself a lot of headaches. Besides, any such solution is going to be comparable in size and complexity to lighweight Java DBs like Derby.
Of course if you don't insist absolutely on using Hibernate, there are many other options.
It appears that it might technically be possible if you use a JDBC plaintext driver; however I haven't seen any opensource ones which provide write access; the one I found on sourceforge is read-only.
You already have an entity model, I suppose you do not want to lose this nor the relationships contained within it. An entity model is directed to be translated to a relational database.
Hibernate and any other JPA provider (EclipseLink) translate this entity model to SQL. They use a JDBC driver to provide a connection to an SQL database. This, you need to keep as well.
The correct question to ask is: does anybody know an embedded Java SQL database, one that you can start from within Java? There are plenty of those, mentioned in this topic:
HyperSQL: stores the result in an SQL clear-text file, readily imported into any other database
H2: uses binary files, low JAR file size
Derby: uses binary files
Ashpool: stores data in an XML-structured file
I have used HyperSQL on one project for small data, and Apache Derby for a project with huge databases (2Gb and more). Apache Derby performs better on these huge databases.
I don't know exactaly your need, but maybe it's one of below:
1 - If your need is just run away from SQL, you can use a NoSQL database.
Hibernate suports it through Hibernate OGM ( http://www.hibernate.org/subprojects/ogm ).
There are some DBs like Cassandra, MongoDB, CouchDB, Hadoop... You have some suggestions Here
.
2 - Now, if you want not to use a database server (with a service process running always), you can use Apache Derby. It's a DB just like any other SQL, but no need of a server. It uses a singular file to keep data. You can easily transport all database with your program.
Take a look: http://db.apache.org/derby/
3 - If you really want some text plain file, you can do like Michael Borgwardt said. But I don't know if Hibernate would be a good idea in this case.
Both H2 and HyperSQL support embedded mode (running inside your JVM instead of in a separate server) and saving to local file(s); these are still SQL databases, but with Hibernate there's not many other options.
Well, since the question is still opened and the OP said he's opened to new approaches/suggestions, here's mine (a little late but ok).
Do you know Prevayler? It's a Java Prevalence implementation which keep all of your business objects in RAM and mantain Snapshots/Changelogs in the File System, this way it's extremely fast and reliable, since if there's any crash, it'll restore it's last state and reapply every change to it.
Also, it's really easy to setup and run in your app.
Ofcourse this is possible, You can simply use file io features of Java, following steps are required:-
Create a File Object
2.Create an object of FileInputStream (though there are ways which use other Classes)
Wrap this object in a Buffer object or simply inside a java.util.Scanner.
use specific write functions of the object created in previous step.
Note that your object must implement Serializable interface. See following link,
I'm currently working on a simple Java application that calculates and graphs the different types of profit for a company. A company can have many branches, and each branch can have many years, and each year can have up to 12 months.
The hierarchy looks as follows:
-company
+branch
-branch
+year
-year
+month
-month
My intention was to have the data storage as simple as possible for the user. The structure I had in mind was an XML file that stored everything to do with a single company. Either as a single XML file or have multiple XML files that are linked together with unique IDs.
Both of these options would also allow the user to easily transport the data, as apposed to using a database.
The problem with a database that is stopping me right now, is that the user would have to setup a database by him/herself which would be very difficult for them if they aren't the technical type.
What do you think I should go for XML file, database, or something else?
It will be more complicated to use XML, XML is more of an interchange format, not a substitute for a DB.
You can use an embeddedable database such as H2 or Apache Derby / JavaDB, in this case the user won't have to set up a database. The data will be stored only locally though, so if this is ok for your application, you can consider it.
I would defintely go for the DB:
you have relational data, a thing DBs are very good at
you can query your data in that relational much easier than in XML
the CRUD operations (create, read, update, delete) are much more easier in DB than in XML
You can avoid the need for the user to install a DB engine by embedding SQLite with your app for example.
If it's a single-user application and the amount of data is unlikely to exceed a couple of megabytes, then using an XML file for the persistent storage might well make sense in that it reduces the complexity of the package and its installation process. But you're limiting the scalability: is that wise?
I am creating a few JAX-WS endpoints, for which I want to save the received and sent messages for later inspection. To do this, I am planning to save the messages (XML files) into filesystem, in some sensible hierarchy. There will be hundreds, even thousands of files per day. I also need to store metadata for each file.
I am considering to put the metadata (just a couple of fields) into database table, but the XML file content itself into files in a filesystem in order not to bloat the database with content data (that is seldomly read).
Is there some simple library that helps me in saving, loading, deleting etc. the files? It's not that tricky to implement it myself, but I wonder if there are existing solutions? Just a simple library that already provides easy access to filesystem (preferrably over different operating systems).
Or do I even need that, should I just go with raw/custom Java?
Is there some simple library that
helps me in saving, loading, deleting
etc. the files? It's not that tricky
to implement it myself, but I wonder
if there are existing solutions? Just
a simple library that already provides
easy access to filesystem (preferrably
over different operating systems).
Java API
Well, if what you need to do is really simple, you should be able to achieve your goal with java.io.File (delete, check existence, read, write, etc.) and a few stream manipulations with FileInputStream and FileOutputStream.
You can also throw in Apache commons-io and its handy FileUtils for a few more utility functions.
Java is independent of the OS. You just need to make sure you use File.pathSeparator, or use the constructor File(File parent, String child) so that you don't need to explicitly mention the separator.
The Java file API is relatively high-level to abstract the differences of the many OS. Most of the time it's sufficient. It has some shortcomings only if you need some relatively OS-specific feature which is not in the API, e.g. check the physical size of a file on the disk (not the the logical size), security rights on *nix, free space/quota of the hard drive, etc.
Most OS have an internal buffer for file writing/reading. Using FileOutputStream.write and FileOutputStream.flush ensure the data have been sent to the OS, but not necessary written on the disk. The Java API support also this low-level integration to manage these buffering issue (example here) for system such as database.
Also both file and directory are abstracted with File and you need to check with isDirectory. This can be confusing, for instance if you have one file x, and one directory /x (I don't remember exactly how to handle this issue, but there is a way).
Web service
The web service can use either xs:base64Binary to pass the data, or use MTOM (Message Transmission Optimization Mechanism) if files are large.
Transactions
Note that the database is transactional and the file system not. So you might have to add a few checks if operations fails and are re-tried.
You could go with a complicated design involving some form of distributed transaction (see this answer), or try to go with a simpler design that provides the level of robustness that you need. A possible design could be:
Update. If the user wants to overwrite a file, you actually create a new one. The level of indirection between the logical file name and the physical file is stored in database. This way you never overwrite a physical file once written, to ensure rollback is consistent.
Create. Same story when user want to create a file
Delete. If the user want to delete a file, you do it only in database first. A periodic job polls the file system to identify files which are not listed in database, and removes them. This two-phase deletes ensures that the delete operation can be rolled back.
This is not as robust as writting BLOB in real transactional database, but provide some robustness. You could otherwise have a look at commons-transaction, but I feel like the project is dead (2007).
There is DataNucleus, a Java persistence provider. It is little too heavy for this case, but it supports JPA and JDO java standards with different datastores (RDBMS, object storage, XML, JSON, Excel, etc.). If the product is already using JPA or JDO, it might be worth considering using NataNucleus, as saving data into different datastores should be transparent. I suppose DataNucleus supports splitting the data into several files, creating the sensible directory/file structure I wanted (in my question), but this is just a guess.
Support for XML and JSON seems to be experimental.