Binding multi-source data to GUI in Java - java

Thanks for taking time to read my question.
Here's my issue : I have different data files (.csv) that I need to basically cram into the same GUI. But there are some operations that I need to do on them, nothing really fancy though. And I figured those operations would really facilitated by the possibility of using the "Sort By" and "Group By". That's why I thought to import the .csv files as databases (they are actually databases exported into csv format).
So let's use a simple example : I used the SUM operation on a column, in addition to the Group By. And I want to display the result in, say, a JTable (or other), how should I proceed ? Keeping in mind that this operation must be done for each file, and I need to display the result from the others files in the same component.

Related

Standard (and most practical) approach to storing large amounts of data to be read by a Java application

I am working with a database that is divided into a few dozen text files, each containing two columns and are 200 lines long.
Currently, I only load up one of the text files and read the data from it into two arrays. I could simply go through the handful of text files and load the data one after the other but I wanted to know what would be the approach to manage a "database" of this size and what would be the "standard" of the format of the database if it were to be included in the end application.
I could simply have a single text file that would hold all the data and would end up 250 000 lines long - while this would work, I just do not know better if it at all seems professional and practical. A much better approach would be if I could have a single file and then via code specify which table (the sub-text files are basically two column tables, hence a few dozens of them) I would like the data from to be read into two arrays.
Why not use a real database?
You could use some in-memory-database.

Saving data to file or database

I'm starting to work on a new Java desktop app that should help me and my colleagues learn vocabulary. It will contain around 700 words, some texts (that point to the words contained in them) and maybe some images (not sure about that part yet). The data will never change and I want the program to be able to run offline.
The question is: Should I use database, text file or serialize the data into file? Or perhaps if there is any other option I don't know about? If you could explain your choice in detail I would be glad.
If the data never changes and is only 700 words it would probably be easiest to use a file.
If your data was a bit more complex and had many fields and was being constantly updated, a database would be more preferable but a csv file could still be used.
Since you want to access this data offline and data never changes, I think the best option would be to just use text file, which will be more efficient in terms of access and speed.
Keep all the data in memory as Serializable Java objects, and store them serialized when your application is not running. Evaluate airomem - really nice solution that would perfectly work for you.

Java framework to manage BLOB data outside of database

I want to store my blobs outside of the database in files, however they are just random blobs of data and aren't directly linked to a file.
So for example I have a table called Data with the following columns:
id
name
comments
...
I can't just include a column called fileLink or something like that because the blob is just raw data. I do however want to store it outside of the database. I would love to create a file called 3.dat where 3 is the id number for that row entry. The only thing with this setup is that the main folder will quickly start to have a large number of files as the id is a flat folder structure and there will be OS file issues. And no the data is not grouped or structured, it's one massive list.
Is there a Java framework or library that will allow me to store and manage the blobs so that I can just do something like MyBlobAPI.saveBlob(id, data); and then do MyBlobAPI.getBlob(id) and so on? In other words something where all the File IO is handled for me?
Simply use an appropriate database which implements blobs as you described, and use JDBC. You really are not looking for another API but a specific implementation. It's up to the DB to take care of effective storing of blobs.
I think a home rolled solution will include something like a fileLink column in your table and your api will create files on the first save and then write that file on update.
I don't know of any code base that will do this for you. There are a bunch that provide an in memory file system for java. But it's only a few lines of code to write something that writes and reads java objects to a file.
You'll have to handle any file system limitations yourself. Though I doubt you'll ever burn through the limitations of modern file systems like btrfs or zfs. FAT32 is limited to 65K files per directory. But even last generation file systems support something on the order of 4 billion files per directory.
So by all means, write a class with two functions. One to serialize an object to a file; given it a unique key as a name. And another to deserialize the object by that key. If you are using a modern file system, you'll never run out of resources.
As far as I can tell there is no framework for this. The closest I could find was Hadoop's HDFS.
That being said the advice of just putting the BLOB's into the database as per the answers below is not always advisable. Sometimes it's good and sometimes it's not, it really depends on your situation. Here are a few links to such discussions:
Storing Images in DB - Yea or Nay?
https://softwareengineering.stackexchange.com/questions/150669/is-it-a-bad-practice-to-store-large-files-10-mb-in-a-database
I did find some addition really good links but I can't remember them offhand. There was one in particular on StackOverFlow but I can't find it. If you believe you know the link please add it in the comments so that I can confirm it's the right one.

Java - Sorting and csv: good practice with huge data

I need to order a huge csv file (10+ million records) with several algorithms in Java but I've some problem with memory amount.
Basically I have a huge csv file where every record has 4 fields, with different type (String, int, double).
I need to load this csv into some structure and then sort it by all fields.
What was my idea: write a Record class (with its own fields), start read csv file line by line, make a new Record object for every line and then put them into an ArrayList. Then call my sorter algorithms for each field.
It doesn't work.. I got and OutOfMemoryException when I try lo load all Record object into my ArrayList.
In this way I create tons of object and I think that is not a good idea.
What should I do when I have this huge amount of data? Which method/data structure can ben less expensive in terms of memory usage?
My point is just to use sort algs and look how they work with big set of data, it's not important save the result of sorting into a file.
I know that there are some libs for csv, but I should implements it without external libs.
Thank you very much! :D
Cut your file into pieces (depending on the size of the file) and look into merge sort. That way you can sort even big files without using a lot of memory, and it's what databases use when they have to do huge sorts.
I would use an in memory database such as h2 in in-memory-mode (jdbc:h2:mem:)
so everything stays in ram and isn't flushed to disc (provided you have enough ram, if not you might want to use the file based url). Create your table in there and write every row from the csv. Provided you set up the indexes properly sorting and grouping will be a breeze with standard sql

File-based Document Storage in android

I'm in the early stages of a note-taking application for android and I'm hoping that somebody can point me to a nice solution for storing the note data.
Ideally, I'm looking to have a solution where:
Each note document is a separate file (for dropbox syncing)
A note can be composed of multiple pages
Note pages can have binary data (such as images)
A single page can be loaded without having to parse the entire document into memory
Thread-safety: Multiple reads/writes can occur at the same time.
XML is out (at least for the entire file), since I don't have a good way to extract a single page at a time. I considered using zip files, but (especially when compressed) I think they'd be stuck loading the entire file as well.
It seems like there should be a Java library out there that does this, but my google-fu is failing me. The only other alternative I can think of is to make a separate sqlite database for every note.
Does anybody know of a good solution to this problem? Thanks!
Seems like a relational database would work here. You just need to play around with the schema a little.
Maybe make a Pages table with each page including, say, a field for the document it belongs to and a field for its order in the document. Pages could also have a field for binary data, which might be contained in another table. If the document itself has additional data, maybe you have a table for documents too.
I haven't used SQLite transactions on an Android device, but it seems like that would be a good way to address thread safety.
I would recommend using SQLite to store the documents. Ultimately, it'll be easier than trying to deal with file I/O every time you access the note. Then, when somebody wants to upload to dropbox, you generate the file on the fly and upload it. It would make sense to have a Notes table and a pages table, at least. That way you can load each page individually and a note is just a collection of pages anyway. Additionally, you can store images as BLOBS in the database for a particular page. Basically, if you only want one type of content per page, then you would have, in the pages table, something like an id column and a content column. Alternatively, if you wanted to support something that is more complex such as multiple types of content then you would need to make your pages a collection of something else, like "entities."
IMO, a relational database is going to be the easiest way to accomplish your requirement of reading from particular pages without having to load the entire file.

Categories