Resources vs. SQLite - java

I'm trying to analyze the trade-offs between using SQLite vs. using resources for an app that needs to ship with a fairly sizeable amount of text (several books). I've read this post on raw XML files vs. SQLite and this one on XML resources vs. SQLite. Both of those, however, seem to be comparing SQLite to parsing XML at run time. I don't know if the same issues apply to using string and int resource arrays. I actually have a number of unknowns and I'd appreciate any insights others can offer.
Data details: about 40 books; three languages per book; average book length 25 chapters; average chapter length 25 paragraphs; about 75,000 paragraphs total. Text is stored by paragraph; no finer granularity needed. For each language, the app's logical view of the text is as a single array of paragraphs spanning all the books. There are also "table of contents" (TOC) data down to the paragraph level. All the data are strictly read-only. I need to support two query types: 1) retrieve the text for a paragraph or range of paragraphs in a specified language; 2) given a paragraph number, determine the book, chapter, and paragraph offset in the chapter. I don't need to use any of SQLite's string functions.
My analysis so far:
SQLite: Create an SQLite data base off-line, package it as a raw resource or asset, and copy it to the data base location when the app is run for the first time (and/or upgraded). I have a implemented a prototype data base for this with half a dozen tables.
Can use SQL to query data base, so don't need to code any search algorithms.
I know it can handle this much data.
Requires several SQL range queries to answer type 2 queries.
Requires twice the space: in the .apk file and again when installed into the app's db area.
Android's SQLite implementation requires external storage (SD card), so app won't work without one. Amazon's guidelines for Kindle Fire apps state that apps cannot require an SD card, so going this way might rule out Kindle Fire compatibility. (Bad!)
Resources: Create a collection of xml array resource files off-line and copy them to the project's res/values folder. Text would be partitioned into many string arrays: one array per chapter per book. There would be about 3,000 arrays. Indexes would be implemented as int arrays. For each book, the index data would be shared across languages. I'd probably also need to generate some typed array resources to provide an index into the generated resource IDs. I expect that the index arrays are small enough to load entirely into memory at app startup.
Type 1 queries involve loading the correct string array(s) and accessing array elements. Type 2 queries involve binary search of the (already loaded) index data.
Don't know whether the resources system in Android can handle that many resource arrays.
Don't know what the performance would be compared to using SQLite.
I suppose a hybrid approach is also possible: store the TOC data one way and the text itself in another.
Again, I'd appreciate any thoughts or insights that would help with this analysis.

One tangential point...
Amazon's guidelines for Kindle Fire apps state that apps cannot require an SD card, so going this way might rule out Kindle Fire compatibility. (Bad!)
The version from today actually recommends
that you deploy a smaller APK that downloads and installs quickly, and then upon first launch downloads additional resources and saves them on a local file system.
for larger apps instead of packaging it altogether. Additionally, what they forbid seems to be [emphasis mine]
copying, recording, downloading, storing, or similar actions of any type of video or audio content onto the Amazon Fire TV or Fire TV Stick device, any SD memory card or any connected external storage (where applicable).
So that restriction seems obsolete now.

Related

How to capture formulas and support formula evaluation in java web application

We have a requirement to incorporate an excel based tool in java web application. This excel tool has set of master data and couple of result outputs using formula calculations on master data.
Master data can be captured in database with relational tables. We are looking for the best way to provide capability to capture, validate and evaluate. formulas.
So far looked at using scripting engines nashorn and provide formula support using eval. We would like to know how people are doing in other places.
I've searched and found two possible libraries that could be useful for you please have a look.
http://mathparser.org/
http://mathparser.org/mxparser-hello-world/mxparser-hello-world-java/
https://lallafa.objecthunter.net/exp4j/
https://lallafa.objecthunter.net/exp4j/#Evaluating_an_expression_asynchronously
Depends on how big your data is and what your required SLA is. Also on what kind of formulas/other functions that you want to support.
For example, consider a function like sum or max. Now, the master data is in some relation table containing 10k rows. You could pull in all this data inside a java app and do a sum (or run any function). However, imagine if the table contained 500K rows. This would take some time to stream all 500K rows to Java app but consumes lot of cpu and network bandwidth (database resources, local cpu resources). A better optimized scenario in that case would be index that column in the database and let database do all the hard work for you.
Personally, I don't like using eval. I would rather parse the user input to determine what actions to take.
I am assuming that data is not big to use big data tools.

How do I store objects if I want to search them by multiple attributes later?

I want to code a simple project in java in order to keep track of my watched/owned tv shows, movies, books, etc.
Searching and retrieving the metadata from an API (themovieDB, Google Books) is already working.
How would I store some of this metadata together with user-input (like progress or rating)?
I'm planning on displaying the data in a table like form (example). Users should also be able to search the local data with multiple attributes. Is there any easy way to do this? I already thought about a database since it seemed that was the easiest solution.
Any suggestions?
Thanks in advance!
You can use lightweight database as H2, HSQLDB or SqlLite. These databases can be embedded in the Java app itself and does not require extra server.
If your data is less, you can also save it in XML or Json by using any XMLParser or JsonParser (e.g. Gson()).
Your DB table will have various attributes which are fetched from API as well as user inputs. You can write query on the top of these DBs to fetch and show the various results.
Either write everything to files, or store everything on a database. It depends on what you want though.
If you choose to write everything to files, you'll have to implement both the writing and the reading to suit your needs. You'll also have to deal with read/write bugs and performance issues yourself.
If you choose a database, you'll just have to implement the high level read and write methods, i.e., the methods that format the data and store it on the appropriate tables. The actual reading and writing is already implemented and optimized for performance.
Overall, databases are usually the smart choice. Although, be careful of which one you choose. Some types might be better for reading, while others are better for writting. You should carefully evaluate what's best, given your problem's domain.
There are many ways to accomplish this but as another user posted, a database is the clear choice.
However, if you're looking to make a program to learn with or something simple for personal use, you could also use a multi dimensional array of strings to hold the name of the program, as well as any other metadata fields and treat the array like a table in excel. This is not the best way to do it, but you can get away with it with very simple code. To search you would only need to loop through the array elements and check that the name of the program (i.e. movieArray[x][0] matches the search string. Once located you can perform actions or edit the other array indexes pertaining to that movie.
For a little more versatility, you would create a class to hold the movie information with fields to hold any metadata. The advantage here is that the metadata fields can be different types rather than having to conform to the array type, and their packaged together in the instance of the class. If you're getting the info from an API then you can update or create the classes from the API response. These objects can be stored in an ArrayList and searched with a loop that checks for a certain value i.e.
for (Movie M : movieArrayList){
if(m.getTitle().equals("Arrival")){
return m;
}
}
Alternatively of course for large scale, a database would be the best answer but it all depends what this is really for and what it's needs will be in the real world.

Inserting to and searching a large amount of data in Java

I am writing a program in Java which tracks data about baseball cards. I am trying to decide how to store the data persistently. I have been leaning towards storing the data in an XML file, but I am unfamiliar with XML APIs. (I have read some online tutorials and started experimenting with the classes in the javax.xml hierarchy.)
The software has to major use cases: the user will be able to add cards and search for cards.
When the user adds a card, I would like to immediately commit the data to the persistant storage. Does the standard API allow me to insert data in a random-access way (or even appending might be okay).
When the user searches for cards (for example, by a player's name), I would like to load a list from the storage without necessarily loading the whole file.
My biggest concern is that I need to store data for a large number of unique cards (in the neighborhood of thousands, possibly more). I don't want to store a list of all the cards in memory while the program is open. I haven't run any tests, but I believe that I could easily hit memory constraints.
XML might not be the best solution. However, I want to make it as simple as possible to install, so I am trying to avoid a full-blown database with JDBC or any third-party libraries.
So I guess I'm asking if I'm heading in the right direction and if so, where can I look to learn more about using XML in the way I want. If not, does anyone have suggestions about what other types of storage I could use to accomplish this task?
While I would certainly not discourage the use of XML, it does have some draw backs in your context.
"Does the standard API allow me to insert data in a random-access way"
Yes, in memory. You will have to save the entire model back to file though.
"When the user searches for cards (for example, by a player's name), I would like to load a list from the storage without necessarily loading the whole file"
Unless you're expected multiple users to be reading/writing the file, I'd probably pull the entire file/model into memory at load and keep it there until you want to save (doing periodical writes the background is still a good idea)
I don't want to store a list of all the cards in memory while the program is open. I haven't run any tests, but I believe that I could easily hit memory constraints
That would be my concern to. However, you could use a SAX parser to read the file into a custom model. This would reduce the memory overhead (as DOM parsers can be a little greedy with memory)
"However, I want to make it as simple as possible to install, so I am trying to avoid a full-blown database with JDBC"
I'd do some more research in this area. I (personally) use H2 and HSQLDB a lot for storage of large amount of data. These are small, personal database systems that don't require any additional installation (a Jar file linked to the program) or special server/services.
They make it really easy to build complex searches across the datastore that you would otherwise need to create yourself.
If you were to use XML, I would probably do one of three things
1 - If you're going to maintain the XML document in memory, I'd get familiar with XPath
(simple tutorial & Java's API) for searching.
2 - I'd create a "model" of the data using Objects to represent the various nodes, reading it in using a SAX. Writing may be a little more tricky.
3 - Use a simple SQL DB (and Object model) - it will simply the overall process (IMHO)
Additional
As if I hadn't dumped enough on you ;)
If you really want to XML (and again, I wouldn't discourage you from it), you might consider having a look a XML database style solution
Apache Xindice (apparently retired)
Or you could have a look at some other people think
Use XML as database in Java
Java: XML into a Database, whats the simplest way?
For example ;)

Trade off between reading from database and memory storage of Java strings using servlets

I'm in the process of setting up a system which will have to repeatedly parse large amounts of text (as a String or StringBuffer - which might be better?) acquired from the a data source. The text will be displayed and may consist of several thousand words and each time the text is parsed, each word may have to checked against a list of 550 stop words. This will allow the words to be filtered from display.
So I wonder about performance as this could be going on in multiple servlet sessions at any one time; is it better to check each word against a MySQL database table (MyISAM or InnoDB) using an index? Or simply to store the 550 words in a Java array or arraylist within servlet context so they possibly be read more quickly?
So I wonder about the trade off between database IO against storing 550 strings in memory.
Any advice?
Thanks
Mr Morgan.
Assuming that the "data source" is not your database, you can get better performance by doing the stopword search in memory rather than asking the database for do it. It stands to reason:
Any algorithm that the database uses can equally be used as your in-memory algorithm.
By running the algorithm locally, you avoid the cost of sending the text to the database and sending the results back.
It is also likely that you can implement a better algorithm for detecting the stop-words than a general purpose database engine could. And the memory needed for a data structure that represents the 500 or so stopwords should be trivial compared with the space used by the rest of your application, the servlet container and all of the libraries that you use.
550 String is a very small amount of data for today's servers : you don't need the database, it will be much slower.
I recommend using a standard Java Properties file, since you don't have that much data. This lets you use the standard Internationalization/Locale features.
This assumes, of course, that the copy changes fairly slowly. But that is usually the case.

Are all .class files in my Java application loaded into memory after application start?

I am making an app for Android, in my Activity I need to load an array of about 10000 strings. Loading it from database was slow, so I decided to put it directly into one .java file (as a private field). I have about 20 of these classes containing string arrays and my question is, are all the classes loaded into memory after my application is started? If so the Activity in which I need these strings would be loaded quickly, but the application as a whole would have a slow start...
Is there other way, how to very quickly load an 10000 string array from a file?
UPDATE:
Why I need these strings? My Android app allows you to find "journeys" in Prague's public transit - you choose departure stop, arrival stop and it finds your journey (have a look here). My app has a suggestions feature - you enter leter "c" as your departure stop and a suggestions ListView appears with stops starting with "c". For these suggestions I need the strings. Fetching the suggestions from database is slow (about 400ms on G1).
First, 400ms to perform a simple database query is really slow. So slow that I'd suspect that there is some problem in your database schema (e.g. indices) or your database connection configuration.
But if you a serious about not using a database, there are a couple of alternatives to what you are currently doing:
Arrange that the classes containing the arrays are lazily loaded as required, using Class.forName(...). If you implement it right, it should be possible for the garbage collector to reclaim the classes after they have been loaded and the strings have been added to your primary data structure.
Turn the 10000 Strings into a flat file, put the file into your app's JAR file. Then use Class.getResourceAsStream(...) to open the file and read it into the in-memory array.
As above, but using an indexed file and replacing the array with a data structure that allows you to read Strings from the file lazily. (This will be a bit complicated, but if you are worried by the memory consumed by the 10000 Strings, this will help address that.)
A class is loaded only when it is first referenced.
Though you need an array of 10000, you may not need all at once. Here is where the concept of paging comes in. This link indicates that Paging is often done in Android.Initialy have only a small amount of array in memory, and as you need it, keep loading it in to memory and unloading any previous data from memory if not wanted.
For e.g. in any table, at one shot, the user may see at best 50 records, then he will have to scroll(considering his screen is not size of an iMax movie theatre). When he scrolls, load the next chunk of data and unload any data that is now inivsible to the user.
When is a Type Loaded? This is a
surprisingly tricky question to
answer. This is due in large part to
the significant flexibility afforded,
by the JVM spec, to JVM
implementations. Loading must be
performed before linking and linking
must be performed before
initialization. The VM spec does
stipulate the timing of
initialization. It strictly requires
that a type be initialized on its
first active use (see Appendix A for a
list of what constitutes an "active
use"). This means that loading (and
linking) of a type MUST be performed
at or before that type's first active
use.
From http://www.developer.com/java/other/article.php/2248831/Java-Class-Loading-The-Basics.htm
I don't think that you will be happy with maintaining 10K Strings, hardcoded at Java files.
Rather check if you are using the right database for your problem and if your indices are set correctly. A wrong index can cause really poor performance.
Additionally you should limit the amount of results returned by the query, but make sure you don't fetch the entries one by one.
If nothing fits, you can still preload the Strings from the database at startup.
You could preload, let's say 10 entries, for each character. If a character is keyed in, you can preload the entries with that character following another and so on.

Categories