Storing JSON objects: SQLite vs serialization to disk

Storing JSON objects: SQLite vs serialization to disk - java

Will be building an app which will be pulling down JSON objects from a web service, in the low hundreds, each relatively small say 20kb each.
The app won't be doing much else than displaying these POJOs, downloading new and updated ones when available and deleting out of date ones. What would be the preferred method for persistent storage of these objects? I guess the two main contenders are storing them in a SQLite DB, maybe using ORMLite to cut down on the overhead, or just serialize the objects to disk, probably in one large file and use a very fast JSON parser.
Any ideas what would be the preferred method?

You could consider using CouchDB as cache between the mobile client and your webservice.
CouchDB would have to run on a service on the internet, caching the objects from the webservice. On the client you can use TouchDB-Android: https://github.com/couchbaselabs/TouchDB-iOS/wiki/Why-TouchDB%3F . TouchDB-Android can synchronize automatically with CouchDB inatance running on the Internet. The application itself would then access TouchDB solely. TouchDB automatically detects wetter or not there's an internet-connection, so your application keeps running even without internet.
Advantages:
- Caching of JSON calls
- Client remains working with internet-connection down, synchronized automatically when internetconnection is up again.
- Takes load of your webservice, and you can scale.
We used this setup before to allow Android software to work seamlessly, even when the internetconnetion would drop frequently and the service we accessed data from was quite slow and had limited capacity.

A dbms such as SQLLite should come with querying, indexing and sorting capabilities (and other standard SQL DBMS features), you should consider if you need any of these. How many objects are you planning to have in production environment? If say a million disk serialization approach might not scale.

Related

Data Structure for Saving Username/Password in a Client/Server Java Application

I'm currently getting into Socket Programming and building a multi-threaded console application where I need to register/login users. The data needs to be saved locally, but I can not seem find the right structure for it.
Here are the ideas I though about:
Simply saving the data to .txt file. (will be troublesome to search and authenticate the logins)
Using the Java Preferences API but since the application is multi-threaded I keep on overwriting the data each time a new client connects to my server. Can I create a new node for each new user?
What do you guys think is the ideal structure for saving login credentials? (security isn't currently a concern for this application)

I would consider the H2 database engine.
quote:"Very fast, open source, JDBC API Embedded and server modes; in-memory
databases Browser based Console application Small footprint: around 2
MB jar file size"
http://www.h2database.com

It really depends on what you want to do with the application. The result would be different, depending on what you would answer to the following questions:
Do you want/need to persist the databases?
Is there any other data which you need to store along with that?
are you using plain java or a framework like Spring?
Some options:
if you're just prototyping and you don't have any persistence: consider using an in-memory storage for it. For simplicity in coding/dependencies, something like a ConcurrentMap can be completely sufficient. If you wrap it properly, you can exchange it later - and you don't add dependencies and complexities at an early state.
If you're prototyping but you still need persistence, using properties files on top of the ConcurrentMaps can give you a quick win.
There might be some more stages to this, depending on where you want to go with this, choosing a database at one point can be an option. Depending on your experience and needs, you can use a SQL or NoSQL database. Personally, I get faster results with NoSQL (MongoDB in my case) but prefer SQL in production for use cases like account management.

Do I have to load the csv values into a DB for java web app

I'm looking to make a web that makes use of two sets of databases, given in CSV format and both are 10 MB in size. I've chosen to use Java dynamic web app with JSP, that users can use to search and sort through the data provided through the CSV.
From what I understand, the user/client sends a request to the server, the server will call upon the Java cases in the backend, which has the different sorting methods and data from the CSV that can be manipulated.
This data, that sits in the backend, is where I'm running into confusion. I know its possible to load the data to a database, and have that sitting on the server that I could call upon.
If I use a class that reads the CSV and loads the data to arrays, Would this reading work be done every time someone accesses the website causing latency or would it already be loaded into arrays in the server?

Depending on the scope you use it would be loaded in an application context, therefore one time (say in a singleton class loaded at the application startup).
But I wouldn't recommend this approach, I would recommend a proper designed database where you can put your csv data into. This way you would have the database engine to help you organize your data which would give you scalability and maintainability (although with a proper design of your classes say a DAO pattern would give you the same).
Organized data in a database would give you more flexibility to search through your data using already made SQL functions.
In order to make my case here are some advantages of a Database system over a file system:
No redundant data – Redundancy removed by data normalization
Data Consistency and Integrity – data normalization takes care of it too
Secure – Each user has a different set of access
Privacy – Limited access
Easy access to data
Easy recovery
Flexible
Concurrency - The database engine will allow you to concurrent read the data or even write to it.
I'm not listing the disadvantages since I'm making my case :)

I can read from a CSV file to build your arrays. You can then add the arrays to session scope. The CSV file will only be read at the servlet that processes it. Future usage will be retrieved from session.

System architecture - Java Backend, Database, Mobile Apps

I´m building Java backends with Spring, Hibernate and RDBMSs for a while now. Also I´m regularily working on mobile applications for iOS and Android.
So I have a full stack of technology to use for this task, however I am looking for something maybe more advanced that better fits the requirements. I was having some thoughts about it, but I better first explain how my current systems work and then how I want my upcoming systems to look like.
Currently using
Spring Framework to connect everything together
Hibernate with Entity beans for persistence
MySQL or others as RDBMS
DTO objects created with Dozer
RESTful API to expose services
DTOs are transferred in JSON format
This setup works. But I have the feeling that it´s just too much work and life could be simpler with other technologies.
What I am looking for
On the mobile site, I want to receive data for the current screen that I could easily cache. JSON is something that is already serialized and that would be easy to save to disk in the mobile application, without using yet another database. So the question is, how could I store the data in the backend, so that I can more easily receive it, without using entity beans, DTOs and Dozer to convert between them? Isn´t there another database solution which already delivers JSON? What about graph databases for example, like OrientDB or Neo4J?
I definitely want to go with Java and Spring, and I am open to a replacement for Hibernate, RDBMS and entity beans and DTOs.
Looking forward to your answers!

Your current design (This setup works) has niceties which a good system should have. tiered and good separation of concerns.
If I understand your requirement correctly then, you argument is, if my end data format is JSON then why not store the data in JSON format which will get you rid of lot of plumbing code/effort in the middle tier.
It will directly enable you to fetch the data from the storage and pass it on the requesting client. These are your requirement in nutshell. Please correct me if I am wrong.
Now JSON is more of textual notation and less of storage format. Jason is generally consumed by the View tier of MVC architecture as its easy to render on the screen using Javascript.
Your reasoning of using a NoSQL DB which directly delivers JSON is credible given that tye end client is going to be mobile app.
Overall architecture looks good and highly optimized for Mobile access.
Now coming on the NoSQL JSON storage, following are the Document Store NoSQL DBs which support JSON interface
i. CouchDB
ii. JasDB
iii.SchemaFreeDB
8.You can evaluate any of these to suite your needs.

(full disclosure - I'm an engineer with Kinvey, a BaaS provider)
One option you might consider is using Backend-as-a-service. Most BaaS providers use JSON to transfer the data over the wire, which sounds like it would be compatible with your requirements.
In addition, you'll typically get a lot of common mobile app functionality baked in (i.e. push notifications, file storage and CDN infrastructure, user management, etc). This could be especially useful if you are building multiple apps, each with their own backend; rather than reinventing the wheel each time, simply spin up a new backend.
One last, but important note, would be pricing. A lot depends on your use case, but from what I've seen, a BaaS provider is usually significantly cheaper that rolling your own solution on AWS or some other cloud provider, especially since most providers offer a free tier.

Even though this question is a bit old, maybe a quick alternative for RDBMS: MongoDB. It is a document database with document-level locking. It scales really well.
Main point: it uses JSON as its document storage (actually the Binary JSON a.k.a. BSON, but that is just a superset). Inserting a document into the database is as easy as
db.collection.insert(JSON);
on the mongo shell and
DBObject bson = (DBObject) JSON.parse(JSONstr);
collection.insert(bson);
in the java driver.

Using Java in Google App Engine, what's the best way to store and access large, static data?

I have my most of my apps "dynamic" data stored in the datastore.
However, I also have a large collection of static data that would only change with new builds of the app. A series of flat files seems like it might be simpler than managing it in the datastore.
Are there standard solutions to this? How about libraries to make loading/parsing this content quick and easy? Does it make more sense to push this data to the datastore? Which would perform better?
Anyone else have this problem and have war stories they can share?

Everything depends on how you need to use the information.
I for instance have an application that needs to have a starting state provided from static data. Since I wanted this static data to be easily prepared outside the application, I put the data as spreadsheets on Google Docs and then I have an administrative function in my web app to load the starting state through Google Docs Spreadsheet API to objects in the datastore. It works fairly well, although there are some reliability issues that I haven't quite worked out yet (I sometimes need to restart the process).
In other cases, you might get away with just including the data as static property/xml files and load them through the standard Java resource APIs (getResourceAsStream and such). I haven't tried this approach though since it wasn't meaningful in my particular situation.

What's the best way to keep java app data stored redundantly in a file?

If I have systems that are based on realtime data, how can I ensure that all the information that is current is redundantly stored in a file? So that when the program starts again, it uses this information to initialize itself back to where it was when it closed.
I know of xstream and HSQLDB. but wasn't sure if this was the best option for data that needs to be a literal carbon copy.

It really all depends what type of app data you're storing. If you need to recreate java objects exactly how they were (i.e. variables and state the same), you can serialize the objects you need. There are many serialization mechanisms, for example, xstream as you mentioned. If you're storing objects directly, using one of those mechanisms would work.
But, a lot of times, you want to store the state of your application, which doesn't necessarily correspond directly to serializing objects directly. If that's the case, you can write out only the relevant data you need. The type of storage you use depends on your needs. If you have a large amount of data, consider a database. A smaller amount might work better in a flat file.
One other thing is that storing data redundantly in a single file doesn't seem too useful. If the file gets corrupted, you'll lose both copies, so if redundancy is a concern, store it in different places (i.e. a primary and backup database).
There's no one right way to do it, but hopefully these ideas get you started.

Creating a literal copy (i.e. a snapshot) of a large body of in-memory data is expensive. Repeating the process each time you get an update to the in-memory data is probably prohibitively expensive. You need to re-think your application architecture.
One approach is to commit your realtime data to a database as it comes in, and then display the data either from the database for coherency.
A second approach is to commit to a database and maintain a parallel in-memory data structure which you display from. You also need to implement code to rebuild the in-memory data structure from the database on application restart. This is more code, and there is more opportunity for glitches where the user sees different stuff after a restart due to some bug.
A third approach is to work entirely from an in-memory data structure and deal with data persistence as follows:
periodically, you suspend processing updates and take a snapshot of the entire in-memory data structure using xstream, java serialization or whatever.
every update needs to be reliably logged (with a timestamp) to a file or files in a form that can be replayed.
when the application restarts, you reload from the last snapshot and then replay all updates that arrived since the snapshot.
The last approach has the problem that there is only one up-to-date stable copy of the data. If that is lost due to a hard disc or OS failure, then you are toast. In the other approaches, this issue can be address using a hot standby database implemented using the RDBMS's off-the-shelf support for such things.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.