I'm using the built in sqlite library on the Android platform.
I'm considering adding several general purpose fields that users will be able to use for their own custom applications, but these fields will be blank most of the time.
My question is, how much overhead will these blank fields add to my database? Do null fields even take up per record memory in sqlite? If so, how much? I don't quite understand the inner workings of a sqlite database.
The SQLite file format is described here. A NULL field will take one byte.
One way to provide custom/optional fields is to put them in a separate table with a foreign key identifying the corresponding record. Then no additional record is required if there are no custom fields, though there will be a need to do joins to gather all the fields when there are custom fields.
Related
I'm working on a project with an existing cassandra database.
The schema looks like this:
partition key (big int)
clustering key1 (timestamp)
data (text)
1
2021-03-10 11:54:00.000
{a:"somedata", b:2, ...}
My question is: Is there any advantage storing data in a json string?
Will it save some space?
Until now I discovered disadvantages only:
You cannot (easily) add/drop columns at runtime, since the application could override the json string column.
Parsing the json string is currently the bottleneck regarding performance.
No, there is no real advantage to storing JSON as string in Cassandra unless the underlying data in the JSON is really schema-less. It will also not save space but in fact use more because each item has to have a key+value instead of just storing the value.
If you can, I would recommend mapping the keys to CQL columns so you can store the values natively and accessing the data is more flexible. Cheers!
Erick is spot-on-correct with his answer.
The only thing I'd add, would be that storing JSON blobs in a single column makes updates (even more) problematic. If you update a single JSON property, the whole column gets rewritten. Also the original JSON blob is still there...just "obsoleted" until compaction runs. The only time that storing a JSON blob in a single column makes any sense, is if the properties don't change.
And I agree, mapping the keys to CQL columns is a much better option.
I don't disagree with the excellent and already accepted answer by #erick-ramirez.
However there is often a good case to be made for using frozen UDTs instead of separate columns for related data that is only ever going to be set and retrieved at the same time and will not be specifically filtered as part of your query.
The "frozen" part is important as it means less work for cassandra but does mean that you rewrite the whole value each update.
This can have a large performance boost over a large number of columns. The nice ScyllaDB people have a great post on that:
If You Care About Performance, Employ User Defined Types
(I know Scylla DB is not exactly Cassandra but I've seen multiple articles that say the same thing about Cassandra)
One downside is that you add work to the application layer and sometimes mapping complex UDTs to your Java types will be interesting.
I am using Cassandra database integrated into a spring boot application.
My Question is around the schema actions. If I need to make structural changes to the DB, say add a column to a table, the database needs to be recreated, however this means all the existing data gets deleted:
schema-action: CREATE_IF_NOT_EXISTS
The only way I have managed to solve this is by using the RECREATE scheme action, but as mentioned earlier, this results in data-loss.
What would be the best approach to handle this? To add structural changes such as a column name with out having to recreate the database and lose all existing data?
Thanks
Cassandra does allow you to modify the schema of an existing table without recreating it from scratch, using the ALTER TABLE statement via cqlsh. However, as explained in that link, there are some important limitations on the kind of changes you can do. You cannot modify the primary key of the table at all, you can add or delete regular columns, and you can't change the type of a column to a non-compatible one.
The reason for most of these limitations is how Cassandra needs to deal with the old data that already exists in the table. For example, it doesn't make sense to say that a column A that until now contained strings - will now contain integers - how are we supposed to handle all the old values in column A which weren't integers?
As Aaron rightly said in a comment, it is unlikely you'll want to do these schema changes as part of your application. These are usually rare operations which are done manually, or via some management application - not your usual application.
I am fairly new to the android sdk and databases and have been searching for an answer to this quite some time.
I am trying to build an app which has multiple tables within a database. e.g. one for weapons, armours etc.
However, my DatabaseManager class which handles all of my table creating, DatabaseHelper inner class and populating of data is creating for an extremely large class requiring high maintenance. Every time I would like to add or remove a table column I need to change quite a few areas of code,
- Every reference to the addition of a row in that table with data
- The method that the above calls
- The method returning all of the database rows
- The code in the helper class creating the table
- Any specific update methods
My question is this:
Surely there must be some better way of coding this system, maybe using a database isn't the best way to go, or am i just not used to such large classes having only learned java at university and my largest class consisting of a mere 400-600 lines of code.
Thanks for any help!
You could use source code generation, e.g. what is provided by jOOQ. See the examples page. When you generate source code from your development database, you can be sure your Java code accessing the underlying database will always be up to date. If anything changes in the database structurally, you will have compilation errors at all relevant places. That way, you can be sure not to forget anything.
On the other hand, if you add a field to a table and you want to insert/update into that table, that field will automatically be considered in all insert/update statements.
In addition to the above advantages, you can use the IDE of your choice (e.g. Eclipse) to provide you with auto-completion to phrase your SQL statements.
Note that support for SQLite is still a bit experimental in jOOQ. Any feedback is welcome!
What is the recommended way in Java to add prefixes to String keys to be stored in database of a web application?
I have an EntityId per Entity but I want to store different kinds of data for an Entity, in different rows distinguished by prefixed EntityId keys like this format:
EntityId | PrefixForThisDataCategory
In general, databases resist that kind of thing. They're happy to store "prefix" characters in a separate column, though. If that column needs to be part of a composite key, they're happy to do that, too.
But if you want to store different kinds of data in different rows of the same table, I hope I can discourage you. Databases--SQL databases, that is--are designed to keep different kinds of data in different tables. People in one table, addresses in another table . . . not addresses in some rows of a table of people.
Of course, you might be aiming at something completely different.
If I understand your question correctly you basically want to store data representing different sets of information in a common table and use one or more fields to differentiate what type of data has been stored.
THIS IS A REALLY BAD IDEA ! - had to say it :-)
I can tell you from experience on many projects that storing data in this way always leads to problems and really messy code. My strong recommendation would be to store the data is separate tables.
There is one variation I can think of however, that is similar to your request, that is derived class mapping in hibernate. There hibernate maps sets of data into tables based on the class that is being stored. This is only for mapping hierarchies of classes and is controlled by hibernate so that you don't have to worry about it.
I'm getting introduced to serialization and ran into some problems when pairing it with LinkedList
Consider i have the following table:
CREATE TABLE JAVA_OBJECTS (
ID BIGINT NOT NULL UNIQUE AUTO_INCREMENT,
OBJ_NAME VARCHAR(50),
OBJ_VALUE BLOB
);
And i'm planning to store 3 object types - so the table may look like so -
ID OBJ_NAME OBJ_VALUE
============================
1 Class1 BLOB
2 Class2 BLOB
3 Class1 BLOB
4 Class3 BLOB
5 Class3 BLOB
And i'll use 3 different LinkedList's to manage these objects..
I've been able to implement LoadFromTable() and StoreIntoTable(Class1 obj1).
My question is - if i change an attribute for a Class2 object in LinkedList<Class2>, how do i effect the change in the DB for this individual item? Also take into account that the order of the elements in LinkedList may change..
Thanks : )
* EDIT
Yes, i understand that i'll have to delete/update a row in my DB table. But how do i keep track of WHICH row to update? I'm only storing the objects in the List, not their respective IDs in the table.
You'll have to store their IDs in the objects you are storing. However, I would suggest not trying to roll your own ORM system, and instead use something like Hibernate.
If you change an attribute in a an object or the order of items. You will have to delete that row and insert the updated list again.
How do i effect the change in the DB for this individual item?
I hope I get you right. The SQL update and delete statements allow you to add a WHERE clause in which you chose the ID of the row to update.
e.g.
UPDATE JAVA_OBJECTS SET OBJ_NAME ="new name" WHERE ID = 2
EDIT:
To prevent problems with your Ids you could wrap you object
class Wrapper {
int dbId;
Object obj;
}
And add them instead of the 'naked' object into your LinkedList
You can use AUTO_INCREMENT attribute for your table and then use the mysql_insert_id() function to retrieve the id assigned to the row added/updated by the last INSERT/UPDATE statement. Along with this maintain a map (eg a HashMap) from the java object to the Id. Using this map you can keep track of which row to delete/update.
Edit: See the answer to this question as well.
I think the real problem here is, that you mix and match different levels of abstraction. By storing serialized Java objects into a relational database as BLOBs you have to consider several drawbacks:
You loose interoperability. Applications written in other languages than Java are not able to read the data back. Even other Java applications have to have the class files of the serialized classes in their classpath.
Changing the class definitions of the stored classes will end up in maintenance nightmares.
You give up the advantages of a relational database. Serialization hides the actual data from the database. So the database is presented only with a black box. You are unable to execute any meaningfull query against the real data. All what you have is the ID and block of bytes.
You have to implement low level data handling by yourself. Actually the database is made to handle your data effectively, but because of serialization you hinder it doing its job. So you are on your own and you are running into that problem right now.
So in most cases you benifit from separation of concerns and using the right tool for a job.
Here are some suggestions:
Separate the internal data handling inside your application from persistent storage. Design your database schema in a way to enable the built-in database features to handle the data efficently. In case of a relational database like MySQL you can choose from different technologies like plain JDBC, object relational mappers like JPA or simple mappers like MyBatis. Separation here means to avoid to contaminate the database with implementation specific concerns.
If you have for example in your Java application a List of Person instances and each Person consists of a name and an age. Then you would represent that list in a relational database as a table consisting of a VARCHAR field for the name and a numeric field for the age and maybe a third field for a unique key. Then the database is able to do what it can do best: managing large amounts of data.
Inside your application you typically separate the persistent layer from the rest of your program containing the code to communicate with the database.
In some use cases a relational database may not be the appropiate tool. Maybe in a single user desktop application with a small set of data it may be the best to simply serialize your Person list into a plain file and read it back at the next start up.
But there exists other alternatives to persist your data. Maybe some kind of object oriented database is the right tool. In particular I have experiences with Fast Objects. As a simplification it is serialization on steroids. There is no need for a layer like JPA or JDBC between your application and your database. You are able to store the class instances directly into the database. But unlike the relational database with its BLOB field, the OODB knows your classes and the actual data and can benefit from that.
Another alternative may be JDBM or Berkeley DB.
So separation of concerns and choosing the right persistence strategy (and using it the right way) is a key concern for the success of your project. But doing it right is hard even for experienced developers.