Mongodb database Schema design tips [duplicate]

Mongodb database Schema design tips [duplicate] - java

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Mongodb database Schema Design with shared data
Hi I am newbie to mongodb.I am using java.
I have 4 tables Tenant,system,authorization in my relational table.
Something like this.
Table Fields
Tenant Tenant_ID(PK), Tenant_INFO
System System_ID(PK), System_Info
Authorization System_ID, Autho_Info.
System_prop System_ID, Prop_Info, Tenant_ID
In System_prop table, Tenant_ID refers the Tenant Table Tenant_ID (PK), System_ID refers the System Table System_ID.
In Authorization table, System_ID refers System tabel System_ID
I am switching my database from relational to mongodb. First thing I need to do is Schema design.
Query I need to do is:
SELECT D.Prop_Info, D.System_ID, A.Tenant_Info From TENANT A ,System_prop D, SYSTEM B, Where D.System_ID = B.System_ID AND D.Tenant_ID = A.Tenant_ID
SELECT C.System_ID, C.auth_Info, B.System_ID FROM Authorization C, SYSTEM B WHERE C.System_ID = B.System_ID
Can anyone help me how to design these tables as collections in mongodb?
Do i need to embed r use dbref? Help me to design the schema for this.

From the schema information you provided, it looks like you have a many-to-many relationship between Tenant and System (through the JOIN table System_prop), and a one-to-many relationship between System and Authorization.
In MongoDB, both of these types of relationships can be implemented using array fields. This is how you could set up your System collection:
{
System_Info: ...,
Tenant: [
{
Tenant_Id: ...,
Tenant_Info: ...,
Prop_Info: ...
},
{
Tenant_Id: ...,
Tenant_Info: ...,
Prop_Info: ...
} ],
Authorization: [
{
Auth_Id: ...,
Auth_Info: ...
},
{
Auth_Id: ...,
Auth_Info: ...
} ]
}
However, for the Tenant info, you will now have de-normalized duplicate information, i.e. the same Tenant document appears in different System documents. It is up to your application to ensure consistency.
As for the queries you mentioned: It looks like there is some information missing. For the first query, you're joining on the Tenant_Id but not requesting any information from the Tenant table. The second one requests Prop_Info from the Authorization table but that table doesn't have Prop_Info. Should that be A.Autho_Info instead? So you might want to double-check these queries.
Here are some additional resources about schema design in MongoDB that are worth a read:
http://www.mongodb.org/display/DOCS/Schema+Design
https://openshift.redhat.com/community/blogs/designing-mongodb-schemas-with-embedded-non-embedded-and-bucket-structures
In the end, it depends on your application and most frequent queries how exactly you choose to store your data, and the example above is just one way to set up your schema.

You are still thinking in relational databases. MongoDB, however, is a document-oriented database.
artificial ID numbers are usually not needed, because every document automatically has a _id field, which is a GUID (as good as guaranteed to be globally unique).
relation tables should not be used in MongoDB. n-type relations are made with arrays fields instead. So when 1 system has N authorizations it uses, your system document should have a field "authorization" which is an array of the object IDs of the authorizations it has. Yes, that would be a horrible violation of the normalization rules of relational databases. But you don't have a relational database here. In MongoDB it is practical to represent N-relations with arrays, because arrays are transparent to the query language.

Related

Spring/JPA/Hibernate database historical changes at column level

In my Spring/JPA/Hibernate application I need to store all historical changes in my database at column level.
For example I have a table - users with a following columns: id, name, email, address
For historical changes I'll introduce a new dedicated table, let's say historical_changes with a following fields: id, table_name, column_name, previous_value, new_value.. create/update_date and user_id
When someone insert/update a record in users table I have to add a set of new records to historical_changes with an affected columns, for example users.name and users.email changes have to be logged in a following way:
1, 'users', 'name', 'old name', 'new name' ...
2, 'users', 'email', 'old#email.com', 'new#email.com' ...
Right now I'm looking for a best way to automate this process. I know that Spring Data supports Auditing feature but can't find functionality that allow me to implement the described case.
Please advise me how the described historical changes feature can be implemented with Spring/JPA/Hibernate or may be with some other 3rd party lib.

Get All tables allowed to user with jdbc

I'm connecting to a database using jdbc, getting list of all schemas and tables from database (I assume that some databases may return at this point only tables which current user can query, but some of databases return full list of tables) and when user try to query some tables he get "insufficient privileges" error.
Is there a way to get only tables, user can query using only jdbc capabilities? Without writing special query to database.
Now I'm looking at
DatabaseMetaData dbMeta = connection.getMetaData()
dbMeta.getTablePrivileges(null, null, null);
But from result of this query it's not so clear which exactly tables can user query.
Currently I'm working with SAP HANA database, but in general it may be any database, so I'm looking for some common approach.

Please look at
http://docs.oracle.com/javase/7/docs/api/java/sql/DatabaseMetaData.html#getTablePrivileges%28java.lang.String,%20java.lang.String,%20java.lang.String%29
You have to get the each row from the ResultSet and query of column name TABLE_NAME which contains the table name and PRIVILEGE which contains the access of each table.

How does no-sql handle relational data?

I know it is a non-relational database but this does not mean that relational data does not exist.
For example, I have a table that holds urls like this ( simplified ):
url | domain
and I have a table that holds domains like this ( simplified ):
domain | favicon_path
Because many different urls may have the same domain, I did not want to repeat the favicon_path for each domain when pulling the data for sending to the view.
Hence I used a simple ( simplified for example ) join command when I need the data.
"SELECT bookmarks.*, domains.favicon FROM bookmarks JOIN
domains ON bookmarks.domain=domains.domain"
How would I handle this scenario using no-sql?
I plan on implementing no-sql using indexedDB on the client ( javascript ) and MongoDB on the server ( java ).

If you want to use document-oriented DB, you can use this structure of documents:
URL_ID : {
"domain":"id_of_domain",
"another_staff": "..."
}
DOMAIN_ID : {
"favicon_path" : "path or id of another document",
"another_staff": "..."
}
So you can get document with URL_ID by id from database and then get document of type Domain.
ADDITION:
You can use the following approach for generating id. Create special document (like sequence) which will have only one field - current_value_of_sequence. Every insert to DB you have to get this sequence and increment it. Some DB like Couchbase have low-level support of this mechanism, which very efficient and thread-safety.

From years of work expierence in IT area, I would say most of the business models could be normalized as simple as these two types of data structure:
Entity info.
Entity list.
For example, in a book store business, we will have the Book entity, and many list that containing all of the books or a subset of the whole books.
With a NoSQL database, such as Redis or SSDB, the Book entity is stored with Key-Value, where key is the book sn, and value is the stringified book info(title, publish date, description, etc). While book list(list by publish date, list by price, etc) are stored in zset data type.

Transform Cassandra query result to POJO with Astyanax

I am working in a Spring web application using Cassandra with Astyanax client. I want to transform result data retrieved from Cassandra queries to a POJO, but I do not know which library or Astyanax API support this.
For example, I have User column family (CF) with some basic properties (username, password, email) and other related additional information can be added to this CF. Then I fetch one User row from that CF by using OperationResult> to hold the data returned, like this:
OperationResult<ColumnList<String>> columns = getKeyspace().prepareQuery(getColumnFamily()).getRow(rowKey).execute();
What I want to do next is populating "columns" to my User object. Here, I have 2 problems and could you please help me solve this:
1/ What is the best structure of User class to hold the corresponding data retrieved from User CF? My suggestion is:
public class User {
String userName, password, email; // Basic properties
Map<String, Object> additionalInfo;
}
2/ How can I transform the Cassandra data to this POJO by using a generic method (so that it can be applied to every single CF which has mapped POJO)?
I am so sorry if there are some stupid dummy things in my questions, because I have just approached NoSQL concepts and Cassandra as well as Astyanax for 2 weeks.
Thank you so much for your help.

You can try Achilles : https://github.com/doanduyhai/achilles, an JPA compliant Entity Manager for Cassandra
Right now there is a complete implementation using Thrift API via Hector.
The CQL3 implementation using Datastax Java Driver is in progress. A beta version will be available in few months (July-August 2013)
CQL3 is great but it's still too low level because you need to extract the data yourself from the ResultSet. It's like coming back to the time when only JDBC Template was available.
Achilles is there to fill the gap.

I would suggest you to use some library like Playorm using which you can easily perform CRUD operations on your entities. See this for an example that how you can create a User object and then you can get the POJO easily by
User user1 = mgr.find(User.class, email);
Assuming that email is your NoSqlId(Primary key or row key in Cassandra).

I use com.netflix.astyanax.mapping.Mapping and com.netflix.astyanax.mapping.MappingCache for exactly this purpose.

The best way to import(merge)-export java db database

I have let's say two pc's.PC-a and PC-b which both have the same application installed with java db support.I want from time to time to copy the data from the database on PC-a to database to PC-b and vice-versa so the two PC's to have the same data all the time.
Is there an already implemented API in the database layer for this(i.e 1.export-backup database from PC-a 2.import-merge databases to PC-b) or i have to do this in the sql layer(manually)?

As you mention in the comments that you want to "merge" the databases, this sounds like you need to write custom code to do this, as presumably there could be conficts - the same key in both, but with different details against it, for example.

In short: You can't do this without some work on your side. SalesLogix fixed this problem by giving everything a site code, so here's how your table looked:
Customer:
SiteCode varchar,
CustomerID varchar,
....
primary key(siteCode, CustomerID)
So now you would take your databases, and match up each record by primary key. Where there are conflicts you would have to provide a report to the end-user, on what data was different.
Say machine1:
SiteCode|CustomerID|CustName |phone |email
1 XXX |0001 |Customer1 |555.555.1212 |darth#example.com
and on machine2:
SiteCode|CustomerID|CustName |phone |email
2 XXY |0001 |customer2 |555.555.1213 |darth#nowhere.com
3 XXX |0001 |customer1 |555.555.1212 |darth#nowhere.com
When performing a resolution:
Record 1 and 3 are in conflict, because the PK matches, but the data doesnt (email is different).
Record 2 is unique, and can freely exist in both databases.
There is NO way to do this automatically without error or data corruption or referential integrity issues.

I guess you are using Java DB (aka Derby) - in which case, assuming you just can't use a single instance, you can do a backup/restore.

Why dont you have the database on one pc. and have all other pc's request data from the host pc

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.