How to implement one-to-many relationships with DynamoDB

How to implement one-to-many relationships with DynamoDB - java

everyone! I have two tables that I would like to join via DynamoDb, but since the latter is not a relational db, I don't know how to map the link between the two tables.
In particular, I have a Price List table and a Detail List table that contains the details of the first one. How can I implement one-to-many relationship in java using dynamoDB with Spring Boot?

DynamoDB is basically a key-value store. You only every perform a lookup based on a key. That key may be artificial, not just a user id, but maybe "user_id#product#order" but still it will be a key-based lookup. If you want to use DynamoDB you have to store the data in a way that all queries that you will need will all boil down to basic key-based access (plus some sorting).
You have to do the exact opposite of normalizing your data and splitting relations into multiple tables: you have to de-normalize all your data to store the data and all the relations just in one table, multiple times, with multiple complex artificial keys. See e.g. https://www.youtube.com/watch?v=HaEPXoXVf2k on how to use LSIs, GSIs, how to model your data, how to choose artificial keys, etc.
That means you will not have Item, Order and OrderItem table that you join together, but you will have just one Everything table which may have the fields: userid, username, ordernumber, itemid, itemprice, itemquantity, itemname, orderdate, shippingaddress, etc.
And if you have three items in an order you will have three entries in this table. That means the username will be in the table very often, that means the itemname will be in the table very often and changing them will be difficult but that is how things are if you want to use dynamodb.
That is how you model one-to-many relations, by packing them into a single table and add proper indexes.
If you do have no idea about the current or future access patterns of your data or how to structure your data properly then dynamodb is the wrong tool for you.

The question you are asking gets at the very essence of working with DynamoDB and NoSQL data modeling. It is not as simple as applying your relational database knowledge to DynamoDB. Take a moment to familiarize yourself with the DynamoDB basics before you get too far into solving this problem.
Watch this video about modeling one-to-many relationships in DynamoDB. I would recommend you watch the entire video from the beginning, as it's one of the best introductions to the topic currently available.

Related

Storing custom level system in MySQL

I am wondering how I would store my custom network level in a MySQL table. I could make four columns, 'level', 'exp', 'expreq' and 'total'. Only this will take up four columns, and as I am storing name, rank and other data in the same table it will be too many columns in the end. Are there better ways? Should I make another table?

In a relational data model, and for expansion ability you have to do it in a different table. by which the master can point to the detailed table where you can have as many attributes as you can.
BUT
This has an obvious impact on the memory when it becomes large, in addition to that, this approach is usually being replaced by less-normalized version of the tables by introducing concepts like "Custom Fields"
OR
If it is me, and this table will be accessible by certain programming language, I would store them in JSON format in very simple table. and let the program do the processing overhead

Hibernate pagination with ____ToMany mapping

I'm writing this on the fly on my phone, so forgive the crappy code samples.
I have entities with a manytomany relationship:
#JoinTable(name="foo", #JoinColum="...", #InverseJoinColumn="...")
#ManyToMany
List list = new ArrayList();
I want their data to be retrieved in a paginated way.
I know about setFirstResult and setMaxResults. Is there a way to use this with the mapping? As in, I retrieve the object and get the list filled with contents equal to the amount of records for a single page, with the appropriate offset.
I guess I'm just unclear of the best way to do this. I could just manually use hibernate criteria to have the effect, but I feel thats missing the API. I have this mapping, I want to see if there's a way to use it in a paginated way.
PS. If this is impractical, just say. Also, if it is, can I still use the mapping to add new entries to the join table. As in, if the entity is a persisted entity in the DB, but I haven't fetched the manytomany list, can I add something new to it and when its persisted with cascade all it'll be added to the join table without clearing the other entries?

The type of the relationship between entities that are part of your query isn't that important. There are a couple of ways to tackle this.
If your database supports the LIMIT keyword in it's queries, you would be able to use it to get data sets, assuming you sort your data. Note that if your data changes while your user is navigating between pages, you might see some duplication or miss some records. You'll be stuck having to rewrite if your database changes to one that doesn't have the LIMIT keyword.
If you need to freeze the data at the point of the original query you need to use a 3rd party framework or write your own to fetch a list of Ids for your query then split up that list and fetch by id in a subset for pagination. This is more reliable can be made to work for any database.
Displaytag is a data paging framework I've used and that I therefore can tell you works well for large datasets. It's also one of the older solutions for this problem and is not part of an extended framework.
http://displaytag.sourceforge.net/11/tut_externalSortAndPage.html
Table sorter is another one I came across. This one uses JQuery and fetches the entire data set in one query, so strictly speaking it doesn't meet your "fetches the data in a paginated way" criteria. (This might not be appropriate for large sets).
http://tablesorter.com/docs/
This tutorial might be helpful:
http://theopentutorials.com/examples/java-ee/jsp/pagination-in-servlet-and-jsp/
If you're already using a framework take a look at whether that framework has tackled pagination:
Spring MVC provides a data pager
http://blog.fawnanddoug.com/2012/05/pagination-with-spring-mvc-spring-data.html
GWT provides a data pager:
http://www.gwtproject.org/javadoc/latest/com/google/gwt/user/cellview/client/SimplePager.html
The following refrences might be helpful too:
JDBC Pagination
which also points to:
http://java.avdiel.com/Tutorials/JDBCPaging.html

Retrieve information for the same DTO from two different databases

I tried to make this as simple as possible with a short example.
We have two databases, one in MSSQLServer and other in Progress.
We have the user DTO as it follows that we shown in a UI table within a web application.
User
int, id
String, name
String, accountNumber
String, street
String, city
String, country
Now this DTO(Entity) is not stored only in one database, some information (fields) for the same user are stored in one database and some in the other database.
MSsql
Table user
int, id
String, name
String, accountNumber
Table userModel
int, id
String, street
String, city
String, country
As you can see the key is the only piece that link two tables in both databases, as I said before they are not in the same database and not using same database vendor.
We have a requirement for sorting the UI table for each column. Obviously we need to create user dto with the information coming from both databases.
Our proposal at this moment is if user want to apply sorting using street field, we run a query in the Progress database and obtain a page (using pagination) using this resultset and go directly to the MSSQLServer User table with those keys and run another query to extract the missing information and save it to our DTO and transfer it to the UI. With implies run a query in one database then other query based on the returned keys in the second database.
The order of the database could change depending in which column(field) the user wants to apply sorting.
Technically we will create a jparepository that acts as a facade and depending on the field make the process in the correct database.
My question is:
There is some kind of pattern that is commonly used in this scenarios, we are using spring, so probably spring have some out of the box features to support this requirement, will be great if this is possible using jparepositories (I have several doubts about it as we will use two different entitymanagers, one for each database).
Note: Move data from one database to another is not an option.

For this, you need to have separate DataSource/EntityManagerFactory/JpaRepository.
There is no out-of-the-box support for this architecture in the Spring framework, but you can easily hide the dual DataSource pair behind a Service layer. You can even configure JTA DataSources for ACID operations.

As you will always need to fetch data from both databases, why not populate local java User objects then sort these objects (using a comparator with the appropriate fields you want to sort on).
The advantage of sorting locally vs doing the sort in the database query is that you won't have to send requests to the database every time you change the sorting field.
So, to summarize:
1- Issue two sql queries for the two databases to get your users
2- Build your User objects using the retrieved values
3- Use Java comparators to sort the users on any field without having to issue new queries to the database.

My advice would be to find a way to link 2 databases together so that you can utilize database driver features without your code being affected.
Essentially if Progress database can be linked to SQL Server, you will be able to query both databases using a single SQL query with a join on id column and you will get a merged, sorted and paginated result set for your application to display.
I am not an expert in Progress database but it seems there is an ODBC driver for it so you might try to link it to SQL Server.

Efficient use of the GAE DataStore

I am currently developing a Google AppEngine (GAE) application and I am struggling a bit with the GAE DataStore best practices. I would like to use the DataStore in the most efficient way. I am using the Objectify framework, but am flexible to use something else if there is a better alternative.
My application uses three objects/tables:
- Items (id, description)
- List (id, listId, listDescription
- SecurityProfile (id,listId, username, accessType)
I an relational world, my Items and SecurityProfiles tables would have an external key to link them to a list (ListId) and I would then use joins in my queries.
The typical Queries I need to make:
- Get all lists accessible to a particular user (need an index on "username" to filter by username and need to get the description from the List table)
- Get all items in list for a particular user (get the Items linked to the Lists retrieved in the query above)
I am struggling a bit to come up with a way to link the different objects in an efficient way (minimizing the DataStore queries and indexes).
I have seen in other posts that joins should be avoided and that I should de-normalize the model as much as possible.
So kind of creating one object only:
- Data (id, description, listId, listDescription, username, accessType)
I can see how that work from a read point of view, but if I update a listDescription, an accessType or add a new username, I could potentially have to update a massive amount of records. Is this really the way to go ?

I'm only familiar with the Python NDB API, but things are similar in Java.
In Python NDB, I would recommend to create a Model for each
User,
List,
List item
Then, you can reference them with repeated KeyProperties, e.g.
class SecurityProfiles(ndb.Model):
accessibleLists = ndb.KeyProperty(repeated=true)
class List(ndb.Model):
listItems = ndb.KeyProperty(repeated=true)
Like this, you can pull a user's profile from the DataStore, and with the keys stored in accessibleLists you can get the lists accessible to the user.
Alternatively, you could do it the other way around:
class List(ndb.Model):
usersWithAccess = ndb.KeyProperty(repeated=true)
and then you could immediately query for lists that are accessible to a given user.

persisting dynamic properties and query

I have a requirement to implement contact database. This contact database is special in a way that user should be able to dynamically (on runtime) add properties he/she wants to track about the contact. Some of these properties are of type string, other numbers and dates. Some of the properties have pre-defined values, others are free fields etc.. User wants to be also able to query such structure fast and easily. The database needs to handle easily 500 000 contacts each having around 10 properties.
It leads to dynamic property model having Contact class with dynamic properties.
class Contact{
private Map<DynamicProperty, Collection<DynamicValue> values> propertiesAndValues;
//other userfull methods
}
The question is how can I store such a structure in "some database" - it does not have to be RDBMS so that I can easily express queries such as
Get all contacts whose name starts with Martin, they are from Company of size 5000 or less, order by time when this contact was inserted in a database, only first 100 results (provide pagination), where each of these segments correspond to a dynamic property.
I need:
filtering - equal, partial equal, (bigger, smaller for integers, dates) and maybe aggregation - but it is not necessary at this point
sorting
pagination
I was considering RDBMS, but this leads more less to this structure which is quite hard to query and it tends to be slow for this amount of data
contact(id serial pk,....);
dynamic_property(dp_id serial pk, ...);
--only one of the values is not empty
dynamic_property_value(dpv_id serial pk, dynamic_property_fk int, value_integer int, date_value timestamp, text_value text);
contact_properties(pav_id serial pk, contact_id_fk int, dynamic_propert_fk int);
property_and_its_value(pav_id_fk int, dpv_id int);
I consider following options:
store contacts in RDBMS and use Lucene for querying - is there anything that would help with this?
Store dynamic properties as XML and store it to rdbms and use xpath support - unfortunatelly it seems to be pretty slow for 500000 contacts
use another database - Mango DB or Jackrabbit to store this information
Which way would you go and why?

Wikipedia has a great entry on Entity-Attribute-Value modeling which is a data modeling technique for representing entities with arbitrary properties. It's typically used for clinical data, but might apply to your situation as well.

Have you considered using Lucene for your querying needs? You could probably get away with just using Lucene and store all your data in the index. Although I wouldn't recommend using Lucene as your only persistence store.
Alternatively, you could use Lucene along with a RDBMS and take advantage of something like Compass.

You could try other kind of databases like CouchDB which is a document oriented db and is distributed
If you want a dumb solution, for your contacts table you could add some 50 columns like STRING_COLUMN1, STRING_COLUMN2... upto 10, DATE_COLUMN1..DATE_COLUMN10. You have another DESCRIPTION column. So if a row has a name which is a string then STRING_COLUMN1 stores the value of your name and the DESCRIPTION column value would be "STRING_COLUMN1-NAME". In this case querying can be a bit tricky. I know many purists laugh at this, but I have seen a similar requirement solved this way in one of the apps :)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.