Design for Users-Challenges Relationship in Java/Android - java

I am just starting off with app development and am currently writing an Android application which has registered users and a list of 'challenges' which they are able to select and later mark as completed/failed.
The plan is to eventually store all users/challenge/etc data on a database though I haven't implemented this yet.
The issue I have run in to is this - in my current design each User has list variables containing their current challenges and completed challenges eg. two ArrayList fields.
Users currently select challenges from a listview of different Challenge objects, which are then added to the user's CurrentChallenges list.
What I had not accounted for is how to structure this so that when a user takes on a challenge, they have their own unique copy of that challenge that can be independently marked as completed etc, whereas at the minute every user that selects say, Challenge 1, is simply adding the same challenge with the same ID etc. as each other user that selects Challenge 1.
I supposed I could have each different challenge be its own sub-class of Challenge and assign every user which selects that challenge type a different instance of that class, however this seems like it would be a very messy/inefficient method as all the different classes would be largely the same.
Does anyone have any good ideas or design patterns for this case? Preferably a solution that will be compatible with later storing these challenges in a database and presumably using ORM.
Thanks a lot for any suggestions,
E

I'd move every aspect of a challenge that is different for each user into a new Attempt class. So Challenge might have variables for name, description etc. and Attempt would have inProgress, completed etc. Obviously these are just examples, replace them with whatever data you're actually storing.
Now in your User class, you can record challenges using a Map. Make it a Map<Challenge, Attempt> and each User will be able to store an Attempt for each Challenge to record their progress. The Challenge instances are shared between users but there is an Attempt instance for each combination of User and Challenge.
When you implement the database later, Challenge, User and Attempt would each translate to a table. Attempt would have foreign keys for both of the other tables. Unfortunately I haven't used ORMs much so I'm not sure whether they'd work with a Map correctly.

Related

What is the DDD way to make sure that there is only one obj created with 2 attribute combinations

im pretty new to the whole DDD concept and i have the following question:
Lets say i have a UI where Users can save cars by putting in a id and a name. What is the DDD way to make sure that every unique id and name combination is only created once. The cars are all Entities and will be stored in a database. Usually i would just have put a primary and a foriegn key in a DB and just check if the combination is already there and if not create/store the obj and if there is the same combination then don´t.
Now i´m thinking if this is domain logic or just a simple CRUD. If it is domain logic and if i udnerstood correctly i should make my car object decide if it is valid or not. If thats the case how would i do that?
thanks in advance!
edit:
another thing: What if every created object should be deleted after 10 days. That would be a concept in the domain and would hence be also part of the domain logic. But how should the Object know when to delete itself and how should it do it? Would that be a domain service that checks the creation date of the objects and if it is older than 10 days it should perform a delete operation inside the DB?
I would go with a UNIQUE constraints on the 2 fields if you don't care about the validity of the values entered. That way even if someone, for some reasons, inserts/updates the records directly in the DB, the DB will prevent it.
If you care about the validity of the combined values entered, then you will have to add on top of that some logic in your code before saving it in the DB.
About your deletion mechanism, you can have a scheduler that check every day what are the data older than 10 days by checking a previously filled DB column (eg CREATED_ON) and delete them.
"It depends".
If id and name are immutable properties that are assigned at the beginning of the objects lifetime, then the straight forward thing to do is incorporate them into the key that you use to look up the aggregate.
car = Garage.get(id, name)
If instead what you have is a relation that changes over time (for instance, if you have to worry about name being corrupted by a data entry error) then things become more complicated.
The general term for the problem you are describing is set-validation. And the riddle is this: in order to reliably verify that a set has some property, you need to know that the property doesn't change between when you check it and when you commit your own change. In other words, you need to be able to lock the entire set.
Expressed more generally, the set is a collection of associated objects that we treat as a unit for the purpose of data changes. And we have a name for that pattern: aggregate.
So "the registry of names" becomes an aggregate in its own right - something that you can load, modify, store, and so on.
In some cases, it can make sense to partition that into smaller aggregates ("the set of things named Bob") - that reduces the amount of data you need to load/store when managing the aggregate itself, but adds some complexity to the use case when you change a name.
Is this "better" than the answer of just using database constraints? It depends on which side of the trade off you value more -- enforcing part of the domain invariant in the domain model and part of it in the data store adds complexity. Also, when you start leaning on the data store to enforce part of the invariant, you begin to limit your choices of what data store to use.

Obtain image via an API call to then save and serve up from local for repeated viewings

I'm working on a webapp at the moment that will display a list of items. The list is dynamic and can change between users. A great analogy is to think of the objects as books, with the db backing it as the library.
My database for Book will contain a list of all books in the library.
-A user can add a book to their collection.
-If a user wants to add a new book to their collection they will also add it to the library.
-If a user wants to add a new book to their collection and it exists in the library, nothing will be added to the Book database.
Currently my table is incredibly simple: Book(id, name). I am able to access a plethora of information about these books via an API call, such as a front cover, number of pages etc etc. I would like to store a subset of this information, especially the image url.
I think a sensible approach would be to alter my Book table so that it looks like: Book(id, name, imageUrl, otherValue, idOfThisBookInApiCallTable) the idOfThisBookInApiCallTable value will allow me to get other attributes as I need them, however I've two issues with this that I'm not sure on how to proceed.
Firstly is that this Table can easily get out of date with the APITable. I don't expect there to be much change, if any, but the risk is there.
Secondly, the image being stored is my main concern, on a page where there might be 50 books, I'll making a call to the url of the image each time. I think a sensible solution would be download the image locally and then serve it from then on repeated visits but I'm not sure if this is the correct approach.
Might I ask if anyone can see any issues with this approach and/or suggest a better one please? I have limited experience with db/web/app design so a little out of my depth here.
If saving the image locally is the correct approach, is there a 'best' way of doing this?
Thanks in advance for any help/suggestions/advice.
I can share my 2 cents of a plausible design but I think the question is too broad and is mostly opinion-based. Let's address it by taking one thing at a time.
First regarding your Book table. Why not a Library table where you maintain the current state of your library with all the books that the library has at the moment.
Each user can hold a collection (a table etc with a one to many relation like user_id to list of book_ids or whatever) and then each user sort of owns a subset of bookIDs.
When adding a new book via user or via library (library can also add more books even if no particular user brought it in) then always add it to the library and if the user_id is known for the 'owner' of this book, add a relation for this user as well in the collection table
More details of a book can be stored separately in a BookDetails table.
Storage of images on your side is always a nice option and you don't want to get blocked by the API for over-usage when requesting over and over again. You can use some cloud storage like s3 where you can keep the images and then not bother the external api. S3 supports compression and caching so you can save lots of time and not have speed problems.
All the above points are just my opinion based on the information you gave on the question. The situation can of course be different for your use-case.

Performance of database call from JAVA to Database

Our team is building a small application wherein a UI has about 10 drop-down list boxes. ( DDLB ).
These list boxes will be populated by selecting data from different tables.
Our JAVA person feels that making separate database call for each list will be very expensive and wants to make a single database call for all lists.
I feel it is impractical to populate all lists in one database call due to following reason
a. Imagine an end user chooses state = 'NY' from one DDLB.
b. The next drop down should be populated with values from ZIP_CODES table for STATE='NY'
Unless we know ahead of time what state a user will be choosing - our only choice is to populate a java structure with all values from ZIP_CODES table. And after the user has selected the state - parse this structure for NY zipcodes.
And imagine doing this for all the DDLB in the form. This will not only be practical but also resource intensive.
Any thoughts ?
If there are not many items in those lists and memory amount allows you could load all values for all drop boxes into memory at application startup and then filter data in memory. It will be better then execute SQL query for every action user makes with those drop boxes.
You could also use some cache engines (like EhCache) that could offload data to disk and store only some fraction in memory.
You can run some timings to see, but I suspect you're sweating something that might take 100th of a second to execute. UI design wise I never put zip codes in selection menus because the list is too long and people already know it well enough to just punch in. When they leave the zip code field I will query the city and state and pre-fill those fields if they're not already set.

Java - Google App Engine - modelling graph structures in Google Datastore

Google Apps Engine offers the Google Datastore as the only NoSQL database (I think it is based on BigTable).
In my application I have a social-like data structure and I want to model it as I would do in a graph database. My application must save heterogeneous objects (users,files,...) and relationships among them (such as user1 OWNS file2, user2 FOLLOWS user3, and so on).
I'm looking for a good way to model this typical situation, and I thought to two families of solutions:
List-based solutions: Any object contains a list of other related objects and the object presence in the list is itself the relationship (as Google said in the JDO part https://developers.google.com/appengine/docs/java/datastore/jdo/relationships).
Graph-based solution: Both nodes and relationships are objects. The objects exist independently from the relationships while each relationship contain a reference to the two (or more) connected objects.
What are strong and weak points of these two approaches?
About approach 1: This is the simpler approach one can think of, and it is also presented in the official documentation but:
Each directed relationship make the object record grow: are there any limitations on the number of the possible relationships given for instance by the object dimension limit?
Is that a JDO feature or also the datastore structure allows that approach to be naturally implemented?
The relationship search time will increase with the list, is this solution suitable for large (million) of relationships?
About approach 2: Each relationship can have a higher level of characterization (it is an object and it can have properties). And I think memory size is not a Google problem, but:
Each relationship requires its own record, so the search time for each related couple will increase as the total number of relationships increase. Is this suitable for large amount of relationships(millions, billions)? I.e. does Google have good tricks to search among records if they are well structured? Or I will be soon in a situation in which if I want to search a friend of User1 called User4 I have to wait seconds?
On the other side each object doesn't increase in dimension as new relationships are added.
Could you help me to find other important points on the two approaches in such a way to chose the best model?
First, the search time in the Datastore does not depend on the number of entities that you store, only on the number of entities that you retrieve. Therefore, if you need to find one relationship object out of a billion, it will take the same time as if you had just one object.
Second, the list approach has a serious limitation called "exploding indexes". You will have to index the property that contains a list to make it searchable. If you ever use a query that references more than just this property, you will run into this issue - google it to understand the implications.
Third, the list approach is much more expensive. Every time you add a new relationship, you will rewrite the entire entity at considerable writing cost. The reading costs will be higher too if you cannot use keys-only queries. With the object approach you can use keys-only queries to find relationships, and such queries are now free.
UPDATE:
If your relationships are directed, you may consider making Relationship entities children of User entities, and using an Object id as an id for a Relationship entity as well. Then your Relationship entity will have no properties at all, which is probably the most cost-efficient solution. You will be able to retrieve all objects owned by a user using keys-only ancestor queries.
I have an AppEngine application and I use both approaches. Which is better depends on two things: the practical limits of how many relationships there can be and how often the relationships change.
NOTE 1: My answer is based on experience with Objectify and heavy use of caching. Mileage may vary with other approaches.
NOTE 2: I've used the term 'id' instead of the proper DataStore term 'name' here. Name would have been confusing and id matches objectify terms better.
Consider users linked to the schools they've attended and vice versa. In this case, you would do both. Link the users to schools with a variation of the 'List' method. Store the list of school ids the user attended as a UserSchoolLinks entity with a different type/kind but with the same id as the user. For example, if the user's id = '6h30n' store a UserSchoolLinks object with id '6h30n'. Load this single entity by key lookup any time you need to get the list of schools for a user.
However, do not do the reverse for the users that attended a school. For that relationship, insert a link entity. Use a combination of the school's id and the user's id for the id of the link entity. Store both id's in the entity as separate properties. For example, the SchoolUserLink for user '6h30n' attending school 'g3g0a3' gets id 'g3g0a3~6h30n' and contains the fields: school=g3g0a3 and user=6h30n. Use a query on the school property to get all the SchoolUserLinks for a school.
Here's why:
Users will see their schools frequently but change them rarely. Using this approach, the user's schools will be cached and won't have to be fetched every time they hit their profile.
Since you will be getting the user's schools via a key lookup, you won't be using a query. Therefore, you won't have to deal with eventual consistency for the user's schools.
Schools may have many users that attended them. By storing this relationship as link entities, we avoid creating a huge single object.
The users that attended a school will change a lot. This way we don't have to write a single, large entity frequently.
By using the id of the User entity as the id for the UserSchoolLinks entity we can fetch the links knowing just the id of the user.
By combining the school id and the user id as the id for the SchoolUser link. We can do a key lookup to see if a user and school are linked. Once again, no need to worry about eventual consistency for that.
By including the user id as a property of the SchoolUserLink we don't need to parse the SchoolUserLink object to get the id of the user. We can also use this field to check consistency between both directions and have a fallback in case somehow people are attending hundreds of schools.
Downsides:
1. This approach violates the DRY principle. Seems like the least of evils here.
2. We still have to use a query to get the users who attended a school. That means dealing with eventual consistency.
Don't forget Update the UserSchoolLinks entity and add/remove the SchoolUserLink entity in a transaction.
You question is too complex but I try explain the best solution (I will answer in Python but same can be done in Java).
class User(db.User):
followers = db.StringListProperty()
Simple add follower.
user = User.get(key)
user.followers.append(str(followerKey))
This allow fast query who is followed and followers
User.all().filter('followers', followerKey) # -> followed
This query i/o costly so you can make it faster but more complicated and costly in i/o writes:
class User(db.User):
followers = db.StringListProperty()
follows = db.StringListProperty()
Whatever this is complicated during changes since delete of Users need update follows so you need 2 writes.
You can also store relationships but it is the worse scenario since it is more complex than second example with followers and follows ... - keep in mind than entity can have 1Mb it is not limit but can be.

How to efficiently unpublish all datas from a particular user on a blogging application?

We develop and operate a blogging application in which user data a scattered across many tables:
- Blog
- Article
- Comment
- Message
- Trackback
- 50 other tables.
Users are able to close their account, and their account/contents must disappear from the site right away.
For legal/contractual reasons, we also must be able to undelete their account/content for a given duration, and also to make those data available for juridic authorities for another duration.
Over the years and different applications, we used different approaches:
"deleted" flag everywhere : Each table has a "deleted" column, which is updated when data is deleted/restored. Very nasty because it slows down every list generation queries, creates a lot of updates upon deletion/restore. Also, it does not handle the two stage deletion described above. In fact we never used this one, but it's worth dis-advising it :)
"Multi table": For each table, we create a second table with the same schema plus two extra fields (dateDeleted, reason). The extra fields are used to know if the data is still accessible for restoration, when to delete it, and why/how it was deleted in the first place. This version is just a bit better than the previous version, but can be very nasty performance wise too when tables are growing. Also, you have to change the schema of some tables (ie: remove UNIQUE constraints) which makes the system harder to understand/upgrade for new developers, administrators ... and mentally healthy people in general.
"Multi DB": Same approach as before, but we move data on a different database cluster, which allows to browse those data without impacting the "end users" db. Also, for this app, the uniqueness constraint is done at the java level, so all the schemas are the same. Lastly, the double data retention constraint is done by having a dedicated DB for each constraint, which makes things easiers.
I have to admit that none of those approaches satisfies me, even if they can work up to a certain amount of data. I have also imagined that we could just delete some key rows in the DB, and let the rest inconsistent (and scheduled for a more controlled deletion job), but it scares me ...
Do you know other ways of doing the same thing, keeping the same level of features (we could align the two durations to simplify the problem) ? I'm not looking a solution for my existing apps, but would like to improve the next ones.
Any input will be highly appreciated !
It seams that every asset (blog, comment, ...) relies on the user. I would give the user table a column "active" which is 0 or 1, Then you implement a feature to ask on each query for the different asset "user active"? Try to optimize this lookup with indizes or something like that. In my opinion its the cleanst way. After this you can implement a job, which runs a cascading delete on users disabled for longer then x days.

Categories