Design pattern for assembling disperate data?

Design pattern for assembling disperate data? - java

I am designing a system which assembles disperate data in a standard row/column type output.
Each column can:
Exist in an independent system.
Can be paginated.
Can be sorted.
Each column can contain millions of rows.
And the system:
Needs to be extensible so different tables of different columns can be outputted.
The final domain object is known (the row).
The key is constant across all systems.
My current implementation plan is to design two classes per column (or one class column that implements two interfaces). The interfaces would:
Implement a pagination and sorting.
Implement "garnishing"
The idea is that the table constructor would receive information about the current sort column and page. Which would then return a list of appropriate keys for the table. This information would be used to create a list of the domain object rows which would then be passed in turn to each of the column "garnishing" implementations so that each columns information could be added in turn.
I guess my question is - what design patterns would be recommended - or alternative design decisions would people use for assembling disperate data with common keys and variable columns.

I'm not sure if I completely understood what you're trying to do, but from what I gather, you want to store rows of arbitrary data in a way that will allow you to make structured tables from it later on. What I would do in this case (assuming you're using Java) is make a very simple Column interface that would just have a "value" property:
public interface Column {
String value;
}
Then, you could make columns by implementing Column:
public class Key implements Column {
String value = new String();
public Key(String keyValue){
this.value = keyValue;
}
}
So then you can make a class called DataRow (or whatever you like) whose objects would contain the actual data. For example, you could have a method in that class that would allow you to add data:
public class DataRow {
List<Column> data = new ArrayList<Column>();
public DataRow(String key){
this.setColumn(new Key(key));
}
public void setColumn(Column columnData) {
this.data.add(columnData);
}
public Column getColumn(Class column){
for(Column c : this.data){
if(c.getClass().equals(column)){
return c;
}
}
return null;
}
}
As you can see, you can call the method setColumn() by giving it a new Column object. This will allow you to add any data you like of any type to the DataRow Object. Then, to make some tables, you could have a function that takes a List of DataRows, and a List of classes, that would then return only the objects which have data from the row specified:
public List<DataRow> createTable(List<DataRow> data, List<Class<? extends Column>> columns){
List<DataRow> table = new ArrayList<DataRow>();
for(DataRow row : data){
DataRow ret = new DataRow(row.getColumn(Key.class).value);
for(Class column : columns){
if(row.getColumn(column.getClass()) != null )ret.setColumn(row.getColumn(column.getClass()));
}
table.add(ret);
}
return table;
}
This will allow you to "create" tables using your data, and the columns you want to include in the table.
Note that I wrote this code to convey an idea, and that it's pretty messy at the moment. But I hope this will help you in some small way.

Related

How to manage access to a database object properly?

I am wondering weather there is a better solution to my problem.
Better in the sense that not every object of the class Segment has to create a new database object.
I am trying to keep only one database in my program because the database is very big and I am sure there is a more efficient solution to this.
The Database holds objects of the class SegmentInformetion in a List. Each Object contains many informations each Segment object needs for its instantiation.
The Layer Class contains a List of Segments. The Layers Constructor contains an array with IDs. Every Segment will get its Information from the Database depending on the ID with which it is calling the Database.
Database {
List<SegmentInformation> segInfoList;
public SegmentInformation getSegInfos( int id ){
return segInfoList.get(id);
}
}
Layer{
List<Segments> segmentList;
public Layer( int[] segmentIDs ){
for (int i : segmentIDs){
segmentList.add( new Segment( segmentIDs[i] ) );
}
}
}
Segment{
double value1;
//....
double valuenN;
public Segment(int sID){
Database db = new Database();
SegmentInformation info = db.getSegInfos( sID );
value1 = info.getValue1();
//....
valueN = info.getValueN();
}
}
I am trying to avoid a global static variable which contains the Database.
Any suggestions for a more suitable way to instantiate all the Segment objects?

Use a Singleton to contain all the Segment objects:
In software engineering, the singleton pattern is a software design
pattern that restricts the instantiation of a class to one "single"
instance. This is useful when exactly one object is needed to
coordinate actions across the system. The term comes from the
mathematical concept of a singleton.
https://en.wikipedia.org/wiki/Singleton_pattern

Java class: limit instance variable to one of several possible values, depending on other instance variables

I am sorry for the vague question. I am not sure what I'm looking for here.
I have a Java class, let's call it Bar. In that class is an instance variable, let's call it foo. foo is a String.
foo cannot just have any value. There is a long list of strings, and foo must be one of them.
Then, for each of those strings in the list I would like the possibility to set some extra conditions as to whether that specific foo can belong in that specific type of Bar (depending on other instance variables in that same Bar).
What approach should I take here? Obviously, I could put the list of strings in a static class somewhere and upon calling setFoo(String s) check whether s is in that list. But that would not allow me to check for extra conditions - or I would need to put all that logic for every value of foo in the same method, which would get ugly quickly.
Is the solution to make several hundred classes for every possible value of foo and insert in each the respective (often trivial) logic to determine what types of Bar it fits? That doesn't sound right either.
What approach should I take here?
Here's a more concrete example, to make it more clear what I am looking for. Say there is a Furniture class, with a variable material, which can be lots of things, anything from mahogany to plywood. But there is another variable, upholstery, and you can make furniture containing cotton of plywood but not oak; satin furniture of oak but not walnut; other types of fabric go well with any material; et cetera.

I wouldn't suggest creating multiple classes/templates for such a big use case. This is very opinion based but I'll take a shot at answering as best as I can.
In such a case where your options can be numerous and you want to keep a maintainable code base, the best solution is to separate the values and the logic. I recommend that you store your foo values in a database. At the same time, keep your client code as clean and small as possible. So that it doesn't need to filter through the data to figure out which data is valid. You want to minimize dependency to data in your code. Think of it this way: tomorrow you might need to add a new material to your material list. Do you want to modify all your code for that? Or do you want to just add it to your database and everything magically works? Obviously the latter is a better option. Here is an example on how to design such a system. Of course, this can vary based on your use case or variables but it is a good guideline. The basic rule of thumb is: your code should have as little dependency to data as possible.
Let's say you want to create a Bar which has to have a certain foo. In this case, I would create a database for BARS which contains all the possible Bars. Example:
ID NAME FOO
1 Door 1,4,10
I will also create a database FOOS which contains the details of each foo. For example:
ID NAME PROPERTY1 PROPERTY2 ...
1 Oak Brown Soft
When you create a Bar:
Bar door = new Bar(Bar.DOOR);
in the constructor you would go to the BARS table and query the foos. Then you would query the FOOS table and load all the material and assign them to the field inside your new object.
This way whenever you create a Bar the material can be changed and loaded from DB without changing any code. You can add as many types of Bar as you can and change material properties as you goo. Your client code however doesn't change much.
You might ask why do we create a database for FOOS and refer to it's ids in the BARS table? This way, you can modify the properties of each foo as much as you want. Also you can share foos between Bars and vice versa but you only need to change the db once. cross referencing becomes a breeze. I hope this example explains the idea clearly.

You say:
Is the solution to make several hundred classes for every possible
value of foo and insert in each the respective (often trivial) logic
to determine what types of Bar it fits? That doesn't sound right
either.
Why not have separate classes for each type of Foo? Unless you need to define new types of Foo without changing the code you can model them as plain Java classes. You can go with enums as well but it does not really give you any advantage since you still need to update the enum when adding a new type of Foo.
In any case here is type safe approach that guarantees compile time checking of your rules:
public static interface Material{}
public static interface Upholstery{}
public static class Oak implements Material{}
public static class Plywood implements Material{}
public static class Cotton implements Upholstery{}
public static class Satin implements Upholstery{}
public static class Furniture<M extends Material, U extends Upholstery>{
private M matrerial = null;
private U upholstery = null;
public Furniture(M matrerial, U upholstery){
this.matrerial = matrerial;
this.upholstery = upholstery;
}
public M getMatrerial() {
return matrerial;
}
public U getUpholstery() {
return upholstery;
}
}
public static Furniture<Plywood, Cotton> cottonFurnitureWithPlywood(Plywood plywood, Cotton cotton){
return new Furniture<>(plywood, cotton);
}
public static Furniture<Oak, Satin> satinFurnitureWithOak(Oak oak, Satin satin){
return new Furniture<>(oak, satin);
}

It depends on what you really want to achieve. Creating objects and passing them around will not magically solve your domain-specific problems.
If you cannot think of any real behavior to add to your objects (except the validation), then it might make more sense to just store your data and read them into memory whenever you want. Even treat rules as data.
Here is an example:
public class Furniture {
String name;
Material material;
Upholstery upholstery;
//getters, setters, other behavior
public Furniture(String name, Material m, Upholstery u) {
//Read rule files from memory or disk and do all the checks
//Do not instantiate if validation does not pass
this.name = name;
material = m;
upholstery = u;
}
}
To specify rules, you will then create three plain text files (e.g. using csv format). File 1 will contain valid values for material, file 2 will contain valid values for upholstery, and file 3 will have a matrix format like the following:
upholstery\material plywood mahogany oak
cotton 1 0 1
satin 0 1 0
to check if a material goes with an upholstery or not, just check the corresponding row and column.
Alternatively, if you have lots of data, you can opt for a database system along with an ORM. Rule tables then can be join tables and come with extra nice features a DBMS may provide (like easy checking for duplicate values). The validation table could look something like:
MaterialID UpholsteryID Compatability_Score
plywood cotton 1
oak satin 0
The advantage of using this approach is that you quickly get a working application and you can decide what to do as you add new behavior to your application. And even if it gets way more complex in the future (new rules, new data types, etc) you can use something like the repository pattern to keep your data and business logic decoupled.
Notes about Enums:
Although the solution suggested by #Igwe Kalu solves the specific case described in the question, it is not scalable. What if you want to find what material goes with a given upholstery (the reverse case)? You will need to create another enum which does not add anything meaningful to the program, or add complex logic to your application.

This is a more detailed description of the idea I threw out there in the comment:
Keep Furniture a POJO, i.e., just hold the data, no behavior or rules implemented in it.
Implement the rules in separate classes, something along the lines of:
interface FurnitureRule {
void validate(Furniture furniture) throws FurnitureRuleException;
}
class ValidMaterialRule implements FurnitureRule {
// this you can load in whatever way suitable in your architecture -
// from enums, DB, an XML file, a JSON file, or inject via Spring, etc.
private Set<String> validMaterialNames;
#Overload
void validate(Furniture furniture) throws FurnitureRuleException {
if (!validMaterialNames.contains(furniture.getMaterial()))
throws new FurnitureRuleException("Invalid material " + furniture.getMaterial());
}
}
class UpholsteryRule implements FurnitureRule {
// Again however suitable to implement/config this
private Map<String, Set<String>> validMaterialsPerUpholstery;
#Overload
void validate(Furniture furniture) throws FurnitureRuleException {
Set<String> validMaterialNames = validMaterialsPerUpholstery.get(furniture.getUpholstery();
if (validMaterialNames != null && !validMaterialNames.contains(furniture.getMaterial()))
throws new FurnitureRuleException("Invalid material " + furniture.getMaterial() + " for upholstery " + furniture.getUpholstery());
}
}
// and more complex rules if you need to
Then have some service along the lines of FurnitureManager. It's the "gatekeeper" for all Furniture creation/updates:
class FurnitureManager {
// configure these via e.g. Spring.
private List<FurnitureRule> rules;
public void updateFurniture(Furniture furniture) throws FurnitureRuleException {
rules.forEach(rule -> rule.validate(furniture))
// proceed to persist `furniture` in the database or whatever else you do with a valid piece of furniture.
}
}

material should be of type Enum.
public enum Material {
MAHOGANY,
TEAK,
OAK,
...
}
Furthermore you can have a validator for Furniture that contains the logic which types of Furniture make sense, and then call that validator in every method that can change the material or upholstery variable (typically only your setters).
public class Furniture {
private Material material;
private Upholstery upholstery; //Could also be String depending on your needs of course
public void setMaterial(Material material) {
if (FurnitureValidator.isValidCombination(material, this.upholstery)) {
this.material = material;
}
}
...
private static class FurnitureValidator {
private static boolean isValidCombination(Material material, Upholstery upholstery) {
switch(material) {
case MAHOGANY: return upholstery != Upholstery.COTTON;
break;
//and so on
}
}
}
}

We often are oblivious of the power inherent in enum types. The Java™ Tutorials clearly states "you should use enum types any time you need to represent a fixed set of constants."
How do you simply make the best of enum in resolving the challenge you presented? - Here goes:
public enum Material {
MAHOGANY( "satin", "velvet" ),
PLYWOOD( "leather" ),
// possibly many other materials and their matching fabrics...
OAK( "some other fabric - 0" ),
WALNUT( "some other fabric - 0", "some other fabric - 1" );
private final String[] listOfSuitingFabrics;
Material( String... fabrics ) {
this.listOfSuitingFabrics = fabrics;
}
String[] getListOfSuitingFabrics() {
return Arrays.copyOf( listOfSuitingFabrics );
}
public String toString() {
return name().substring( 0, 1 ) + name().substring( 1 );
}
}
Let's test it:
public class TestMaterial {
for ( Material material : Material.values() ) {
System.out.println( material.toString() + " go well with " + material.getListOfSuitingFabrics() );
}
}

Probably the approach I'd use (because it involves the least amount of code and it's reasonably fast) is to "flatten" the hierarchical logic into a one-dimensional Set of allowed value combinations. Then when setting one of the fields, validate that the proposed new combination is valid. I'd probably just use a Set of concatenated Strings for simplicity. For the example you give above, something like this:
class Furniture {
private String wood;
private String upholstery;
/**
* Set of all acceptable values, with each combination as a String.
* Example value: "plywood:cotton"
*/
private static final Set<String> allowed = new HashSet<>();
/**
* Load allowed values in initializer.
*
* TODO: load allowed values from DB or config file
* instead of hard-wiring.
*/
static {
allowed.add("plywood:cotton");
...
}
public void setWood(String wood) {
if (!allowed.contains(wood + ":" + this.upholstery)) {
throw new IllegalArgumentException("bad combination of materials!");
}
this.wood = wood;
}
public void setUpholstery(String upholstery) {
if (!allowed.contains(this.wood + ":" + upholstery)) {
throw new IllegalArgumentException("bad combination of materials!");
}
this.upholstery = upholstery;
}
public void setMaterials(String wood, String upholstery) {
if (!allowed.contains(wood + ":" + upholstery)) {
throw new IllegalArgumentException("bad combination of materials!");
}
this.wood = wood;
this.upholstery = upholstery;
}
// getters
...
}
The disadvantage of this approach compared to other answers is that there is no compile-time type checking. For example, if you try to set the wood to plywoo instead of plywood you won’t know about your error until runtime. In practice this disadvantage is negligible since presumably the options will be chosen by a user through a UI (or through some other means), so you won’t know what they are until runtime anyway. Plus the big advantage is that the code will never have to be changed so long as you’re willing to maintain a list of allowed combinations externally. As someone with 30 years of development experience, take my word for it that this approach is far more maintainable.
With the above code, you'll need to use setMaterials before using setWood or setUpholstery, since the other field will still be null and therefore not an allowed combination. You can initialize the class's fields with default materials to avoid this if you want.

Can I reduce code duplication, without unduly compromising efficieny or introducing overheads?

My problem is centered around having code that is easily maintained and efficient. More specifically it revolves around getting data from an SQLite Cursor.
When I first started using cursors I would hard code something along the lines of mystrvar = cursor.getString(?) where ? would be the offset to the respective row.
I then started using constants that were defined along with the table column names. e.g. I'd have something like :-
// Table Aisles
public static final String AISLES_TABLE_NAME = "aisles";
public static final String AISLES_COLUMN_ID = PRIMARY_KEY_NAME;
public static final String AISLES_COLUMN_ID_FULL = AISLES_TABLE_NAME + AISLES_COLUMN_ID;
public static final int AISLES_COLUMN_ID_INDEX = 0; ........
and, as an example would code something along the lines of :-
mystrvar = cursor.getString(DBHelper.AISLES_COLUMN_ID_INDEX);
This was an improvement, but had the flaw of not being that good at coping with joined tables.
I then became aware of cursor.getColumnIndex(), BUT suspected that solely using this. Would have overheads that could be circumvented.
What I have done is to include code that has sparse use of getColumnIndex(). It sets offset variables via getColumnIndex() just once in an activity/custom cursor adapter and subsequently uses the respective offset variable which is the cursor offset for the respective column.
The following is an example (split into 3 chunks, the variable definitions, second a method that sets the variables and then third, the actual data extraction from the cursor :-
1) variable definitions :-
public class Database_Inspector_AislesDB_Adapter extends CursorAdapter {
// Variables to store aisles table offsets as obtained via the defined column names by
// call to setAislesOffsets (aisles_aisleid_offset set -1 to act as notdone flag )
public static int aisles_aisleid_offset = -1;
public static int aisles_aislename_offset;
public static int aisles_aisleorder_offset;
public static int aisles_aisleshopref_offset;
public Database_Inspector_AislesDB_Adapter(Context context, Cursor cursor, int flags) {
super(context, cursor, 0);
setAislesOffsets(cursor); //** Calls method to set offsets
........
}
2) Method that sets the offsets just once (returns virtually immediately if they have already been set)
// Set Aisles Table query offsets into returned cursor, if not already set
public void setAislesOffsets(Cursor cursor) {
if(aisles_aisleid_offset != -1) {
return;
}
aisles_aisleid_offset = cursor.getColumnIndex(ShopperDBHelper.AISLES_COLUMN_ID);
aisles_aislename_offset = cursor.getColumnIndex(ShopperDBHelper.AISLES_COLUMN_NAME);
aisles_aisleorder_offset = cursor.getColumnIndex(ShopperDBHelper.AISLES_COLUMN_ORDER);
aisles_aisleshopref_offset = cursor.getColumnIndex(ShopperDBHelper.AISLES_COLUMN_SHOP);
}
3) example use of offsets
textviewaisleid.setText(cursor.getString(aisles_aisleid_offset));
textviewaislesaislename.setText(cursor.getString(aisles_aislename_offset));
textviewaislesorder.setText(cursor.getString(aisles_aisleorder_offset));
textviewaisleshopref.setText(cursor.getString(aisles_aisleshopref_offset));
However, the above coding has to be used for each activity/adapter that uses the table table. There are 7 tables with 56 columns. Joined tables need combinations. Is there a way that an equivalent of global variables could be used (I'm assuming using shared preferences would be more of an overhead). That is I could set the offsets just once from anywhere and then access them from anywhere (by anywhere I mean from within any activity or adpater)? To re-iterate, mainly to reduce maintenance overheads/issues and with consideration of run efficiency.

Each query can have different column Indexes, so it would not be a good idea to use table column Indexes.
There is no performance problem with getColumnIndex(), especially not when you're returning only a single row. (But to avoid additional checks for missing or wrong columns, you should use getColumnIndexOrThrow(), if possible.)
To reduce the amount of typing, write a helper function that calls both getColumnIndexOrThrow() and getString()/getXxx().

It appears that using the run once per activity to set column offset values, cannot be improved upon without introducing overheads.

Java tablemodel hashmap vs list

I have a custom AbstractTableModel
That model stores the data in a HashMap. So for my method for getValueAt(int rowIndex, int columnIndex)
I do
new ArrayList<Object>(data.values()).get(index);
However my data has over 2000 entries, so doing this every single time whenever I have to get the data for my table creates a huge performance hit.
So what solution can you recommend?
Should I try using List to store all my data in instead of HashMap?What is the accepted standard for storing data when using table models?
Thanks to anyone for their suggestion, and I aplogize for what might be a stupid question, but I am not too great when it comes to tables and how to store data in them.

A HashMap doesn't generally make a good fit for a table model because the table needs the ability to access data at an row/col location.
A ArrayList of ArrayLists is a reasonable way to store a table model. This still gives you fast access. Getting to a particular row is a constant time lookup, and then getting the column is also a constant time lookup.
If you don't want the overhead of the lists, you can always store the data in a 2D array.

Yes, the code you sight is going to suck in performance terms - for every cell you render, you're creating a new ArrayList based on the values in your Map (you can do the math).
At the very least, do the list creation once, probably in the constructor of your table model, like this (which assumes you've got some arbitary object, that you don't mention in your question, as the values of the map):
public class MyTableModel extends AbstractTableModel
{
private static final int COLUMN_0 = 0;
private static final int COLUMN_1 = 1;
private List<MyObject> data;
public MyTableModel(Map<?, MyObject> data)
{
this.data = new ArrayList<MyObject>(data.values());
}
public Object getValueAt(int rowIndex, int columnIndex)
{
switch (columnIndex)
{
case COLUMN_0: return this.data.get(rowIndex).getColumn0();
case COLUMN_1: return this.data.get(rowIndex).getColumn1();
...
case COLUMN_N: return this.data.get(rowIndex).getColumnN();
}
throw new IllegalStateException("Unhandled column index: " + columnIndex);
}
}

how to perform joins in java without database

I need to perform Joins on 2 tables (that I have read from 2 CSV files) without use of database. I have no idea on collections (List, ArrayList). If anyone can give a detail piece of sample code on any one type of join that would be helpful.
For example I have 2 lists :
a=[2,3,4]
b=[3,4,5]
If it is an inner join
output: [3,4]
Tried so far:
for i in a:
for j in i:
if (i==j):
print(i)

Assuming that you have the following CSV files:
id,name,description
1,Foo,FooBar
2,Bar,BarFo
3,Hey,Ho
and the second one:
id,year
2,1990
1,1923
Then you could have the following structures (I'm skipping the constructors and methods for now):
public class Item {
public String name;
public String description;
}
and the second:
public class Date {
public final int year;
}
Then you could have a third one:
public class Joined {
public final Item item;
public final Date date;
}
And then you could have a Map<Integer,Joined>, and you can read the first CSV and create the Joined objects with only the Item part filled out, then read the second CSV and you could fill up the Date part of the Joined object.
In this joining part, you can decide which joining type you want to implement.
If you have a different key, then you have to change the key of the Map, or you may need to create a new class if you have a complex key.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.