Flattenned data into list of trees in java - java

I have a data in tabular format like below:
Activity | ActivityID | ParentID
a1 | 1 |
a2 | 2 | 1
a3 | 3 |
a4 | 4 | 3
a5 | 5 | 2
a6 | 6 | 3
a7 | 7 | 1
I want to represent it like below in java:
a1 -> a2 -> a5
-> a7
a3 -> a4
-> a6
Basically, a List of tree objects where a1 and a3 are roots of the tree having 2 children (a2, a7 and a4, a6 respectively) and a2 has one child (a5). The tree might no necessarily be binary and the data set can be big where one parent can have 50-100 children.
What would be the most effective way in Java ?

For a list of the tree, you can store your data in a structure like this:
final class Node{
final int id;
final string name;
final List<Node> children;
}
So, the final structure is: List<Node>

The data structure you search for is called N-ary tree data structure (you can refer to wikipedia and nist).
If you are familiar with the binary tree (only two childs) , it will be easy for you to change fo n-ary tree (n childs).
In your case you have a forest of the n-ary tree (a list of the n-ary tree) , or you can consider it as one big tree with a common root, where all your effective tree begin at the level one.
The simplest way is to create a Node{info, listChildren} with a field info and a list (arrayList maybe) that will contain children, and a NTreeClasse that contain methods as addChild...generally we use a recursive function that check a certain condition to choose the right path where to insert a new node.
An example of implementing N-ary tree source code and binary tree source code example
If you want to improve your implementation or you seek optimisation, you have to consider some others points like:
The type of the list of children in each node, which is related to the possible number of children, if the number is small we can use a simple array, if the number is big we can use hash table ... etc
Is your tree change or not (dynamic with insert and delete)
The traversal of the tree
Avoid recursion: replace the recursive method by an iterative.
Consider the construction steps, the normal way is for each element, we begin from the root, and we find the right path til we arrive to the leaf where we should insert the node (new child), you can maybe insert the children directly in the right place, its depend on your data and how you want to organize your tree.
you can consider using array to store the tree, also depend strongly on your data.
Hope this helps a little bit to begin and to dig further.

I am providing you a short and straight forward algorithm. It's the dirty version. You can rewrite it as you wish(by which, i mean a better version):
I am assuming, there is a table array such that table[i].activityID would give me ith activity's id and table[i].parentID would give me the activity's parent id ans so on...
[Note: one more point here, I'm also assuming if there is no parent of an element then, its parent id is -1, i.e. in the given example of yours a1's and a3's parent id would be -1. This way we can understand which activity does not have any parent. If you have a better idea, you can use that].
Let us first define the class Activity
class Activity implements Comparable<Activity> {
int id;
String name;
List<Activity> children;
Activity(int id, String name) {
this.id = id;
this.name = name;
this.children = new ArrayList<>();
}
#Override
public int compareTo(Activity activity) {
return this.id - activity.id;
}
}
As the activity class is defined, we'll be creating activity's now and put them in a TreeMap(just to keep track of them).
Notice, which activity's parent id is -1, those will be roots of trees [and according to your example, there could be several trees formed from your table]. We'll keep track of roots while creating activity's.
// to keep track of all acticities
TreeMap<int, Activity> activities = new TreeMap<>();
for(int i=0; i<table.length; i++) {
Activity a = new Activity(table[i].activityID, table[i].activityName);
activities.add(a);
}
// to keep track of root activities
List<Activity> roots = new ArrayList<>();
for(int i=0; i<table.length; i++) {
// check if there is a parent of this Activity
Activity activity = activities.get(table[i].activityID);
// now, if activity has a parent
// then, add activity in the parent's children list
if(table[i].parentID != -1) {
Activity parent = activities.get(table[i].parentID);
// this is very bad way of writing code
// just do it for now
// later rewrite it
parent.children.add(activity);
} else {
// and activity does not have a parent
// then, it is a root
roots.add(activity);
}
}
Now, we're going to create a traverse method, which will help to traverse the tree node in order.
void traverse(Activity u) {
System.out.println(u.name);
for(Activity v: u.children) {
traverse(v);
}
}
Now, you can call this function using the root activities:
for(Activity rootActivity: roots) {
traverse(rootActivity);
}
That's it...

Related

Sorting parents and childs using Java

I have an "Item" class that contains the following fields (in short): id (related to the primary key of the Item table on SQL Server), description, sequence (non-null integer), and link (a reference to the id of the parent object), can be null)
I would like to sort by using Java as follows:
Id Sequence Link Description
1 1 null Item A
99 ..1 1 Son of A, first of the sequence
57 ..2 1 Son of A, second of the sequence
66 ..3 1 Son of A, third of the sequence
2 2 null Item B
3 3 null Item C
...
(I put the dots for better visualization)
That is, I would like the children of a certain item to come directly below their parent, ordered by the "sequence" field.
I tried using the comparator, but it failed:
public class SequenceComparator implements Comparator<Item> {
#Override
public int compare(Item o1, Item o2) {
String x1 = o1.getSequence().toString();
String x2 = o2.getSequence().toString();
int sComp = x1.compareTo(x2);
if (sComp != 0) {
return sComp;
} else {
x1 = o1.getLink().toString();
x2 = o2.getLink() == null?"":o2.getLink().toString();
return x1.compareTo(x2);
}
}
}
How can I do that?
New answer: I don’t think you want one comparator to control the complete sorting, because when sorting children you need the sequence of the parent, and you don’t have an easy or natural access to that from within the comparator.
Instead I suggest a sorting in a number of steps:
Put the items into groups by parent items. So one group will be the item with id 1 and all its children. Items with no children will be in a group on their own.
Sort each group so the parent comes first and then all the children in the right order.
Sort the groups by the parent’s sequence.
Concatenate the sorted groups into one list.
Like this, using both Java 8 streams and List.sort():
// group by parent id
Map<Integer, List<Item>> intermediate = input.stream()
.collect(Collectors.groupingBy(i -> i.getLink() == null ? Integer.valueOf(i.getId()) : i.getLink()));
// sort each inner list so that parent comes first and then children by sequence
for (List<Item> innerList : intermediate.values()) {
innerList.sort((i1, i2) -> {
if (i1.getLink() == null) { // i1 is parent
return -1; // parent first
}
if (i2.getLink() == null) {
return 1;
}
return i1.getSequence().compareTo(i2.getSequence());
});
}
// sort lists by parent’s sequence, that is, sequence of first item
List<Item> result = intermediate.values().stream()
.sorted(Comparator.comparing(innerList -> innerList.get(0).getSequence()))
.flatMap(List::stream)
.collect(Collectors.toList());
The output is (leaving out the item description):
1 1 null
99 ..1 1
57 ..2 1
66 ..3 1
2 2 null
3 3 null
(This output was produced with a toString method that printed the dots when converting an item with a parent to a String.)
If you cannot use Java 8, I still believe the general idea of the steps mentioned above will work, only some of the steps will require a little more code.
I deleted my previous answer since I had misunderstood the part about what getLink() returns and then decided that that answer wasn’t worth trying to salvage.
Edit:
I am actually ignoring this piece from the documentation of Collectors.groupingBy(): “There are no guarantees on the …, mutability, of the … List objects returned.” It still works with my Java 8. If immutability of the list should prevent sorting, the solution is to create a new ArrayList containing the same items.
With thanks to Stuart Marks for the inspiration, the comparator for sorting the inner lists needs not be as clumsy as above. The sorting can be written in this condensed way:
innerList.sort(Comparator.comparing(itm -> itm.getLink() == null ? null : itm.getSequence(),
Comparator.nullsFirst(Integer::compare)));
Given that there are only two layers in the hierarchy, this boils down to a classic multi-level sort. There are two kinds of items, parents and children, distinguished by whether the link field is null. The trick is that the sorting at each level isn't on a particular field. Instead, the value on which to sort on depends on what kind of item it is.
The first level of sorting should be on the parent value. The parent value of a parent item is its sequence, but the parent value of a child item is the sequence of the parent it's linked to. Child items are linked to parent items via their id, so the first thing we need to do is to build up a map from ids to sequence values of parent nodes:
Map<Integer, Integer> idSeqMap =
list.stream()
.filter(it -> it.getLink() == null)
.collect(Collectors.toMap(Item::getId, Item::getSequence));
(This assumes that ids are unique, which is reasonable as they're related to the table primary key.)
Now that we have this map, you can write a lambda expression that gets the appropriate parent value from the item. (This assumes that all non-null link values point to existing items.) This is as follows:
(Item it) -> it.getLink() == null ? it.getSequence() : idSeqMap.get(it.getLink())
The second level of sorting should be on the child value. The child value of a parent item is null, so nulls will need to be sorted before any non-null value. The child value of a child item is its sequence. A lambda expression for getting the child value is:
(Item it) -> it.getLink() == null ? null : it.getSequence()
Now, we can combine these using the Comparator helper functions introduced in Java 8. The result can be passed directly to the List.sort() method.
list.sort(Comparator.comparingInt((Item it) -> it.getLink() == null ? it.getSequence() : idSeqMap.get(it.getLink()))
.thenComparing((Item it) -> it.getLink() == null ? null : it.getSequence(),
Comparator.nullsFirst(Integer::compare))
.thenComparingInt(Item::getId));
The first level of sorting is pretty straightforward; just pass the first lambda expression (which extracts the parent value) to Comparator.comparingInt.
The second level of sorting is a bit tricky. I'm assuming the result of getLink() is a nullable Integer. First, we have to extract the child value using the second lambda expression. This results in a nullable value, so if we were to pass this to thenComparing we'd get a NullPointerException. Instead, thenComparing allows us to pass a secondary comparator. We'll use this to handle nulls. For this secondary comparator we pass
Comparator.nullsFirst(Integer::compare)
This compares Integer objects, with nulls sorted first, and non-nulls compared in turn using the Integer.compare method.
Finally, we compare id values as a last resort. This is optional if you're using this comparator only for sorting; duplicates will end up next to each other. But if you use this comparator for a TreeSet, you'll want to make sure that different items never compare equals. Presumably a database id value would be sufficient to differentiate all unique items.
Considering your data structure is a Tree (with null as the root node) with no cycles:
You have to walk up the tree for both o1 and o2 until you find a common ancestor. Once you do, take one step back along both branches to find their relative order (with Sequence)
Finding the common ancestor may be tricky to do, and I don't know if it is possible in linear time, but certainly possible in O(n log n) time (with n the length of the branches)

Generating Hierarchy using Maps and path variable

I've got a family class that pulls data from a postgres database. The class looks like something like this:
#Entity
public class Family(){
#Id
private long id;
private String firstName;
private String lastName;
private Long parentId;
private4 String familyPath;
private List<Family> children;
//getters and setters
In the database, I have their relation to each other stored as a period-delimited string. So for example, if Bob is the child of Sue, the tree column would look like: "bob.sue". This path is stored as part of the family object in the familyPath variable.
CLARIFICATION
familyPath is a path based on unique IDs for each row in the database. So a path may look like "1.2.3" where the last number is the current row.
"1.2.4" is another potential path. so rows with IDs 3 and 4 are children of 2, etc.
In my code I query the database for all family members in the data, so I have a set of every member of the family in the database. My goal is to generate a set of all family members as a hierarchical structure using this initial, flat set. So, in the end if I call getChildren on Bob, I get a list back containing Sue and any other children.
My Solution:
First, I iterate through my list of families, and find what I call the root members -- those at the top level of the family path -- and remove them into a separate list. So now I have a list of top level family members, an a list of everyone else.
Then, for each member in the top level list, I call the following recursive method:
private Family familyTree(Family root, List<Family> members) {
List<Family> children = new ArrayList<>();
for (Family f : members) {
if (isChildOf(f, root)) {
children.add(familyTree(f, resources));
}
}
root.setChildren(children);
return root;
}
private boolean isChildOf(Family a, Family b) {
String pCPath = a.getFamilyPath();
String pPPath = b.getFamilyPath();
return pCPath.indexOf('.') >= 0
&& pCPath.substring(0, pCPath.lastIndexOf('.')).equals(pPPath);
}
and save the output to a list. This generates the desired results.
My Question
However, I feel that this recursive method is very expensive (n^2). I'm thinking that there may be a more efficient way to generate this hierarchy using sets, maps and the Family object's familyPath variable, But i keep getting stuck in multiple iterative loops. Does anyone have any thoughts?
Option 1 - Single pass
private Family familyTree(Family root, List<Family> members) {
Map<Long, List<Family>> parentMap = new HashMap<>();
// Assuming root is not contained in members
root.children = new ArrayList<>();
parentMap.put(root.id, root.children);
// Assign each member to a child list
for (Family member : members) {
// Put the family member in the right child list
Long parentId = member.getParentId();
List<Family> parentChildren = parentMap.get(parentId);
if (parentChildren == null) {
parentChildren = new ArrayList<>();
parentMap.put(parentId, parentChildren);
}
parentChildren.add(member);
// Get or create the child list of the family member
List<Family> ownChildren = parentMap.get(member.id);
if (ownChildren == null) {
ownChildren = new ArrayList<>();
parentMap.put(member.id, ownChildren);
}
member.children = ownChildren;
}
return root;
}
private Long getParentId() {
// left as an exercise...
}
Option 1.b - Single pass over all members including roots
private List<Family> familyTree(List<Family> members) {
List<Family> roots = new ArrayList<>();
Map<Long, List<Family>> parentMap = new HashMap<>();
// Assign each member to a child list
for (Family member : members) {
// Put the family member in the right child list
Long parentId = member.getParentId();
if (parentId == null) {
// a root member
roots.add(member);
} else {
// a non-root member
List<Family> parentChildren = parentMap.get(parentId);
if (parentChildren == null) {
parentChildren = new ArrayList<>();
parentMap.put(parentId, parentChildren);
}
parentChildren.add(member);
}
// Get or create the child list of the family member
List<Family> ownChildren = parentMap.get(member.id);
if (ownChildren == null) {
ownChildren = new ArrayList<>();
parentMap.put(member.id, ownChildren);
}
member.children = ownChildren;
}
return roots;
}
Option 2 - Add a reference to the parent
Your Family class should have a private Family parent attribute. You will then be able to do a single query per family "level". That is:
get all children of Sue
get all children of people from (1) and assign them to the proper parent
etc.
Option 3 - Nested Set Model of Hierarchies
The database schema can be modified to retrieve whole sub-trees in a single query. The trick is to give each tree node a "left" and a "right" value. These values establish a range for the "left" and "right" values of a node's children.
Selecting a full tree can then be done like this:
SELECT child.id, ...
FROM family AS child, family AS parent
WHERE child.lft BETWEEN parent.lft AND parent.rgt
AND parent.id = 111
ORDER BY child.lft;
There are many other hierarchical operations which can be done very easily with such a schema. See this post and Joe Celko's Trees and Hierarchies in SQL for Smarties for more information.
Finally, your model only considers a single parent for each family member which seems strange.

Create recursive structure from flatten DFS structure

Problem
I have the following tree:
2
/ \
3 5
/ / \
6 4 1
that is represented in the following way and order:
id parent
------------
2 null
3 2
6 3
5 2
4 5
1 5
Purpose:
Store this flatten tree in a recursive structure in O(n) (O(n*log(n)) is acceptable, but not very good) (I know how to solve it in O(n^2), but I stored data in that DFS order to be able to "parse" it in a more efficient way). E.g.:
class R {
int id;
List<R> children;
}
that looks like this in a JSON form:
{
id: 2,
children: [
{
id: 3,
children: { ... }
},
{
id: 5,
children: { ... }
}
]
}
How can I do this? The programming language is not important, because I can translate it in Java.
Java code:
R r = new R();
Map<Long, Line> map = createMap2();
List<Line> vals = new ArrayList<Line>(map.values());
r.id = vals.get(0).id;
vals.remove(0);
r.children = createResource(vals, r.id);
...
private static List<R> createResource(List<Line> l, Long pid) {
List<R> lr = new ArrayList<R>();
if ( l.size() > 0 ) {
Long id = l.get(0).id;
Long p = l.get(0).pid;
l.remove(0);
if ( pid.equals(p) ) {
R r = new R();
r.id = id;
r.children = createResource(l, id);
lr.add(r);
}
else {
return createResource(l, pid); // of course, this is not ok
}
}
return lr;
}
The problem in the code above is that only 2, 3 and 6 are stored in the recursive structure (class R). I want to store the whole flatten tree structure (many Line objects) in that recursive structure (R object), not only some of nodes.
P.S.: The problem is simplified. I'm not interested in a specific solutions because there are many fields involved and thousands of entries. I am also interested in solutions that work fine for the worst case scenarios (different kind of trees) because this is the user's guarantee.
What about something like this? In the first pass, hash the parents as arrays of their children and identify the root; in the second, begin with the root and for each of its children, insert a new object, with its own children and so on:
To take your example, the first pass would generate
parent_hash = {2:[3,5], 3:[6], 5:[4,1]}
root = 2
The second pass would go something like:
object 2 -> object 3 -> object 6
-> object 5 -> object 4
-> object 1
done
The problem with your code is that once an entry doesn't satisfy p == pid condition it is lost forever. Instead of losing entries you should break the loop and return immediately. The offending entry shall also be returned and handled by a proper instance of R upstream.
You can easily represent the whole tree in an array, since each node of the tree can be represented by an index in the array. For a binary tree, the children of index i would be at index 2*i+1 and index 2*i+2. It would then be simple to convert the array to any other representation. The array itself would be a space-efficient representation for balanced trees, but would waste a lot of space for very unbalanced trees. (This should not matter unless you're dealing with a very large amount of data.)
If you need a memory-efficient way for large unbalanced trees, then it would make sense to use the standard Node-representation of trees. To convert from your list, you could use a HashMap as גלעד ברקן suggested. However, if the id's of the nodes are mostly continuous (like the example where they're from 1 to 6), you could also just use an array where each index of the array i is used to store a Node with an id of i. This will let you easily find a parent node and assign it children nodes as they're created.
(cf. my Trees tutorial for storing trees as arrays.)
I found a simple solution based on that "DFS" order.
This approach works even if I use a list of "Line" objects or a map.
private static List<R> createResource(List<Line> l, Long pid) {
List<R> lr = new ArrayList<R>();
for ( Line line : l ) {
if ( line is a children ) {
R r = new R();
r.id = id;
l.remove(0);
r.children = createResource(l, line.id);
lr.add(r);
}
}
return lr;
}
It seems to be in O(n^2) because there is a for loop + recursion for all elements, but it is in O(n) . Thanks to DFS order, the next element for which createResource is called is on the first position ( 0 -> O(1) ). Because the recursion takes every element => the complexity is O(n).
But if the order is not the DFS order (maybe a Map that is not LinkedHashMap is involved), I recommend the solution that contains arrays for parents. (according to גלעד ברקן )

Neo4j How to create relationships after ingesting data

Say I have a csv file where one of the columns is unix timestamp. It is not sorted on that, but following those in order could be a useful relationship. When I want that, I could use ORDER BY, but adding relationship pointers should be faster right, and make use of the NOSQL? Do I have to sort this and add the relationship as I ingest, or can I do it after a query?
After I run the first query and get a subset back:
result = engine.execute(START...WHERE...ORDER BY time)
Can I then go through results adding this relationship like:
prev.createRelationshipTo(next, PRECEDES);
I tried two different ways using foreach or iterator and both had runtime errors casting a string to a Node:
for (Map<String, Object>row : result) {
String n = (String) row.values().iterator().next();
System.out.println(n);
}
Iterator<Node> nodes = result.columnAs("n.chosen");
if (nodes.hasNext()) {
Node prev = nodes.next();
while (nodes.hasNext()) {
Node n = nodes.next();
prev.createRelationshipTo(n, null);
prev = n;
}
}
Also, there is the edge case of two rows having the same timestamp. I don't care about the order that is chosen, but I want it to not break the relation chain.

Advice to get selected elements from CheckedTreeSelectionDialog

I'm using CheckedTreeSelectionDialog to implement some kind of refactoring. The refactoring is performed over a large set of objets, so each root node of the selection tree is a objet, and each of those objects has a suggested modification as a child node. For example,
CheckedTreeSelectionDialog:
ObjectA
---------- Remove attribute attA1
---------- Remove attribute attA2
Object B
---------- Remove attribute attB1
.
.
.
I obtain the selected elementes this way:
Object[] result = dialog.getResult();
and, if I select all those 5 elements showed before, I will get the list:
ObjectA
attA1
attA2
ObjectB
attB1
I thought I would get some kind of tree, for example, where I can get the object "ObjectA" and see which of its childs where selected.
Am I doing this right?
Thanks!
Alternatively you can get the tree viewer and from that get the checked elements.
Map<Object, List<Object>> mapOfCheckedElements = new HashMap<Object, List<Object>>();
for (TreeItem level1 : checkBoxTreeViewer.getTree().getItems()) {
if (level1.getChecked()) {
List<Object> checkedChildren = new ArrayList<Object>();
for (TreeItem level2 : level1.getItems()) {
if (level2.getChecked()) {
checkedChildren.add(level2);
}
}
mapOfCheckedElements.put(level1, checkedChildren);
}
}

Categories