Sorting and comparing XML from Java

Sorting and comparing XML from Java - java

I was given XML and schema files. My goal was to output all data from the XML (without duplicates) and order this list by the date of birth. Currently I got all data printed out (with duplicates) and I don't know what to do next. I've tried different things, but unsuccessfully.

HashSet will depend on the Node.equals() method to determine equality, and you're adding distinct nodes, albeit with the same underlying text. From the doc:
adds the specified element e to this set if this set contains no
element e2 such that (e==null ? e2==null : e.equals(e2))
I would extract the underlying text (String) from the Node, and a HashSet<String> will determine uniqueness correctly.

EDIT
After reading the post again I realised I need to remove dups too so:
You can use a TreeSet to impose unqiueness and sort by DOB - I presume that a person with the same first name, surname and date of birth is the same person.
First I would wrap your Node in a class that implements Comparable and that also does the getting of all those properties you have. The wrapper needs to implement Comparable as the TreeSet uses this method to decide whether elements are different (a.compareTo(b) != 0) and also how to order them.
public static final class NodeWrapper implements Comparable<NodeWrapper> {
private static final SimpleDateFormat DOB_FORMAT = new SimpleDateFormat("yyyy-MM-dd");
private final Element element;
private final Date dob;
private final String firstName;
private final String surName;
private final String sex;
public NodeWrapper(final Node node) {
this.element = (Element) node;
try {
this.dob = DOB_FORMAT.parse(initDateOfBirth());
} catch (ParseException ex) {
throw new RuntimeException("Failed to parse dob", ex);
}
this.firstName = initFirstName();
this.surName = initSurnameName();
this.sex = initSex();
}
private String initFirstName() {
return getNodeValue("firstname");
}
private String initSurnameName() {
return getNodeValue("surname");
}
private String initDateOfBirth() {
return getNodeValue("dateofbirth");
}
private String initSex() {
return getNodeValue("sex");
}
private String getNodeValue(final String name) {
return element.getElementsByTagName(name).item(0).getTextContent();
}
public Node getNode() {
return element;
}
Date getDob() {
return dob;
}
public String getFirstName() {
return firstName;
}
public String getSurName() {
return surName;
}
public String getDateOfBirth() {
return DOB_FORMAT.format(dob);
}
public String getSex() {
return sex;
}
public int compareTo(NodeWrapper o) {
int c;
c = getDob().compareTo(o.getDob());
if (c != 0) {
return c;
}
c = getSurName().compareTo(o.getSurName());
if (c != 0) {
return c;
}
return getFirstName().compareTo(o.getFirstName());
}
#Override
public int hashCode() {
int hash = 5;
hash = 47 * hash + (this.dob != null ? this.dob.hashCode() : 0);
hash = 47 * hash + (this.firstName != null ? this.firstName.hashCode() : 0);
hash = 47 * hash + (this.surName != null ? this.surName.hashCode() : 0);
return hash;
}
#Override
public boolean equals(Object obj) {
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
final NodeWrapper other = (NodeWrapper) obj;
if (this.dob != other.dob && (this.dob == null || !this.dob.equals(other.dob))) {
return false;
}
if ((this.firstName == null) ? (other.firstName != null) : !this.firstName.equals(other.firstName)) {
return false;
}
if ((this.surName == null) ? (other.surName != null) : !this.surName.equals(other.surName)) {
return false;
}
return true;
}
#Override
public String toString() {
return "FirstName: " + getFirstName() + ". Surname: " + getSurName() + ". DOB: " + getDateOfBirth() + ". Sex: " + getSex() + ".";
}
}
So if the date of birth, surname and firstname are all equal we assume it is the same person - we return 0. It is good practice, if using compareTo in this way to make it consistent with equals so that if a.compareTo(b)==0 then a.equals(b), I have added the required equals and hashCode methods as well.
Now you can use a TreeSet in your code which will automatically sort and guarantee unqiueness:
final Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new File("file.xml"));
final Set<NodeWrapper> inimesteList = new TreeSet<NodeWrapper>();
final NodeList isa = doc.getElementsByTagName("isa");
for (int i = 0; i < isa.getLength(); i++) {
inimesteList.add(new NodeWrapper(isa.item(i)));
}
final NodeList ema = doc.getElementsByTagName("ema");
for (int i = 0; i < ema.getLength(); i++) {
inimesteList.add(new NodeWrapper(ema.item(i)));
}
final NodeList isik = doc.getElementsByTagName("isik");
for (int i = 0; i < isik.getLength(); i++) {
inimesteList.add(new NodeWrapper(isik.item(i)));
}
System.out.println();
System.out.println("Total: " + inimesteList.size());
for (final NodeWrapper nw : inimesteList) {
System.out.println(nw);
}
I have also added a toString method and used that to print the nodes - this makes the code much cleaner.
The Document approach, while seeming simpler than JAXB, is riddled with this sort of tedium. As you already have a schema I would strongly recommend that you make the move to xjc and JAXB unmarshalling - this will make this sort of stuff hundereds of times easier.

Its better to create a Java Bean (POJO) with the single node details. Override equals() and hashcode() in the same. Store all the Node data into the List of Bean. Then use LinkedHashSet to remove duplicates. Implement Comparable or use Comparator and Collections.sort() to sort the same.
Extend or encapsulate Node in another class and override equals() and hashcode() in the same. Store all the Nodes into the List of new class instance. Then use LinkedHashSet to remove duplicates. Implement Comparable or use Comparator and Collections.sort() to sort the same.

Related

toString() for a List Stack

i have a class for List, Node, and Stack.
The classes List, Node are all done, now i want to finish my Stack.class, which uses my List.class.
Now i am in my main method and i want to try out my push/pop methods, but don't know how to output them as strings.
I did this in my List.class, but don't know how to recreate it for the Stack.class.
Can someone help me? Thanks.
public class Stack {
private List list;
public Stack() {
list = new List();
}
public class List{
public String toString() {
Node temp = head;
String string = "";
while (temp != null && temp.getNext() != null) {
string = string + temp.getElement() + ", ";
temp = temp.getNext();
}
if (temp != null) {
string = string + temp.getElement() + ".";
}
return string;
}

In your Stack class, you could invoke list.toString(). Something like
public String toString() {
return String.format("Stack: %s", list.toString());
}

you have to "concatenate" the 2 toString() method.
So you have to create a new toString() in your Stack class.
public class Stack {
private List list;
public Stack() {
list = new List();
}
public String toString()
{
String myreturn = "//Anything you need" + list.toString();
return myreturn;
}}

You can handle each element in the list:
For example, if you need get list values by comma separate:
public String toString() {
return list.stream()
.map(Objects::toString)
.collect(Collectors.joining(","));
}
Or you need also modify each value:
public String toString() {
return list.stream()
.map(p -> "[" + p.toString() + "]")
.collect(Collectors.joining(","));
}

Java Custom Object with multiple properties as Map key or concatenation of its properties

I have a requirement where I have to aggregate a number of objects based on its properties. Object has around 10 properties and aggregation must be done on all its properties. For example -
If there are two objects A and B of some class C with properties p1, p2, p3,...p10, (all properties are of String type) then these two objects must be considered equal only if all its corresponding properties are equal.
For this I have two approaches in mind using HashMap in Java-
Approach 1 - Using key as Object of tyep C and Value as Integer for count and increase the count every time an existing object is found in Map otherwise create a new key value pair.
HahsMap<C, Integer>
But in this approach since I have to aggregate on all the properties, I will have to write(override) an equals() method which will check all the string properties for equality and similarly some implementation for hashCode().
Approach 2 - Using key as a single string made by concatenation of all the properties of object and value as a wrapper object which will have two properties one the object of type C and another a count variable of Integer type.
For each object(C) create an String key by concatenation of its properties and if key already exists in the Map, get the wrapper object and update its count property, otherwise create a new key, value pair.
HashMap<String, WrapperObj>
In this approach I don't have to do any manual task to use String as key and also it is considered a good practice to use String as key in Map.
Approach 2 seems easy to implement and efficient as opposed to Approach 2 every time when equals is called all the properties will be checked one by one.
But I am not sure whether Approach 2 in a standard way of comparing two objects and performing this kind of operation.
Please suggest if there is any other way to implement this requirement, like if there is any better way to implement equals() method for using it as key when all its properties should be taken into consideration when checking for equality of objects.
Example -
Class whose objects needs aggregation with hash and equals implementation in case of Approach 1
public class Report {
private String p1;
private String p2;
private String p3;
private String p4;
.
.
.
private String p10;
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((p1 == null) ? 0 : p1.hashCode());
result = prime * result + ((p2 == null) ? 0 : p2.hashCode());
result = prime * result + ((p3 == null) ? 0 : p3.hashCode());
result = prime * result + ((p4 == null) ? 0 : p4.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (!(obj instanceof Report))
return false;
Report other = (Report) obj;
if (p1 == null) {
if (other.p1 != null)
return false;
} else if (!p1.equals(other.p1))
return false;
if (p2 == null) {
if (other.p2 != null)
return false;
} else if (!p2.equals(other.p2))
return false;
if (p3 == null) {
if (other.p3 != null)
return false;
} else if (!p3.equals(other.p3))
return false;
if (p4 == null) {
if (other.p4 != null)
return false;
} else if (!p4.equals(other.p4))
return false;
.
.
.
if (p10 == null) {
if (other.p10 != null)
return false;
} else if (!p10.equals(other.p10))
return false;
return true;
}
}
Code For aggregation Approach 1-
Map<Report, Integer> map = new HashMap<Report, Integer>();
for(Report report : reportList) {
if(map.get(report) != null)
map.put(report, map.get(report)+1);
else
map.put(report, 1);
}
Approach 2 - With wrapper class and not implementing equals and hash for Report class.
public class Report {
private String p1;
private String p2;
private String p3;
private String p4;
public String getP1() {
return p1;
}
public void setP1(String p1) {
this.p1 = p1;
}
public String getP2() {
return p2;
}
public void setP2(String p2) {
this.p2 = p2;
}
public String getP3() {
return p3;
}
public void setP3(String p3) {
this.p3 = p3;
}
public String getP4() {
return p4;
}
public void setP4(String p4) {
this.p4 = p4;
}
Report warpper class -
public class ReportWrapper {
private Report report;
private Integer count;
public Report getReport() {
return report;
}
public void setReport(Report report) {
this.report = report;
}
public Integer getCount() {
return count;
}
public void setCount(Integer count) {
this.count = count;
}
}
Code For aggregation Approach 2-
Map<String, ReportWrapper> map = new HashMap<String,
ReportWrapper>();
for(Report report : reportList) {
String key = report.getP1() + ";" + report.getP2() +
";" + report.getP3() +
";" + .....+ ";" + report.getP10();
ReportWrapper rw = map.get(key);
if(rw != null) {
rw.setCount(rw.getCount()+1);
map.put(key, rw);
}
else {
ReportWrapper wrapper = new ReportWrapper();
wrapper.setReport(report);
wrapper.setCount(1);
map.put(key, wrapper);
}
}
PSI: Here I am more concerned about, which approach is better.

Consider using the equals and hashcode methods that you can get generated from an IDE or use a tool like Lombok which will do it for you using an annotation and you don't have to write any code.
For lombok:
https://projectlombok.org/features/EqualsAndHashCode
How to use #EqualsAndHashCode With Include - Lombok
This is what IDEA generates if you want to go that route. No special process required.
#Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
Report report = (Report) o;
return Objects.equals(prop1, report.prop1) &&
Objects.equals(prop2, report.prop2) &&
Objects.equals(prop3, report.prop3) &&
Objects.equals(prop4, report.prop4) &&
Objects.equals(prop5, report.prop5) &&
Objects.equals(prop6, report.prop6) &&
Objects.equals(prop7, report.prop7) &&
Objects.equals(prop8, report.prop8) &&
Objects.equals(prop9, report.prop9);
}
#Override
public int hashCode() {
return Objects.hash(prop1, prop2, prop3, prop4, prop5, prop6, prop7, prop8, prop9);
}

HashMap with incorrect equals and HashCode implementation

According to what I have read,
to use an object as the key to a hashMap, it has to provide a correct
override and implementation of the equals and hashCode
method. HashMap get(Key k) method calls hashCode method on the key
object and applies returned hashValue to its own static hash
function to find a bucket location(backing array) where keys and
values are stored in form of a nested class called Entry (Map.Entry).
HashMap's internal hash Method defends against poor quality hash
functions.
To test these contracts, I have written a bean class with incorrect but legal implementations of the equals and hashCode method.
The class:
public class HashVO {
private String studentName;
private int age;
private boolean isAdult;
public HashVO(String studentName, int age, boolean isAdult) {
super();
this.studentName = studentName;
this.age = age;
this.isAdult = isAdult;
}
public String getStudentName() {
return studentName;
}
public void setStudentName(String studentName) {
this.studentName = studentName;
}
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
public boolean isAdult() {
return isAdult;
}
public void setAdult(boolean isAdult) {
this.isAdult = isAdult;
}
#Override
public String toString() {
return studentName + " : " + age + " : " + isAdult;
}
#Override
public boolean equals(Object obj) {
return false;
}
#Override
public int hashCode() {
return 31;
}
}
In this case, the hash method of the HashMap,
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
should also return same value everytime because the hashcode always returns 31. So if objects of class HashVO are used as key of a hashMap, the get method should not work, as it should go to the same bucket to retrieve the objects and the equals method always returns false so it will not be able to able to find a match for the key object.
But when I am using this method,
public static void main(String[] args) {
HashMap<HashVO, String> voMap = new HashMap<HashVO, String>();
HashVO vo = new HashVO("Item1", 25, true);
HashVO vo1 = new HashVO("Item2", 12, false);
HashVO vo2 = new HashVO("Item3", 1, false);
voMap.put(vo, "Item");
voMap.put(vo1, "Item1");
voMap.put(vo2, "Item2");
System.out.println(voMap.get(vo));
System.out.println(voMap.get(vo1));
System.out.println(voMap.get(vo2));
}
the output is correct, and showing
Item
Item1
Item2
I want to understand why this correct output is coming even as the Equals and HashCode method implementation is incorrect.

HashMap has a little trick where it compares object references before using equals. Since you're using the same object references for adding the elements and for retrieving them, HashMap will return them correctly.
See Java 7 source here (Java 8 did a pretty big revamp of HashMap but it does something similar)
final Entry<K,V> getEntry(Object key) {
if (size == 0) {
return null;
}
int hash = (key == null) ? 0 : hash(key);
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
// HERE. Uses == with the key
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
}
return null;
}
Note that this isn't part of the docs, so don't depend on it.

The HashMap works like this:
1) Index of table cell in which (Key,Value) will save calculates as key.hashCode();
2) Keys in HashMap compare by equals() or by reference comparing.
So, in your situation all pairs of (K,V) will store in one cell of HashMap table as LinkedList.
And you can get them from Map because references for keys will equals

How to order a unique list based on 2 object attributes in Java

I have a list of objects I'm referring to as Artifacts. I need to sort alphabetically by the "Name" attribute and in numerical order by an attribute that Artifact has called "Level".
The Level is not always set in Artifact and in that case the entire collection should be alphabetical. If the Artifact has a Level then that takes precedence and should be sorted by order of Level.
The Artifacts need to be unique based upon the Name attribute. I could use a Set collection and override the equals method of the Artifact to sort Alphabetically. However, when I want to sort by Level, the equals method relevant to Name will throw off the results of this sort.
What collections and object structure should I use to remain unique by Name but also be able to sort by Level?

You'll want to look at the comparable interface and the comparator interface. Implement Comparable if this is the only way your objects can be compared, comparator otherwise.

I think java.util.TreeSet is good Container for your problem. It is Set and it uses Compareble mechanism.
So you have two options:
1) put Comparator into TreeSet constructor
2) make your Artifact implement Comparable
TIP: In compareTo method you can use compareTo method from String.

The code below will sort the set giving the precedence to the level and later the name. If a level is null, it will be placed at the beginning, treating it as it was a level 0. For null names, the Artifact will be positioned as it had an empty level. Hope that helps.
import java.util.Arrays;
import java.util.SortedSet;
import java.util.TreeSet;
public class Artifact implements Comparable<Artifact> {
private String name;
private Integer level;
public Artifact(String name, Integer level) {
this.name = name;
this.level = level;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((level == null) ? 0 : level.hashCode());
result = prime * result + ((name == null) ? 0 : name.hashCode());
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
Artifact other = (Artifact) obj;
if (level == null) {
if (other.level != null)
return false;
} else if (!level.equals(other.level))
return false;
if (name == null) {
if (other.name != null)
return false;
} else if (!name.equals(other.name))
return false;
return true;
}
#Override
public int compareTo(Artifact o) {
if (level == null){
return new Artifact(name, 0).compareTo(o);
}
if (name == null){
return new Artifact("", level).compareTo(o);
}
if (level.equals(o.level)) {
return name.compareTo(o.name);
} else {
return level.compareTo(o.level);
}
}
public String toString() {
return level + " " + name;
}
public static void main(String[] args) {
Artifact a1 = new Artifact("a", 1);
Artifact a2 = new Artifact("a", 2);
Artifact a3 = new Artifact("a", 3);
Artifact b1 = new Artifact("b", 1);
Artifact b2 = new Artifact("b", 2);
Artifact b2a = new Artifact("b", 2);
Artifact nullLevel = new Artifact("a",null);
Artifact nullName = new Artifact(null,2);
SortedSet<Artifact> set = new TreeSet<Artifact>();
set.add(a1);
set.add(a2);
set.add(a3);
set.add(b1);
set.add(b2);
set.add(b2a);
set.add(nullLevel);
set.add(nullName);
System.out.println(Arrays.toString(set.toArray()));
}
}

Sorted List in Java

I need to sort the list in java as below:
List contains collection of objects like this,
List list1 = {obj1, obj2,obj3,.....};
I need the final list which has "lowest value" and "repetition of name should avoid".
Ex:
List list1 = {[Nellai,10],[Gujarath,10],[Delhi,30],[Nellai,5],[Gujarath,15],[Delhi,20]}
After Sorting , I need the list like this :
List list1 = {[Nellai,5],[Gujarath,10],[Delhi,20]};
I have 2 Delhi (30,20) in my list. But I need only one Delhi which has lowest fare (20).
How to do that it in java?
Gnaniyar Zubair

If order doesn't matter, a solution is to use a Map[String, Integer], add an entry each time you find a new town, update the value each time the stored value is less than the stored one and then zip all the pairs into a list.

Almost the same as #Visage answer, but the order is different:
public class NameFare {
private String name;
private int fare;
public String getName() {
return name;
}
public int getFare() {
return fare;
}
#Override public void equals(Object o) {
if (o == this) {
return true;
} else if (o != null) {
if (getName() != null) {
return getName().equals(o.getName());
} else {
return o.getName() == null;
}
}
return false;
}
}
....
public Collection<NameFare> sortAndMerge(Collection<NameFare> toSort) {
ArrayList<NameFare> sorted = new ArrayList<NameFare>(toSort.size());
for (NameFare nf : toSort) {
int idx = sorted.getIndexOf(nf);
if (idx != -1) {
NameFare old = sorted.get(idx);
if (nf.getFare() < old.getFare()) {
sorted.remove(idx);
sorted.add(nf);
}
}
}
Collections.sort(sorted, new Comparator<NameFare>() {
public int compare(NameFare o1, NameFare o2) {
if (o1 == o2) {
return 0;
} else {
if (o1.getName() != null) {
return o1.getName().compareTo(o2.getName());
} else if (o2.getName() != null) {
return o2.getName().compareTo(o1.getName());
} else {
return 0;
}
}
}
});
}

I would do it in two stages.
Firstrly sort the list using a custom comparator.
Secondly, traverse the list and, for duplicate entries (which will now be adjacent to each other, provided you worte your comparator correctly), remove the entries with the higher values.

If you want to avoid duplicates, perhaps a class like TreeSet would be a better choice than List.

I would use an ArrayList like this:
ArrayList<Name> listOne = new ArrayList<Name>();
listOne.add(new Name("Nellai", 10);
listOne.add(new Name("Gujarath", 10);
listOne.add(new Name("Delhi", 30);
listOne.add(new Name("Nellai", 5);
listOne.add(new Name("Delhi", 20);
Collection.sort(listOne);
Then create the Name class
class name implements Comparable
{
private String name;
private int number;
public Name(String name, int number)
{
this.name= name;
this.number= number;
}
public String getName()
{
return this.name;
}
public int getNumber()
{
return this.number;
}
public int compareTo(Object otherName) // must be defined if we are implementing //Comparable interface
{
if(otherName instanceif Name)
{
throw new ClassCastException("Not valid Name object"):
}
Name tempName = (Name)otherName;
// eliminate the duplicates when you sort
if(this.getNumber() >tempName.getNumber())
{
return 1;
}else if (this.getNumber() < tempName.getNumber()){
return -1;
}else{
return 0;
}
}
}
I didn't compiled the code, it's edited here so you should fix the code. And also to figure out how to eliminate the duplicates and print only the lowest one.
You need to sweat too.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Sorting and comparing XML from Java - java

I was given XML and schema files. My goal was to output all data from the XML (without duplicates) and order this list by the date of birth. Currently I got all data printed out (with duplicates) and I don't know what to do next. I've tried different things, but unsuccessfully.

Related

toString() for a List Stack

Java Custom Object with multiple properties as Map key or concatenation of its properties

HashMap with incorrect equals and HashCode implementation

How to order a unique list based on 2 object attributes in Java

Sorted List in Java

Categories

Resources