Finding Duplicate Algorithm - Java - java

I have set of values, an arraylist, and i have to find duplicate keys. One approach is to use 2 loops. and iterate through the list for each value resutling O(n2).
the other thing, That i can do is to put the values as keys in HashTable. I believed, that hashtable would throw an exception if there is already same key in it. But it is not throwing an exception
Hashtable<String, String> ht = new Hashtable<String, String>();
for (int i = 0; i<20; i++){
ht.put(String.valueOf(i%10), String.valueOf(i%10));
}
do i understand it wrong? Doesn't hastable/hashmap throw exception if there is already same key in it?

My suggestion is you want a HashSet instead of a Hashtable:
Set<String> ht = new HashSet<String>();
for (int i = 0; i<20; i++){
if ( !ht.add(String.valueOf(i%10)) ) {
//it already existed, throw an exception or whatever
}
}
If you don't care about the values that you add to a map, you almost certainly want a Set and not a Map/table.

No, it doesn't throw an exception, it simply replaces the old value. You can check if a value already exists by calling get:
if (ht.get(key) != null) {
// value already exists
}
Edit: As #Mark Peters suggested, containsKey is a simpler and sometimes better solution.

You can see in the API docs that put returns null if there was nothing in the table before for that key, and the key's previous value if there was one. (It doesn't throw an exception in either case.)

You may want to read up on the performance characteristics of hashes.
For example, hashes will make answering the question "does this key exist?" fast, which might help with your algorithm.

According to Java Docs, the only exceptions that put may raise is NullPointerException, if key or value is null. You can change your loop to something like:
for(int i = 0 ; i < 20 ; i++) {
if (ht.containsKey(String.valueOf(i%10)))
throw new Something();
ht.put(String.valueOf(i%20), True);
}

From the JavaDoc:
put
public Object put(Object key,
Object value)
Maps the specified key to the specified value in this hashtable. Neither the key nor the value can be null.
The value can be retrieved by calling the get method with a key that is equal to the original key.
Specified by:
put in interface Map
Specified by:
put in class Dictionary
Parameters:
key - the hashtable key.
value - the value.
Returns:
the previous value of the specified key in this hashtable, or null if it did not have one.
Throws:
NullPointerException - if the key or value is null.
See Also:
Object.equals(Object), get(Object)
It looks like it will let you overwrite the value, but then it gives you the old value as a return Object.

Here's the easiest way to do it:
List yourList;
HashSet noDuplicates = new HashSet(yourList);
HashSet duplicates = new HashSet(yourList).removeAll(noDuplicates);

Depending on your memory vs runtime constraints, I would recommend something if you are space constrained:
You can sort the array (worst case of O(nlog_n) if you use something of the likes of quicksort), and then traverse it to find duplicates in adjacent elements.
Hope this helps

Related

How would an empty HashMap with .containsKey() be true?

I am a beginner programmer working on a Two Sum problem. An array of integers are given as well as an integer, target. The intention of the program is to find which two numbers in the array of integers add up to the target integer. The most efficient solution I am seeing is quite ingenious in how it iterates over all of the integers in the array and checks if the difference between each integer in the array and the target number is another integer in the array. Then those two would be the solution. My issue is with the HashMap part. How would an empty HashMap .containsKey() work if it is empty and has no keys in it?
class Solution {
public int[] twoSum(int[] nums, int target) {
int n=nums.length;
Map<Integer,Integer> map=new HashMap<>();
int[] result=new int[2];
for(int i=0;i<n;i++){
if(map.containsKey(target-nums[i])){
result[1]=i;
result[0]=map.get(target-nums[i]);
return result;
}
map.put(nums[i],i);
}
return result;
}
}
I tried to research solution explanations but all of them just said that the solution checks if the values are in the map but how would any values be in the map if it is empty and was never linked to the integers array? Thanks a lot for the help.
I give you a phone book. I ask you if 'Ryan Siegrist' is in it. You open it up intending to scan it (or perhaps you're going to be smarter about it and use a fancy algorithm, such as binary searching by opening it to the middle page to see if you should go 'higher' or 'lower') - and it turns out the phone book is simply empty.
The correct answer is of course 'no, Ryan Siegrist is not in this empty phone book'.
HashMap is no different. .containsKey(whateverYouLike) returns false if you invoke it on an empty list. Why wouldn't it?
The stated algorithm does nothing the first time you loop, but note that at the end of the for look, whether the if ( containsKey ) check worked out or failed, an entry is added to the map. So the second and further runs through that loop, the map is no longer empty.
Short version:
If the map is empty and it does not contain a key. Then the line:
map.put(nums[i],i);
will still execute. This is because it is outside of the if check.
Long version
So when the code first iterates through the array, the HashMap is always empty at first because it was initialized as such:
Map<Integer,Integer> map=new HashMap<>();
Then the first iteration of the if check returns false:
if(map.containsKey(target-nums[i]))
But it still executes the line of code which will associate the map with the value of nums at the index of i with the index of i as the value for the map.
Then the loop will continue iterating until a solution is found or the loop terminates.

Should I care about no_entry_value in trove4j?

I'm using trove4j for its primitives collections. I notice it has constructors like
public TLongArrayList( int capacity, long no_entry_value )
where no_entry_value represents null. The default value is zero.
The null value in collections, like Set specially, is very important, according to my cognition. But I found trove4j did't use this value much after I glanced at the source code.
So I'm confused that should I care about that value. Should I elaborately pick a value that would never occur in my programs, or just leave it to be default zero.
This is kind of one of those things that you know you need when you need it. It comes down to whether or not you want to just call get(...) and know whether or not the value was in the map without calling containsKey(...). That's where the "no entry value" comes in because that is what is returned in the case where you didn't store that key in your map. The default is 0, so if you never store a value of 0 in your map, that will work for checks. But of course that can be a bit risky. Typical values would be 0, -1, Integer.MAX_VALUE... things like that.
If there is no value that you can guarantee will never be in your map, then you need to make sure you check with containsKey before you trust the returned value. You can minimize the overhead of doing two lookups with something like:
int value = my_map.get( my_key );
// NOTE: Assuming no entry value of 0
if ( value == 0 && !my_map.containsKey( my_key ) ) {
// value wasn't present
}
else {
// value was present
}
That's a performance improvement over calling containsKey every time before doing a get.

Map.Entry<> java

I have no idea about java whatsoever but I found this blockchain guide in java and I have been trying to understand and convert the code in C++ (my thing). I was doing good so far but I am stuck here. I cant understand this for loop and Map.Entry<> thing. Any kind of help is appreciated.
And also I am new to blockchain.
The link to this guide is:
https://medium.com/programmers-blockchain/creating-your-first-blockchain-with-java-part-2-transactions-2cdac335e0ce
If that helps.
Here is the code:
public class Wallet {
public PrivateKey privateKey;
public PublicKey publicKey;
public HashMap<String,TransactionOutput> UTXOs = new HashMap<String,TransactionOutput>();
public float getBalance() {
float total = 0;
for (Map.Entry<String, TransactionOutput> item: NoobChain.UTXOs.entrySet()){
TransactionOutput UTXO = item.getValue();
if(UTXO.isMine(publicKey)) { //if output belongs to me ( if coins belong to me )
UTXOs.put(UTXO.id,UTXO); //add it to our list of unspent transactions.
total += UTXO.value ;
}
}
return total;
}
What is this for loop doing is beyond me. If anyone could provide a simpler C++ version of this loop.
Instead of just providing a C++ code snipped let me try to explain this:
In java there are data structures called Maps which contain key-value pairs (you probably could guess this part). The Map itself is not iterable, so in order to loop through a map you can loop through all its keys (also called a key set), all the values or all the key-value pairs (also know as the Entry set). The latter happens in your example.
So in your example you have a map of String (the keys) and TransactionOutput objects (values). The for, loops through these pairs and each pair is stored in the variable item. Then the value part is extracted from the key-value pair (item) which is a TransactionOutput object.
Then this object is verified with the method isMine() and if that is true, it is added to another Map (calles UTXOs) that maps Strings (the key) to TransactionOutput object. In this case it seems the string (the key in the map) is the id of the TransactionObject.
The variable total is increased by the value of the added TrasactionOutput.
Side note: This for-loop could as well just loop through all the values in the map since the key is never used in this particular loop.
Now, to explain this in other words, it is going through the map of TransactionOutputs, the ones that belong to "me" are put aside in a separate map and the total amount of the TrasactionOutput values that belong to "me" is returned.
Hope this clears things up!
Good luck,
Teo

TObjectIntMap.get() returns 0 if null Trove

I am using trove library to create hash maps
http://trove.starlight-systems.com/
The class I am using is TObjectIntMap in which I had to use the function get.
The issue is that get returns 0 if two cases
1- If the value of the specified key is zero
2- If the key does not exist
For example in the following code
TObjectIntMap<String> featuresMap = new TObjectIntHashMap<String>();
if(String.valueOf(featuresMap.get("B")) == null)
System.out.println("NULL");
else
System.out.println("NotNull");
System.out.println(featuresMap.get("B"));
The program will print the following
1- NotNull: because it gets zero. Although the key "B" has not been set
2- Zero: The return of featuresMap.get("B") is zero instead of null.
I have checked their documentation in the link below and it was a mistake that they solved. So get actually return zero instead of null because int cannot be null.
https://bitbucket.org/robeden/trove/issue/43/incorrect-javadoc-for-tobjectintmapget
Now my question is: How to differentiate between a zero and Null in this case. Is their any way around to address this issue.
Try their containsKey method. If the value comes back 0, use that method to check if the map contains the key - if it does, then the key's value really is 0. If it doesn't, then the key is not set.

Java, multiple iterators on a set, removing proper subsets and ConcurrentModificationException

I have a set A = {(1,2), (1,2,3), (2,3,4), (3,4), (1)}
I want to turn it into A={(1,2,3), (2,3,4)}, remove proper subsets from this set.
I'm using a HashSet to implement the set, 2 iterator to run through the set and check all pairs for proper subset condition using containsAll(c), and the remove() method to remove proper subsets.
the code looks something like this:
HashSet<Integer> hs....
Set<Integer> c=hs.values();
Iterator<Integer> it= c.iterator();
while(it.hasNext())
{
p=it.next();
Iterator<Integer> it2= c.iterator();
while(it2.hasNext())
{
q=it2.next();
if q is a subset of p
it2.remove();
else if p is a subset of q
{
it.remove();
break;
}
}
}
I get a ConcurrentModificationException the 1st time i come out of the inner while loop and do a
p=it.next();
The exception is for when modifying the Collection while iterating over it. But that's what .remove() is for.
I have used remove() when using just 1 iterator and encountered no problems there.
If the exception is because I'm removing an element from 'c' or 'hs' while iterating over it, then the exception should be thrown when it encounter the very next it 2 .next() command, but I don't see it then. I see it when it encounters the it.next() command.
I used the debugger, and the collections and iterators are in perfect order after the element has been removed. They contain and point to the proper updated set and element. it.next() contains the next element to be analyzed, it's not a deleted element.
Any ideas over how i can do what i'm trying to do without making a copy of the hashset itself and using it as an intermediate before I commit updates?
Thank you
You can't modify the collection with it2 and continue iterating it with it. Just as the exception says, it's concurrent modification, and it's not supported.
I'm afraid you're stuck with an intermediate collection.
Edit
Actually, your code doesn't seem you make sense: are you sure it's a collection of Integer and not of Set<Integer>? In your code p and q are Integers, so "if q is a subset of p" doesn't seem to make too much sense.
One obvious way to make this a little smarter: sort your sets by size first, as you go from largest to smallest, add the ones you want to keep to a new list. You only have to check each set against the keep list, not the whole original collection.
The idea behind the ConcurrentModificationException is to maintain the internal state of the iterators. When you add or delete things from a set of items, it will throw an exception even if nothing appears wrong. This is to save you from coding errors that would end up throwing a NullPointerException in otherwise mundane code. Unless you have very tight space constraints or have an extremely large collection, you should just make a working copy that you can add and delete from without worry.
How about creating another set subsetNeedRemoved containing all subsets you are going to remove? For each subset, if there is a proper superset, add the subset to subsetNeedRemoved. At the end, you can loop over subsetNeedRemoved and remove corresponding subsets in the original set.
I'd write something like this...
PriorityQueue<Set<Integer>> queue = new PriorityQueue<Set<Integer>>(16,
new Comparator<Set<Integer>>() {
public int compare(Set<Integer> a, Set<Integer> b) {
return b.size() - a.size(); // overflow-safe!
}
});
queue.addAll(sets); // we'll extract them in order from largest to smallest
List<Set<Integer>> result = new ArrayList<>();
while(!queue.isEmpty()) {
Set<Integer> largest = queue.poll();
result.add(largest);
Iterator<Set<Integer>> rest = queue.iterator();
while(rest.hasNext()) {
if(largest.containsAll(rest.next())) {
rest.remove();
}
}
}
Yeah, it consumes some extra memory, but it's idiomatic, straightforward, and possibly faster than another approach.

Categories