ArrayList vs HashSet [duplicate] - java

This question already has answers here:
HashSet vs. ArrayList
(9 answers)
Closed last year.
I was working on this problem https://cses.fi/problemset/task/1660, and here's my code:
import java.util.*;
public class SubarraySumsI {
public static void main(String[] args) {
Scanner in = new Scanner (System.in);
int N = in.nextInt();
int X = in.nextInt();
ArrayList <Long> prefix = new ArrayList <Long> ();
int count = 0;
prefix.add(0l);
long prefixsum = 0;
for (int a = 0; a < N; a++) {
prefixsum += in.nextInt();
prefix.add(prefixsum);
if (prefix.contains(prefixsum - X)) {
count++;
}
}
System.out.println(count);
in.close();
}
}
I noticed that on a lot of the test cases, it was really slow. However, if I just change prefix from an ArrayList to a HashSet, it suddenly becomes a lot faster.
I'm not that experienced with using Sets and HashSets yet, so can someone explain what the difference is between ArrayList and HashSet?

You can read in here the difference between Set and Array
https://www.geeksforgeeks.org/difference-between-list-set-and-map-in-java/

The question that you asked
"Can someone explain what the difference is between ArrayList and HashSet?"
is too broad. There are many differences, and you should be able to find many resources that summarize them.
The (probable) specific thing that is causing a performance difference in your program will be this:
if (list.contains(prefixsum - X)) {
count++;
}
versus (presumably)
if (set.contains(prefixsum - X)) {
count++;
}
In the ArrayList case, the contains method has to test each element of the list until it either find a match or reaches the end. The time taken to do that will typically be proportional to the number of elements in the list.
In the HashSet case, a hash table data structure is used which (typically) reduces the number of elements that need to be tested to a small number.
The net result is that contain is fast for a HashSet and slow for ArrayList.
Explaining in detail how hash tables work in general and specifically in the HashSet case is beyond the scope of this Q&A.

Related

USA Coding Olympiad 1st Timed Out and/or Too Much Memory

A few days ago, I participated in the USA coding olympiad for the first time, and got the same error on all my codes. I can't figure out why because it told me that I did incredibly well on the first test case, so I don't understand how the other 9 all timed out. Could someone please explain what is wrong with my code.
Problem
Error Message
import java.io.*;
public class milkmeasure {
private static int [] cows ={7,7,7};
public static void main(String[] args) throws IOException {
// initialize file I/O
BufferedReader br = new BufferedReader(new FileReader("measurement.in"));
PrintWriter pw = new PrintWriter(new BufferedWriter(new FileWriter("measurement.out")));
int N = Integer.parseInt(br.readLine());
String [] entries = new String [N];
for (int i=0;i<N;i++){
entries [i]= br.readLine();
}
int topCow = 1;
int finalN = 0;
for (int i=0; i<N; i++){
String lowEntry = entries[lownum(N,entries)];
String name = lowEntry.substring(2,lowEntry.substring(2).indexOf(" ")+2);
int effect = Integer.parseInt(lowEntry.substring(lowEntry.substring(2).indexOf(" ")+3));
if (name.equals("Bessie")){cows[1]+=effect;}
else if (name.equals("Elsie")){cows[2]+=effect;}
else if (name.equals("Mildred")){cows[0]+=effect;}
int newTop = findTop();
if (newTop!=topCow){finalN++;}
topCow = newTop;
entries[lownum(N,entries)]="101 ";
}
pw.println(finalN);
pw.close();
}
private static int lownum (int N, String [] entries){
int lowNum = 101;
int returnInt=0;
for (int i =0; i<N; i++){
int a = Integer.parseInt(entries[i].substring(0,entries[i].indexOf(" ")));
if (a<lowNum){
lowNum = a;
returnInt =i;
}
}
return returnInt;
}
private static int findTop (){
int maxval = 0;
int returnval =0;
for (int i =0; i<3; i++){
if (cows[i]>= maxval){
returnval += cows[i]*cows[i];
maxval=cows[i];
}
}
return returnval;
}
}
Algorithmic complexity issue
For each entry, your main() method invokes lownum() (twice). lownum() scans all the entries to identify and return the one with the lowest day number. Overall, then, the complexity of your program scales at least as o(N2) in the number of entries.
That lower bound could be reduced to o(N log N) by sorting the entries once and then simply processing them in order.
With a reasonable bound on the maximum day number of the entries, and the given assurance that there is at most one entry per day, it could be reduced further to o(N) by assigning entries to an array or List at positions corresponding to their day numbers, so that no actual sorting is required.
It turns out that this is the main driver of your asymptotic complexity, so improving this lower bound allows you to improve the upper bound, too, all the way to O(N).
General efficiency issues
Since the problem specifies that there will be at most one entry per day for 100 days, however, you are probably in a regime that is still strongly influenced by (in-)efficiencies that affect the cost coefficient. And you in fact have quite a few inefficiencies. Among them:
You parse each entry many times, scanning to split them into fields and converting some of those into integers. That's terribly wasteful. It would be far more efficient to parse each entry just once, and then store the parsed results. In fact, you can get the parsing at input for almost free by using a Scanner.
You invoke the lownum() method twice for each entry. The current implementation of this method is expensive, as discussed above, and nothing changes between the first and second invocation that would affect the result.
(minor) you perform full string comparisons on the cow names, even though it would be sufficient to look only at their first letters
(minor) you invoke separate methods to find the next entry and to compute the new top cow. Method invocation is comparatively expensive, so it is a bit inefficient to make large numbers of invocations of methods that do very little work. That's probably not a significant effect for your particular code, however.

Efficient way to randomise numbers without duplications [duplicate]

This question already has answers here:
Generate random number without duplicate in certain range
(10 answers)
Closed 6 years ago.
I have used this code in order to randomise 1000000 numbers without duplication's. Here's what I have so far.
enter code here protected void randomise() {
int[] copy = new int[getArray().length];
// used to indicate if elements have been used
boolean[] used = new boolean[getArray().length];
Arrays.fill(used,false);
for (int index = 0; index < getArray().length; index++) {
int randomIndex;
do {
randomIndex = getRandomIndex();
} while (used[randomIndex]);
copy[index] = getArray()[randomIndex];
used[randomIndex] = true;
}
for (int index = 0; index < getArray().length; index++) {
getArray()[index] = copy[index];
//Checks if elements in array have already been used
}
}
public static void main(String[] args) {
RandomListing count = new SimpleRandomListing(1000000);
//Will choose 1000000 random numbers
System.out.println(Arrays.toString(count.getArray()));
}
This method is too slow can you let me know how this can be done more efficiently. I appreciate all replies.
Regards,
A more efficient way to do this is by starting with a pool of numbers (e.g. an List of all numbers between 0 and 1000000) and then remove numbers that you've already used. That way, every time you try to get a new number, that number is guaranteed to never having been used before rather than spending time trying to find a "good" unused number.
It looks like your using a linear search to find matches. Try using a binary search it's more efficient. The array you are searching must be sorted to implement a binary search.

java - Remove nearly duplicates from a List

I have a List of Tweet objects (homegrown class) and I want to remove NEARLY duplicates based on their text, using the Levenshtein distance. I have already removed the identical duplicates by hashing the tweets' texts but now I want to remove texts that are identical but have up to 2-3 different characters. Since this is a O(n^2) approach, I have to check every single tweet text with all the others available. Here's my code so far:
int distance;
for(Tweet tweet : this.tweets) {
distance = 0;
Iterator<Tweet> iter = this.tweets.iterator();
while(iter.hasNext()) {
Tweet currentTweet = iter.next();
distance = Levenshtein.distance(tweet.getText(), currentTweet.getText());
if(distance < 3 && (tweet.getID() != currentTweet.getID())) {
iter.remove();
}
}
}
The first problem is that the code throws ConcurrentModificationException at some point and never completes. The second one: can I do anything better than this double loop? The list of tweets contains nearly 400.000 tweets so we're talking about 160 billion iterations!
This solution works for the question in hand(so far tested with possible inputs) but the normal set operations to remove duplicates wont work if you dont implement the full contract for compare to return 1,0 and -1.
Why dont you implement your own compare operation using the Set which can have only distinct values. It is going to be O(n log(n)).
Set set = new TreeSet(new Comparator() {
#Override
public int compare(Tweet first, Tweet second) {
int distance = Levenshtein.distance(first.getText(), second.getText());
if(distance < 3){
return 0;
}
return 1;
}
});
set.addAll(this.tweets);
this.tweets = new ArrayList<Tweet>(set);
As for the ConcurrentModificationException: As the others pointed out, I was removing elements from a list that I was also iterating in the outer for-each. Changing the for-each into a normal for resolved the problem.
As for the O(n^2) approach: There's no "better" algorithm regarding its complexity, than a O(n^2) approach. What I'm trying to do is an "all-to-all" comparison to find nearly duplicate elements. Of course there are optimizations to lower the total capacity of n, parallelization to concurrently parse sub-lists of the original list, but the complexity is quadratic at all times.

ArrayLinkedList Insertion Sort

I have to do an Array List for an insertion sort and my teacher sent this back to me and gave me an F, but says I can make it up before Friday.
I do not understand why this isn't an A.L insertion sort.
Can someone help me fix this so it hits his criteria?
Thanks.
HE SAID:
After checking your first insertion sort you all did it incorrectly. I specifically said to shift the numbers and move the number into its proper place and NOT SWAP THE NUMBER INTO PLACE. In the assignment in MySA I said if you do this you will get a 0 for the assignment.
import java.util.ArrayList;
public class AListINSSORT {
private static void insertionSort(ArrayList<Integer> arr) {
insertionSort();
}
private static void insertionSort() {
ArrayList<Integer> swap = new ArrayList<Integer>();
swap.add(1);
swap.add(2);
swap.add(3);
swap.add(4);
swap.add(5);
int prior = 0;
int latter = 0;
for (int i = 2; i <= latter; i++)
{
for (int k = i; k > prior && (swap.get(k - 1) < swap.get(k - 2)); k--)
{
Integer temp = swap.get(k - 2);
swap.set(k - 2, swap.get(k - 1));
swap.set(k - 1, temp);
}
}
System.out.println(swap);
}
}
First of all, it seems your teacher asked you to use a LinkedList instead of an ArrayList. There is quite a difference between them.
Secondly, and maybe more to the point. In your inner loop you are saving a temp variable and swapping the elements at position k - 2 and k - 1 with each other. From the commentary this is not what your teacher intended. Since he wants you to solve the problem with element insertion, I recommend you look at the following method definition of LinkedList.add(int i, E e): https://docs.oracle.com/javase/7/docs/api/java/util/LinkedList.html#add(int,%20E).
This should point you in the right direction.
As far as I see, your code does nothing at all.
The condition of the outer for loop
for (int i = 2; i <= latter; i++)
is not fulfilled.
As you start with i = 2 and as latter = 0, it never holds i <= latter.
Thus, you never run through the outer for loop and finally just give back the input values.
If you add the input values to swap in a different order (not already ordered), you will see that your code does not re-order them.
There's a lot of stuff wrong here.
Firstly, your method:
private static void insertionSort(ArrayList<Integer> arr) {
insertionSort();
}
takes an ArrayList and completely ignores it. This should presumably be the List which requires sorting.
Then in insertionSort() you create a new ArrayList, insert some numbers already in order, and then attempt something which looks nothing like insertion sort, but slightly more like bubble sort.
So, when you call insertionSort(List) it won't actually do anything to the list at all, all the work in insertionSort() happens to a completely different List!
Since on SO we don't generally do people's homework for them, I suggest looking at the nice little animated diagram on this page
What you should have then is something like:
public void insertionSort(LinkedList<Integer> numbers) {
//do stuff with numbers, using get() and add()
}

How to divide a set of numbers into two sets such that the difference of their sum is minimum

How to write a Java Program to divide a set of numbers into two sets such that the difference of the sum of their individual numbers, is minimum.
For example, I have an array containing integers- [5,4,8,2]. I can divide it into two arrays- [8,2] and [5,4]. Assuming that the given set of numbers, can have a unique solution like in above example, how to write a Java program to achieve the solution. It would be fine even if I am able to find out that minimum possible difference.
Let's say my method receives an array as parameter. That method has to first divide the array received into two arrays, and then add the integers contained in them. Thereafter, it has to return the difference between them, such that the difference is minimum possible.
P.S.- I have had a look around here, but couldn't find any specific solution to this. Most probable solution seemed to be given here- divide an array into two sets with minimal difference . But I couldn't gather from that thread how can I write a Java program to get a definite solution to the problem.
EDIT:
After looking at the comment of #Alexandru Severin, I tried a java program. It works for one set of numbers [1,3,5,9], but doesn't work for another set [4,3,5,9, 11]. Below is the program. Please suggest changes:-
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class FindMinimumDifference {
public static void main(String[] args) {
int[] arr= new int[]{4,3,5,9, 11};
FindMinimumDifference obj= new FindMinimumDifference();
obj.returnMinDiff(arr);
}
private int returnMinDiff(int[] array){
int diff=-1;
Arrays.sort(array);
List<Integer> list1= new ArrayList<>();
List<Integer> list2= new ArrayList<>();
int sumOfList1=0;
int sumOfList2=0;
for(int a:array){
for(Integer i:list1){
sumOfList1+=i;
}
for(Integer i:list2){
sumOfList2+=i;
}
if(sumOfList1<=sumOfList2){
list1.add(a);
}else{
list2.add(a);
}
}
List<Integer> list3=new ArrayList<>(list1);
List<Integer> list4= new ArrayList<>(list2);
Map<Integer, List<Integer>> mapOfProbables= new HashMap<Integer, List<Integer>>();
int probableValueCount=0;
for(int i=0; i<list1.size();i++){
for(int j=0; j<list2.size();j++){
if(abs(list1.get(i)-list2.get(j))<
abs(getSumOfEntries(list1)-getSumOfEntries(list2))){
List<Integer> list= new ArrayList<>();
list.add(list1.get(i));
list.add(list2.get(j));
mapOfProbables.put(probableValueCount++, list);
}
}
}
int minimumDiff=abs(getSumOfEntries(list1)-getSumOfEntries(list2));
List resultList= new ArrayList<>();
for(List probableList:mapOfProbables.values()){
list3.remove(probableList.get(0));
list4.remove(probableList.get(1));
list3.add((Integer)probableList.get(1));
list4.add((Integer)probableList.get(0));
if(minimumDiff>abs(getSumOfEntries(list3)-getSumOfEntries(list4))){
// valid exchange
minimumDiff=abs(getSumOfEntries(list3)-getSumOfEntries(list4));
resultList=probableList;
}
}
System.out.println(minimumDiff);
if(resultList.size()>0){
list1.remove(resultList.get(0));
list2.remove(resultList.get(1));
list1.add((Integer)resultList.get(1));
list2.add((Integer)resultList.get(0));
}
System.out.println(list1+""+list2); // the two resulting set of
// numbers with modified data giving expected result
return minimumDiff;
}
private static int getSumOfEntries(List<Integer> list){
int sum=0;
for(Integer i:list){
sum+=i;
}
return sum;
}
private static int abs(int i){
if(i<=0)
i=-i;
return i;
}
}
First of all, sorting the array then putting first member in group and second in another wound never work, and here is why:
Given the input[1,2,3,100].
The result would be: [1,3] and [2,100], clearly wrong.
The correct answer should be: [1,2,3] and [100]
You can find many optimization algorithms on google for this problem, but since I assume you're a beginner, I'll try to give you a simple algorithm that you can implement:
sort the array
iterate from highest to lowest value
for each iteration, calculate the sum of each group, then add the element to the group with minimum sum
At the end of the loop you should have two fairly balanced arrays. Example:
Array: [1,5,5,6,7,10,20]
i1: `[20] []`
i2: `[20] [10]`
i3: `[20] [10,7]`
i4: `[20] [20,7,6]`
i5: `[20,5] [10,7,6]`
i6: `[20,5] [10,7,6,5]`
i7: `[20,5,1] [10,7,6,5]`
Where the sums are 26 and 28. As you can see we can further optimize the solution, if we exchange 5 and 6 resulting in [20,6,1] and [20,7,5,5] the sums are equal.
For this step you can:
find all groups of elements (x,y) where x is in group1, y is in group2, and |x-y| < |sum(group1) - sum(group2)|
loop all groups and try exchanging x with y until you get a minimum difference
after each exchange check if the minimum value in the group with the highest sum is higher then the difference of the groups, if so, transfer it to the other group
This algorithm will always return the best solution, and is a whole lot better then a greedy approach. However it is not optimal in terms of complexity, speed and memory. If one needs it for very large arrays and the resources are limited, the most optimal algorithm may differ depending on the speed/memory ration and the accepted error percentage.
This is a variation on the Partition Problem https://en.wikipedia.org/wiki/Partition_problem
If you want the optimal solution you have to test every possible combination of output sets. That may be feasible for small sets but is infeasible for large inputs.
One good approximation is the greedy algorithm I present below.
This heuristic works well in practice when the numbers in the set are
of about the same size as its cardinality or less, but it is not
guaranteed to produce the best possible partition.
First you need to put your input in a sortable collection such as a List.
1) Sort the input collection.
2) Create 2 result sets.
3) Iterate over the sorted input. If the index is even put the item in result1 else put the item in result2.
List<Integer> input = new ArrayList<Integer>();
Collections.sort(input);
Set<Integer> result1 = new HashSet<Integer>();
Set<Integer> result2 = new HashSet<Integer>();
for (int i = 0; i < input.size(); i++) {
if (i % 2 == 0) {// if i is even
result1.add(input.get(i));
} else {
result2.add(input.get(i));
}
}
I seem to have got the perfect solution for this. Below Java program works perfectly. Only assumption is that, the given problem has unique solution (just one solution). This assumption implies- only non-zero number. I am putting the program below. I request everyone to tell if the program could fail for certain scenario, or if it could be improved/optimized in some way. Credits to Mr Alexandru Severin's algorithm posted as one of the answers in this thread.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class FindMinimumDifference {
static List<Integer> list1= new ArrayList<>();
static List<Integer> list2= new ArrayList<>();
public static void main(String[] args) {
int[] arr= new int[]{3,-2,9,7};
// tested for these sample data:- [1,5,9,3] ; [4,3,5,9,11] ;
//[7,5,11,2,13,15,14] ; [3,2,1,7,9,11,13] ;
//[3,1,0,5,6,9] ; [6,8,10,2,4,0] ; [3,1,5,7,0] ; [4,-1,5,-3,7] ; [3,-2,9,7]
System.out.println("the minimum possible difference is: "+returnMinDiff(arr));
System.out.println("the two resulting set of nos. are: "+list1+" and "+list2);
}
private static int returnMinDiff(int[] array){
int diff=-1;
Arrays.sort(array);
for(int a:array){
int sumOfList1=0;
int sumOfList2=0;
for(Integer i:list1){
sumOfList1+=i;
}
for(Integer i:list2){
sumOfList2+=i;
}
if(sumOfList1<=sumOfList2){
list1.add(a);
}else{
list2.add(a);
}
}
List<Integer> list3=new ArrayList<>(list1);
List<Integer> list4= new ArrayList<>(list2);
if(list3.size()!=list4.size()){ // both list should contain equal no. of entries.
//If not, add 0 to the list having lesser no. of entries
if(list3.size()<list4.size()){
list3.add(0);
}else{
list4.add(0);
}
}
Map<Integer, List<Integer>> mapOfProbables= new HashMap<Integer, List<Integer>>();
int probableValueCount=0;
for(int i=0; i<list3.size();i++){
for(int j=0; j<list4.size();j++){
if(abs(list3.get(i)-list4.get(j))
<abs(getSumOfEntries(list3)-getSumOfEntries(list4))){
List<Integer> list= new ArrayList<>();
list.add(list3.get(i));
list.add(list4.get(j));
mapOfProbables.put(probableValueCount++, list);
}
}
}
int minimumDiff=abs(getSumOfEntries(list1)-getSumOfEntries(list2));
List resultList= new ArrayList<>();
for(List probableList:mapOfProbables.values()){
list3=new ArrayList<>(list1);
list4= new ArrayList<>(list2);
list3.remove(probableList.get(0));
list4.remove(probableList.get(1));
list3.add((Integer)probableList.get(1));
list4.add((Integer)probableList.get(0));
if(minimumDiff>abs(getSumOfEntries(list3)-getSumOfEntries(list4))){ // valid exchange
minimumDiff=abs(getSumOfEntries(list3)-getSumOfEntries(list4));
resultList=probableList;
}
}
if(resultList.size()>0){ // forming the two set of nos. whose difference of sum comes out to be minimum
list1.remove(resultList.get(0));
list2.remove(resultList.get(1));
if(!resultList.get(1).equals(0) ) // (resultList.get(1).equals(0) && !list1.contains(0))
list1.add((Integer)resultList.get(1));
if(!resultList.get(0).equals(0) || (resultList.get(0).equals(0) && list2.contains(0)))
list2.add((Integer)resultList.get(0));
}
return minimumDiff; // returning the minimum possible difference
}
private static int getSumOfEntries(List<Integer> list){
int sum=0;
for(Integer i:list){
sum+=i;
}
return sum;
}
private static int abs(int i){
if(i<=0)
i=-i;
return i;
}
}
For this question, assume that we can divide the array into two subarrays such that their sum is equal. (Even thought they are not equal , it will work)
So if the sum of elements in array is S. Your goal is to find a subset with sum S/2. You can write a recursive function for this.
int difference = Integer.MAX_VALUE;
public void recursiveSum(int[] array, int presentSum, int index,Set<Integer> presentSet){
if(index == array.length){
if(Math.abs(presentSum - (S/2)) < difference)){
difference = Math.abs(presentSum - (S/2);
// presentSet is your answer
return;
}
}
recursiveSum(array,presentSum,index+1,presentSet); // don't consider the present element in the final solution
presentSet.add(array[index]);
recursiveSum(array,presentSum + array[index],index+1,presentSet); //consider the present element in the final solution
}
You can also write an equivalent O(N^2) dynamic programming code for this.
I was just demonstrating the idea.
So when you find this set with sum S/2, automatically you have divided the array in to two parts with same sum (S/2 here).
It seems that you are more interested in the algorithm than the code. So, here is my psuedocode:-
int A[];//This contains your elements in sorted (descending) order
int a1[],a2[];//The two sub-arrays
int sum1=0,sum2=0;//These store the sum of the elements of the 2 subarrays respectively
for(i=0;i<A.length;i++)
{
//Calculate the absolute difference of the sums for each element and then add accordingly and thereafter update the sum
if(abs(sum1+A[i]-sum2)<=abs(sum2+A[i]-sum1))
{a1.add(A[i]);
sum1+=A[i];}
else
{a2.add(A[i]);
sum2+=A[i];}
}
This will work for all integers, positive or negative.

Categories