Proper way to iterate over a set while keeping track of index - java

I want to essentially map a set to an array after calling someMethod on each element. Should I loop through using a foreach loop and a "i" variable outside, or use a regular for loop and iterator?
int[] arr = new int[set.size()];
int i = 0;
for(int ele : set) {
arr[i] = someMethod(ele);
i++;
}
or
int[] arr = new int[set.size()];
Iterator<Integer> iterator = set.iterator();
for(int i=0; i<set.size(); i++) {
arr[i] = someMethod(iterator.next().intValue());
}

Both methods are equally good but for loop is slightly faster than forEach loop,
There are some other methods to convert set to array given below.
You can use toArray() function :
int[] arr = new int[set.size()];
arr = set.toArray(arr);
or you can use stream in java 8 or above:
int[] arr = set.stream().toArray(int[] ::new);
Another method is by using Arrays.copyOf
int[] arr = Arrays.copyOf(set.toArray(), set.size(), Integer[].class);

tl;dr
Both of your loops are equivalent.
You can use streams to make a one-liner, where the stream does the looping for you.
int[] arrayOfInts =
Set
.of( 1 , 2 , 3 )
.stream()
.mapToInt( Integer :: intValue )
.map( integer -> Math.multiplyExact( integer , integer ) )
.sorted()
.toArray();
arrayOfInts = [1, 4, 9]
Either loop is good
Both of your loops work well.
Choose whichever is easiest to read and understand. For me that would usually be the for-each syntax, as seen in your first loop.
By the way, that first loop could be shortened. The i++ syntax pulls out and utilizes the value of i before incrementing. So you can nest the i++ inside your array index accessor.
Set < Integer > set = Set.of( 1 , 2 , 3 );
int[] arr = new int[ set.size() ];
int i = 0;
for ( int element : set ) {
arr[ i++ ] = Math.multiplyExact( element , element ); // Auto-boxing converts `Integer` objects into `int` primitive values.
}
System.out.println( "arr = " + Arrays.toString( arr ) );
arr = [1, 9, 4]
Stream instead of loop
The Answer by Pranav Choudhary is correct. But you also mentioned wanting to apply a method to modify your number before assigning to the array.
Java streams make it easy to perform such a modification during collection.
Define our Set of Integer objects.
Set < Integer > set = Set.of( 1 , 2 , 3 );
Create a stream of int primitives, boxed from our Integer objects.
IntStream intStream = set.stream().mapToInt( Integer :: intValue );
Apply your modification. Here we square each number. The Math.multiplyExact method is handy because it will throw an ArithmeticException if we overflow the limits of the 32-bit int type. We collect each resulting square as an int in our array.
int[] arrayOfInts = intStream.map( integer -> Math.multiplyExact( integer , integer ) ).toArray();
Dump to console.
System.out.println( "arrayOfInts = " + Arrays.toString( arrayOfInts ) );
When run.
arrayOfInts = [9, 4, 1]
Notice the order of the output. A Set by definition has no determined iteration order. To emphasize that, the Set.of method reserves the right to use an implementation of Set that randomly changes its iteration. So each time you run this code you may see a different ordering of results.
If you care about order, add that to your stream work. Add a call to sorted().
int[] arrayOfInts = intStream.map( integer -> Math.multiplyExact( integer , integer ) ).sorted().toArray();
When run, we get consistent ordering. See this code run live at IdeOne.com.
arrayOfInts = [1, 4, 9]
Not that I recommend it, but we could turn this code into a one-liner.
int[] arrayOfInts = Set.of( 1 , 2 , 3 ).stream().mapToInt( Integer :: intValue ).map( integer -> Math.multiplyExact( integer , integer ) ).sorted().toArray();
Or:
int[] arrayOfInts =
Set
.of( 1 , 2 , 3 )
.stream()
.mapToInt( Integer :: intValue )
.map( integer -> Math.multiplyExact( integer , integer ) )
.sorted()
.toArray();
See this code run live at IdeOne.com.
arrayOfInts = [1, 4, 9]

Both versions are correct in the sense that they both give the same result.
However, in my opinion the first version is the preferable of the two.
It is easier to understand for an average Java programmer.
It is more concise. Fewer lines of code. Simpler statements.
It will probably be faster ... unless the optimizer is amazingly clever.
If it was me, I would save a line and write it like this:
for(int ele : set) {
arr[i++] = someMethod(ele);
}
but that is just my old C habits shining through.
As others have pointed out, in Java 8+ there is an even more concise way to write this; e.g. see this answer.

With java8:
Integer[] arr = set.stream().map(element -> doSomething(element)).toArray(Integer[]::new);
Where doSomething() is your method which you can call -
calling someMethod on each element.

Related

Filter out element one by one using filter()

I will kick start with my question:
I have an array:
int[] arr = { 1,2,3,4,5 }; , i want to store values to a List<Integer> li like this : 14,13,12,11,10
How these values came to the List li like this??
Our initial numbers are 1,2 ,3 ,4 , and 5 . We can calculate the following sums using four of the five integers:
If we sum everything except 1, our sum is 14.
If we sum everything except 2, our sum is 13.
If we sum everything except 3, our sum is 12.
If we sum everything except 4, our sum is 11.
If we sum everything except 5, our sum is 10.
My approach and thoughts:
I thought i already have an int [] arr , so i will make it to stream , now i will filter out each elements one by one and will sum rest in each iteration and will add this to List li.
List<Integer> li = IntStream.range(0,1).filter(i-> arr[i] !=i).sum();
^^ This did not worked, I am thinking can i do some this like below?
IntStream.range(0,1).filter(i-> filter(this is anotherfilter)).sum();
I am not able to understand this, i want to do this problem with streams and java-8.
You can break it into two steps and perform the operation as:
int[] arr = { 1,2,3,4,5 };
int total = Arrays.stream(arr).sum(); // total of the array
List<Integer> output = Arrays.stream(arr)
.mapToObj(integer -> total - integer) // (total - current) value as element
.collect(Collectors.toList());
IntStream.range(0, arr.length)
.map(x -> IntStream.of(arr).sum() - arr[x])
.forEachOrdered(System.out::println);
Or IntStream.of(arr).sum() can be computed only once as a single variable.
Supplier<IntStream> supplier = () -> IntStream.range(1, 6);
supplier.get()
.map(integer -> supplier.get().sum() - integer)
.forEach(item -> System.out.println(item));
Instead of forEach, you can use collect(Collectors.toList()) to collect integers to the list

How to set the same value for a range in an array in Java in one line?

Say I have an array of 100 elements and I want a certain range of indices to have a certain value. For example indices 0 through 10 will have "Bob," indices 11 - 57 will have "Jake," and indices 58-99 will have "John". Is that possible to do in one line? Thank you
Yes. You could use an IntStream.range(int, int) to generate the range, then map to a String with nested ternaries. Like,
System.out.println(Arrays.toString(IntStream.range(0, 100)
.mapToObj(i -> i < 11 ? "Bob" : i < 58 ? "Jake" : "John")
.toArray()));
Here's a one-liner for Bob.
for(int i=0; i<10; i++) myArray[i]="Bob";
I'd write a method:
static void fillRange(String[] arr, int from, int to, String val){
for(int i=from; i<=to; i++) arr[i]=val;
}
and then call it:
fillRange(myArray, 0, 10, "Bob"); fillRange(myArray, 11, 57, "Jake");...
But you must make sure that your array contains the required range already:
int arrLength = 99;
....
String[] myArray = new String[arrLength];
In many languages it would be simple to set up all the parameters in a tuple and evaluate them as a list comprehension. However, Java isn't really well suited for doing this type of thing in a single statement.
Arrays.stream(new Object[] {
new Object[] {0,10,"Bob"},
new Object[] {11,57,"Jake"},
new Object[] {58,99,"John"}
}).forEach(objs -> Arrays.fill(myArray,
(int)(((Object[]) objs)[0]),
((int)(((Object[]) objs)[1]))+1,
((Object[]) objs)[2])
);
Or if you are willing to abuse the ternary operator, you could use this lovely gem (credits to Johannes Kuhn):
Arrays.setAll(myArray, i -> i <= 10 ? "Bob" : i <= 57 ? "Jake" : "John");
However, the moral of the story is: Just because you can do something in one line, doesn't mean you should. Clarity is more important.

Java 8 Stream and operation on arrays

I have just discovered the new Java 8 stream capabilities. Coming from Python, I was wondering if there was now a neat way to do operations on arrays like summing, multiplying two arrays in a "one line pythonic" way ?
Thanks
There are new methods added to java.util.Arrays to convert an array into a Java 8 stream which can then be used for summing etc.
int sum = Arrays.stream(myIntArray).sum();
Multiplying two arrays is a little more difficult because I can't think of a way to get the value AND the index at the same time as a Stream operation. This means you probably have to stream over the indexes of the array.
//in this example a[] and b[] are same length
int[] a = ...
int[] b = ...
int[] result = new int[a.length];
IntStream.range(0, a.length).forEach(i -> result[i] = a[i] * b[i]);
Commenter #Holger points out you can use the map method instead of forEach like this:
int[] result = IntStream.range(0, a.length).map(i -> a[i] * b[i]).toArray();
You can turn an array into a stream by using Arrays.stream():
int[] ns = new int[] {1,2,3,4,5};
Arrays.stream(ns);
Once you've got your stream, you can use any of the methods described in the documentation, like sum() or whatever. You can map or filter like in Python by calling the relevant stream methods with a Lambda function:
Arrays.stream(ns).map(n -> n * 2);
Arrays.stream(ns).filter(n -> n % 4 == 0);
Once you're done modifying your stream, you then call toArray() to convert it back into an array to use elsewhere:
int[] ns = new int[] {1,2,3,4,5};
int[] ms = Arrays.stream(ns).map(n -> n * 2).filter(n -> n % 4 == 0).toArray();
Be careful if you have to deal with large numbers.
int[] arr = new int[]{Integer.MIN_VALUE, Integer.MIN_VALUE};
long sum = Arrays.stream(arr).sum(); // Wrong: sum == 0
The sum above is not 2 * Integer.MIN_VALUE.
You need to do this in this case.
long sum = Arrays.stream(arr).mapToLong(Long::valueOf).sum(); // Correct
Please note that Arrays.stream(arr) create a LongStream (or IntStream, ...) instead of Stream so the map function cannot be used to modify the type. This is why .mapToLong, mapToObject, ... functions are provided.
Take a look at why-cant-i-map-integers-to-strings-when-streaming-from-an-array

Any shortcut to initialize all array elements to zero?

In C/C++ I used to do
int arr[10] = {0};
...to initialize all my array elements to 0.
Is there a similar shortcut in Java?
I want to avoid using the loop, is it possible?
int arr[] = new int[10];
for(int i = 0; i < arr.length; i++) {
arr[i] = 0;
}
A default value of 0 for arrays of integral types is guaranteed by the language spec:
Each class variable, instance variable, or array component is initialized with a default value when it is created (§15.9, §15.10) [...] For type int, the default value is zero, that is, 0.
If you want to initialize an one-dimensional array to a different value, you can use java.util.Arrays.fill() (which will of course use a loop internally).
While the other answers are correct (int array values are by default initialized to 0), if you wanted to explicitly do so (say for example if you wanted an array filled with the value 42), you can use the fill() method of the Arrays class:
int [] myarray = new int[num_elts];
Arrays.fill(myarray, 42);
Or if you're a fan of 1-liners, you can use the Collections.nCopies() routine:
Integer[] arr = Collections.nCopies(3, 42).toArray(new Integer[0]);
Would give arr the value:
[42, 42, 42]
(though it's Integer, and not int, if you need the primitive type you could defer to the Apache Commons ArrayUtils.toPrimitive() routine:
int [] primarr = ArrayUtils.toPrimitive(arr);
In java all elements(primitive integer types byte short, int, long) are initialised to 0 by default. You can save the loop.
How it Reduces the Performance of your application....? Read Following.
In Java Language Specification the Default / Initial Value for any Object can be given as Follows.
For type byte, the default value is zero, that is, the value of (byte) is 0.
For type short, the default value is zero, that is, the value of (short) is 0.
For type int, the default value is zero, that is, 0.
For type long, the default value is zero, that is, 0L.
For type float, the default value is positive zero, that is, 0.0f.
For type double, the default value is positive zero, that is, 0.0d.
For type char, the default value is the null character, that is, '\u0000'.
For type boolean, the default value is false.
For all reference types, the default value is null.
By Considering all this you don't need to initialize with zero values for the array elements because by default all array elements are 0 for int array.
Because An array is a container object that holds a fixed number of values of a single type.
Now the Type of array for you is int so consider the default value for all array elements will be automatically 0 Because it is holding int type.
Now consider the array for String type so that all array elements has default value is null.
Why don't do that......?
you can assign null value by using loop as you suggest in your Question.
int arr[] = new int[10];
for(int i=0;i<arr.length;i++)
arr[i] = 0;
But if you do so then it will an useless loss of machine cycle.
and if you use in your application where you have many arrays and you do that for each array then it will affect the Application Performance up-to considerable level.
The more use of machine cycle ==> More time to Process the data ==> Output time will be significantly increase. so that your application data processing can be considered as a low level(Slow up-to some Level).
You can save the loop, initialization is already made to 0. Even for a local variable.
But please correct the place where you place the brackets, for readability (recognized best-practice):
int[] arr = new int[10];
If you are using Float or Integer then you can assign default value like this ...
Integer[] data = new Integer[20];
Arrays.fill(data,new Integer(0));
You can create a new empty array with your existing array size, and you can assign back them to your array. This may faster than other.
Snipet:
package com.array.zero;
public class ArrayZero {
public static void main(String[] args) {
// Your array with data
int[] yourArray = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
//Creating same sized array with 0
int[] tempArray = new int[yourArray.length];
Assigning temp array to replace values by zero [0]
yourArray = tempArray;
//testing the array size and value to be zero
for (int item : yourArray) {
System.out.println(item);
}
}
}
Result :
0
0
0
0
0
0
0
0
0
Initialization is not require in case of zero because default value of int in Java is zero.
For values other than zero java.util.Arrays provides a number of options, simplest one is fill method.
int[] arr = new int[5];
Arrays.fill(arr, -1);
System.out.println(Arrays.toString(arr)); //[-1, -1, -1, -1, -1 ]
int [] arr = new int[5];
// fill value 1 from index 0, inclusive, to index 3, exclusive
Arrays.fill(arr, 0, 3, -1 )
System.out.println(Arrays.toString(arr)); // [-1, -1, -1, 0, 0]
We can also use Arrays.setAll() if we want to fill value on condition basis:
int[] array = new int[20];
Arrays.setAll(array, p -> p > 10 ? -1 : p);
int[] arr = new int[5];
Arrays.setAll(arr, i -> i);
System.out.println(Arrays.toString(arr)); // [0, 1, 2, 3, 4]
The int values are already zero after initialization, as everyone has mentioned. If you have a situation where you actually do need to set array values to zero and want to optimize that, use System.arraycopy:
static private int[] zeros = new float[64];
...
int[] values = ...
if (zeros.length < values.length) zeros = new int[values.length];
System.arraycopy(zeros, 0, values, 0, values.length);
This uses memcpy under the covers in most or all JRE implementations. Note the use of a static like this is safe even with multiple threads, since the worst case is multiple threads reallocate zeros concurrently, which doesn't hurt anything.
You could also use Arrays.fill as some others have mentioned. Arrays.fill could use memcpy in a smart JVM, but is probably just a Java loop and the bounds checking that entails.
Benchmark your optimizations, of course.
In c/cpp there is no shortcut but to initialize all the arrays with the zero subscript.Ex:
int arr[10] = {0};
But in java there is a magic tool called Arrays.fill() which will fill all the values in an array with the integer of your choice.Ex:
import java.util.Arrays;
public class Main
{
public static void main(String[] args)
{
int ar[] = {2, 2, 1, 8, 3, 2, 2, 4, 2};
Arrays.fill(ar, 10);
System.out.println("Array completely filled" +
" with 10\n" + Arrays.toString(ar));
}
}
You defined it correctly in your question, it is nearly the same as for C++. All you need to do for the primitive data type is to initialize the array. Default values are for int 0.
int[] intArray = new int[10];
you can simply do the following
int[] arrayOfZeros= new int[SizeVar];
declare the array as instance variable in the class i.e. out of every method and JVM will give it 0 as default value. You need not to worry anymore
Yes, int values in an array are initialized to zero. But you are not guaranteed this. Oracle documentation states that this is a bad coding practice.
Yet another approach by using lambda above java 8
Arrays.stream(new Integer[nodelist.size()]).map(e ->
Integer.MAX_VALUE).toArray(Integer[]::new);
int a=7, b=7 ,c=0,d=0;
int dizi[][]=new int[a][b];
for(int i=0;i<a;i++){
for(int q=d;q<b;q++){
dizi[i][q]=c;
System.out.print(dizi[i][q]);
c++;
}
c-=b+1;
System.out.println();
}
result
0123456
-1012345
-2-101234
-3-2-10123
-4-3-2-1012
-5-4-3-2-101
-6-5-4-3-2-10

Algorithm - How to delete duplicate elements in a list efficiently?

There is a list L. It contains elements of arbitrary type each.
How to delete all duplicate elements in such list efficiently? ORDER must be preserved
Just an algorithm is required, so no import any external library is allowed.
Related questions
In Python, what is the fastest algorithm for removing duplicates from a list so that all elements are unique while preserving order?
How do you remove duplicates from a list in Python whilst preserving order?
Removing duplicates from list of lists in Python
How do you remove duplicates from a list in Python?
Assuming order matters:
Create an empty set S and an empty list M.
Scan the list L one element at a time.
If the element is in the set S, skip it.
Otherwise, add it to M and to S.
Repeat for all elements in L.
Return M.
In Python:
>>> L = [2, 1, 4, 3, 5, 1, 2, 1, 1, 6, 5]
>>> S = set()
>>> M = []
>>> for e in L:
... if e in S:
... continue
... S.add(e)
... M.append(e)
...
>>> M
[2, 1, 4, 3, 5, 6]
If order does not matter:
M = list(set(L))
Special Case: Hashing and Equality
Firstly, we need to determine something about the assumptions, namely the existence of an equals and has function relationship. What do I mean by this? I mean that for the set of source objects S, given any two objects x1 and x2 that are elements of S there exists a (hash) function F such that:
if (x1.equals(x2)) then F(x1) == F(x2)
Java has such a relationship. That allows you to check to duplicates as a near O(1) operation and thus reduces the algorithm to a simple O(n) problem. If order is unimportant, it's a simple one liner:
List result = new ArrayList(new HashSet(inputList));
If order is important:
List outputList = new ArrayList();
Set set = new HashSet();
for (Object item : inputList) {
if (!set.contains(item)) {
outputList.add(item);
set.add(item);
}
}
You will note that I said "near O(1)". That's because such data structures (as a Java HashMap or HashSet) rely on a method where a portion of the hash code is used to find an element (often called a bucket) in the backing storage. The number of buckets is a power-of-2. That way the index into that list is easy to calculate. hashCode() returns an int. If you have 16 buckets you can find which one to use by ANDing the hashCode with 15, giving you a number from 0 to 15.
When you try and put something in that bucket it may already be occupied. If so then a linear comparison of all entries in that bucket will occur. If the collision rate gets too high or you try to put too many elements in the structure will be grown, typically doubled (but always by a power-of-2) and all the items are placed in their new buckets (based on the new mask). Thus resizing such structures is relatively expensive.
Lookup may also be expensive. Consider this class:
public class A {
private final int a;
A(int a) { this.a == a; }
public boolean equals(Object ob) {
if (ob.getClass() != getClass()) return false;
A other = (A)ob;
return other.a == a;
}
public int hashCode() { return 7; }
}
This code is perfectly legal and it fulfills the equals-hashCode contract.
Assuming your set contains nothing but A instances, your insertion/search now turns into an O(n) operation, turning the entire insertion into O(n2).
Obviously this is an extreme example but it's useful to point out that such mechanisms also rely on a relatively good distribution of hashes within the value space the map or set uses.
Finally, it must be said that this is a special case. If you're using a language without this kind of "hashing shortcut" then it's a different story.
General Case: No Ordering
If no ordering function exists for the list then you're stuck with an O(n2) brute-force comparison of every object to every other object. So in Java:
List result = new ArrayList();
for (Object item : inputList) {
boolean duplicate = false;
for (Object ob : result) {
if (ob.equals(item)) {
duplicate = true;
break;
}
}
if (!duplicate) {
result.add(item);
}
}
General Case: Ordering
If an ordering function exists (as it does with, say, a list of integers or strings) then you sort the list (which is O(n log n)) and then compare each element in the list to the next (O(n)) so the total algorithm is O(n log n). In Java:
Collections.sort(inputList);
List result = new ArrayList();
Object prev = null;
for (Object item : inputList) {
if (!item.equals(prev)) {
result.add(item);
}
prev = item;
}
Note: the above examples assume no nulls are in the list.
If the order does not matter, you might want to try this algorithm written in Python:
>>> array = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6]
>>> unique = set(array)
>>> list(unique)
[1, 2, 3, 4, 5, 6]
in haskell this would be covered by the nub and nubBy functions
nub :: Eq a => [a] -> [a]
nub [] = []
nub (x:xs) = x : nub (filter (/= x) xs)
nubBy :: (a -> a -> Bool) -> [a] -> [a]
nubBy f [] = []
nubBy f (x:xs) = x : nub (filter (not.f x) xs)
nubBy relaxes the dependence on the Eq typeclass, instead allowing you to define your own equality function to filter duplicates.
These functions work over a list of consistent arbitrary types (e.g. [1,2,"three"] is not allowed in haskell), and they are both order preserving.
In order to make this more efficient, using Data.Map (or implementing a balanced tree) could be used to gather the data into a set (key being the element, and value being the index into the original list in order to be able to get the original ordering back), then gathering the results back into a list and sorting by index. I will try and implement this later.
import qualified Data.Map as Map
undup x = go x Map.empty
where
go [] _ = []
go (x:xs) m case Map.lookup x m of
Just _ -> go xs m
Nothing -> go xs (Map.insert x True m)
This is a direct translation of #FogleBird's solution. Unfortunately it doesn't work without the import.
a Very basic attempt at replacing Data.Map import would be to implement a tree, something like this
data Tree a = Empty
| Node a (Tree a) (Tree a)
deriving (Eq, Show, Read)
insert x Empty = Node x Empty Empty
insert x (Node a left right)
| x < a = Node a (insert x left) right
| otherwise = Node a left (insert x right)
lookup x Empty = Nothing --returning maybe type to maintain compatibility with Data.Map
lookup x (Node a left right)
| x == a = Just x
| x < a = lookup x left
| otherwise = lookup x right
an improvement would be to make it autobalancing on insert by maintaining a depth attribute (keeps the tree from degrading into a linked list). This nice thing about this over a hash table is that it only requires your type to be in the typeclass Ord, which is easily derivable for most types.
I take requests it seems. In response to #Jonno_FTWs inquiry here is a solution which completely removes duplicates from the result. It's not entirely dissimilar to the original, simply adding an extra case. However the runtime performance will be much slower since you are going through each sub-list twice, once for the elem, and the second time for the recusion. Also note that now it will not work on infinite lists.
nub [] = []
nub (x:xs) | elem x xs = nub (filter (/=x) xs)
| otherwise = x : nub xs
Interestingly enough you don't need to filter on the second recursive case because elem has already detected that there are no duplicates.
In Python
>>> L = [2, 1, 4, 3, 5, 1, 2, 1, 1, 6, 5]
>>> a=[]
>>> for i in L:
... if not i in a:
... a.append(i)
...
>>> print a
[2, 1, 4, 3, 5, 6]
>>>
In java, it's a one liner.
Set set = new LinkedHashSet(list);
will give you a collection with duplicate items removed.
For Java could go with this:
private static <T> void removeDuplicates(final List<T> list)
{
final LinkedHashSet<T> set;
set = new LinkedHashSet<T>(list);
list.clear();
list.addAll(set);
}
Delete duplicates in a list inplace in Python
Case: Items in the list are not hashable or comparable
That is we can't use set (dict) or sort.
from itertools import islice
def del_dups2(lst):
"""O(n**2) algorithm, O(1) in memory"""
pos = 0
for item in lst:
if all(item != e for e in islice(lst, pos)):
# we haven't seen `item` yet
lst[pos] = item
pos += 1
del lst[pos:]
Case: Items are hashable
Solution is taken from here:
def del_dups(seq):
"""O(n) algorithm, O(log(n)) in memory (in theory)."""
seen = {}
pos = 0
for item in seq:
if item not in seen:
seen[item] = True
seq[pos] = item
pos += 1
del seq[pos:]
Case: Items are comparable, but not hashable
That is we can use sort. This solution doesn't preserve original order.
def del_dups3(lst):
"""O(n*log(n)) algorithm, O(1) memory"""
lst.sort()
it = iter(lst)
for prev in it: # get the first element
break
pos = 1 # start from the second element
for item in it:
if item != prev: # we haven't seen `item` yet
lst[pos] = prev = item
pos += 1
del lst[pos:]
go through the list and assign sequential index to each item
sort the list basing on some comparison function for elements
remove duplicates
sort the list basing on assigned indices
for simplicity indices for items may be stored in something like std::map
looks like O(n*log n) if I haven't missed anything
It depends on what you mean by "efficently". The naive algorithm is O(n^2), and I assume what you actually mean is that you want something of lower order than that.
As Maxim100 says, you can preserve the order by pairing the list with a series of numbers, use any algorithm you like, and then resort the remainder back into their original order. In Haskell it would look like this:
superNub :: (Ord a) => [a] -> [a]
superNub xs = map snd
. sortBy (comparing fst)
. map head . groupBy ((==) `on` snd)
. sortBy (comparing snd)
. zip [1..] $ xs
Of course you need to import Data.List (sort), Data.Function (on) and Data.Ord (comparing). I could just recite the definitions of those functions, but what would be the point?
I've written an algorithm for string. Actually it does not matter what type do you have.
static string removeDuplicates(string str)
{
if (String.IsNullOrEmpty(str) || str.Length < 2) {
return str;
}
char[] arr = str.ToCharArray();
int len = arr.Length;
int pos = 1;
for (int i = 1; i < len; ++i) {
int j;
for (j = 0; j < pos; ++j) {
if (arr[i] == arr[j]) {
break;
}
}
if (j == pos) {
arr[pos] = arr[i];
++pos;
}
}
string finalStr = String.Empty;
foreach (char c in arr.Take(pos)) {
finalStr += c.ToString();
}
return finalStr;
}
One line solution in Python.
Using lists-comprehesion:
>>> L = [2, 1, 4, 3, 5, 1, 2, 1, 1, 6, 5]
>>> M = []
>>> zip(*[(e,M.append(e)) for e in L if not e in M])[0]
(2, 1, 4, 3, 5, 6)
Maybe you should look into using associate arrays (aka dict in python) to avoid having duplicate elements in the first place.
My code in Java:
ArrayList<Integer> list = new ArrayList<Integer>();
list.addAll({1,2,1,3,4,5,2,3,4,3});
for (int i=0; i<list.size(); i++)
{
for (int j=i+1; j<list.size(); j++)
{
if (list.get(i) == list.get(j))
{
list.remove(i);
j--;
}
}
}
or simply do this:
SetList<Integer> unique = new SetList<Integer>();
unique.addAll(list);
Both ways have Time = nk ~ O(n^2)
where n is the size of input list,
k is number of unique members of the input list
Algorithm delete_duplicates (a[1....n])
//Remove duplicates from the given array
//input parameters :a[1:n], an array of n elements
{
temp[1:n]; //an array of n elements
temp[i]=a[i];for i=1 to n
temp[i].value=a[i]
temp[i].key=i
*//based on 'value' sort the array temp.*
//based on 'value' delete duplicate elements from temp.
//based on 'key' sort the array temp.//construct an array p using temp.
p[i]=temp[i].value
return p
In other of elements is maintained in the output array using the 'key'. Consider the key is of length O(n), the time taken for performing sorting on the key and value is O(nlogn). So the time taken to delete all duplicates from the array is O(nlogn).
Generic solution close to the accepted answer
k = ['apple', 'orange', 'orange', 'grapes', 'apple', 'apple', 'apple']
m = []
def remove_duplicates(k):
for i in range(len(k)):
for j in range(i, len(k)-1):
if k[i] == k[j+1]:
m.append(j+1)
l = list(dict.fromkeys(m))
l.sort(reverse=True)
for i in l:
k.pop(i)
return k
print(remove_duplicates(k))

Categories