Saving memory and CPU in java loops

Saving memory and CPU in java loops - java

this (obvious) code i've writen works well, but for testing purposes, i should make it work for a "one million" sized array in a reasonable time by saving CPU Cycles and saving as much memory as i can.
any suggestions please?
!!! the array is arranged in ascending order !!!
import java.util.Arrays;
class A {
static boolean exists(int[] ints, int k) {
for(int integer : ints){
if(integer == k){
return true;
}
}
return false;
}

Since your array is in ascending order, one thing you could do (i think) is to make a binary search instead of a linear search.

You could use a Set<Integer> that relies on hashing rather than an array where you iterate sequentially.
static boolean exists(Set<Integer> ints, int k) {
return ints.contains(k);
}
You could convert the array to a Set and pass it to the method as many times as required :
Set<Integer> set = Arrays.stream(ints).boxed().collect(Collectors.toSet());
boolean isExist = exists(set, 15);
...
isExist = exists(set, 5005);
...
isExist = exists(set, 355);

Related

Memoization of this leetcode problem. How do I memoize this recursive solution

I have made all possible swipes and then at the end I have passed the array to be checked if it is increasing or not.
this is the question and I have written the recursive approach as follows
class Solution {
public int minSwap(int[] A, int[] B) {
return helper(A,B,0,0);
}
boolean helper2(int[] A,int[] B){
for(int i=0;i<A.length-1;i++){
if(A[i]>=A[i+1] || B[i]>=B[i+1])
return false;
}
return true;
}
int helper(int[] A,int[] B,int i,int swaps){
if(i==A.length && helper2(A,B)==true)
return swaps;
if(i==A.length)
return 1000;
swap(A,B,i);
int c=helper(A,B,i+1,swaps+1);
swap(A,B,i);
int b=helper(A,B,i+1,swaps);
return Math.min(b,c);
}
private void swap(int[] A, int[] B, int index){
int temp = A[index];
A[index] = B[index];
B[index] = temp;
}
}
Here I have tried all possible swipes and then checked them and returned one with minimum swipes. How do I do memoization of this. Which variables should I use in memoization of this code. Is there any thumb rule of selecting variables for memoization?

Wikipedia says:
In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
Since A and B don't change, the inputs are i and swaps, so for every combination of the two, we need to store the result.
One way to do this, is to use a HashMap with a key with the 2 values, e.g.
class Key {
int i;
int swaps;
// implement methods, especially equals() and hashCode()
}
You can then add the following at the beginning of helper(), though you might want to add it after the two if statements:
Key key = new Key(i, swap);
Integer cachedResult = cache.get(key);
if (cachedResult != null)
return cachedResult;
Then replace the return statement with:
int result = Math.min(b,c);
cache.put(key, result);
return result;
Whether cache is a field or a parameter being passed along is entirely up to you.

Java Searching through two Arrays

I have 2 ArrayList's. ArrayList A has 8.1k elements and ArrayList B has 81k elements.
I need to iterate through B, search for that particular item in A then change a field in the matched element in list B.
Here's my code:
private void mapAtoB(List<A> aList, ListIterator<B> it) {
AtomicInteger i = new AtomicInteger(-1);
while(it.hasNext()) {
System.out.print(i.incrementAndGet() + ", ");
B b = it.next();
aList.stream().filter(a -> b.equalsB(a)).forEach(a -> {
b.setId(String.valueOf(a.getRedirectId()));
it.set(b);
});
}
System.out.println();
}
public class B {
public boolean equalsB(A a) {
if (a == null) return false;
if (this.getFullURL().contains(a.getFirstName())) return true;
return false;
}
}
But this is taking forever. To finish this method it takes close to 15 minutes. Is there any way to optimize any of this? 15 min run time is way too much.

I'll be happy to see a good and thorough solution, meanwhile I can propose two ideas (or maybe two reincarnations of one).
The first one is to speed up searching of all objects of type A in one object of type B. For that, Rabin-Karp algorithm seems applicable and simple enough to quickly implement, and Aho-Corasick harder but will probably give better results, not sure how much better.
The other option is to limit the number of objects of type B which should be fully processed for each object of A, for that you could e.g. build an inverse N-gram index: for each fullUrl you take all its substrings of length N ("N-grams"), and you build a map from each such N-gram to a set of B's that have such N-gram in their fullUrl. When searching for an object A, you take all of its N-grams, find a set of B's for each such N-gram and intersect all these sets, the intersection will contain all B's that you should fully process. I implemented this approach quickly, for the sizes you specified it gives a 6-7 time speedup for N=4; as N grows, search becomes faster, but building the index slows down (so if you can reuse it you are probably better off choosing a bigger N). This index takes about 200 Mb for the sizes you specified, so this approach will only scale this far with the growth of the collection of B's. Assuming that all strings are longer than NGRAM_LENGTH, here's the quick and dirty code for building the index using Guava's SetMultimap, HashMultimap:
SetMultimap<String, B> idx = HashMultimap.create();
for (B b : bList) {
for (int i = 0; i < b.getFullURL().length() - NGRAM_LENGTH + 1; i++) {
idx.put(b.getFullURL().substring(i, i + NGRAM_LENGTH), b);
}
}
And for the search:
private void mapAtoB(List<A> aList, SetMultimap<String, B> mmap) {
for (A a : aList) {
Collection<B> possible = null;
for (int i = 0; i < a.getFirstName().length() - NGRAM_LENGTH + 1; i++) {
String ngram = a.getFirstName().substring(i, i + NGRAM_LENGTH);
Set<B> forNgram = mmap.get(ngram);
if (possible == null) {
possible = new ArrayList<>(forNgram);
} else {
possible.retainAll(forNgram);
}
if (possible.size() < 20) { // it's ok to scan through 20
break;
}
}
for (B b : possible) {
if (b.equalsB(a)) {
b.setId(a.getRedirectId());
}
}
}
}
A possible direction for optimization would be to use hashes instead of full N-grams thus reducing the memory footprint and necessity for N-gram key comparisons.

Store and find if a certain array is already stored

My program checks multiple boolean arrays (length 30 each) and I would like to know if I already checked that array. I thought the best way to handle this problem would be to store all the arrays and search for the new array in the set of all the arrays but I don't know what structure I should use. At first, I though hashtable would be the best but it looks like I can't use them with arrays. I looked for set and list but I have no clue what to use !
Edit/clarification: Hey it's my first question here and I'm surprised how many answers I received, thanks a lot ! Lot of people says they are unsure about what exactly I'm looking for so I'll try to clarify:
I have multiple boolean arrays of length 30 where the order is important ( order of elements in the array).
I receive one array at a time and I want to check if I already received the same array (same element, same order). I don't need to store them( I don't need any index, I don't want to know how many arrays I received), don't need anything except to know if I already received the array.

A boolean array is basically a list of bits. Since array size is 30, and an int is a 32-bit value, you can convert the array into an int. With a long you could support arrays up to 64 in size.
So, first convert your array to an int:
private static int toBits(boolean[] array) {
if (array.length > 32)
throw new IllegalArgumentException("Array too large: " + array.length);
int bits = 0;
for (int i = 0; i < array.length; i++)
if (array[i])
bits |= 1 << i;
return bits;
}
Then keep track using a Set<Integer>:
private Set<Integer> alreadySeen = new HashSet<>();
private boolean firstTime(boolean[] array) {
return ! this.alreadySeen.add(toBits(array));
}
This provides a very fast and low-memory implementation that can handle lots of boolean arrays.

You can create a Wrapper class that holds array (content) and a flag. And, instead of storing array of arrays, you can store array of objects of this class. Have a look at the example below:
public class ArrayWrapper {
private boolean checked;
private boolean[] content;
/**
* #return the checked
*/
public boolean isChecked() {
return checked;
}
/**
* #param checked the checked to set
*/
public void setChecked(boolean checked) {
this.checked = checked;
}
/**
* #return the content
*/
public boolean[] getContent() {
return content;
}
/**
* #param content the content to set
*/
public void setContent(boolean[] content) {
this.content = content;
}
}
Now, you can create a List<ArrayWrapper> or ArrayWrapper[], iterate through it and set checked to true once the array (content) is checked.

Use Arrays.equals(array1, array2)
This method returns true if the two specified arrays of booleans are equal to one another. Two arrays are considered equal if both arrays contain the same number of elements, and all corresponding pairs of elements in the two arrays are equal.
I’m giving you a brute force solution.
List<boolean[]> arrs = new ArrayList<>();
while (true) {
boolean[] receivedArr = receive();
for (boolean[] existingArr : arrs) {
if (Arrays.equals(existingArr, receivedArr)) {
drop(receivedArr);
break;
}
arrs.add(receivedArr);
}
}

You can try an adjacency list or perhaps an array/arraylist of an Object that you call 'Pair' for example where this object has two attributes , the first is an array(the array you checked or didn't check yet) and the second attribute is a boolean value that denotes whether this array has been visited or not.

You can use an array :)
If you have n arrays, then create a boolean array of size n. Let's call it checked[].
So if checked[5] == true, you already checked the fifth array.
Another option would be to use the index 0 of each array as the 'checked flag'.

Thanks for your clarification!
HashMap is still a good answer using Arrays.hashCode() to create your key object. Like so:
HashMap<Integer, Boolean> checked = new HashMap<>();
/**
* Returns true if already checked; false if it's new
*/
public boolean isChecked(Boolean [] array) {
int hashCode = Arrays.hashCode(array);
Boolean existing = checked(hashCode);
if (existing == null) {
checked.put(hashCode, true);
return true;
}
return false;
}

Iterating through array - java

I was wondering if it was better to have a method for this and pass the Array to that method or to write it out every time I want to check if a number is in the array.
For example:
public static boolean inArray(int[] array, int check) {
for (int i = 0; i < array.length; i++) {
if (array[i] == check)
return true;
}
return false;
}
Thanks for the help in advance!

Since atleast Java 1.5.0 (Java 5) the code can be cleaned up a bit. Arrays and anything that implements Iterator (e.g. Collections) can be looped as such:
public static boolean inArray(int[] array, int check) {
for (int o : array){
if (o == check) {
return true;
}
}
return false;
}
In Java 8 you can also do something like:
// import java.util.stream.IntStream;
public static boolean inArray(int[] array, int check) {
return IntStream.of(array).anyMatch(val -> val == check);
}
Although converting to a stream for this is probably overkill.

You should definitely encapsulate this logic into a method.
There is no benefit to repeating identical code multiple times.
Also, if you place the logic in a method and it changes, you only need to modify your code in one place.
Whether or not you want to use a 3rd party library is an entirely different decision.

If you are using an array (and purely an array), the lookup of "contains" is O(N), because worst case, you must iterate the entire array. Now if the array is sorted you can use a binary search, which reduces the search time to log(N) with the overhead of the sort.
If this is something that is invoked repeatedly, place it in a function:
private boolean inArray(int[] array, int value)
{
for (int i = 0; i < array.length; i++)
{
if (array[i] == value)
{
return true;
}
}
return false;
}

You can import the lib org.apache.commons.lang.ArrayUtils
There is a static method where you can pass in an int array and a value to check for.
contains(int[] array, int valueToFind)
Checks if the value is in the given array.
ArrayUtils.contains(intArray, valueToFind);
ArrayUtils API

Using java 8 Stream API could simplify your job.
public static boolean inArray(int[] array, int check) {
return Stream.of(array).anyMatch(i -> i == check);
}
It's just you have the overhead of creating a new Stream from Array, but this gives exposure to use other Stream API. In your case you may not want to create new method for one-line operation, unless you wish to use this as utility.
Hope this helps!

Implementing edit distance method using recursion results in object heap error

private static int editDistance(ArrayList<String> s1, ArrayList<String> s2) {
if (s1.size()==0) {
return s2.size();
}
else if (s2.size()==0) {
return s1.size();
}
else {
String temp1 = s1.remove(s1.size()-1);
String temp2 = s2.remove(s2.size()-1);
if (temp1.equals(temp2)) {
return editDistance((ArrayList<String>)s1.clone(),(ArrayList<String>)s2.clone());
} else {
s1.add(temp1);
int first = editDistance((ArrayList<String>)s1.clone(),(ArrayList<String>)s2.clone())+1;
s2.add(temp2);
s1.remove(s1.size()-1);
int second = editDistance((ArrayList<String>)s1.clone(),(ArrayList<String>)s2.clone())+1;
s2.remove(s2.size()-1);
int third = editDistance((ArrayList<String>)s1.clone(),(ArrayList<String>)s2.clone())+1;
if (first <= second && first <= third ) {
return first;
} else if (second <= first && second <= third) {
return second;
} else {
return third;
}
}
}
}
For example, the input can be ["div","table","tr","td","a"] and ["table","tr","td","a","strong"] and the corresponding output should be 2.
My problem is when either input list has a size too big, e.g., 40 strings in the list, the program will generate a can't reserve enough space for object heap error. The JVM parameters are -Xms512m -Xmx512m. Could my code need so much heap space? Or it is due to logical bugs in my code?
Edit: With or without cloning the list, this recursive approach does not seem to work either way. Could someone please help estimate the total heap memory it requires to work for me? I assume it would be shocking. Anyway, I guess I have to turn to the dynamic programming approach instead.

You clone() each ArrayList instance before each recursive call of your method. That essentially means that you get yet another copy of the whole list and its contents for each call - it can easily add-up to a very large amount of memory for large recursion depths.
You should consider using List#sublist() instead of clone(), or even adding parameters to your method to pass down indexes towards a single set of initial List objects.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Saving memory and CPU in java loops - java

Since your array is in ascending order, one thing you could do (i think) is to make a binary search instead of a linear search.

Related

Memoization of this leetcode problem. How do I memoize this recursive solution

Java Searching through two Arrays

Store and find if a certain array is already stored

Iterating through array - java

Implementing edit distance method using recursion results in object heap error

Categories

Resources