Optimizing an algorithm java - java

Hi I have the following method. What it does is it finds all the possible paths from the top left to bottom right of a N x M matrix. I was wondering what is the best way to optimize it for speed as it is a little slow right now. The resulted paths are then stored in a set.
EDIT I forgot to clarify you can only move down or right to an adjacent spot, no diagonals from your current position
For example
ABC
DEF
GHI
A path from the top left to bottom right would be ADEFI
static public void printPaths (String tempString, int i, int j, int m, int n, char [][] arr, HashSet<String> palindrome) {
String newString = tempString + arr[i][j];
if (i == m -1 && j == n-1) {
palindrome.add(newString);
return;
}
//right
if (j+1 < n) {
printPaths (newString, i, j+1, m, n, arr, palindrome);
}
//down
if (i+1 < m) {
printPaths (newString, i+1, j, m, n, arr, palindrome);
}
}
EDIT Here is the entirety of the code
public class palpath {
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("palpath.in"));
PrintWriter pw = new PrintWriter(new BufferedWriter(new FileWriter("palpath.out")));
StringTokenizer st = new StringTokenizer(br.readLine());
int d = Integer.parseInt(st.nextToken());
char[][] grid = new char [d][d];
String index = null;
for(int i = 0; i < d; i++)
{
String temp = br.readLine();
index = index + temp;
for(int j = 0; j < d; j++)
{
grid[i][j] = temp.charAt(j);
}
}
br.close();
int counter = 0;
HashSet<String> set = new HashSet<String>();
printPaths ("", 0, 0, grid.length, grid[0].length, grid, set);
Iterator<String> it = set.iterator();
while(it.hasNext()){
String temp = it.next();
StringBuilder sb = new StringBuilder(temp).reverse();
if(temp.equals(sb.toString())) {
counter++;
}
}
pw.println(counter);
pw.close();
}
static public void printPaths (String tempString, int i, int j, int m, int n, char [][] arr, HashSet<String> palindrome) {
String newString = tempString + arr[i][j];
if (i == m -1 && j == n-1) {
palindrome.add(newString);
return;
}
//right
if (j+1 < n) {
printPaths (newString, i, j+1, m, n, arr, palindrome);
}
//down
if (i+1 < m) {
printPaths (newString, i+1, j, m, n, arr, palindrome);
}
}

Given a graph of length M x N, all paths from (0,0) to (M-1, N-1) that only involve rightward and downward moves are guaranteed to contain exactly M-1 moves rightward and N-1 moves downward.
This presents us with an interesting property: we can represent a path from (0,0) to (M-1, N-1) as a binary string (0 indicating a rightward move and 1 indicating a downward move).
So, the question becomes: how fast can we print out a list of permutations of that bit string?
Pretty fast.
public static void printPaths(char[][] arr) {
/* Get Smallest Bitstring (e.g. 0000...111) */
long current = 0;
for (int i = 0; i < arr.length - 1; i++) {
current <<= 1;
current |= 1;
}
/* Get Largest Bitstring (e.g. 111...0000) */
long last = current;
for (int i = 0; i < arr[0].length - 1; i++) {
last <<= 1;
}
while (current <= last) {
/* Print Path */
int x = 0, y = 0;
long tmp = current;
StringBuilder sb = new StringBuilder(arr.length + arr[0].length);
while (x < arr.length && y < arr[0].length) {
sb.append(arr[x][y]);
if ((tmp & 1) == 1) {
x++;
} else {
y++;
}
tmp >>= 1;
}
System.out.println(sb.toString());
/* Get Next Permutation */
tmp = (current | (current - 1)) + 1;
current = tmp | ((((tmp & -tmp) / (current & -current)) >> 1) - 1);
}
}

You spend a lot of time on string memory management.
Are strings in Java mutable? If you can change chars inside string, then set length of string as n+m, and use this the only string, setting (i+j)th char at every iteration. If they are not mutable, use array of char or something similar, and transform it to string at the end

For a given size N×M of the array all your paths have N+M+1 items (N+M steps), so the first step of optimization is getting rid of recursion, allocating an array and running the recursion with while on explicit stack.
Each partial path can be extended with one or two steps: right or down. So you can easily make an explicit stack with positions visited and a step taken on each position. Put the position (0,0) to the stack with phase (step taken) 'none', then:
while stack not empty {
if stack is full /* reached lower-right corner, path complete */ {
print the path;
pop;
}
else if stack.top.phase == none {
stack.top.phase = right;
try push right-neighbor with phase none;
}
else if stack.top.phase == right {
stack.top.phase = down;
try push down-neighbor with phase none;
}
else /* stack.top.phase == down */ {
pop;
}
}

If you make a few observations about your requirements you can optimise this drastically.
There will be exactly (r-1)+(c-1) steps (where r = rows and c = columns).
There will be exactly (c-1) steps to the right and (r-1) steps down.
You therefore can use numbers where a zero bit could (arbitrarily) indicate a down step while a 1 bit steps across. We can then merely iterate over all numbers of (r-1)+(c-1) bits containing just (c-1) bits set. There's a good algorithm for that at the Stanford BitTwiddling site Compute the lexicographically next bit permutation.
First a BitPatternIterator I have used before. You could pull out the code in hasNext if you wish.
/**
* Iterates all bit patterns containing the specified number of bits.
*
* See "Compute the lexicographically next bit permutation" http://graphics.stanford.edu/~seander/bithacks.html#NextBitPermutation
*
* #author OldCurmudgeon
*/
public static class BitPattern implements Iterable<BigInteger> {
// Useful stuff.
private static final BigInteger ONE = BigInteger.ONE;
private static final BigInteger TWO = ONE.add(ONE);
// How many bits to work with.
private final int bits;
// Value to stop at. 2^max_bits.
private final BigInteger stop;
// All patterns of that many bits up to the specified number of bits.
public BitPattern(int bits, int max) {
this.bits = bits;
this.stop = TWO.pow(max);
}
#Override
public Iterator<BigInteger> iterator() {
return new BitPatternIterator();
}
/*
* From the link:
*
* Suppose we have a pattern of N bits set to 1 in an integer and
* we want the next permutation of N 1 bits in a lexicographical sense.
*
* For example, if N is 3 and the bit pattern is 00010011, the next patterns would be
* 00010101, 00010110, 00011001,
* 00011010, 00011100, 00100011,
* and so forth.
*
* The following is a fast way to compute the next permutation.
*/
private class BitPatternIterator implements Iterator<BigInteger> {
// Next to deliver - initially 2^n - 1 - i.e. first n bits set to 1.
BigInteger next = TWO.pow(bits).subtract(ONE);
// The last one we delivered.
BigInteger last;
#Override
public boolean hasNext() {
if (next == null) {
// Next one!
// t gets v's least significant 0 bits set to 1
// unsigned int t = v | (v - 1);
BigInteger t = last.or(last.subtract(BigInteger.ONE));
// Silly optimisation.
BigInteger notT = t.not();
// Next set to 1 the most significant bit to change,
// set to 0 the least significant ones, and add the necessary 1 bits.
// w = (t + 1) | (((~t & -~t) - 1) >> (__builtin_ctz(v) + 1));
// The __builtin_ctz(v) GNU C compiler intrinsic for x86 CPUs returns the number of trailing zeros.
next = t.add(ONE).or(notT.and(notT.negate()).subtract(ONE).shiftRight(last.getLowestSetBit() + 1));
if (next.compareTo(stop) >= 0) {
// Dont go there.
next = null;
}
}
return next != null;
}
#Override
public BigInteger next() {
last = hasNext() ? next : null;
next = null;
return last;
}
#Override
public void remove() {
throw new UnsupportedOperationException("Not supported.");
}
#Override
public String toString() {
return next != null ? next.toString(2) : last != null ? last.toString(2) : "";
}
}
}
Using that to iterate your solution:
public void allRoutes(char[][] grid) {
int rows = grid.length;
int cols = grid[0].length;
BitPattern p = new BitPattern(rows - 1, cols + rows - 2);
for (BigInteger b : p) {
//System.out.println(b.toString(2));
/**
* Walk all bits, taking a step right/down depending on it's set/clear.
*/
int x = 0;
int y = 0;
StringBuilder s = new StringBuilder(rows + cols);
for (int i = 0; i < rows + cols - 2; i++) {
s.append(grid[y][x]);
if (b.testBit(i)) {
y += 1;
} else {
x += 1;
}
}
s.append(grid[y][x]);
// That's a solution.
System.out.println("\t" + s);
}
}
public void test() {
char[][] grid = {{'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'I'}};
allRoutes(grid);
char[][] grid2 = {{'A', 'B', 'C'}, {'D', 'E', 'F'}, {'G', 'H', 'I'}, {'J', 'K', 'L'}};
allRoutes(grid2);
}
printing
ADGHI
ADEHI
ABEHI
ADEFI
ABEFI
ABCFI
ADGJKL
ADGHKL
ADEHKL
ABEHKL
ADGHIL
ADEHIL
ABEHIL
ADEFIL
ABEFIL
ABCFIL
which - to my mind - looks right.

Related

How do I count the maximum number of two characters in a array?

I have a character array consisting of the elements b and r arranged as {'b','b','r','r','b','r'};
What I want to find is the maximum number of those two characters without an interruption in their arrangement.
Example:
ar = {'b','b','r','r','b','r'};
The output should be 4 because bb rr each contains two characters and there is no mixing of b with rr or r with bb.
This is what I came up with :
int i =0;
int max=0;
while(i<ar.length){
char c = ar[i];
int count = 0;
while(i<ar.length&&ar[i] ==c){i++;count++;}
if(i==ar.length)break;
char n_c = ar[i];
while(i<ar.length && ar[i]==n_c){i++;count++;}
if(i==ar.length) break;
if(count>max) max=count;
}
If you want to find the maximum sub array length which contains only continious r and b, here is a solution. The basic idea is using two cursor and greedy search.
public static int findMaximum(char[] input) {
int result = 0;
int first = 0;
int second = 0;
while (input[first] == input[second]) {
second++; // the second index should start from another character
}
while (second < input.length) {
int preSecond = second; // copy second, in need reset first to it
while (second + 1 < input.length && input[second] == input[second + 1]) {
second++; // increment second
}
result = Math.max(result, second - first + 1);
if (second < input.length - 1) {
first = preSecond;
}
second++;
}
return result;
}
some test cases:
public static void main(String[] args) {
System.out.println(findMaximum(new char[]{'b','b','r'})); //3
System.out.println(findMaximum(new char[]{'b','b','r','r'})); //4
System.out.println(findMaximum(new char[]{'b','b','r','r','r','b','r'})); //5
System.out.println(findMaximum(new char[]{'b','b','b','r','r','b','r'})); //5
System.out.println(findMaximum(new char[]{'b','b','r','r','b','r','r','r','r','r'})); //6
}

How can I build this tree with O(n) space complexity?

The Problem
Given a set of integers, find a subset of those integers which sum to 100,000,000.
Solution
I am attempting to build a tree containing all the combinations of the given set along with the sum. For example, if the given set looked like 0,1,2, I would build the following tree, checking the sum at each node:
{}
{} {0}
{} {1} {0} {0,1}
{} {2} {1} {1,2} {0} {2} {0,1} {0,1,2}
Since I keep both the array of integers at each node and the sum, I should only need the bottom (current) level of the tree in memory.
Issues
My current implementation will maintain the entire tree in memory and therefore uses way too much heap space.
How can I change my current implementation so that the GC will take care of my upper tree levels?
(At the moment I am just throwing a RuntimeException when I have found the target sum but this is obviously just for playing around)
public class RecursiveSolver {
static final int target = 100000000;
static final int[] set = new int[]{98374328, 234234123, 2341234, 123412344, etc...};
Tree initTree() {
return nextLevel(new Tree(null), 0);
}
Tree nextLevel(Tree currentLocation, int current) {
if (current == set.length) { return null; }
else if (currentLocation.sum == target) throw new RuntimeException(currentLocation.getText());
else {
currentLocation.left = nextLevel(currentLocation.copy(), current + 1);
Tree right = currentLocation.copy();
right.value = add(currentLocation.value, set[current]);
right.sum = currentLocation.sum + set[current];
currentLocation.right = nextLevel(right, current + 1);
return currentLocation;
}
}
int[] add(int[] array, int digit) {
if (array == null) {
return new int[]{digit};
}
int[] newValue = new int[array.length + 1];
for (int i = 0; i < array.length; i++) {
newValue[i] = array[i];
}
newValue[array.length] = digit;
return newValue;
}
public static void main(String[] args) {
RecursiveSolver rs = new RecursiveSolver();
Tree subsetTree = rs.initTree();
}
}
class Tree {
Tree left;
Tree right;
int[] value;
int sum;
Tree(int[] value) {
left = null;
right = null;
sum = 0;
this.value = value;
if (value != null) {
for (int i = 0; i < value.length; i++) sum += value[i];
}
}
Tree copy() {
return new Tree(this.value);
}
}
The time and space you need for building the tree here is absolutely nothing at all.
The reason is because, if you're given
A node of the tree
The depth of the node
The ordered array of input elements
you can simply compute its parent, left, and right children nodes using O(1) operations. And you have access to each of those things while you're traversing the tree, so you don't need anything else.
The problem is NP-complete.
If you really want to improve performance, then you have to forget about your tree implementation. You either have to just generate all the subsets and sum them up or to use dynamic programming.
The choice depends on the number of elements to sum and the sum you want to achieve. You know the sum it is 100,000,000, bruteforce exponential algorithm runs in O(2^n * n) time, so for number below 22 it makes sense.
In python you can achieve this with a simple:
def powerset(iterable):
"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
You can significantly improve this complexity (sacrificing the memory) by using meet in the middle technique (read the wiki article). This will decrease it to O(2^(n/2)), which means that it will perform better than DP solution for n <~ 53
After thinking more about erip's comments, I realized he is correct - I shouldn't be using a tree to implement this algorithm.
Brute force usually is O(n*2^n) because there are n additions for 2^n subsets. Because I only do one addition per node, the solution I came up with is O(2^n) where n is the size of the given set. Also, this algorithm is only O(n) space complexity. Since the number of elements in the original set in my particular problem is small (around 25) O(2^n) complexity is not too much of a problem.
The dynamic solution to this problem is O(t*n) where t is the target sum and n is the number of elements. Because t is very large in my problem, the dynamic solution ends up with a very long runtime and a high memory usage.
This completes my particular solution in around 311 ms on my machine, which is a tremendous improvement over the dynamic programming solutions I have seen for this particular class of problem.
public class TailRecursiveSolver {
public static void main(String[] args) {
final long starttime = System.currentTimeMillis();
try {
step(new Subset(null, 0), 0);
}
catch (RuntimeException ex) {
System.out.println(ex.getMessage());
final long endtime = System.currentTimeMillis();
System.out.println(endtime - starttime);
}
}
static final int target = 100000000;
static final int[] set = new int[]{ . . . };
static void step(Subset current, int counter) {
if (current.sum == target) throw new RuntimeException(current.getText());
else if (counter == set.length) {}
else {
step(new Subset(add(current.subset, set[counter]), current.sum + set[counter]), counter + 1);
step(current, counter + 1);
}
}
static int[] add(int[] array, int digit) {
if (array == null) {
return new int[]{digit};
}
int[] newValue = new int[array.length + 1];
for (int i = 0; i < array.length; i++) {
newValue[i] = array[i];
}
newValue[array.length] = digit;
return newValue;
}
}
class Subset {
int[] subset;
int sum;
Subset(int[] subset, int sum) {
this.subset = subset;
this.sum = sum;
}
public String getText() {
String ret = "";
for (int i = 0; i < (subset == null ? 0 : subset.length); i++) {
ret += " + " + subset[i];
}
if (ret.startsWith(" ")) {
ret = ret.substring(3);
ret = ret + " = " + sum;
} else ret = "null";
return ret;
}
}
EDIT -
The above code still runs in O(n*2^n) time - since the add method runs in O(n) time. This following code will run in true O(2^n) time, and is MUCH more performant, completing in around 20 ms on my machine.
It is limited to sets less than 64 elements due to storing the current subset as the bits in a long.
public class SubsetSumSolver {
static boolean found = false;
static final int target = 100000000;
static final int[] set = new int[]{ . . . };
public static void main(String[] args) {
step(0,0,0);
}
static void step(long subset, int sum, int counter) {
if (sum == target) {
found = true;
System.out.println(getText(subset, sum));
}
else if (!found && counter != set.length) {
step(subset + (1 << counter), sum + set[counter], counter + 1);
step(subset, sum, counter + 1);
}
}
static String getText(long subset, int sum) {
String ret = "";
for (int i = 0; i < 64; i++) if((1 & (subset >> i)) == 1) ret += " + " + set[i];
if (ret.startsWith(" ")) ret = ret.substring(3) + " = " + sum;
else ret = "null";
return ret;
}
}
EDIT 2 -
Here is another version uses a meet in the middle attack, along with a little bit shifting in order to reduce the complexity from O(2^n) to O(2^(n/2)).
If you want to use this for sets with between 32 and 64 elements, you should change the int which represents the current subset in the step function to a long although performance will obviously drastically decrease as the set size increases. If you want to use this for a set with odd number of elements, you should add a 0 to the set to make it even numbered.
import java.util.ArrayList;
import java.util.List;
public class SubsetSumMiddleAttack {
static final int target = 100000000;
static final int[] set = new int[]{ ... };
static List<Subset> evens = new ArrayList<>();
static List<Subset> odds = new ArrayList<>();
static int[][] split(int[] superSet) {
int[][] ret = new int[2][superSet.length / 2];
for (int i = 0; i < superSet.length; i++) ret[i % 2][i / 2] = superSet[i];
return ret;
}
static void step(int[] superSet, List<Subset> accumulator, int subset, int sum, int counter) {
accumulator.add(new Subset(subset, sum));
if (counter != superSet.length) {
step(superSet, accumulator, subset + (1 << counter), sum + superSet[counter], counter + 1);
step(superSet, accumulator, subset, sum, counter + 1);
}
}
static void printSubset(Subset e, Subset o) {
String ret = "";
for (int i = 0; i < 32; i++) {
if (i % 2 == 0) {
if ((1 & (e.subset >> (i / 2))) == 1) ret += " + " + set[i];
}
else {
if ((1 & (o.subset >> (i / 2))) == 1) ret += " + " + set[i];
}
}
if (ret.startsWith(" ")) ret = ret.substring(3) + " = " + (e.sum + o.sum);
System.out.println(ret);
}
public static void main(String[] args) {
int[][] superSets = split(set);
step(superSets[0], evens, 0,0,0);
step(superSets[1], odds, 0,0,0);
for (Subset e : evens) {
for (Subset o : odds) {
if (e.sum + o.sum == target) printSubset(e, o);
}
}
}
}
class Subset {
int subset;
int sum;
Subset(int subset, int sum) {
this.subset = subset;
this.sum = sum;
}
}

About the CharMatcher.WHITESPACE implementation

When i looked up the implementation of CharMatcher and notice a field WHITESPACE_MULTIPLIER=1682554634 , then i set this value to 1582554634 , running the testcase CharMatcherTest#testWhitespaceBreakingWhitespaceSubset, of course it failed.
After that I changed testWhitespaceBreakingWhitespaceSubset to only invoke WHITESPACE.apply((char)c) without assert, print the index in the method of WHITESPACE.matches
int index=(WHITESPACE_MULTIPLIER * c) >>> WHITESPACE_SHIFT)
finally found that index collided after changed the WHITESPACE_MULTIPLIER from 1682554634 to 1582554634
No doubt, 1682554634 is well designed , my question is how can I infer this "magic number"?`
Upon Martin Grajcar's proposal, I try to write the "magic number generator" as follows and worked :
char[] charsReq = WHITESPACE_TABLE.toCharArray();
Arrays.sort(charsReq);
OUTER:
for (int WHITESPACE_MULTIPLIER_WANTTED = 1682553701; WHITESPACE_MULTIPLIER_WANTTED <= 1682554834; WHITESPACE_MULTIPLIER_WANTTED++) {
int matchCnt = 0;
for (int c = 0; c <= Character.MAX_VALUE; c++) {
int position = Arrays.binarySearch(charsReq, (char) c);
char index = WHITESPACE_TABLE.charAt((WHITESPACE_MULTIPLIER_WANTTED * c) >>> WHITESPACE_SHIFT);
if (position >= 0 && index == c) {
matchCnt++;
} else if (position < 0 && index != c) {
matchCnt++;
} else {
continue OUTER;
}
}
// all valid
if ((matchCnt - 1) == (int) (Character.MAX_VALUE)) {
System.out.println(WHITESPACE_MULTIPLIER_WANTTED);
}
}
if changed the sequence of characters(swap \u2001 \u2002 position) in WHITESPACE_TABLE the algorithms has no solution (changed the loop end condition to Integer.MAX_VALUE).
as the IntMath.gcd implementation is refer to http://en.wikipedia.org/wiki/Binary_GCD_algorithm
my question is : where can i find the material of CharMatcher.WHITESPACE.match implementation?
I'm not sure if the generator still exists somewhere, but it can be recreated easily. The class Result contains the data used in the implementation of CharMatcher.WHITESPACE:
static class Result {
private int shift;
private int multiplier;
private String table;
}
// No duplicates allowed.
private final String allMatchingString = "\u2002\r\u0085\u200A\u2005\u2000"
+ "\u2029\u000B\u2008\u2003\u205F\u1680"
+ "\u0009\u0020\u2006\u2001\u202F\u00A0\u000C\u2009"
+ "\u2004\u2028\n\u2007\u3000";
public Result generate(String allMatchingString) {
final char[] allMatching = allMatchingString.toCharArray();
final char filler = allMatching[allMatching.length - 1];
final int shift = Integer.numberOfLeadingZeros(allMatching.length);
final char[] table = new char[1 << (32 - shift)];
OUTER: for (int i=0; i>=0; ++i) {
final int multiplier = 123456789 * i; // Jumping a bit makes the search faster.
Arrays.fill(table, filler);
for (final char c : allMatching) {
final int index = (multiplier * c) >>> shift;
if (table[index] != filler) continue OUTER; // Conflict found.
table[index] = c;
}
return new Result(shift, multiplier, new String(table));
}
return null; // No solution exists.
}
It generates a different multiplier, but this doesn't matter.
In case no solution for a given allMatchingString exists, you can decrement shift and try again.

Find all substrings that are palindromes

If the input is 'abba' then the possible palindromes are a, b, b, a, bb, abba.
I understand that determining if string is palindrome is easy. It would be like:
public static boolean isPalindrome(String str) {
int len = str.length();
for(int i=0; i<len/2; i++) {
if(str.charAt(i)!=str.charAt(len-i-1) {
return false;
}
return true;
}
But what is the efficient way of finding palindrome substrings?
This can be done in O(n), using Manacher's algorithm.
The main idea is a combination of dynamic programming and (as others have said already) computing maximum length of palindrome with center in a given letter.
What we really want to calculate is radius of the longest palindrome, not the length.
The radius is simply length/2 or (length - 1)/2 (for odd-length palindromes).
After computing palindrome radius pr at given position i we use already computed radiuses to find palindromes in range [i - pr ; i]. This lets us (because palindromes are, well, palindromes) skip further computation of radiuses for range [i ; i + pr].
While we search in range [i - pr ; i], there are four basic cases for each position i - k (where k is in 1,2,... pr):
no palindrome (radius = 0) at i - k
(this means radius = 0 at i + k, too)
inner palindrome, which means it fits in range
(this means radius at i + k is the same as at i - k)
outer palindrome, which means it doesn't fit in range
(this means radius at i + k is cut down to fit in range, i.e because i + k + radius > i + pr we reduce radius to pr - k)
sticky palindrome, which means i + k + radius = i + pr
(in that case we need to search for potentially bigger radius at i + k)
Full, detailed explanation would be rather long. What about some code samples? :)
I've found C++ implementation of this algorithm by Polish teacher, mgr Jerzy Wałaszek.
I've translated comments to english, added some other comments and simplified it a bit to be easier to catch the main part.
Take a look here.
Note: in case of problems understanding why this is O(n), try to look this way:
after finding radius (let's call it r) at some position, we need to iterate over r elements back, but as a result we can skip computation for r elements forward. Therefore, total number of iterated elements stays the same.
Perhaps you could iterate across potential middle character (odd length palindromes) and middle points between characters (even length palindromes) and extend each until you cannot get any further (next left and right characters don't match).
That would save a lot of computation when there are no many palidromes in the string. In such case the cost would be O(n) for sparse palidrome strings.
For palindrome dense inputs it would be O(n^2) as each position cannot be extended more than the length of the array / 2. Obviously this is even less towards the ends of the array.
public Set<String> palindromes(final String input) {
final Set<String> result = new HashSet<>();
for (int i = 0; i < input.length(); i++) {
// expanding even length palindromes:
expandPalindromes(result,input,i,i+1);
// expanding odd length palindromes:
expandPalindromes(result,input,i,i);
}
return result;
}
public void expandPalindromes(final Set<String> result, final String s, int i, int j) {
while (i >= 0 && j < s.length() && s.charAt(i) == s.charAt(j)) {
result.add(s.substring(i,j+1));
i--; j++;
}
}
So, each distinct letter is already a palindrome - so you already have N + 1 palindromes, where N is the number of distinct letters (plus empty string). You can do that in single run - O(N).
Now, for non-trivial palindromes, you can test each point of your string to be a center of potential palindrome - grow in both directions - something that Valentin Ruano suggested.
This solution will take O(N^2) since each test is O(N) and number of possible "centers" is also O(N) - the center is either a letter or space between two letters, again as in Valentin's solution.
Note, there is also O(N) solution to your problem, based on Manacher's algoritm (article describes "longest palindrome", but algorithm could be used to count all of them)
I just came up with my own logic which helps to solve this problem.
Happy coding.. :-)
System.out.println("Finding all palindromes in a given string : ");
subPal("abcacbbbca");
private static void subPal(String str) {
String s1 = "";
int N = str.length(), count = 0;
Set<String> palindromeArray = new HashSet<String>();
System.out.println("Given string : " + str);
System.out.println("******** Ignoring single character as substring palindrome");
for (int i = 2; i <= N; i++) {
for (int j = 0; j <= N; j++) {
int k = i + j - 1;
if (k >= N)
continue;
s1 = str.substring(j, i + j);
if (s1.equals(new StringBuilder(s1).reverse().toString())) {
palindromeArray.add(s1);
}
}
}
System.out.println(palindromeArray);
for (String s : palindromeArray)
System.out.println(s + " - is a palindrome string.");
System.out.println("The no.of substring that are palindrome : "
+ palindromeArray.size());
}
Output:-
Finding all palindromes in a given string :
Given string : abcacbbbca
******** Ignoring single character as substring palindrome ********
[cac, acbbbca, cbbbc, bb, bcacb, bbb]
cac - is a palindrome string.
acbbbca - is a palindrome string.
cbbbc - is a palindrome string.
bb - is a palindrome string.
bcacb - is a palindrome string.
bbb - is a palindrome string.
The no.of substring that are palindrome : 6
I suggest building up from a base case and expanding until you have all of the palindomes.
There are two types of palindromes: even numbered and odd-numbered. I haven't figured out how to handle both in the same way so I'll break it up.
1) Add all single letters
2) With this list you have all of the starting points for your palindromes. Run each both of these for each index in the string (or 1 -> length-1 because you need at least 2 length):
findAllEvenFrom(int index){
int i=0;
while(true) {
//check if index-i and index+i+1 is within string bounds
if(str.charAt(index-i) != str.charAt(index+i+1))
return; // Here we found out that this index isn't a center for palindromes of >=i size, so we can give up
outputList.add(str.substring(index-i, index+i+1));
i++;
}
}
//Odd looks about the same, but with a change in the bounds.
findAllOddFrom(int index){
int i=0;
while(true) {
//check if index-i and index+i+1 is within string bounds
if(str.charAt(index-i-1) != str.charAt(index+i+1))
return;
outputList.add(str.substring(index-i-1, index+i+1));
i++;
}
}
I'm not sure if this helps the Big-O for your runtime, but it should be much more efficient than trying each substring. Worst case would be a string of all the same letter which may be worse than the "find every substring" plan, but with most inputs it will cut out most substrings because you can stop looking at one once you realize it's not the center of a palindrome.
I tried the following code and its working well for the cases
Also it handles individual characters too
Few of the cases which passed:
abaaa --> [aba, aaa, b, a, aa]
geek --> [g, e, ee, k]
abbaca --> [b, c, a, abba, bb, aca]
abaaba -->[aba, b, abaaba, a, baab, aa]
abababa -->[aba, babab, b, a, ababa, abababa, bab]
forgeeksskeegfor --> [f, g, e, ee, s, r, eksske, geeksskeeg,
o, eeksskee, ss, k, kssk]
Code
static Set<String> set = new HashSet<String>();
static String DIV = "|";
public static void main(String[] args) {
String str = "abababa";
String ext = getExtendedString(str);
// will check for even length palindromes
for(int i=2; i<ext.length()-1; i+=2) {
addPalindromes(i, 1, ext);
}
// will check for odd length palindromes including individual characters
for(int i=1; i<=ext.length()-2; i+=2) {
addPalindromes(i, 0, ext);
}
System.out.println(set);
}
/*
* Generates extended string, with dividors applied
* eg: input = abca
* output = |a|b|c|a|
*/
static String getExtendedString(String str) {
StringBuilder builder = new StringBuilder();
builder.append(DIV);
for(int i=0; i< str.length(); i++) {
builder.append(str.charAt(i));
builder.append(DIV);
}
String ext = builder.toString();
return ext;
}
/*
* Recursive matcher
* If match is found for palindrome ie char[mid-offset] = char[mid+ offset]
* Calculate further with offset+=2
*
*
*/
static void addPalindromes(int mid, int offset, String ext) {
// boundary checks
if(mid - offset <0 || mid + offset > ext.length()-1) {
return;
}
if (ext.charAt(mid-offset) == ext.charAt(mid+offset)) {
set.add(ext.substring(mid-offset, mid+offset+1).replace(DIV, ""));
addPalindromes(mid, offset+2, ext);
}
}
Hope its fine
public class PolindromeMyLogic {
static int polindromeCount = 0;
private static HashMap<Character, List<Integer>> findCharAndOccurance(
char[] charArray) {
HashMap<Character, List<Integer>> map = new HashMap<Character, List<Integer>>();
for (int i = 0; i < charArray.length; i++) {
char c = charArray[i];
if (map.containsKey(c)) {
List list = map.get(c);
list.add(i);
} else {
List list = new ArrayList<Integer>();
list.add(i);
map.put(c, list);
}
}
return map;
}
private static void countPolindromeByPositions(char[] charArray,
HashMap<Character, List<Integer>> map) {
map.forEach((character, list) -> {
int n = list.size();
if (n > 1) {
for (int i = 0; i < n - 1; i++) {
for (int j = i + 1; j < n; j++) {
if (list.get(i) + 1 == list.get(j)
|| list.get(i) + 2 == list.get(j)) {
polindromeCount++;
} else {
char[] temp = new char[(list.get(j) - list.get(i))
+ 1];
int jj = 0;
for (int ii = list.get(i); ii <= list
.get(j); ii++) {
temp[jj] = charArray[ii];
jj++;
}
if (isPolindrome(temp))
polindromeCount++;
}
}
}
}
});
}
private static boolean isPolindrome(char[] charArray) {
int n = charArray.length;
char[] temp = new char[n];
int j = 0;
for (int i = (n - 1); i >= 0; i--) {
temp[j] = charArray[i];
j++;
}
if (Arrays.equals(charArray, temp))
return true;
else
return false;
}
public static void main(String[] args) {
String str = "MADAM";
char[] charArray = str.toCharArray();
countPolindromeByPositions(charArray, findCharAndOccurance(charArray));
System.out.println(polindromeCount);
}
}
Try out this. Its my own solution.
// Maintain an Set of palindromes so that we get distinct elements at the end
// Add each char to set. Also treat that char as middle point and traverse through string to check equality of left and right char
static int palindrome(String str) {
Set<String> distinctPln = new HashSet<String>();
for (int i=0; i<str.length();i++) {
distinctPln.add(String.valueOf(str.charAt(i)));
for (int j=i-1, k=i+1; j>=0 && k<str.length(); j--, k++) {
// String of lenght 2 as palindrome
if ( (new Character(str.charAt(i))).equals(new Character(str.charAt(j)))) {
distinctPln.add(str.substring(j,i+1));
}
// String of lenght 2 as palindrome
if ( (new Character(str.charAt(i))).equals(new Character(str.charAt(k)))) {
distinctPln.add(str.substring(i,k+1));
}
if ( (new Character(str.charAt(j))).equals(new Character(str.charAt(k)))) {
distinctPln.add(str.substring(j,k+1));
} else {
continue;
}
}
}
Iterator<String> distinctPlnItr = distinctPln.iterator();
while ( distinctPlnItr.hasNext()) {
System.out.print(distinctPlnItr.next()+ ",");
}
return distinctPln.size();
}
Code is to find all distinct substrings which are palindrome.
Here is the code I tried. It is working fine.
import java.util.HashSet;
import java.util.Set;
public class SubstringPalindrome {
public static void main(String[] args) {
String s = "abba";
checkPalindrome(s);
}
public static int checkPalindrome(String s) {
int L = s.length();
int counter =0;
long startTime = System.currentTimeMillis();
Set<String> hs = new HashSet<String>();
// add elements to the hash set
System.out.println("Possible substrings: ");
for (int i = 0; i < L; ++i) {
for (int j = 0; j < (L - i); ++j) {
String subs = s.substring(j, i + j + 1);
counter++;
System.out.println(subs);
if(isPalindrome(subs))
hs.add(subs);
}
}
System.out.println("Total possible substrings are "+counter);
System.out.println("Total palindromic substrings are "+hs.size());
System.out.println("Possible palindromic substrings: "+hs.toString());
long endTime = System.currentTimeMillis();
System.out.println("It took " + (endTime - startTime) + " milliseconds");
return hs.size();
}
public static boolean isPalindrome(String s) {
if(s.length() == 0 || s.length() ==1)
return true;
if(s.charAt(0) == s.charAt(s.length()-1))
return isPalindrome(s.substring(1, s.length()-1));
return false;
}
}
OUTPUT:
Possible substrings:
a
b
b
a
ab
bb
ba
abb
bba
abba
Total possible substrings are 10
Total palindromic substrings are 4
Possible palindromic substrings: [bb, a, b, abba]
It took 1 milliseconds

How to iteratively generate k elements subsets from a set of size n in java?

I'm working on a puzzle that involves analyzing all size k subsets and figuring out which one is optimal. I wrote a solution that works when the number of subsets is small, but it runs out of memory for larger problems. Now I'm trying to translate an iterative function written in python to java so that I can analyze each subset as it's created and get only the value that represents how optimized it is and not the entire set so that I won't run out of memory. Here is what I have so far and it doesn't seem to finish even for very small problems:
public static LinkedList<LinkedList<Integer>> getSets(int k, LinkedList<Integer> set)
{
int N = set.size();
int maxsets = nCr(N, k);
LinkedList<LinkedList<Integer>> toRet = new LinkedList<LinkedList<Integer>>();
int remains, thresh;
LinkedList<Integer> newset;
for (int i=0; i<maxsets; i++)
{
remains = k;
newset = new LinkedList<Integer>();
for (int val=1; val<=N; val++)
{
if (remains==0)
break;
thresh = nCr(N-val, remains-1);
if (i < thresh)
{
newset.add(set.get(val-1));
remains --;
}
else
{
i -= thresh;
}
}
toRet.add(newset);
}
return toRet;
}
Can anybody help me debug this function or suggest another algorithm for iteratively generating size k subsets?
EDIT: I finally got this function working, I had to create a new variable that was the same as i to do the i and thresh comparison because python handles for loop indexes differently.
First, if you intend to do random access on a list, you should pick a list implementation that supports that efficiently. From the javadoc on LinkedList:
All of the operations perform as could be expected for a doubly-linked
list. Operations that index into the list will traverse the list from
the beginning or the end, whichever is closer to the specified index.
An ArrayList is both more space efficient and much faster for random access. Actually, since you know the length beforehand, you can even use a plain array.
To algorithms: Let's start simple: How would you generate all subsets of size 1? Probably like this:
for (int i = 0; i < set.length; i++) {
int[] subset = {i};
process(subset);
}
Where process is a method that does something with the set, such as checking whether it is "better" than all subsets processed so far.
Now, how would you extend that to work for subsets of size 2? What is the relationship between subsets of size 2 and subsets of size 1? Well, any subset of size 2 can be turned into a subset of size 1 by removing its largest element. Put differently, each subset of size 2 can be generated by taking a subset of size 1 and adding a new element larger than all other elements in the set. In code:
processSubset(int[] set) {
int subset = new int[2];
for (int i = 0; i < set.length; i++) {
subset[0] = set[i];
processLargerSets(set, subset, i);
}
}
void processLargerSets(int[] set, int[] subset, int i) {
for (int j = i + 1; j < set.length; j++) {
subset[1] = set[j];
process(subset);
}
}
For subsets of arbitrary size k, observe that any subset of size k can be turned into a subset of size k-1 by chopping of the largest element. That is, all subsets of size k can be generated by generating all subsets of size k - 1, and for each of these, and each value larger than the largest in the subset, add that value to the set. In code:
static void processSubsets(int[] set, int k) {
int[] subset = new int[k];
processLargerSubsets(set, subset, 0, 0);
}
static void processLargerSubsets(int[] set, int[] subset, int subsetSize, int nextIndex) {
if (subsetSize == subset.length) {
process(subset);
} else {
for (int j = nextIndex; j < set.length; j++) {
subset[subsetSize] = set[j];
processLargerSubsets(set, subset, subsetSize + 1, j + 1);
}
}
}
Test code:
static void process(int[] subset) {
System.out.println(Arrays.toString(subset));
}
public static void main(String[] args) throws Exception {
int[] set = {1,2,3,4,5};
processSubsets(set, 3);
}
But before you invoke this on huge sets remember that the number of subsets can grow rather quickly.
You can use
org.apache.commons.math3.util.Combinations.
Example:
import java.util.Arrays;
import java.util.Iterator;
import org.apache.commons.math3.util.Combinations;
public class tmp {
public static void main(String[] args) {
for (Iterator<int[]> iter = new Combinations(5, 3).iterator(); iter.hasNext();) {
System.out.println(Arrays.toString(iter.next()));
}
}
}
Output:
[0, 1, 2]
[0, 1, 3]
[0, 2, 3]
[1, 2, 3]
[0, 1, 4]
[0, 2, 4]
[1, 2, 4]
[0, 3, 4]
[1, 3, 4]
[2, 3, 4]
Here is a combination iterator I wrote recetnly
package psychicpoker;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Iterator;
import java.util.List;
import static com.google.common.base.Preconditions.checkArgument;
public class CombinationIterator<T> implements Iterator<Collection<T>> {
private int[] indices;
private List<T> elements;
private boolean hasNext = true;
public CombinationIterator(List<T> elements, int k) throws IllegalArgumentException {
checkArgument(k<=elements.size(), "Impossible to select %d elements from hand of size %d", k, elements.size());
this.indices = new int[k];
for(int i=0; i<k; i++)
indices[i] = k-1-i;
this.elements = elements;
}
public boolean hasNext() {
return hasNext;
}
private int inc(int[] indices, int maxIndex, int depth) throws IllegalStateException {
if(depth == indices.length) {
throw new IllegalStateException("The End");
}
if(indices[depth] < maxIndex) {
indices[depth] = indices[depth]+1;
} else {
indices[depth] = inc(indices, maxIndex-1, depth+1)+1;
}
return indices[depth];
}
private boolean inc() {
try {
inc(indices, elements.size() - 1, 0);
return true;
} catch (IllegalStateException e) {
return false;
}
}
public Collection<T> next() {
Collection<T> result = new ArrayList<T>(indices.length);
for(int i=indices.length-1; i>=0; i--) {
result.add(elements.get(indices[i]));
}
hasNext = inc();
return result;
}
public void remove() {
throw new UnsupportedOperationException();
}
}
I've had the same problem today, of generating all k-sized subsets of a n-sized set.
I had a recursive algorithm, written in Haskell, but the problem required that I wrote a new version in Java.
In Java, I thought I'd probably have to use memoization to optimize recursion. Turns out, I found a way to do it iteratively. I was inspired by this image, from Wikipedia, on the article about Combinations.
Method to calculate all k-sized subsets:
public static int[][] combinations(int k, int[] set) {
// binomial(N, K)
int c = (int) binomial(set.length, k);
// where all sets are stored
int[][] res = new int[c][Math.max(0, k)];
// the k indexes (from set) where the red squares are
// see image above
int[] ind = k < 0 ? null : new int[k];
// initialize red squares
for (int i = 0; i < k; ++i) { ind[i] = i; }
// for every combination
for (int i = 0; i < c; ++i) {
// get its elements (red square indexes)
for (int j = 0; j < k; ++j) {
res[i][j] = set[ind[j]];
}
// update red squares, starting by the last
int x = ind.length - 1;
boolean loop;
do {
loop = false;
// move to next
ind[x] = ind[x] + 1;
// if crossing boundaries, move previous
if (ind[x] > set.length - (k - x)) {
--x;
loop = x >= 0;
} else {
// update every following square
for (int x1 = x + 1; x1 < ind.length; ++x1) {
ind[x1] = ind[x1 - 1] + 1;
}
}
} while (loop);
}
return res;
}
Method for the binomial:
(Adapted from Python example, from Wikipedia)
private static long binomial(int n, int k) {
if (k < 0 || k > n) return 0;
if (k > n - k) { // take advantage of symmetry
k = n - k;
}
long c = 1;
for (int i = 1; i < k+1; ++i) {
c = c * (n - (k - i));
c = c / i;
}
return c;
}
Of course, combinations will always have the problem of space, as they likely explode.
In the context of my own problem, the maximum possible is about 2,000,000 subsets. My machine calculated this in 1032 milliseconds.
Inspired by afsantos's answer :-)... I decided to write a C# .NET implementation to generate all subset combinations of a certain size from a full set. It doesn't need to calc the total number of possible subsets; it detects when it's reached the end. Here it is:
public static List<object[]> generateAllSubsetCombinations(object[] fullSet, ulong subsetSize) {
if (fullSet == null) {
throw new ArgumentException("Value cannot be null.", "fullSet");
}
else if (subsetSize < 1) {
throw new ArgumentException("Subset size must be 1 or greater.", "subsetSize");
}
else if ((ulong)fullSet.LongLength < subsetSize) {
throw new ArgumentException("Subset size cannot be greater than the total number of entries in the full set.", "subsetSize");
}
// All possible subsets will be stored here
List<object[]> allSubsets = new List<object[]>();
// Initialize current pick; will always be the leftmost consecutive x where x is subset size
ulong[] currentPick = new ulong[subsetSize];
for (ulong i = 0; i < subsetSize; i++) {
currentPick[i] = i;
}
while (true) {
// Add this subset's values to list of all subsets based on current pick
object[] subset = new object[subsetSize];
for (ulong i = 0; i < subsetSize; i++) {
subset[i] = fullSet[currentPick[i]];
}
allSubsets.Add(subset);
if (currentPick[0] + subsetSize >= (ulong)fullSet.LongLength) {
// Last pick must have been the final 3; end of subset generation
break;
}
// Update current pick for next subset
ulong shiftAfter = (ulong)currentPick.LongLength - 1;
bool loop;
do {
loop = false;
// Move current picker right
currentPick[shiftAfter]++;
// If we've gotten to the end of the full set, move left one picker
if (currentPick[shiftAfter] > (ulong)fullSet.LongLength - (subsetSize - shiftAfter)) {
if (shiftAfter > 0) {
shiftAfter--;
loop = true;
}
}
else {
// Update pickers to be consecutive
for (ulong i = shiftAfter+1; i < (ulong)currentPick.LongLength; i++) {
currentPick[i] = currentPick[i-1] + 1;
}
}
} while (loop);
}
return allSubsets;
}
This solution worked for me:
private static void findSubsets(int array[])
{
int numOfSubsets = 1 << array.length;
for(int i = 0; i < numOfSubsets; i++)
{
int pos = array.length - 1;
int bitmask = i;
System.out.print("{");
while(bitmask > 0)
{
if((bitmask & 1) == 1)
System.out.print(array[pos]+",");
bitmask >>= 1;
pos--;
}
System.out.print("}");
}
}
Swift implementation:
Below are two variants on the answer provided by afsantos.
The first implementation of the combinations function mirrors the functionality of the original Java implementation.
The second implementation is a general case for finding all combinations of k values from the set [0, setSize). If this is really all you need, this implementation will be a bit more efficient.
In addition, they include a few minor optimizations and a smidgin logic simplification.
/// Calculate the binomial for a set with a subset size
func binomial(setSize: Int, subsetSize: Int) -> Int
{
if (subsetSize <= 0 || subsetSize > setSize) { return 0 }
// Take advantage of symmetry
var subsetSizeDelta = subsetSize
if (subsetSizeDelta > setSize - subsetSizeDelta)
{
subsetSizeDelta = setSize - subsetSizeDelta
}
// Early-out
if subsetSizeDelta == 0 { return 1 }
var c = 1
for i in 1...subsetSizeDelta
{
c = c * (setSize - (subsetSizeDelta - i))
c = c / i
}
return c
}
/// Calculates all possible combinations of subsets of `subsetSize` values within `set`
func combinations(subsetSize: Int, set: [Int]) -> [[Int]]?
{
// Validate inputs
if subsetSize <= 0 || subsetSize > set.count { return nil }
// Use a binomial to calculate total possible combinations
let comboCount = binomial(setSize: set.count, subsetSize: subsetSize)
if comboCount == 0 { return nil }
// Our set of combinations
var combos = [[Int]]()
combos.reserveCapacity(comboCount)
// Initialize the combination to the first group of set indices
var subsetIndices = [Int](0..<subsetSize)
// For every combination
for _ in 0..<comboCount
{
// Add the new combination
var comboArr = [Int]()
comboArr.reserveCapacity(subsetSize)
for j in subsetIndices { comboArr.append(set[j]) }
combos.append(comboArr)
// Update combination, starting with the last
var x = subsetSize - 1
while true
{
// Move to next
subsetIndices[x] = subsetIndices[x] + 1
// If crossing boundaries, move previous
if (subsetIndices[x] > set.count - (subsetSize - x))
{
x -= 1
if x >= 0 { continue }
}
else
{
for x1 in x+1..<subsetSize
{
subsetIndices[x1] = subsetIndices[x1 - 1] + 1
}
}
break
}
}
return combos
}
/// Calculates all possible combinations of subsets of `subsetSize` values within a set
/// of zero-based values for the set [0, `setSize`)
func combinations(subsetSize: Int, setSize: Int) -> [[Int]]?
{
// Validate inputs
if subsetSize <= 0 || subsetSize > setSize { return nil }
// Use a binomial to calculate total possible combinations
let comboCount = binomial(setSize: setSize, subsetSize: subsetSize)
if comboCount == 0 { return nil }
// Our set of combinations
var combos = [[Int]]()
combos.reserveCapacity(comboCount)
// Initialize the combination to the first group of elements
var subsetValues = [Int](0..<subsetSize)
// For every combination
for _ in 0..<comboCount
{
// Add the new combination
combos.append([Int](subsetValues))
// Update combination, starting with the last
var x = subsetSize - 1
while true
{
// Move to next
subsetValues[x] = subsetValues[x] + 1
// If crossing boundaries, move previous
if (subsetValues[x] > setSize - (subsetSize - x))
{
x -= 1
if x >= 0 { continue }
}
else
{
for x1 in x+1..<subsetSize
{
subsetValues[x1] = subsetValues[x1 - 1] + 1
}
}
break
}
}
return combos
}

Categories