I am working through the Minimax algorithm with Alpha-Beta Pruning example found here. In the example, they use an array to implement the search tree. I followed the example, but also tried implementing it with a binary search tree as well. Here are the values I'm using in the tree: 3, 5, 6, 9, 1, 2, 0, -1.
The optimal value at the end should be 5. With the BST implementation, I keep getting 2.
I think this is the problem, but I don't know how to get around it:
I wrote the code to return out of recursion if it sees a leaf node to stop from getting null pointer exceptions when trying to check the next value. But instead, I think it's stopping the search too early (based off of what I see when stepping through the code with the debugger). If I remove the check though, the code fails on a null pointer.
Can someone point me in the right direction? What am I doing wrong?
Here's the code:
public class AlphaBetaMiniMax {
private static BinarySearchTree myTree = new BinarySearchTree();
static int MAX = 1000;
static int MIN = -1000;
static int opt;
public static void main(String[] args) {
//Start constructing the game
AlphaBetaMiniMax demo = new AlphaBetaMiniMax();
//3, 5, 6, 9, 1, 2, 0, -1
demo.myTree.insert(3);
demo.myTree.insert(5);
demo.myTree.insert(6);
demo.myTree.insert(9);
demo.myTree.insert(1);
demo.myTree.insert(2);
demo.myTree.insert(0);
demo.myTree.insert(-1);
//print the tree
System.out.println("Game Tree: ");
demo.myTree.printTree(demo.myTree.root);
//Print the results of the game
System.out.println("\nGame Results:");
//run the minimax algorithm with the following inputs
int optimalVal = demo.minimax(0, myTree.root, true, MAX, MIN);
System.out.println("Optimal Value: " + optimalVal);
}
/**
* #param alpha = 1000
* #param beta = -1000
* #param nodeIndex - the current node
* #param depth - the depth to search
* #param maximizingPlayer - the current player making a move
* #return - the best move for the current player
*/
public int minimax(int depth, MiniMaxNode nodeIndex, boolean maximizingPlayer, double alpha, double beta) {
//Base Case #1: Reached the bottom of the tree
if (depth == 2) {
return nodeIndex.getValue();
}
//Base Case #2: if reached a leaf node, return the value of the current node
if (nodeIndex.getLeft() == null && maximizingPlayer == false) {
return nodeIndex.getValue();
} else if (nodeIndex.getRight() == null && maximizingPlayer == true) {
return nodeIndex.getValue();
}
//Mini-Max Algorithm
if (maximizingPlayer) {
int best = MIN;
//Recur for left and right children
for (int i = 0; i < 2; i++) {
int val = minimax(depth + 1, nodeIndex.getLeft(), false, alpha, beta);
best = Math.max(best, val);
alpha = Math.max(alpha, best);
//Alpha Beta Pruning
if (beta <= alpha) {
break;
}
}
return best;
} else {
int best = MAX;
//Recur for left and right children
for (int i = 0; i < 2; i++) {
int val = minimax(depth + 1, nodeIndex.getRight(), true, alpha, beta);
best = Math.min(best, val);
beta = Math.min(beta, best);
//Alpha Beta Pruning
if (beta <= alpha) {
break;
}
}
return best;
}
}
}
Output:
Game Tree:
-1 ~ 0 ~ 1 ~ 2 ~ 3 ~ 5 ~ 6 ~ 9 ~
Game Results:
Optimal Value: 2
Your problem is your iterations are depending on a loop control of 2, and not a node == null finding for nodeIndex.getRight()(for max) getLeft(for min.)
Remember a tree has
1 head(first level)
2nd level = 2
3rd level = 4
4th 8
and so on. So your algorithm for looping will not even go down 3 levels.
for (int i = 0; i < 2; i++) {
int val = minimax(depth + 1, nodeIndex.getLeft(), false, alpha, beta);
best = Math.max(best, val);
alpha = Math.max(alpha, best);
//Alpha Beta Pruning
if (beta <= alpha) {
break;
}
Change your loops to control iteration correctly and you should find the highest value easily.
Related
There's a question I saw and I'm wondering if it's possible to solve it using recursion. It goes as follow:
Write an algorithm that, when given an array of input, finds the maximum product from those inputs. For example:
Input: [1, 2, 3]
Output: 6 (1*2*3)
Input: [-1, 1, 2, 3]
Output: 6 (1*2*3)
Input: [-2, -1, 1, 2, 3]
Output: 12 (-2*-1*1*2*3)
I'm trying to find a way of using recursion to solve it, but the algorithm I tried doesn't work. My algorithm, written in Java is as follow
Integer[] array;
public int maximumProduct(int[] nums) {
array=new Integer[nums.length];
return multiply(nums, 0);
}
public int multiply(int[] nums, int i){
if (array[i]!=null){
return array[i];
}
if (i==(nums.length-1)){
return nums[i];
}
int returnval=Math.max(nums[i]*multiply(nums, i+1), multiply(nums, i+1));
array[i]=returnval;
return returnval;
}
The problem with this algorithm is that it doesn't work well if there's an even number of negative numbers. For example, if nums[0]=-2, nums[1]=-1 and nums[2]=1, then multiply(nums, 1) will always return 1 instead of -1, and thus it will always see 1 as bigger than 1*-2 at multiply(nums, 0). I'm not sure how to solve this problem, however. Is there any way of solving this using recursion or dynamic programming?
If there is only one non-zero element in the array, and it happens to be a negative number, then then answer is either 0, if there is a 0 present in the input, or if the array contains only that single negative element, the answer is that element itself.
In all other cases, the final answer is going to be positive.
We first make a linear scan to find the number of negative integers. If this number is even, then the answer is the product of all the non-zero elements. If there are an odd number of negative elements, we need to leave out one negative element from the answer, so that the answer is positive. As we want the maximum possible answer, the number we want to leave out should have as small an absolute value as possible. So among all the negative numbers, find the one with the minimum absolute value, and find the product of the remaining non-zero elements, which should be the answer.
All this requires only two linear scans of the array, and hence runs in O(n) time.
What is the maximum product of integers?
To obtain the maximum sum, you will want to multiply all the positive integers with the product of the largest negative integers, with the number of negative integers included in the product being even to obtain a positive final result.
In an algorithm for a single traversal
I am going to treat the positive integers and the negative integers in the input separately. You will want to keep a running product of positive integers, a running product of negative integers and the largest negative integer (ie. the negative integer with the smallest absolute value) found so far.
Let us ignore the edge cases where the final answer is <= 0. That can be handled easily.
//Initialization
int [] nums // Input
int posProduct = 1;
int negProduct = 1;
int smallestNeg = 1;
//Input Traversal
for (int i : nums) {
if ( i == 0 ) {
// ignore
} else if ( i < 0 ) {
if (smallestNeg == 1) {
smallestNeg = i;
} else if ( i > smallestNeg ) {
negProduct *= smallestNeg; //Integrate the old smallest into the running product
smallestNeg = i; // i is the new smallest
} else {
negProduct *= i;
}
} else {
// i is strictly positive
posProduct *= i;
}
}
//Result Computation
int result = posProduct;
if ( negProduct < 0 ) {
// The running product of negative number numbers is negative
// We use the smallestNeg to turn it back up to a positive product
result *= smallestNeg;
result *= negProduct;
} else {
result *= negProduct
}
edit: In a recursive traversal
I personally find that writing the array traversal in a recursive manner to be clumsy but it can be done.
For the beauty of the exercise and to actually answer the question of the OP, here is how I would do it.
public class RecursiveSolver {
public static int findMaxProduct (int [] nums) {
return recursiveArrayTraversal(1, 1, 1, nums, 0);
}
private static int recursiveArrayTraversal(int posProduct, int negProduct,
int smallestNeg, int [] nums, int index) {
if (index == nums.length) {
// End of the recursion, we traversed the whole array
posProduct *= negProduct;
if (posProduct < 0) {
posProduct *= smallestNeg;
}
return posProduct;
}
// Processing the "index" element of the array
int i = nums[index];
if ( i == 0 ) {
// ignore
} else if ( i < 0 ) {
if (smallestNeg == 1) {
smallestNeg = i;
} else if ( i > smallestNeg ) {
negProduct *= smallestNeg;
smallestNeg = i;
} else {
negProduct *= i;
}
} else {
// i is strictly positive
posProduct *= i;
}
//Recursive call here!
//Notice the index+1 for the index parameter which carries the progress
//in the array traversal
return recursiveArrayTraversal(posProduct, negProduct,
smallestNeg, nums, index+1);
}
}
First, break the array in subproblems always you find a 0 in the list:
1 -2 4 -1 8 0 4 1 0 -3 -4 0 1 3 -5
|_____________| |____| |____| |_______|
p1 p2 p3 p4
Then, for each problem pi, count how many negative numbers are there.
If pi has an even number of negatives (or no negatives at all), the answer of pi is the product of all its elements.
If pi has only 1 negative number (say n), the answer will be the maximum between the product of all the elements in n's right and the product of all elements in n's left.
If pi has an odd number (bigger than only 1) of negative numbers, call the index of the leftmost negative number l and the index of the rightmost negative number r. Supposing pi has n elements, the answer will be:
max(
pi[ 0 ] * pi[ 1 ] * ... * pi[r - 1],
pi[l + 1] * pi[l + 2] * ... * pi[ n ]
)
Knowing that, it's easy to write a recursion for each step of the solution of this problem: a recursion to divide problems at zeros, another to count negatives and another to find answers, in O(n).
Linear version
List<Integer> vals = new ArrayList<>(List.of(5,1,-2,1,2,3,-4,-1));
int prod = 0;
int min = 1;
for (int v : vals) {
if (v == 0) {
// ignore zero values
continue;
}
if (prod == 0) {
prod = 1;
}
prod *= v;
// compute min to be the largest negative value in the list.
if (v < 0 && min < Math.abs(v)) {
min = v;
}
}
if (prod < 0) {
prod /= min;
}
System.out.println("Maximum product = " + prod);
}
Recursive version
int prod = prod(vals, new int[] {0} , vals.size());
System.out.println("Maximum product = " + prod);
public static int prod(List<Integer> vals, int[]min, int size) {
int prod = 0;
if(vals.size() > 0) {
int t = vals.get(0);
if (t < 0 && min[0] < Math.abs(t)) {
min[0] = t;
}
prod = prod(vals.subList(1,vals.size()), min, vals.size());
}
if (vals.isEmpty() || vals.get(0) == 0) {
return prod;
}
if (prod == 0) {
prod = 1;
}
prod *= t;
if (vals.size() == size && prod < 0) {
prod/=min[0];
}
return prod;
}
This is my solution - leaving it open for optimization and to figure out the runtime. This is a general purpose solution that finds the products of all the combinations of integers in a list. Of course, there is a O(n) solution but I present this solution as well.
import java.util.ArrayList;
import java.util.List;
public class MaxProd {
int[] input = {1, 2, 3};
// int[] input = {-2, -1, 1, 2, 3};
public static void main(String[] args) {
MaxProd m = new MaxProd();
List<Integer> ll = m.max(0);
for (int i : ll) {
System.out.println(i);
}
ll.sort((x,y) -> Integer.compare(x, y));
System.out.println("The max: " + ll.get(ll.size() -1 ));
}
private List<Integer> max(int index) {
if (index < input.length){
List<Integer> l = new ArrayList<>();
List<Integer> retList = max(index + 1);
for (int j : retList){
l.add(input[index] * j);
}
l.add(input[index]);
l.addAll(retList);
return l;
}
else return new ArrayList<>();
}
}
it prints:
6
2
3
1
6
2
3
The max: 6
If the requirements are constrained (as in this case) then one can get by without the need for generating all combinations resulting in a linear solution. Also, I'm sorting at the end. Note: you could easily get the result with a single pass on the returned list to find the maximum product as specified in other answers.
Given an integer A representing the square blocks. The height of each square block is 1. The task is to create a staircase of max height using these blocks. The first stair would require only one block, the second stair would require two blocks and so on. Find and return the maximum height of the staircase.
Your submission failed for the following input: A : 92761
Your function returned the following : 65536
The expected returned value : 430
Approach:
We are interested in the number of steps and we know that each step Si uses exactly Bi number of bricks. We can represent this problem as an equation:
n * (n + 1) / 2 = T (For Natural number series starting from 1, 2, 3, 4, 5 …)
n * (n + 1) = 2 * T
n-1 will represent our final solution because our series in problem starts from 2, 3, 4, 5…
Now, we just have to solve this equation and for that we can exploit binary search to find the solution to this equation. Lower and Higher bounds of binary search are 1 and T.
CODE
public int solve(int A) {
int l=1,h=A,T=2*A;
while(l<=h)
{
int mid=l+(h-l)/2;
if((mid*(mid+1))==T)
return mid;
if((mid*(mid+1))>T && (mid!=0 && (mid*(mid-1))<=T) )
return mid-1;
if((mid*(mid+1))>T)
h=mid-1;
else
l=mid+1;
}
return 0;
}
To expand on the comment by Matt Timmermans:
You know that for n steps, you need (n * (n + 1))/2 blocks. You want know, if given B blocks, how many steps you can create.
So you have:
(n * (n + 1))/2 = B
(n^2 + n)/2 = B
n^2 + n = 2B
n^2 + n - 2B = 0
That looks suspiciously like something for which you'd use the quadratic formula.
In this case, a=1, b=1, and c=(-2B). Plugging the numbers into the formula:
n = ((-b) + sqrt(b^2 - 4*a*c))/(2*a)
= (-1 + sqrt(1 - 4*1*(-2B)))/(2*a)
= (-1 + sqrt(1 + 8B))/2
= (sqrt(1 + 8B) - 1)/2
So if you have 5050 blocks, you get:
n = (sqrt(1 + 40400) - 1)/2
= (sqrt(40401) - 1)/2
= (201 - 1)/2
= 100
Try it with the quadratic formula calculator. Use 1 for the value of a and b, and replace c with negative two times the number of blocks you're given. So in the example above, c would be -10100.
In your program, since you can't have a partial step, you'd want to truncate the result.
Why are you using all these formulas? A simple while() loop should do the trick, eventually, it's just a simple Gaussian Sum ..
public static int calculateStairs(int blocks) {
int lastHeight = 0;
int sum = 0;
int currentHeight = 0; //number of bricks / level
while (sum <= blocks) {
lastHeight = currentHeight;
currentHeight++;
sum += currentHeight;
}
return lastHeight;
}
So this should do the job as it also returns the expected value. Correct me if im wrong.
public int solve(int blocks) {
int current; //Create Variables
for (int x = 0; x < Integer.MAX_VALUE; x++) { //Increment until return
current = 0; //Set current to 0
//Implementation of the Gauss sum
for (int i = 1; i <= x; i++) { //Sum up [1,*current height*]
current += i;
} //Now we have the amount of blocks required for the current height
//Now we check if the amount of blocks is bigger than
// the wanted amount, and if so we return the last one
if (current > blocks) {
return x - 1;
}
}
return current;
}
I'm trying to build a chess AI. My negamax function with alpha-beta pruning (ABP) runs much slower (about 8 times) than separate min and max functions also with ABP, though the moves returned are equal.
My board evaluation function always returns a value with respect to the red player, i.e. the higher the better for red. For Negamax only, this value is multiplied by -1 for the black player when returning at depth 0.
My Negamax function:
int alphaBeta(Board board, int depth, int alpha, int beta) {
if (depth <= 0 || board.isGameOver()) { // game over == checkmate/stalemate
int color = board.getCurrPlayer().getAlliance().isRed() ? 1 : -1;
return BoardEvaluator.evaluate(board, depth) * color;
}
int bestVal = Integer.MIN_VALUE + 1;
for (Move move : MoveSorter.simpleSort(board.getCurrPlayer().getLegalMoves())) {
MoveTransition transition = board.getCurrPlayer().makeMove(move);
if (transition.getMoveStatus().isAllowed()) { // allowed == legal && non-suicidal
int val = -alphaBeta(transition.getNextBoard(), depth - 1, -beta, -alpha);
if (val >= beta) {
return val; // fail-soft
}
if (val > bestVal) {
bestVal = val;
alpha = Math.max(alpha, val);
}
}
}
return bestVal;
}
The root call:
-alphaBeta(transition.getNextBoard(), searchDepth - 1,
Integer.MIN_VALUE + 1, Integer.MAX_VALUE); // +1 to avoid overflow when negating
My min and max functions:
int min(Board board, int depth, int alpha, int beta) {
if (depth <= 0 || board.isGameOver()) {
return BoardEvaluator.evaluate(board, depth);
}
int minValue = Integer.MAX_VALUE;
for (Move move : MoveSorter.simpleSort(board.getCurrPlayer().getLegalMoves())) {
MoveTransition transition = board.getCurrPlayer().makeMove(move);
if (transition.getMoveStatus().isAllowed()) {
minValue = Math.min(minValue, max(transition.getNextBoard(), depth - 1, alpha, beta));
beta = Math.min(beta, minValue);
if (alpha >= beta) break; // cutoff
}
}
return minValue;
}
int max(Board board, int depth, int alpha, int beta) {
if (depth <= 0 || board.isGameOver()) {
return BoardEvaluator.evaluate(board, depth);
}
int maxValue = Integer.MIN_VALUE;
for (Move move : MoveSorter.simpleSort(board.getCurrPlayer().getLegalMoves())) {
MoveTransition transition = board.getCurrPlayer().makeMove(move);
if (transition.getMoveStatus().isAllowed()) {
maxValue = Math.max(maxValue, min(transition.getNextBoard(), depth - 1, alpha, beta));
alpha = Math.max(alpha, maxValue);
if (alpha >= beta) break; // cutoff
}
}
return maxValue;
}
The root calls for red and black players respectively:
min(transition.getNextBoard(), searchDepth - 1, Integer.MIN_VALUE, Integer.MAX_VALUE);
max(transition.getNextBoard(), searchDepth - 1, Integer.MIN_VALUE, Integer.MAX_VALUE);
I'm guessing there's a bug with the cutoff in the Negamax function although I followed the pseudocode from here. Any help is appreciated, thanks!
EDIT: alphaBeta() is called about 6 times more than min() and max() combined, while the number of beta cutoffs is only about 2 times more.
Solved. I should have posted my full code for the root calls as well -- didn't realise I wasn't passing in the new value for beta. Alpha/beta was actually being updated in the root method for separate min-max.
Updated root method for Negamax:
Move bestMove = null;
int bestVal = Integer.MIN_VALUE + 1;
for (Move move : MoveSorter.simpleSort(currBoard.getCurrPlayer().getLegalMoves())) {
MoveTransition transition = currBoard.getCurrPlayer().makeMove(move);
if (transition.getMoveStatus().isAllowed()) {
int val = -alphaBeta(transition.getNextBoard(), searchDepth - 1, Integer.MIN_VALUE + 1, -bestVal);
if (val > bestVal) {
bestVal = val;
bestMove = move;
}
}
}
return bestMove;
Apologies for the lack of information provided in my question -- I didn't expect the bug to be there.
I am stuck on the coin denomination problem.
I am trying to find the lowest number of coins used to make up $5.70 (or 570 cents). For example, if the coin array is {100,5,2,5,1} (100 x 10c coins, 5 x 20c, 2 x 50c, 5 x $1, and 1 x $2 coin), then the result should be {0,1,1,3,1}
At the moment the coin array will consist of the same denominations ( $2, $1, 50c, 20c, 10c)
public static int[] makeChange(int change, int[] coins) {
// while you have coins of that denomination left and the total
// remaining amount exceeds that denomination, take a coin of that
// denomination (i.e add it to your result array, subtract it from the
// number of available coins, and update the total remainder). –
for(int i= 0; i< coins.length; i++){
while (coins[i] > 0) {
if (coins[i] > 0 & change - 200 >= 0) {
coins[4] = coins[4]--;
change = change - 200;
} else
if (coins[i] > 0 & change - 100 >= 0) {
coins[3] = coins[3]--;
change = change - 100;
} else
if (coins[i] > 0 & change - 50 >= 0) {
coins[2] = coins[2]--;
change = change - 50;
} else
if (coins[i] > 0 & change - 20 >= 0) {
coins[1] = coins[1]--;
change = change - 20;
} else
if (coins[i] > 0 & change - 10 >= 0) {
coins[0] = coins[0]--;
change = change - 10;
}
}
}
return coins;
}
I am stuck on how to deduct the values from coins array and return it.
EDIT: New code
The brute force solution is to try up to the available number of coins of the highest denomination (stopping when you run out or the amount would become negative) and for each of these recurse on solving the remaining amount with a shorter list that excludes that denomination, and pick the minimum of these. If the base case is 1c the problem can always be solved, and the base case is return n otherwise it is n/d0 (d0 representing the lowest denomination), but care must be taken to return a large value when not evenly divisible so the optimization can pick a different branch. Memoization is possible, and parameterized by the remaining amount and the next denomination to try. So the memo table size would be is O(n*d), where n is the starting amount and d is the number of denominations.
So the problem can be solved in pseudo-polynomial time.
The wikipedia link is sparse on details on how to decide if a greedy algorithm such as yours will work. A better reference is linked in this CS StackExchange question. Essentially, if the coin system is canonical, a greedy algorithm will provide an optimal solution. So, is [1, 2, 5, 10, 20] canonical? (using 10s of cents for units, so that the sequence starts in 1)
According to this article, a 5-coin system is non-canonical if and only if it satisfies exactly one of the following conditions:
[1, c2, c3] is non-canonical (false for [1, 2, 5])
it cannot be written as [1, 2, c3, c3+1, 2*c3] (true for [1, 2, 5, 10, 20])
the greedyAnswerSize((k+1) * c4) > k+1 with k*c4 < c5 < (k+1) * c4; in this case, this would require a k*10 < 20 < (k+1)*10; there is no integer k in that range, so this is false for [1, 2, 5, 10, 20].
Therefore, since the greedy algorithm will not provide optimal answers (and even if it did, I doubt that it would work with limited coins), you should try dynamic programming or some enlightened backtracking:
import java.util.HashSet;
import java.util.PriorityQueue;
public class Main {
public static class Answer implements Comparable<Answer> {
public static final int coins[] = {1, 2, 5, 10, 20};
private int availableCoins[] = new int[coins.length];
private int totalAvailable;
private int totalRemaining;
private int coinsUsed;
public Answer(int availableCoins[], int totalRemaining) {
for (int i=0; i<coins.length; i++) {
this.availableCoins[i] = availableCoins[i];
totalAvailable += coins[i] * availableCoins[i];
}
this.totalRemaining = totalRemaining;
}
public boolean hasCoin(int coinIndex) {
return availableCoins[coinIndex] > 0;
}
public boolean isPossibleBest(Answer oldBest) {
boolean r = totalRemaining >= 0
&& totalAvailable >= totalRemaining
&& (oldBest == null || oldBest.coinsUsed > coinsUsed);
return r;
}
public boolean isAnswer() {
return totalRemaining == 0;
}
public Answer useCoin(int coinIndex) {
Answer a = new Answer(availableCoins, totalRemaining - coins[coinIndex]);
a.availableCoins[coinIndex]--;
a.totalAvailable = totalAvailable - coins[coinIndex];
a.coinsUsed = coinsUsed+1;
return a;
}
public int getCoinsUsed() {
return coinsUsed;
}
#Override
public String toString() {
StringBuilder sb = new StringBuilder("{");
for (int c : availableCoins) sb.append(c + ",");
sb.setCharAt(sb.length()-1, '}');
return sb.toString();
}
// try to be greedy first
#Override
public int compareTo(Answer a) {
int r = totalRemaining - a.totalRemaining;
return (r==0) ? coinsUsed - a.coinsUsed : r;
}
}
// returns an minimal set of coins to solve
public static int makeChange(int change, int[] availableCoins) {
PriorityQueue<Answer> queue = new PriorityQueue<Answer>();
queue.add(new Answer(availableCoins, change));
HashSet<String> known = new HashSet<String>();
Answer best = null;
int expansions = 0;
while ( ! queue.isEmpty()) {
Answer current = queue.remove();
expansions ++;
String s = current.toString();
if (current.isPossibleBest(best) && ! known.contains(s)) {
known.add(s);
if (current.isAnswer()) {
best = current;
} else {
for (int i=0; i<Answer.coins.length; i++) {
if (current.hasCoin(i)) {
queue.add(current.useCoin(i));
}
}
}
}
}
// debug
System.out.println("After " + expansions + " expansions");
return (best != null) ? best.getCoinsUsed() : -1;
}
public static void main(String[] args) {
for (int i=0; i<100; i++) {
System.out.println("Solving for " + i + ":"
+ makeChange(i, new int[]{100,5,2,5,1}));
}
}
}
You are in wrong direction. This program will not give you an optimal solution. To get optimal solution go with dynamic algorithms implemented and discussed here. Please visit these few links:
link 1
link 2
link 3
Im trying to build a game tree to my game in order to find my next move.
At first, Im building the tree using a recursive algorithm, and then, to find the best move Im using the alpha - beta pruning algorithm.
I want to build the game tree using the alpha - beta pruning in order to minimize the size of the game tree, but Im having problem writing the algorithm.
Could you help me add the alpha - beta pruning to the expand algorithm?
Here is the expand algorithm:
public void expand(int depth)
{
expand++;
if(depth > 0)
{
this.children = new ArrayList<GameTreeNode>();
List<Move> possibleMoves = this.b.possibleMoves(this.b.turn);
ReversiBoard tmp = null;
for(Move m : possibleMoves)
{
TurnState nextState = (this.state == TurnState.PLUS ? TurnState.MINUS : TurnState.PLUS);
tmp = new ReversiBoard(this.b);
tmp.makeMove(m);
int nextTurn = (turn == PLAYER1 ? PLAYER2 : PLAYER1);
if(tmp.possibleMoves(nextTurn).isEmpty())
nextTurn = turn;
this.children.add(new GameTreeNode(tmp, nextState, m, nextTurn));
for(GameTreeNode child : children)
child.expand(depth - 1);
}
}
}
Here is the alpha - beta pruning code:
int alphaBetaMax( int alpha, int beta, int depthleft ) {
alphaBetaNum++;
if ( depthleft == 0 ) return this.b.evaluate();
for (GameTreeNode tree : this.children) {
bestValue = alphaBetaMin( alpha, beta, depthleft - 1 );
if( bestValue >= beta )
{
bestMove = tree.move;
return beta; // fail hard beta-cutoff
}
if( bestValue > alpha )
alpha = bestValue; // alpha acts like max in MiniMax
}
return alpha;
}
int alphaBetaMin( int alpha, int beta, int depthleft ) {
alphaBetaNum++;
if ( depthleft == 0 ) return -this.b.evaluate();
for ( GameTreeNode tree : this.children) {
bestValue = alphaBetaMax( alpha, beta, depthleft - 1 );
if( bestValue <= alpha )
{
bestMove = tree.move;
return alpha; // fail hard alpha-cutoff
}
if( bestValue < beta )
beta = bestValue; // beta acts like min in MiniMax
}
return beta;
}
public void summonAlphaBeta(int depth)
{
this.bestValue = alphaBetaMax(Integer.MIN_VALUE, Integer.MAX_VALUE, depth);
}
Thank You!
You have two options.
You could just combine the two algorithms by converting your expand method into expandAndReturnMin and expandAndReturnMax methods which each take the alpha and beta values as arguments. Ideally any shared code would be put into a third method to keep your code clean.
Here is some example code for you to consider. In this example I've assumed a static member is storing the best move.
public int bestValue(Board board, int depth, int alpha, int beta, boolean aiPlayer) {
if (depth >= MAX_DEPTH || board.possibleMoves(aiPlayer).isEmpty()) {
return board.getValue();
} else {
for (Move move: board.possibleMoves(aiPlayer) {
int value = bestValue(board.makeMove(move), depth + 1, alpha, beta, !aiPlayer);
if (aiPlayer && value > alpha) {
alpha = value;
bestMove = move;
if (alpha >= beta)
break;
} else if (!aiPlayer && value < beta) {
beta = value;
bestMove = move;
if (beta >= alpha)
break;
}
}
return aiPlayer ? alpha : beta;
}
}
The best initial move is determined by:
board.bestValue(board, 0, Integer.MIN_VALUE, Integer.MAX_VALUE, true);
and then using board.getBestMove().
A more elegant solution would be to store the alpha and beta values in the tree itself. That is very simple: after generating each child node you update the values in the current node. Then if they fall outside the allowed range you can stop generating child nodes. This is the more standard approach and is computationally cheap but makes the nodes use more memory.