Algorithm Complexity (Big-O) of sudoku solver - java

I'm look for the "how do you find it" because I have no idea how to approach finding the algorithm complexity of my program.
I wrote a sudoku solver using java, without efficiency in mind (I wanted to try to make it work recursively, which i succeeded with!)
Some background:
my strategy employs backtracking to determine, for a given Sudoku puzzle, whether the puzzle only has one unique solution or not. So i basically read in a given puzzle, and solve it. Once i found one solution, i'm not necessarily done, need to continue to explore for further solutions. At the end, one of three possible outcomes happens: the puzzle is not solvable at all, the puzzle has a unique solution, or the puzzle has multiple solutions.
My program reads in the puzzle coordinates from a file that has one line for each given digit, consisting of the row, column, and digit. By my own convention, the upper left square of 7 is written as 007.
Implementation:
I load the values in, from the file, and stored them in a 2-D array
I go down the array until i find a Blank (unfilled value), and set it to 1. And check for any conflicts (whether the value i entered is valid or not).
If yes, I move onto the next value.
If no, I increment the value by 1, until I find a digit that works, or if none of them work (1 through 9), I go back 1 step to the last value that I adjusted and I increment that one (using recursion).
I am done solving when all 81 elements have been filled, without conflicts.
If any solutions are found, I print them to the terminal.
Otherwise, if I try to "go back one step" on the FIRST element that I initially modified, it means that there were no solutions.
How can my programs algorithm complexity? I thought it might be linear [ O(n) ], but I am accessing the array multiple times, so i'm not sure :(
Any help is appreciated

O(n ^ m) where n is the number of possibilities for each square (i.e., 9 in classic Sudoku) and m is the number of spaces that are blank.
This can be seen by working backwards from only a single blank. If there is only one blank, then you have n possibilities that you must work through in the worst case. If there are two blanks, then you must work through n possibilities for the first blank and n possibilities for the second blank for each of the possibilities for the first blank. If there are three blanks, then you must work through n possibilities for the first blank. Each of those possibilities will yield a puzzle with two blanks that has n^2 possibilities.
This algorithm performs a depth-first search through the possible solutions. Each level of the graph represents the choices for a single square. The depth of the graph is the number of squares that need to be filled. With a branching factor of n and a depth of m, finding a solution in the graph has a worst-case performance of O(n ^ m).

In many Sudokus, there will be a few numbers that can be placed directly with a bit of thought. By placing a number in the first empty cell, you give up on a lot of opportunities to reduce the possibilities. If the first ten empty cells have lots of possibilities, you get exponential growth. I'd ask the questions:
Where in the first line can the number 1 go?
Where in the first line can the number 2 go?
...
Where in the last line can the number 9 go?
Same but with nine columns?
Same but with the nine boxes?
Which number can go into the first cell?
Which number can go into the 81st cell?
That's 324 questions. If any question has exactly one answer, you pick that answer. If any question has no answer at all, you backtrack. If every question has two or more answers, you pick a question with the minimal number of answers.
You may get exponential growth, but only for problems that are really hard.

Related

Suffix array nlogn creation

I have been learning suffix arrays creation, & i understand that We first sort all suffixes according to first character, then according to first 2 characters, then first 4 characters and so on while the number of characters to be considered is smaller than 2n.
But my doubt is why don't we choose the first 3 characters, then 9... and so on. Why only 2 characters are taken into account since the strings are a part of same strings and not different random strings?
I haven't analyzed the suffix array construction algorithm thoroughly, but still would like to share my thoughts.
In my humble opinion, your question is similar to the following ones:
Why do computers use binary encoding of information instead of ternary?
Why does binary search bisect the range instead of trisecting it?
Why are there two sexes rather than three?
The reason is that the number 2 is special - it is the smallest plural number. The difference between 1 and 2 is qualitative, whereas the difference between 2 and 3 (as well as any other positive integer) is quantitative and therefore not as drastic.
As a result, binary formulation of many algorithms and data structures turns out to be the simplest one, though some of them may be generalized, with various degrees of added complexity, for an arbitrary base.
Answer is given from the post you linked. And as #Leon answered, the algorithm work because it use a dichotomous approach to solve the sorting problem. if you correctly read the answer, the main purpose is to divide word be small 2 character fragments. So that 4 characters can be easily sort base on the arrangement of the 2 pair of characters, 6 characters with 4-2 or 2-4 or 2-2-2 and so one. Thus have a word of 3 letters in the table is non-sense since word of 3 characters may be seen has 2 characters + the position in the alphabet of the last character.
I think you are considering only the speed of 2^x versus 3^x where you obviously would prefer the latter.
But you have to consider the effort you need for each step.
Since 3^x needs about 1.58 less steps than 2^x you would need to be able to compute a single step for the 3^x growth in less than 1.58 times what you need for a single step in the 2^x growth to perform better.
Generally the problems will get much more complex when you have to handle three elements in each step instead of two.
Also if you could expand it to 3^x you could also do it for a bigger n^x and then with big n your algorithm is suddenly not exponential but effectively linear.

Simple algorithm for a sudoku solver java

I've been stuck on this thing for a while, I just can't wrap my head around it. For a homework, I have to produce an algorithm for a sudoku solver that can check what number goes in a blank square in a row, in a column and in a block. It's a regular 9x9 sudoku and I'm assuming that the grid is already printed so I have to produce the part where it solves it.
I've read a ton of stuff on the subject I just get stuck expressing it.
I want the solver to do the following:
If the value is smaller than 9, increase it by 1
If the value is 9, set it to zero and go back 1
If the value is invalid, increase by 1
I've already read about backtracking and such but I'm in the early stage of the class so I'd like to keep it as simple as possible.
I'm more capable of writing in pseudo code but not so much with the algorithm itself and it's the algorithm that is needed for this exercise.
Thanks in advance for your help guys.
Seeing as it's homework, I believe I can point you in the general direction.
To start, keep a two-dimensional array (or a data structure that can represent the grid), and keep track of the values that can go there. Let's say it's a class named "possibilities":
public class Possibilities {
//Keep track of the numbers possible internally, with an accessor
}
The way sudoku works, there will usually be a square with only a single answer (in some cases, this won't be available, which means you need to potentially make a copy of the data and play out a little bit, or have a way to roll back). Simply put, fill in the answer, and then iterate over the adjacent squares to remove the freshly put number as a possibility (And check those squares simultaneously as new potential answers).
I think the easiest algorithm for solving a sudoku puzzle is a complete search. That is, trying every single combination until you find one that works. The easiest way to implement this is recursively. I know you don't want to get involved in backtracking, but I actually think that would be the easiest way for you to write this algorithm.
Assume you already have an algorithm that checks whether n can go in some cell at (i, j) in your board. This means that n does not violate any of the constraints (there can only be one number from 1 .. 9 in each row, column and box). This should not be too hard, you just have to loop through the row, column and box that contains cell (i, j) and make sure that n does not appear yet.
Then, you will have a recursive function called solve() that will return true if it finds a solution, otherwise it will return false. The function will constantly place numbers in empty cells of the sudoku board (only if they don't violate the constraints, which we assume you already write an algorithm for) until it is filled. Once filled, the puzzle is solved. You know that the board is valid because you've been checking the validity of every number you place on the way there. If no number can be placed at any point, it will backtrack by returning false.
The pseudocode for solve will look something like this:
boolean solve()
if the board is filled
return true
for each cell that is not empty
for n = 1 .. 9
if n does not exist in this row, column and box
place n in this cell
if solve()
return true
remove n from this cell
return false

divide and conquer assignment

I have to write a java program to simulate a robot to match lids with it's corresponding jar. The robot has two arms, one for the lids and one for the jars. I can't compare lids with lids or jars with jars. The user will enter three lines:
5(n)
9 7 2 5 6(size of lids)
2 6 5 7 9(size of jars)
The output should be:
3 5 4 2 1
The 3rd number in line 2 is equal to the 1st number in line 3 and so on.
We are supposed to use a divide and conquer algorithm and I really have no idea where to start. All I have to go by is it's similar to quicksort. Any help would be greatly appreciated.
Divide and conquer algorithms might be confusing at first. Think about it as if you have some relatively large problem that you can't solve, but if that problem was much, much smaller you could find the answer. Applying it to this situation: suppose instead of having 2 big lists of lid and jar sizes, you instead have 1 lid size and some number of jar sizes. You could easily tell me which jar that lid fits on, right? The idea of solving the problem for 1 lid is essentially breaking the large problem (several lids) into a smaller one (1 lid). Once that makes sense, you can move on the algorithm.
You will likely employ some recursion in order to write your algorithm. Start with the base case and solve the simplest meaningful problem (I like the 1 lid example). Once you can solve that problem, can you recursively solve the same problem for every lid? I'm not attaching any code because I don't want to spoil the learning experience for you (and this is clearly homework).
The whole point of "divide and conquer" is to divide up the work into multiple, smaller problems; then you solve the smaller problems and roll them up until they are combined into a solution. This pretty much implies a recursive solution.
With any recursive function, you always need a "basis case". This will be a simple case that is trivially easy to solve. For example, if you only have one jar and one lid, then you simply return that the jar matches the lid. (Because as part of the problem statement, you always have one matching lid for each jar.)
So one place to start is a trivial program that only works right for a length-1 list of jars/lids. Then add more machinery to make it more capable.
With quicksort, you choose a place to divide up the numbers (the "pivot"), then do a very rough sort (just take numbers that should be on the left of the pivot but are on the right and move them to the left, and vice versa). Then you call quicksort recursively on the sublist. Eventually each of the recursive calls to quicksort hits a basis case (a length-1 sublist); once they all have hit the basis case the quicksort is done. (Note: there are ways to optimize quicksort and make it faster by adding more code, but I'm talking about the simplest implementation of quicksort here.)
Maybe in this case you should start with a length-n list of just the numbers from 1 to n, and and then swap the numbers around until you have a correct list?
Hmm.  With length-2 lists, there are only two possibilities: the lists line up, or not.  If they line up you are done.  If not, you swap the numbers to make them line up, and you are done.  Hmm.  This is similar to sorting in a way, but you can't just compare numbers directly like you can when you are sorting.  (In sorting you always know that 3 sorts below 5, but here it might not be so.) So, now think about a way to break down the list and keep doing it until you have a length-2 or length-1 sublist, then handle those trivial cases.
Sounds like a fun problem. I hope you enjoy working on it.

Application of BFS or DFS

I need help in solving this problem, I tried using a 2D array and then finding the least number of swaps. Not sure exactly how to go about this problem. Whether to use BFS or DFS?
You are given two four digits numbers. The first number is the initial number, and the second one is the target number. Write a java program to transform the initial number into the target number using the fewest possible operations. The available operations are as follows:
Add 1 to one of the four digits. Adding 1 to a 9 results in 0.
Subtract 1 from one of the four digits. Subtracting 1 from 0 results in 9.
Swap two adjacent digits
eg 1:
initial no :1111
final no : 9999
min no of operations :8
eg 2:
initial no :1234
final no : 2144
min no of operations :2
BFS.
When DFS finds first solution it is usually not one found in the smallest possible number of moves. It can also explore long, pointless paths when solution is close (it can get stuck in infinite loop if you don't remember visited nodes). These problems could be solved by iterative deepening DFS, which might be desirable if there are memory constraints, but BFS is simpler for such small search space.
You should use BFS algorithm, because it will give you the shortest possible way to transform the first number to the targeted one. DFS only explore the paths, and not by shortest way. In some cases, DFS might find the solutions faster than BFS, but there is no algorithmic guarantee for that.

Fast counting of 2D sub-matrices withing a large, dense 2D matrix?

What's a good algorithm for counting submatrices within a larger, dense matrix? If I had a single line of data, I could use a suffix tree, but I'm not sure if generalizing a suffix tree into higher dimensions is exactly straightforward or the best approach here.
Thoughts?
My naive solution to index the first element of the dense matrix and eliminate full-matrix searching provided only a modest improvement over full-matrix scanning.
What's the best way to solve this problem?
Example:
Input:
Full matrix:
123
212
421
Search matrix:
12
21
Output:
2
This sub-matrix occurs twice in the full matrix, so the output is 2. The full matrix could be 1000x1000, however, with a search matrix as large as 100x100 (variable size), and I need to process a number of search matrices in a row. Ergo, a brute force of this problem is far too inefficient to meet my sub-second search time for several matrices.
For an algorithms course, I once worked an exercise in which the Rabin-Karp string-search algorithm had to be extended slightly to search for a matching two-dimensional submatrix in the way you describe.
I think if you take the time to understand the algorithm as it is described on Wikipedia, the natural way of extending it to two dimensions will be clear to you. In essence, you just make several passes over the matrix, creeping along one column at a time. There are some little tricks to keep the time complexity as low as possible, but you probably won't even need them.
Searching an N×N matrix for a M×M matrix, this approach should give you an O(N²⋅M) algorithm. With tricks, I believe it can be refined to O(N²).
Algorithms and Theory of Computation Handbook suggests what is an N^2 * log(Alphabet Size) solution. Given a sub-matrix to search for, first of all de-dupe its rows. Now note that if you search the large matrix row by row at most one of the de-duped rows can appear at any position. Use Aho-Corasick to search this in time N^2 * log(Alphabet Size) and write down at each cell in the large matrix either null or an identifier for the matching row of the sub-matrix. Now use Aho-Corasick again to search down the columns of this matrix of row matches and signal a match where all the rows are present below each other.
This sounds similar to template matching. If motivated you could probably transform your original array with the FFT and drop a log from a brute force search. (Nlog(M)) instead of (NM)
I don't have a ready answer but here's how I would start:
-- You want very fast lookup, how much (time) can you spend on building index structures? When brute-force isn't fast enough you need indexes.
-- What do you know about your data that you haven't told us? Are all the values in all your matrices single-digit integers?
-- If they are single-digit integers (or anything else you can represent as a single character or index value), think about linearising your 2D structures. One way to do this would be to read the matrix along a diagonal running top-right to bottom-left and scanning from top-left to bottom-right. Difficult to explain in words, but read the matrix:
1234
5678
90ab
cdef
as 125369470c8adbef
(get it?)
Now you can index your super-matrix to whatever depth your speed and space requirements demand; in my example key 1253... points to element (1,1), key abef points to element (3,3). Not sure if this works for you, and you'll have to play around with the parameters to your solution. Choose your favourite method for storing the key-value pairs: a hash, a list, or even build some indexes into the index if things get wild.
Regards
Mark

Categories