# How to Solve and Create Sudoku Grids in Python

A look a how the backtracking algorithm works using the example of a Sudoku grid

Photo by [John Morgan](https://unsplash.com/@iamfrancismorgan?utm_source=unsplash\&utm_medium=referral\&utm_content=creditCopyText) on [Unsplash](https://unsplash.com/s/photos/sudoku?utm_source=unsplash\&utm_medium=referral\&utm_content=creditCopyText)

What is the problem I am trying to solve? This is the reason I’m actually doing this: because I started creating a Sudoku book (on KDP) using grids generated by an online service, and PDF generation via React. My daughter came to me and said: "Daddy, daddy! Actually, in this grid, there are two possibilities!” And so I wanted to check that I hadn’t made a mistake and that my thing worked. And that’s why I coded this program that does this test. The nice thing about all this is that if you know how to solve sudoku in python, you also know how to generate it.

The code is relatively simple, using four functions: The first one reads the data and creates the grid, the second one looks for the possibilities of a square, the third one that solves, that does the first resolution the simple resolution of a grid, and the fourth one that does the complete resolution of the grid.

### Creating a grid with createGrid

How is the grid structured? First, we read the existing data and then we create an array, actually an array of arrays. In each of these little tables, the nine, well they have nine digits and these nine digits can be between 0 and 9, 0 meaning an empty cell, the digits 1 to 9 meaning the cells already filled in.

### The retrieval of the possibilities of a cell with getPossibilities

In this array of numbers 0, 9 we will make a first function called `getPossibilities`

. This function will return the array of all the possibilities, in the current state of the grid, for one of the cells. If the cell is not null, the only possibility is the number that is already filled in, so we return an array containing only that number. If the cell is currently 0, we look at all the numbers (greater than 0) that are in the same row, column, or 3x3 cell, and we list them in an array.

We then iterate from 1 to 9 to find the set of numbers that are not contained in the table of existing numbers, and that are therefore possible candidates for this case. This function will then return an array that can have no elements (if there is an error in the grid), only one element (which is, therefore, the right element for the square), or several elements.

the getPossibilities function

### A first simple resolution with simpleSolve

The function simpleSolve will do a basic resolution as its name indicates. This basic resolution starts by making a deep copy of the array so as not to alter the array received as a parameter. Then the function iterates over all the cells. For each cell of the grid that contains a 0, it calls `getPossibilities`

to get the possibilities on this cell. There are then three possible outcomes. First, there is only one possibility in a cell: in this case, the function will fill this possibility in the cell (and that’s why we made a deep copy at the beginning). Second, there is no possibility: in this case, the function returns an error. Third, there are several possibilities: in this case, the function increments a counter which is used to measure the number of 0s left in the grid. Once the function has looped over all the cells, one of two things can happen. Either we have found new values but there are still 0s in the grid, in which case we start a new loop by resetting the counter of 0s. Or, there has been no (or no more) replacement or there are no 0s left in the grid, in which case the function has reached the end of what it can do. (If there are no more 0’s, the grid has been solved.) The function, therefore, returns what it has succeeded in doing with the grid, as well as the number of 0’s remaining.

the simpleSolve function

### The complete solving with solveGrid

The `solveGrid `

function will start by running`simpleSolve`

on the grid, which will also have the added benefit of making a deep copy of the grid. If the `simpleSolve`

does not solve the grid, i.e. if it returns a grid where there are 0’s left, this means that there are cells with several possibilities. What do we do then? Well, we try the different possibilities. We choose a cell with several possibilities (for example one of the cells with the least number of different possibilities). Then we create several copies of the grid, one for each possibility. In these grids, we apply each of the possibilities. As if new realities were being created. Like in quantum mechanics, like Schrödinger’s cat, we create a variant of the reality where we say: "this cell has this possibility now”.

What do we do with these grids? We do a recursion: we have `solveGrid`

call itself with the new grids. And these new executions will apply everything described above to each grid variant, starting with `simpleSolve`

. And if the function still finds cells where there are several possibilities, it will itself restart variant resolutions. The process will unfold and will make branches, and branches of branches, like a tree that explores the viability of different possibilities.

In these branches, several things can happen. The simplest case is to end up with a cell with no possibilities. In this case, the branch is not viable: we put it aside, it will just return an empty array. On the other hand, it is also possible to arrive at a valid solution, a solved grid without any 0. In this case, the function returns an array containing the solution.

Now, let’s go back to the context of the function that created several branches. Each branch, each execution returns an array containing the solutions found. At this point, the function concatenates these arrays, and returns them to their parent, as long as this array of valid grids has 0 or 1 element, and this is how the solutions move back up the tree of possibilities. If the array has two or more solutions, the function creates an error, because the presence of two valid grids means that the parent grid is wrong.

There are three possible outcomes, in the case of the first execution of my `solveGrid`

function, depending on the number of solution grids found. If there are several solutions, the grid does not contain enough information to be solved. If no solutions were found, the grid is broken. And if only one solution was found, the grid is valid.

the solveGrid function

### Key Takeways

This recursive backtracking system allows us not only to check that a grid is valid but also to measure the grid’s difficulty by measuring how many levels of recursion were needed since each level means the grid involves another non-obvious cell.
As a bonus, the same code also allows us to generate valid grids. Given an initial set of numbers (placed randomly or by hand), we can call `solveGrid`

function and take a branch that returns only one solution. And we can choose a level of depth of the recursion tree to calibrate the difficulty.

The full code can be found here

Have fun!