The origin of the project was the idea that came up while playing Sudoku in a temporary dormitory in 2019. At that time, there was not enough time to proceed, so it was put on hold until a few days ago. The implementation process was relatively simple, the only small pitfall was the choice of the method for number recognition. I originally planned to use neural networks, but the models I could find were trained on mnist, and the accuracy was only around 60~80% when tested. I didn’t want to spend time training and fine-tuning the model, so I ended up using template matching method. In the end, the project was completed, which can be considered as completing a project that was delayed for 4 years.
Video demonstration: Backtracking recursion method solves Sudoku puzzles with one click.
General rules of Sudoku
Sudoku is a type of logic puzzle, usually composed of a 9×9 grid divided into 9 small 3×3 squares. The goal is to fill each square with numbers from 1 to 9, so that each row, each column, and each small square contains each number exactly once, without repetition.
The basic rules of Sudoku are as follows:
- Each square can only be filled with a number from 1 to 9.
- Each number in every row must be unique, i.e., there cannot be any duplicate numbers in the same row.
- Each number in a column must be unique, i.e., there cannot be duplicate numbers in the same column.
- Each number in the small square must be unique, that is, there cannot be duplicate numbers in the same small square.
Based on these rules, Sudoku puzzles usually provide some pre-filled numbers as hints at the beginning. Players need to fill in the remaining blanks based on these known numbers through logical reasoning and inference until all squares are filled and comply with the rules.
Get Sudoku board state
Sudoku game is Sudoku Zenkai on Steam. Enter the game, choose the difficulty, and the following puzzles will appear. Fill in the corresponding positions with numbers.
This screenshot uses the pyautogui and PIL libraries. The specific code is as follows:
import pyautogui
from PIL import ImageGrab
def get_window(window_name="Sudoku Zenkai"):
# 获取窗口的位置和大小
window = pyautogui.getWindowsWithTitle(window_name)[0]
# 获取窗口的位置和大小
window_left = window.left
window_top = window.top
window_width = window.width
window_height = window.height
# 获取指定窗口的图像
screenshot = ImageGrab.grab((window_left, window_top, window_left + window_width, window_top + window_height))
# 保存图像
# screenshot.save("window_screenshot.png")
# 将PIL图像转换为NumPy数组
numpy_image = np.array(screenshot)
# 将NumPy数组转换为OpenCV图像
opencv_image = cv2.cvtColor(numpy_image, cv2.COLOR_RGB2BGR)
return opencv_image, (window_left, window_top)
This function first retrieves the position information of the specified window, and then calls the grab function to retrieve the image corresponding to the position. Therefore, it is necessary to ensure that the window is on the top layer in order to capture the desired window image.
Afterwards, we need to determine the position of the chessboard, which is the coordinates of the top-left corner of the chessboard. This step requires the use of two key functions.
filter_color
is used to filter the image based on the target color, leaving only the pixels of that color.
get_white_rect
function is used to perform a statistical analysis on the filtered pixels and obtain the position of the top-left corner.
def filter_color(image, target_color, tolerance=20):
# 创建一个与图像大小相同的掩码
mask = np.zeros(image.shape[:2], dtype=np.uint8)
# 在掩码中将与目标颜色相等的像素置为白色(255)
mask[(image == target_color).all(axis=2)] = 255
return mask
def get_white_rect(image):
top_left = None
bottom_right = None
_, binary = cv2.threshold(image, 127, 255, cv2.THRESH_BINARY)
# 获取非零像素的位置
white_pixels = cv2.findNonZero(binary)
if white_pixels is not None:
# 获取最左上和最右下的白点的位置
top_left = np.min(white_pixels, axis=0)[0]
bottom_right = np.max(white_pixels, axis=0)[0]
return top_left, bottom_right
The complete function for obtaining the chessboard is as follows. Its purpose is to convert the chessboard displayed in image form (game screenshot) into a 9×9 numpy array, making it easier for the next step of recursive solving.
def get_sudoku_board(window_img, max_col=9, max_row=9, box_size=64, padding=0):
# 定义目标颜色
target_color = (166, 146, 132) # BGR颜色
# 过滤图像中的单一颜色
filtered_image = filter_color(window_img, target_color)
top_left, bottom_right = get_white_rect(filtered_image)
core_image = window_img[top_left[1]:bottom_right[1], top_left[0]:bottom_right[0]]
core_image = cv2.resize(core_image, (572, 572))
# 坐标
cord_x = padding
cord_y = padding
# 用来剔除分割线的参数
erode_level = 9
# 9x9棋盘,每个位置代表对应位置的数字,0为空,即待填数字
puzzle = np.zeros((max_col, max_col), dtype=np.uint8)
# 切分棋盘并识别每个位置的数字
for i in range(max_row):
cord_x = padding
for j in range(max_col):
# 获取对应位置图像并剔除分割线
box_image = core_image[cord_y:cord_y + box_size, cord_x:cord_x + box_size]
erode_image = box_image[erode_level:-erode_level, erode_level:-erode_level]
erode_height, erode_width = erode_image.shape[:2]
# 调整尺寸 方便识别数字
square_image = np.zeros((box_size, box_size, 3), dtype=np.uint8)
x = (box_size - erode_width) // 2
y = (box_size - erode_height) // 2
square_image[y:y + erode_height, x:x + erode_width] = erode_image
# 二值化并识别数字
_, binary_image = cv2.threshold(square_image, 180, 255, cv2.THRESH_BINARY)
puzzle[i][j] = get_digit_by_template(binary_image)
cord_x += box_size + padding
cord_y += box_size + padding
print("Difficulty: %d" % (81 - np.count_nonzero(puzzle)))
return puzzle, top_left, bottom_right[0] - top_left[0]
The double loop will divide the digital image into 81 positions, as shown below:
The next thing to do is to use the get_digit_by_template
function to convert each image into an integer variable. Of course, if it is a number to be filled in, then the corresponding image on the chessboard is naturally empty, and the image will be completely black.
As mentioned in the preface, a template matching method is used here to recognize numbers. The core code is as follows:
def get_digit_by_template(image):
img_gray = cv.cvtColor(image, cv.COLOR_RGB2GRAY)
top_left, bottom_right = get_white_rect(img_gray)
if top_left is None:
# 如果是全黑的,那么就是零
return 0
confidence = np.zeros(9)
# 读取模板图像和待识别图像
template_path = 'template/'
resized_image = cv2.resize(img_gray, (64, 64))
# 1到9共9个数字
for i in range(9):
image_path = os.path.join(template_path, "%d.png"%(i+1))
template = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
result = cv2.matchTemplate(resized_image, template, cv2.TM_CCOEFF_NORMED)
# 获取最相似的模板编号
confidence[i] = np.max(result)
return np.argmax(confidence)+1
Template image as follows:
So here is the process of extracting the chessboard from the image completed.
Backtracking algorithm to solve Sudoku.
Backtracking is a commonly used algorithmic strategy for problem-solving. It involves trying out various possible solutions and, when unable to proceed further, backtracking (returning) to the previous step and choosing a different path to continue the search until a solution is found or it is determined that there is no solution.
Backtracking recursion is usually used to search for solutions that meet specific conditions in a set of candidate solutions. Its basic idea is to continuously try various possible choices through the continuous invocation of recursive functions, and make conditional judgments at each step to determine whether the requirements of the problem are met. If the current choice does not meet the conditions, backtrack to the previous step and try other choices.
The general steps of backtracking recursion are as follows:
- Define a recursive function: Define a recursive function based on the specific problem, determine the input parameters and return values, and write the recursive termination condition.
- Make a selection: In a recursive function, make a choice according to the requirements of the problem, which may be selecting one from a group of possible options or choosing a value within a certain range.
- Check conditions: For the current selection, check if it meets the requirements or constraints of the problem. If the conditions are met, continue with the recursive call; if the conditions are not met, backtrack to the previous step.
- Recursive call: In the case of meeting the condition, by making a recursive call to itself, continue to make choices in the next step and further explore possible solutions.
- Backtracking: If the result returned by the recursive call is a failure or an unsolvable situation, backtrack to the previous step, undo the current selection, and try other choices.
- Return result: In the termination condition of the recursive function, return the final solution or indicate the result of no solution.
The key to backtracking recursion lies in continuously making choices, conditionally judging, and recursively calling, searching the entire solution space to find a solution to the problem. The efficiency of this algorithm is usually affected by the size of the problem and the number of choices, so when designing the algorithm, attention should be paid to the order of choices and pruning strategies to reduce the search space and improve efficiency.
The code to be solved is as follows:
import copy
def solve_sudoku(board):
# 检查当前位置是否可填数字
def is_valid(board, row, col, num):
# 检查行是否合法
for i in range(9):
if board[row][i] == num:
return False
# 检查列是否合法
for i in range(9):
if board[i][col] == num:
return False
# 检查小九宫格是否合法
start_row = 3 * (row // 3)
start_col = 3 * (col // 3)
for i in range(3):
for j in range(3):
if board[start_row + i][start_col + j] == num:
return False
return True
# 寻找下一个待填数字的位置
def find_empty(board):
for i in range(9):
for j in range(9):
if board[i][j] == 0:
return i, j
return None
# 递归解决数独
def solve(board):
# 找到下一个待填数字的位置
find = find_empty(board)
if not find:
return True
else:
row, col = find
# 尝试填入数字
for num in range(1, 10):
if is_valid(board, row, col, num):
board[row][col] = num
# 递归尝试填下一个位置
if solve(board):
return True
# 如果无解,回溯到上一个位置
board[row][col] = 0
return False
# 创建数独副本,并进行求解
board_copy = copy.deepcopy(board)
if solve(board_copy):
return board_copy
else:
return None
Simulate mouse and keyboard actions to automatically complete Sudoku.
import time
import pyautogui
from puzzle_getter import get_sudoku_board
from window_getter import get_window
from sudoku_solver import solve_sudoku
# 延迟运行画面捕捉,确保数独程序在最上层显示
# 当然也可以使用按键触发
print("Starting in 2s ...")
time.sleep(2)
print("Started.")
# 获取图像,生成numpy棋盘
image, (window_left, window_top) = get_window()
sudoku_board, top_left_pos, board_width = get_sudoku_board(image)
sudoku_board = list(sudoku_board)
# 求解数独
start_time = time.time()
solution = solve_sudoku(sudoku_board)
end_time = time.time()
if solution is not None:
print("Used time : %.4fs"%(end_time-start_time))
for row in solution:
print(row)
else:
print("No solution found.")
move_step = board_width // 9
base_position = (window_left + top_left_pos[0] + move_step // 2, window_top + top_left_pos[1] + move_step // 2)
# 设定每次操作间隔,单位s
pyautogui.PAUSE = 0.0015
for j in range(9):
for i in range(9):
if sudoku_board[j][i] != 0:
continue
pyautogui.click(base_position[0] + i * move_step, base_position[1] + j * move_step)
# time.sleep(0.001)
pyautogui.press(str(solution[j][i]))
certainly like your web site but you need to check the spelling on several of your posts. Several of them are rife with spelling problems and I find it very bothersome to tell the truth nevertheless I will certainly come back again.