UCT-ADP Progressive Bias Algorithm for Solving Gomoku

12/11/2019
by   Xu Cao, et al.
0

We combine Adaptive Dynamic Programming (ADP), a reinforcement learning method and UCB applied to trees (UCT) algorithm with a more powerful heuristic function based on Progressive Bias method and two pruning strategies for a traditional board game Gomoku. For the Adaptive Dynamic Programming part, we train a shallow forward neural network to give a quick evaluation of Gomoku board situations. UCT is a general approach in MCTS as a tree policy. Our framework use UCT to balance the exploration and exploitation of Gomoku game trees while we also apply powerful pruning strategies and heuristic function to re-select the available 2-adjacent grids of the state and use ADP instead of simulation to give estimated values of expanded nodes. Experiment result shows that this method can eliminate the search depth defect of the simulation process and converge to the correct value faster than single UCT. This approach can be applied to design new Gomoku AI and solve other Gomoku-like board game.

READ FULL TEXT
research
09/11/2018

Massively Parallel Dynamic Programming on Trees

Dynamic programming is a powerful technique that is, unfortunately, ofte...
research
09/04/2023

AlphaZero Gomoku

In the past few years, AlphaZero's exceptional capability in mastering i...
research
07/05/2021

Towards solving the 7-in-a-row game

Our paper explores the game theoretic value of the 7-in-a-row game. We r...
research
08/14/2019

Heuristic Dynamic Programming for Adaptive Virtual Synchronous Generators

In this paper a neural network heuristic dynamic programing (HDP) is use...
research
06/16/2020

Online Reinforcement Learning Control by Direct Heuristic Dynamic Programming: from Time-Driven to Event-Driven

In this paper time-driven learning refers to the machine learning method...
research
11/09/2020

Solving the Steiner Tree Problem with few Terminals

The Steiner tree problem is a well-known problem in network design, rout...
research
06/18/2017

Single item stochastic lot sizing problem considering capital flow and business overdraft

This paper introduces capital flow to the single item stochastic lot siz...

Please sign up or login with your details

Forgot password? Click here to reset