An Approach to the Development of a Game Agent based on SOM and Reinforcement Learning

Keiji Kamei; Yuuki Kakizoe

doi:10.52731/ijscai.v1.i2.92

Keiji Kamei Nishinippon Institute of Technology
Yuuki Kakizoe Nishinippon Institute of Technology

DOI: https://doi.org/10.52731/ijscai.v1.i2.92

Keywords: Game agent, Self-Organizing Maps, Reinforcement learning, Reversi(Othello) game

Abstract

Recently, several studies have reported that the computer programs are called the game agent exceed to the ability of experts in some board games, e.g., Deep Blue, AKARA, AlphaGO, etc. Meanwhile, human beings have no advantages in terms of numerical ability compared with computers; however, experts often defeat those programs. For this, the aim of many researches for developing agents of board games is to defeat experts in all kinds of computational ways; hence, those depend on the computational capability because those apply deep look ahead search to determination of moves. By contrast to those researches, our final aims are the development of a board game agent does not require the high computational capability and of an “Enjoyable” game agent is tailored skills for a player based on “Simple structure and Algorithms.” To realize our aims, we propose to combine SelfOrganizing Maps(SOM) with Reinforcement Learning. For more effective learning of the optimal moves of a board game, our proposal modifies the formula of SOM and introduces the tree search with less calculation load to determine moves in the closing stage. We conduct the two experiments; firstly, we examine the availability of our proposals. Secondly, we aim for improving the winning rate. From the results, the game agent that is developed on the basis of our proposal achieved a 60% winning rate against the opponent program by using the general personal computer. Moreover, those suggest the potential of becoming an “Enjoyable” game agent for every player with diverse skills.

Author Biographies

Keiji Kamei, Nishinippon Institute of Technology

Associate Professor

Department of Production Systems
Graduate School of Engineering

Yuuki Kakizoe, Nishinippon Institute of Technology

Department of Production Systems
Graduate School of Engineering

References

M. Campbell, “Deep Blue,” Artificial Intelligence, vol. 134, issues 1-2, Jan., 2002, pp. 57–83

T. Obata, T. Sugiyama, K. Hoki and T. Ito, “Consultation Algorithm for Computer Shogi: Move Decisions by Majority,” Computers and Games: 7th Int’l Conf.(CG 2010), Springer, 2011, pp.456–465

K. Hoki, “A New Trend in the Computer Shogi: Application of Brute-force Search and Futility Pruning Technique in Shogi,” Joho Shori, vol.47, no.8, 2006, pp.884–889

G. Tesauro, “TD-Gammon, a Self-teaching Backgammon Program, Achieves Masterlevel Play,” Neural Computation, vol. 6, no. 2, 1994, pp.215–219

J. B. Pollack and A. D. Blair, Why did TD-Gammon Work?, Advances in Neural Information Processing Systems 9, MIT Press, 1997, pp.10–16

D. Silver et al., “Mastering the Game of Go with Deep Neural Networks and Tree Search,” Nature, vol. 529, 2016, pp.484–489

R. S. Sutton and A. G. Barto, Reinforcement Learning, MIT Press, 1998

T. Kohonen, Self-organizing Maps : Second edition, Springer, 1997

Y. Kakizoe and K. Kamei, “Reduction of Dimensions by SOM for Creating a Game Agent,” Proc. The 21st Annual Conference of the Japanese Neural Network Society(JNNS2011), 2011, pp.212–213

Y. Kakizoe and K. Kamei, “Determination of the Appropriate Thresholds for Memory Retrieval of Reversi from SOM,” Proc. The 6th Int’l Conf. on Soft Computing and Intelligent Systems and the 13th Int’l Symp. on Advanced Intelligent Systems(SCIS&ISIS 2012), 2012, pp.473–476

Y. Kakizoe and K. Kamei, “Development of Othello Agent by Reinforcement Learning and SOM,” Proc. The 22nd Annual Conference of the Japanese Neural Network Society(JNNS2012), CD-ROM, 2012

K. Kamei and Y. Kakizoe, “An Approach to the Development of a Game Agent based on SOM and Reinforcement Learning,” Proc. IIAI Int’l Congress on Advanced Applied Informatics 2016(IIAI AAI 2016), 2016, pp. 669–674

R. Pfeifer and C. Scheier, Understanding Intelligence, MIT Press, 1999

J. Feinstein, Perfect play in 6x6 Othello from two alternative starting positions; http://www.feinst.demon.co.uk/Othello/6x6sol.html

K. Rocki and R. Suda, “Large-scale parallel monte carlo tree search on GPU,” Proc. IEEE Int’l Symposium on Parallel and Distributed Processing Workshops and Phd Forum(IPDPSW), 2011, pp. 2034–2037

D. Strnad and N. Guid, “Parallel alpha-beta algorithm on the GPU,” Computing and Information Technology, vol. 19, no. 4, 2011, pp.269–274

J. Olivito, C. Gonzalez and J. Resano, “FPGA implementation of a strong Reversi ´ player,” Proc. The 2010 Int’l Conference on Field-Programmable Technology(FPT’10), 2010, pp.507–510

A. Benbassat and M. Sipper, “Evolving board-game players with genetic programming,” Proc. the 13th annual conference companion on Genetic and evolutionary computation(GECCO’11), 2011, pp.739–742

S. Y. Chong, M. K. Tan and J. D. White, “Observing the evolution of neural networks learning to play the game of Othello,” IEEE Transactions on Evolutionary Computation, vol.9, no. 3, 2005, pp.240–251