Literature DB >> 30523106

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play.

David Silver1,2, Thomas Hubert3, Julian Schrittwieser3, Ioannis Antonoglou3, Matthew Lai3, Arthur Guez3, Marc Lanctot3, Laurent Sifre3, Dharshan Kumaran3, Thore Graepel3, Timothy Lillicrap3, Karen Simonyan3, Demis Hassabis1.   

Abstract

The game of chess is the longest-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. By contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go by reinforcement learning from self-play. In this paper, we generalize this approach into a single AlphaZero algorithm that can achieve superhuman performance in many challenging games. Starting from random play and given no domain knowledge except the game rules, AlphaZero convincingly defeated a world champion program in the games of chess and shogi (Japanese chess), as well as Go.
Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works.

Entities:  

Mesh:

Year:  2018        PMID: 30523106     DOI: 10.1126/science.aar6404

Source DB:  PubMed          Journal:  Science        ISSN: 0036-8075            Impact factor:   47.728


  82 in total

1.  Opinion: What does AI's success playing complex board games tell brain scientists?

Authors:  Dale Purves
Journal:  Proc Natl Acad Sci U S A       Date:  2019-07-23       Impact factor: 11.205

2.  Deep learning based detection of intracranial aneurysms on digital subtraction angiography: A feasibility study.

Authors:  Nicolin Hainc; Manoj Mannil; Vaia Anagnostakou; Hatem Alkadhi; Christian Blüthgen; Lorenz Wacht; Andrea Bink; Shakir Husain; Zsolt Kulcsár; Sebastian Winklhofer
Journal:  Neuroradiol J       Date:  2020-07-07

3.  Fast reinforcement learning with generalized policy updates.

Authors:  André Barreto; Shaobo Hou; Diana Borsa; David Silver; Doina Precup
Journal:  Proc Natl Acad Sci U S A       Date:  2020-08-17       Impact factor: 11.205

4.  Archetypal landscapes for deep neural networks.

Authors:  Philipp C Verpoort; Alpha A Lee; David J Wales
Journal:  Proc Natl Acad Sci U S A       Date:  2020-08-25       Impact factor: 11.205

5.  Why deep-learning AIs are so easy to fool.

Authors:  Douglas Heaven
Journal:  Nature       Date:  2019-10       Impact factor: 49.962

6.  The unreasonable effectiveness of deep learning in artificial intelligence.

Authors:  Terrence J Sejnowski
Journal:  Proc Natl Acad Sci U S A       Date:  2020-01-28       Impact factor: 11.205

7.  α-Rank: Multi-Agent Evaluation by Evolution.

Authors:  Shayegan Omidshafiei; Christos Papadimitriou; Georgios Piliouras; Karl Tuyls; Mark Rowland; Jean-Baptiste Lespiau; Wojciech M Czarnecki; Marc Lanctot; Julien Perolat; Remi Munos
Journal:  Sci Rep       Date:  2019-07-09       Impact factor: 4.379

8.  Agent-Based Modeling of Systemic Inflammation: A Pathway Toward Controlling Sepsis.

Authors:  Gary An; R Chase Cockrell
Journal:  Methods Mol Biol       Date:  2021

Review 9.  How Machine Learning Will Transform Biomedicine.

Authors:  Jeremy Goecks; Vahid Jalili; Laura M Heiser; Joe W Gray
Journal:  Cell       Date:  2020-04-02       Impact factor: 41.582

10.  Fail-safe genetic codes designed to intrinsically contain engineered organisms.

Authors:  Jonathan Calles; Isaac Justice; Detravious Brinkley; Alexa Garcia; Drew Endy
Journal:  Nucleic Acids Res       Date:  2019-11-04       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.