AlphaGo Zero

AlphaGo Zero是DeepMind 围棋软件 AlphaGo的最新版。2017年10月19日，AlphaGo团队在《自然》上发表文章介绍了AlphaGo Zero，文中指出此版本不采用人类玩家的棋谱，且比之前的所有版本都要强大^[1]。通过自我对弈，AlphaGo Zero在三天内以100比0的战绩战胜了AlphaGo Lee，花了21天达到AlphaGo Master的水平，用40天超越了所有旧版本^[2]。DeepMind联合创始人兼CEO杰米斯·哈萨比斯说，AlphaGo Zero不再受限于人类认知，很强大^[3]。由于专家数据“经常很贵、不可靠或是无法取得”，不借助人类专家的数据集训练人工智能，对于人工智能开发超人技能具有重大意义^[4]，因为这样的AI不是学习人，是通过对自我的反思和独有的创造力直接超越人类。文章作者之一大卫·席尔瓦表示，摒弃向人类学习的需求，这有可能是对现有人工智能算法的拓展^[5]。

训练

AlphaGo Zero神经网络使用TensorFlow在64个GPU和19个CPU参数服务器训练，推理的TPU只有四个。神经网络最初除了规则，对围棋一无所知。AI进行“非监督式学习”，自己和自己对弈，直到能预测自己的每一手棋及其对棋局结果的影响^[6]。前三天，AlphaGo Zero连续自我对弈490万局^[7]。几天之内它就发展出击败人类顶尖棋手的技能，而早期的AlphaGo要达到同等水平需要数月的训练^[8]。为了比较，研究人员还用人类对局数据训练了另一版AlphaGo Zero，发现该版本学习更加迅速，但从长远来看，表现反而较差^[9]。

应用

哈萨比斯表示，AlphaGo的算法对需要智能搜索巨大概率空间的领域建树最大，如蛋白质折叠或精准模拟化学反应^[10]。对于很难模拟的领域，如学习如何开车，用处可能相对较低^[11]。

评价

普遍认为，AlphaGo Zero是一次巨大的进步，即便是和它的开山鼻祖AlphaGo作比较时。艾伦人工智能研究院（英语：Allen Institute for Artificial Intelligence）的奥伦·伊奇奥尼（英语：Oren Etzioni）表示，AlphaGo Zero是“非常令人印象深刻的技术成果”，“不管是在他们实现目标的能力上，还是他们花40天时间用四个TPU训练这套系统的能力”^[6]。《卫报》称AlphaGo Zero是“人工智能的大突破”，援引谢菲尔德大学的伊莱尼·瓦希莱基（Eleni Vasilaki）和卡内基梅隆大学的汤姆·米切尔（Tom Mitchell），两人分别说它是令人印象深刻的成就和“突出的工程成就”^[11]。悉尼大学的马克·佩斯（英语：Mark Pesce）说AlphaGo Zero是“巨大的技术进展”，带领我们进入“未至之地”^[12]。

然而，纽约大学心理学家盖瑞·马库斯（英语：Gary Marcus）对我们目前所知的则表示谨慎，AlphaGo或许包括“程序员如何建造一台解决围棋等问题的机器的隐晦知识”，在确保它的基础结构比玩围棋时更有效率之前，它需要在其他的领域受检测。相反，DeepMind“自信这种方法可以归纳至更多的领域中”^[7]。

韩国职业围棋选手李世乭回应称：“之前的AlphaGo并不完美，我认为这就是为什么要把AlphaGo Zero造出来”。至于AlphaGo的发展潜力，李世乭表示他必须要静观其变，但同时表示它会影响年轻的棋手。韩国国家围棋队教练睦镇硕表示，围棋界已经模仿到之前AlphaGo各个版本的下棋风格，从中创造新的思路，他希望AlphaGo Zero能带来新的思路。睦镇硕补充道，棋界的大趋势如今被AlphaGo的下棋风格影响。“最初，我们很难理解，我差不多认为我在跟外星人打比赛。然而，有过这么次的体会，我已经适应它了。”他说。“我们现在错过了辩论AlphaGo与人类之间的能力差距的点。现在讲的是计算机间的差距。”据称，他已经开始和国家队棋手分析AlphaGo Zero的比赛风格：“虽然只看了几场比赛，但我们的印象是，AlphaGo Zero和他的前者相比，下棋更像人类^[13]。”中国职业棋手柯洁在他的微博上表示：“一个纯净、纯粹自我学习的AlphaGo是最强的……对于AlphaGo的自我进步来讲……人类太多余了^[14]。”

历史版本比较

架构和实力^[15]
版本	硬件	等级分	赛况
AlphaGo Fan（英语：AlphaGo Fan）	176个GPU、^[4]分布式	3144^[1]	5：0 对阵樊麾
AlphaGo Lee	48个TPU、^[4]分布式	3739^[1]	4：1 对阵李世乭
AlphaGo Master	4个第二代TPU^[4]、单机	4858^[1]	网棋 60:0 对阵 44位职业棋手中国乌镇围棋峰会 3:0 对阵柯洁；1:0 对阵五位顶尖棋手联队
AlphaGo Zero	4个第二代TPU^[4]、单机	5185^[1]	100：0 对阵AlphaGo Lee 89：11 对阵AlphaGo Master

参考

参考资料

^ ^1.0 ^1.1 ^1.2 ^1.3 ^1.4 Mastering the game of Go without human knowledge. Nature. 2017-10-19 [2017-10-19]. （原始内容存档于2017-10-19）.
^ Google's New AlphaGo Breakthrough Could Take Algorithms Where No Humans Have Gone. Yahoo!. 2017-10-19 [2017-10-19]. （原始内容存档于2017-10-19）.
^ AlphaGo Zero: Google DeepMind supercomputer learns 3,000 years of human knowledge in 40 days. The Telegraph. 2017-10-18 [2017-10-19]. （原始内容存档于2017-10-20）.
^ ^4.0 ^4.1 ^4.2 ^4.3 ^4.4 Hassabis, Demis; Siver, David. AlphaGo Zero: Learning from scratch. DeepMind. 2017-10-18 [2017-10-19]. （原始内容存档于2017-10-19）.
^ DeepMind AlphaGo Zero learns on its own without meatbag intervention. ZDNet. 2017-10-19 [2017-10-20]. （原始内容存档于2017-10-20）.
^ ^6.0 ^6.1 Greenemeier, Larry. AI versus AI: Self-Taught AlphaGo Zero Vanquishes Its Predecessor. Scientific American. [2017-10-20]. （原始内容存档于2017-10-19）.
^ ^7.0 ^7.1 Computer Learns To Play Go At Superhuman Levels 'Without Human Knowledge'. NPR. 2017-10-18 [2017-10-20]. （原始内容存档于2017-10-20）.
^ Google's New AlphaGo Breakthrough Could Take Algorithms Where No Humans Have Gone. Fortune. 2017-10-19 [2017-10-20]. （原始内容存档于2017-10-19）.
^ This computer program can beat humans at Go—with no human instruction. Science | AAAS. 2017-10-18 [2017-10-20]. （原始内容存档于2017-10-19）.
^ The latest AI can work things out without being taught. The Economist. [2017-10-20]. （原始内容存档于2017-10-19）.
^ ^11.0 ^11.1 Sample, Ian. 'It's able to create knowledge itself': Google unveils AI that learns on its own. The Guardian. 2017-10-18 [2017-10-20]. （原始内容存档于2017-10-19）.
^ How Google's new AI can teach itself to beat you at the most complex games. Australian Broadcasting Corporation. 2017-10-19 [2017-10-20]. （原始内容存档于2017-10-20）.
^ Go Players Excited About ‘More Humanlike’ AlphaGo Zero. Korea Bizwire. 2017-10-19 [2017-10-21]. （原始内容存档于2017-10-21）.
^ 柯洁:对于AlphaGo的自我进步来讲人类太多余了. 环球网. 2017-10-20 [2017-11-08]. （原始内容存档于2017-11-09）.
^ 【柯洁战败解密】AlphaGo Master最新架构和算法，谷歌云与TPU拆解. 搜狐. 2017-05-24 [2017-06-01]. （原始内容存档于2017-09-17）.

外部链接

AlphaGo blog（页面存档备份，存于互联网档案馆）
Nature news on AlphaGo Zero（页面存档备份，存于互联网档案馆）
Full nature article on AlphaGo Zero Archive.is的存档，存档日期2018-01-03
AlphaGo Zero Games（页面存档备份，存于互联网档案馆）
AMA on Reddit（页面存档备份，存于互联网档案馆）

[Nature2017-1] 1.0 ^1.1 ^1.2 ^1.3 ^1.4 Mastering the game of Go without human knowledge. Nature. 2017-10-19 [2017-10-19]. （原始内容存档于2017-10-19）.

[2] Google's New AlphaGo Breakthrough Could Take Algorithms Where No Humans Have Gone. Yahoo!. 2017-10-19 [2017-10-19]. （原始内容存档于2017-10-19）.

[3] AlphaGo Zero: Google DeepMind supercomputer learns 3,000 years of human knowledge in 40 days. The Telegraph. 2017-10-18 [2017-10-19]. （原始内容存档于2017-10-20）.

[Deepmind20171018-4] 4.0 ^4.1 ^4.2 ^4.3 ^4.4 Hassabis, Demis; Siver, David. AlphaGo Zero: Learning from scratch. DeepMind. 2017-10-18 [2017-10-19]. （原始内容存档于2017-10-19）.

[5] DeepMind AlphaGo Zero learns on its own without meatbag intervention. ZDNet. 2017-10-19 [2017-10-20]. （原始内容存档于2017-10-20）.

[Scientific_American-6] 6.0 ^6.1 Greenemeier, Larry. AI versus AI: Self-Taught AlphaGo Zero Vanquishes Its Predecessor. Scientific American. [2017-10-20]. （原始内容存档于2017-10-19）.

[npr-7] 7.0 ^7.1 Computer Learns To Play Go At Superhuman Levels 'Without Human Knowledge'. NPR. 2017-10-18 [2017-10-20]. （原始内容存档于2017-10-20）.

[8] Google's New AlphaGo Breakthrough Could Take Algorithms Where No Humans Have Gone. Fortune. 2017-10-19 [2017-10-20]. （原始内容存档于2017-10-19）.

[9] This computer program can beat humans at Go—with no human instruction. Science | AAAS. 2017-10-18 [2017-10-20]. （原始内容存档于2017-10-19）.

[10] The latest AI can work things out without being taught. The Economist. [2017-10-20]. （原始内容存档于2017-10-19）.

[guardian-11] 11.0 ^11.1 Sample, Ian. 'It's able to create knowledge itself': Google unveils AI that learns on its own. The Guardian. 2017-10-18 [2017-10-20]. （原始内容存档于2017-10-19）.

[12] How Google's new AI can teach itself to beat you at the most complex games. Australian Broadcasting Corporation. 2017-10-19 [2017-10-20]. （原始内容存档于2017-10-20）.

[13] Go Players Excited About ‘More Humanlike’ AlphaGo Zero. Korea Bizwire. 2017-10-19 [2017-10-21]. （原始内容存档于2017-10-21）.

[14] 柯洁:对于AlphaGo的自我进步来讲人类太多余了. 环球网. 2017-10-20 [2017-11-08]. （原始内容存档于2017-11-09）.

[sohu0524-15] 【柯洁战败解密】AlphaGo Master最新架构和算法，谷歌云与TPU拆解. 搜狐. 2017-05-24 [2017-06-01]. （原始内容存档于2017-09-17）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]