可微分编程

维基百科,自由的百科全书
跳到导航 跳到搜索

可微分编程是一种编程范型,在其中数值计算程序始终可通过自动微分来求导数[1][2][3][4]。这允许了对程序中的参数的基于梯度优化英语Gradient method,通常通过梯度下降。可微分编程广泛用于各种领域,特别是科学计算人工智能[4]

方式[编辑]

多数可微分编程框架是通过构造包含程序中的控制流和数据结构的图来进行工作的[5]。早期的尝试一般可归入两组之中:

  • 基于运算符重载、动态图的方式,比如PyTorchNumPy的AutoGrad。它们的动态和交互本质,使得多数程序可以更容易的书写和推理。但是它们导致了解释器开销(特别是在包含很多小运算的时候),较弱的可伸缩性,费力于从编译器优化中获益[6][7][4]

这些早期方式都只能微分以适合于这些框架的风格书写的代码,限制了同其他程序的互操作性。

更新近的包如Julia编程语言的Zygote[8]Swift编程语言的Swift for TensorFlow[9],和新的编程语言Myia[10],通过将编程语言的语法当作图来处理,解决了早期尝试面对的问题。任意代码的中间表示可以直接的微分、优化和编译[5][11][6]

应用[编辑]

可微分编程已经应用于多个领域,比如在机器人学中结合深度学习物理引擎,用可微分密度泛函理论解决电子结构问题,可微分光线追踪图像处理概率编程[12][13][14][15][16][4]

参见[编辑]

注释[编辑]

  1. ^ TensorFlow 1使用静态图方式,而TensorFlow 2缺省使用动态图方式。

引用[编辑]

  1. ^ Baydin, Atilim Gunes; Pearlmutter, Barak; Radul, Alexey Andreyevich; Siskind, Jeffrey. Automatic differentiation in machine learning: a survey. Journal of Machine Learning Research. 2018, 18: 1–43. 
  2. ^ Wang, Fei; Decker, James; Wu, Xilun; Essertel, Gregory; Rompf, Tiark, Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K. , 编, Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming (PDF), Advances in Neural Information Processing Systems 31 (Curran Associates, Inc.), 2018: 10201–10212 [2019-02-13], (原始内容存档 (PDF)于2021-02-15) 
  3. ^ Innes, Mike. On Machine Learning and Programming Languages (PDF). SysML Conference 2018. 2018 [2021-01-14]. (原始内容存档 (PDF)于2020-06-05). 
  4. ^ 4.0 4.1 4.2 4.3 Innes, Mike; Edelman, Alan; Fischer, Keno; Rackauckas, Chris; Saba, Elliot; Viral B Shah; Tebbutt, Will, ∂P: A Differentiable Programming System to Bridge Machine Learning and Scientific Computing, 2019, arXiv:1907.07587可免费查阅 
  5. ^ 5.0 5.1 5.2 Innes, Michael; Saba, Elliot; Fischer, Keno; Gandhi, Dhairya; Rudilosso, Marco Concetto; Joy, Neethu Mariya; Karmali, Tejan; Pal, Avik; Shah, Viral. Fashionable Modelling with Flux. 2018-10-31. arXiv:1811.01457可免费查阅 [cs.PL]. 
  6. ^ 6.0 6.1 6.2 Automatic Differentiation in Myia. [2019-06-24]. (原始内容存档于2021-02-24). 
  7. ^ 7.0 7.1 TensorFlow: Static Graphs. [2019-03-04]. 
  8. ^ Zygote. [2021-01-14]. (原始内容存档于2021-02-14). 
  9. ^ Swift for TensorFlow. [2021-01-14]. (原始内容存档于2021-01-21). 
  10. ^ Myia. [2021-01-14]. (原始内容存档于2020-12-31). 
  11. ^ Innes, Michael. Don't Unroll Adjoint: Differentiating SSA-Form Programs. 2018-10-18. arXiv:1810.07951可免费查阅 [cs.PL]. 
  12. ^ Degrave, Jonas; Hermans, Michiel; Dambre, Joni; wyffels, Francis. A Differentiable Physics Engine for Deep Learning in Robotics. 2016-11-05. arXiv:1611.01652可免费查阅 [cs.NE]. 
  13. ^ Li, Li; Hoyer, Stephan; Pederson, Ryan; Sun, Ruoxi; Cubuk, Ekin D.; Riley, Patrick; Burke, Kieron. Kohn-Sham Equations as Regularizer: Building Prior Knowledge into Machine-Learned Physics. Physical Review Letters. 2021, 126 (3): 036401. doi:10.1103/PhysRevLett.126.036401. 
  14. ^ Differentiable Monte Carlo Ray Tracing through Edge Sampling. people.csail.mit.edu. [2019-02-13]. 
  15. ^ SciML Scientific Machine Learning Open Source Software Organization Roadmap. sciml.ai. [2020-07-19]. 
  16. ^ Differentiable Programming for Image Processing and Deep Learning in Halide. people.csail.mit.edu. [2019-02-13].