可微分编程

可微分编程是一种编程范型，在其中数值计算程序始终可通过自动微分来求导数^[1]^[2]^[3]^[4]。这允许了对程序中的参数的基于梯度优化（英语：Gradient method），通常通过梯度下降。可微分编程广泛用于各种领域，特别是科学计算和人工智慧^[4]。

方式

多数可微分编程框架是通过构造包含程序中的控制流和数据结构的图来进行工作的^[5]。各种尝试一般可归入两组之中：

基于静态、编译图的方式，比如TensorFlow 1、Theano和MXNet。它们意图允许良好的编译器优化（英语：优化编译器）并易于伸缩成大系统，但是它们的静态本质，限制了交互性和能够轻易建立的程序类型，例如难于构建涉及循环或递归的那些程序，还有使得用户难以针对他们的程序进行有效的推理^[5]^[6]^[7]。

基于运算符重载、动态图的方式，比如PyTorch和针对NumPy的Autograd^[8]，TensorFlow 2也缺省使用了动态图方式。它们的动态和交互本质，使得多数程序可以更容易的书写和推理。但是它们导致了解释器开销，特别是在包含很多小运算的时候，和较弱的可伸缩性，并且缩减了来自编译器优化的利益^[6]^[7]。用Julia写的Flux（英语：Flux (machine-learning framework)）用到了自动微分程序包Zygote^[9]，它直接工作在Julia的中间表示之上，但仍可以由Julia的JIT编译器进行优化^[5]^[10]^[4]。

早期方式的局限在于，它们都是以适合于这些框架的风格书写求微分的代码，这限制了同其他程序的互操作性。新近的方式，通过从语言的语法或中间表示构造图来解决了这种问题，允许任意代码都是可求微分的^[5]^[6]。

应用

可微分编程已经应用于多个领域，比如在机器人学中结合深度学习和物理引擎，用可微分密度泛函理论解决电子结构（英语：Electronic structure）问题，可微分光线追踪，图像处理和概率编程^[11]^[12]^[13]^[14]^[15]^[4]。

参见

引用

^ Baydin, Atilim Gunes; Pearlmutter, Barak; Radul, Alexey Andreyevich; Siskind, Jeffrey. Automatic differentiation in machine learning: a survey. Journal of Machine Learning Research. 2018, 18: 1–43 [2021-01-14]. （原始内容存档于2022-01-23）.
^ Wang, Fei; Decker, James; Wu, Xilun; Essertel, Gregory; Rompf, Tiark, Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K. , 编, Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming (PDF), Advances in Neural Information Processing Systems 31 (Curran Associates, Inc.), 2018: 10201–10212 [2019-02-13], （原始内容存档 (PDF)于2021-02-15）
^ Innes, Mike. On Machine Learning and Programming Languages (PDF). SysML Conference 2018. 2018 [2021-01-14]. （原始内容存档 (PDF)于2020-06-05）.
^ ^4.0 ^4.1 ^4.2 ^4.3 Innes, Mike; Edelman, Alan; Fischer, Keno; Rackauckas, Chris; Saba, Elliot; Viral B Shah; Tebbutt, Will, ∂P: A Differentiable Programming System to Bridge Machine Learning and Scientific Computing, 2019, arXiv:1907.07587 
^ ^5.0 ^5.1 ^5.2 ^5.3 Innes, Michael; Saba, Elliot; Fischer, Keno; Gandhi, Dhairya; Rudilosso, Marco Concetto; Joy, Neethu Mariya; Karmali, Tejan; Pal, Avik; Shah, Viral. Fashionable Modelling with Flux. 2018-10-31 [2022-08-31]. arXiv:1811.01457  [cs.PL]. （原始内容存档于2022-08-31）.
^ ^6.0 ^6.1 ^6.2 Automatic Differentiation in Myia. [2019-06-24]. （原始内容存档于2021-02-24）.
^ ^7.0 ^7.1 TensorFlow: Static Graphs. [2019-03-04]. （原始内容存档于2021-09-02）.
^ Autograd － Efficiently computes derivatives of numpy code. [2022-08-28]. （原始内容存档于2022-07-18）.
^ Zygote. [2021-01-14]. （原始内容存档于2021-02-14）.
^ Innes, Michael. Don't Unroll Adjoint: Differentiating SSA-Form Programs. 2018-10-18. arXiv:1810.07951  [cs.PL].
^ Degrave, Jonas; Hermans, Michiel; Dambre, Joni; wyffels, Francis. A Differentiable Physics Engine for Deep Learning in Robotics. 2016-11-05. arXiv:1611.01652  [cs.NE].
^ Li, Li; Hoyer, Stephan; Pederson, Ryan; Sun, Ruoxi; Cubuk, Ekin D.; Riley, Patrick; Burke, Kieron. Kohn-Sham Equations as Regularizer: Building Prior Knowledge into Machine-Learned Physics. Physical Review Letters. 2021, 126 (3): 036401. doi:10.1103/PhysRevLett.126.036401.
^ Differentiable Monte Carlo Ray Tracing through Edge Sampling. people.csail.mit.edu. [2019-02-13]. （原始内容存档于2021-05-12）.
^ SciML Scientific Machine Learning Open Source Software Organization Roadmap. sciml.ai. [2020-07-19]. （原始内容存档于2021-10-17）.
^ Differentiable Programming for Image Processing and Deep Learning in Halide. people.csail.mit.edu. [2019-02-13]. （原始内容存档于2021-05-06）.

[baydin2018automatic-1] Baydin, Atilim Gunes; Pearlmutter, Barak; Radul, Alexey Andreyevich; Siskind, Jeffrey. Automatic differentiation in machine learning: a survey. Journal of Machine Learning Research. 2018, 18: 1–43 [2021-01-14]. （原始内容存档于2022-01-23）.

[2] Wang, Fei; Decker, James; Wu, Xilun; Essertel, Gregory; Rompf, Tiark, Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K. , 编, Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming (PDF), Advances in Neural Information Processing Systems 31 (Curran Associates, Inc.), 2018: 10201–10212 [2019-02-13], （原始内容存档 (PDF)于2021-02-15）

[innes-3] Innes, Mike. On Machine Learning and Programming Languages (PDF). SysML Conference 2018. 2018 [2021-01-14]. （原始内容存档 (PDF)于2020-06-05）.

[diffprog-zygote-4] 4.0 ^4.1 ^4.2 ^4.3 Innes, Mike; Edelman, Alan; Fischer, Keno; Rackauckas, Chris; Saba, Elliot; Viral B Shah; Tebbutt, Will, ∂P: A Differentiable Programming System to Bridge Machine Learning and Scientific Computing, 2019, arXiv:1907.07587 

[flux-5] 5.0 ^5.1 ^5.2 ^5.3 Innes, Michael; Saba, Elliot; Fischer, Keno; Gandhi, Dhairya; Rudilosso, Marco Concetto; Joy, Neethu Mariya; Karmali, Tejan; Pal, Avik; Shah, Viral. Fashionable Modelling with Flux. 2018-10-31 [2022-08-31]. arXiv:1811.01457  [cs.PL]. （原始内容存档于2022-08-31）.

[myia1-6] 6.0 ^6.1 ^6.2 Automatic Differentiation in Myia. [2019-06-24]. （原始内容存档于2021-02-24）.

[pytorchtut-7] 7.0 ^7.1 TensorFlow: Static Graphs. [2019-03-04]. （原始内容存档于2021-09-02）.

[8] Autograd － Efficiently computes derivatives of numpy code. [2022-08-28]. （原始内容存档于2022-07-18）.

[9] Zygote. [2021-01-14]. （原始内容存档于2021-02-14）.

[10] Innes, Michael. Don't Unroll Adjoint: Differentiating SSA-Form Programs. 2018-10-18. arXiv:1810.07951  [cs.PL].

[11] Degrave, Jonas; Hermans, Michiel; Dambre, Joni; wyffels, Francis. A Differentiable Physics Engine for Deep Learning in Robotics. 2016-11-05. arXiv:1611.01652  [cs.NE].

[Li2021-12] Li, Li; Hoyer, Stephan; Pederson, Ryan; Sun, Ruoxi; Cubuk, Ekin D.; Riley, Patrick; Burke, Kieron. Kohn-Sham Equations as Regularizer: Building Prior Knowledge into Machine-Learned Physics. Physical Review Letters. 2021, 126 (3): 036401. doi:10.1103/PhysRevLett.126.036401.

[13] Differentiable Monte Carlo Ray Tracing through Edge Sampling. people.csail.mit.edu. [2019-02-13]. （原始内容存档于2021-05-12）.

[14] SciML Scientific Machine Learning Open Source Software Organization Roadmap. sciml.ai. [2020-07-19]. （原始内容存档于2021-10-17）.

[15] Differentiable Programming for Image Processing and Deep Learning in Halide. people.csail.mit.edu. [2019-02-13]. （原始内容存档于2021-05-06）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

查论编编程范型
指令式	过程式结构化非结构化例外处理
面向对象	基于类基于原型契约式面向代理
函数式	纯函数式全函数式隐式函数级
数据流程	同步式响应式函数式响应管道流处理基于流程
宣告式	逻辑式回答集函数式逻辑约束式数据查询框架本体
元编程	宏模板反射式同像性元对象元类面向特性面向语言
并发/并行	协程生成器 future/promise 演员模型消息传递通信顺序进程通道分叉会合整体同步 PGAS
其他范型	阵列面向表达式（英语：Expression-oriented programming language）模块化关注分离面向方面数据驱动事件驱动串接式面向堆栈基于自动机可微分概率式
`关键特征`	块嵌套函数（英语：Nested function）回调函数递归头等对象头等函数闭包实化续体多态运算符重载泛型多分派模式匹配推导式抽象数据类型代数数据类型递归数据类型求值策略非确定性
多范型语言比较（英语：Comparison of multi-paradigm programming languages）