徑向基函數核

在機器學習中，（高斯）徑向基函數核（英語：Radial basis function kernel），或稱為RBF核，是一種常用的核函數。它是支持向量機分類中最為常用的核函數。^[1]

關於兩個樣本x和x'的RBF核可表示為某個「輸入空間」（input space）的特徵向量，它的定義如下所示：^[2]

K(\mathbf {x} ,\mathbf {x'} )=\exp \left(-{\frac {||\mathbf {x} -\mathbf {x'} ||_{2}^{2}}{2\sigma ^{2}}}\right)

$\textstyle ||\mathbf {x} -\mathbf {x'} ||_{2}^{2}$ 可以看做兩個特徵向量之間的平方歐幾里得距離。 $\sigma$ 是一個自由參數。一種等價但更為簡單的定義是設一個新的參數 $\gamma$ ，其表達式為 $\textstyle \gamma ={\tfrac {1}{2\sigma ^{2}}}$ ：

K(\mathbf {x} ,\mathbf {x'} )=\exp(-\gamma ||\mathbf {x} -\mathbf {x'} ||_{2}^{2})

因為RBF核函數的值隨距離增大而減小，並介於0（極限）和1（當x = x'的時候）之間，所以它是一種現成的相似性度量表示法。^[2] 核的特徵空間有無窮多的維數；對於 $\sigma =1$ ，它的展開式為：^[3]

\exp \left(-{\frac {1}{2}}||\mathbf {x} -\mathbf {x'} ||_{2}^{2}\right)=\sum _{j=0}^{\infty }{\frac {(\mathbf {x} ^{\top }\mathbf {x'} )^{j}}{j!}}\exp \left(-{\frac {1}{2}}||\mathbf {x} ||_{2}^{2}\right)\exp \left(-{\frac {1}{2}}||\mathbf {x'} ||_{2}^{2}\right)

近似

因為支持向量機和其他模型使用了核技巧（英語：Kernel trick），它在處理輸入空間中大量的訓練樣本或含有大量特徵的樣本的時表現不是很好。所以，目前已經設計出了多種RBF核（或相似的其他核）的近似方法。^[4] 典型的情況下，這些方法使用z(x)的形式，也就是用一個函數對一個與其他向量（例如支持向量機中的支持向量）無關的單向量進行變換，例如：

z(\mathbf {x} )z(\mathbf {x'} )\approx \varphi (\mathbf {x} )\varphi (\mathbf {x'} )=K(\mathbf {x} ,\mathbf {x'} )

其中 $\textstyle \varphi$ 是RBF核中植入的隱式映射。

一種建構這樣的z函數的方法，是對核函數作傅立葉變換，然後從中隨機抽出所需函數。^[5]

參見

參考資料

^ Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael Ringgaard and Chih-Jen Lin (2010). Training and testing low-degree polynomial data mappings via linear SVM. J. Machine Learning Research 11: 1471–1490.
^ ^2.0 ^2.1 Vert, Jean-Philippe, Koji Tsuda, and Bernhard Schölkopf (2004). "A primer on kernel methods." Kernel Methods in Computational Biology.
^ Shashua, Amnon. Introduction to Machine Learning: Class Notes 67577. 2009. arXiv:0904.3664  [cs.LG].
^ Andreas Müller (2012). Kernel Approximations for Efficient SVMs (and other feature extraction methods) （頁面存檔備份，存於網際網路檔案館）.
^ Ali Rahimi and Benjamin Recht (2007). Random features for large-scale kernel machines. Neural Information Processing Systems.

[Chang2010-1] Yin-Wen Chang, Cho-Jui Hsieh, Kai-Wei Chang, Michael Ringgaard and Chih-Jen Lin (2010). Training and testing low-degree polynomial data mappings via linear SVM. J. Machine Learning Research 11: 1471–1490.

[primer-2] 2.0 ^2.1 Vert, Jean-Philippe, Koji Tsuda, and Bernhard Schölkopf (2004). "A primer on kernel methods." Kernel Methods in Computational Biology.

[3] Shashua, Amnon. Introduction to Machine Learning: Class Notes 67577. 2009. arXiv:0904.3664  [cs.LG].

[4] Andreas Müller (2012). Kernel Approximations for Efficient SVMs (and other feature extraction methods) （頁面存檔備份，存於網際網路檔案館）.

[5] Ali Rahimi and Benjamin Recht (2007). Random features for large-scale kernel machines. Neural Information Processing Systems.

[1]

[2]

[3]

[4]

[5]