高斯金字塔

高斯金字塔（英文：Gaussian Pyramid）为在图像处理、计算机视觉、信号处理上所使用的一项技术。高斯金字塔本质上为信号的多尺度表示法，亦即将同一信号或图片多次的进行高斯模糊，并且向下取样，藉以产生不同尺度下的多组信号或图片以进行后续的处理，例如在影像辨识上，可以借由比对不同尺度下的图片，以防止要寻找的内容可能在图片上有不同的大小。高斯金字塔的理论基础为尺度空间理论，而后续也衍生出了多分辨率分析。

尺度空间理论[编辑]

高斯金字塔背后的理论基础为尺度空间理论。这个的概念可以用在任意维度的信号中，不过最常用的地方还是在二维的影像信号上，以下将以二维影像信号作为主要讨论对象。给定一张图片 $f(x,y)$ ，它的尺度空间表示方式 $L(x,y;t)$ 定义为:影像信号 $f(x,y)$ 和高斯函数 $g(x,y;t)={\frac {1}{2{\pi }t}}e^{-(x^{2}+y^{2})/2t}\,$ 的旋积。完整的式子为：

L(x,y;t)\ =g(x,y;t)*f(x,y),

式中的分号代表旋积的对象为 $x,y$ ，而分号右边的 $t$ 表示定义的尺度大小。这个定义当 $t\geq 0$ 时对于所有的 $t$ 都会成立，不过通常在实作上只会选取特定的 $t$ 值。其中 $t$ 为高斯函数的变异数。当 $t$ 趋近于零的时候， $g$ 成为一个单位脉冲响应，使得 $L(x,y;t)\ =f(x,y)$ ，这代表当 $t=0$ 的时候我们可以把这项操作视为图片 $f$ 本身。当 $t$ 增加时， $L$ 代表将影像 $f$ 通过一个较大的高斯滤波器，从而使得影像的细节被去除更多。

使用高斯函数的原因[编辑]

根据尺度空间理论，假如限定从较精密的尺度推展到较粗糙的尺度的过程中，不能有新的结构被创造出来，那么高斯函数已经被证明出来为一个能够张成尺度空间的正则（Canonical）函数。 ^[1]^[2]^[3]^[4]^[5]^[6]^[7]^[8]^[9]^[10]

尺度空间的另外一种表现形式是将其视为一个扩散方程（举热传导方程式作为例子）：

\partial _{t}L={\frac {1}{2}}\nabla ^{2}L,

并且初始条件为 $L(x,y;0)=f(x,y)$ 。这个表示方式将图片上的内容视为温度的分布，并且将建立尺度空间表示形式的过程视为热传导随着时间 $t$ 的扩散过程。另外，对于这种表现形式的仔细分析也统合了连续和离散尺度空间的理论，并且可以拓展到非线性的尺度空间。因此，我们可以说这种表示形式为尺度空间的基本。而高斯函数为此扩散方程的格林函数，也因此，在描述尺度空间表示方式和建立高斯金字塔时，会以高斯函数为主体。

建立高斯金字塔[编辑]

在建立高斯金字塔的时候，我们首先会将影像转换为尺度空间的表示方式，亦即乘上不同大小的高斯函数，之后再依据取定的尺度向下取样。乘上的高斯函数大小和向下取样的频率通常会选为2的幂次，也就是说，在每次迭代的过程中，影像都会被乘上一个固定大小的高斯函数，并且被以长宽各0.5的比率被向下取样。如果将向下取样过程的图片一张一张叠在一起，会呈现一个金字塔的样子，因此这个过程称为高斯金字塔。

应用[编辑]

图形特征点检测[编辑]

高斯金字塔的概念可以拿来进行边缘检测。高斯拉普拉斯算子（英文：Laplacian of Gaussian）为高斯金字塔的一个延伸，它可以作为加强影像边缘的一个带通滤波器。高斯拉普拉斯算子为影像通过高斯滤波器之后再通过拉普拉斯算子的结果：

\nabla ^{2}L=L_{xx}+L_{yy}

然而，高斯拉普拉斯的结果会受到高斯函数的大小影响，为了去除这个影响，可以导入尺度归一化高斯拉普拉斯运算子

\nabla _{norm}^{2}L(x,y;t)=t(L_{xx}+L_{yy})

接着可以借由寻找 $\nabla ^{2}L$ 同时符合几何空间中和尺度空间中的局部极大值点，来寻找影像中的特征点。 ^[11] 换句话说，对于输入影像 $f(x,y)$ ，我们可以借由建立起高斯金字塔，建出它在二维几何平面加上一维尺度空间共三维的空间，并且找出其亮度大于邻近26点的点作为特征点。

为了简化计算，我们可以将上述的热扩散函数带入高斯拉普拉斯算子，并进行近似以得到高斯差算子：

\nabla _{norm}^{2}L(x,y;t)\approx {\frac {t}{\Delta t}}\left(L(x,y;t+\Delta t)-L(x,y;t-\Delta t)\right)

.

高斯差可以简单的透过将在尺度空间相邻的图片进行相减得到。这个方法被用在著名的尺度不变特征转换中^[12]：尺度不变特征转换借由寻找并描述不同尺度下的影像特征点，以进行不同影像之间的特征点比对。

参考资料[编辑]

^ Koenderink, Jan "The structure of images", Biological Cybernetics, 50:363–370, 1984
^ Lindeberg, T., Scale-Space Theory in Computer Vision, Kluwer Academic Publishers, 1994 （页面存档备份，存于互联网档案馆）, ISBN 0-7923-9418-6
^ Florack, Luc, Image Structure, Kluwer Academic Publishers, 1997.
^ Lindeberg, Tony. Scale-space. Encyclopedia of Computer Science and Engineering (Benjamin Wah, ed), John Wiley and Sons. 2008, IV: 2495–2504 [2013-06-27]. doi:10.1002/9780470050118.ecse609. （原始内容存档于2019-02-13）.
^ J. Babaud, A. P. Witkin, M. Baudin, and R. O. Duda, Uniqueness of the Gaussian kernel for scale-space filtering. IEEE Trans. Pattern Anal. Machine Intell. 8(1), 26–33, 1986.
^ A. Yuille, T.A. Poggio: Scaling theorems for zero crossings. IEEE Trans. Pattern Analysis & Machine Intelligence, Vol. PAMI-8, no. 1, pp. 15–25, Jan. 1986.
^ Lindeberg, T., "Scale-space for discrete signals," PAMI(12), No. 3, March 1990, pp. 234–254.. [2013-06-27]. （原始内容存档于2017-08-25）.
^ Pauwels, E., van Gool, L., Fiddelaers, P. and Moons, T.: An extended class of scale-invariant and recursive scale space filters, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 7, pp. 691–701, 1995.
^ Lindeberg, T.: On the axiomatic foundations of linear scale-space: Combining semi-group structure with causailty vs. scale invariance. In: J. Sporring et al. (eds.) Gaussian Scale-Space Theory: Proc. PhD School on Scale-Space Theory , (Copenhagen, Denmark, May 1996), pages 75–98, Kluwer Academic Publishers, 1997.. [2013-06-27]. （原始内容存档于2017-08-25）.
^ Weickert, J. Linear scale space has first been proposed in Japan. Journal of Mathematical Imaging and Vision, 10(3):237–252, 1999.
^ T. Lindeberg, "Feature detection with automatic scale selection", International Journal of Computer Vision 30 (2): pp 77–116, 1998.. [2013-06-27]. （原始内容存档于2021-03-07）.
^ Lowe, D. G., “Distinctive image features from scale-invariant keypoints”, International Journal of Computer Vision, 60, 2, pp. 91-110, 2004.. [2013-06-27]. （原始内容存档于2008-03-07）.

[koe84-1] Koenderink, Jan "The structure of images", Biological Cybernetics, 50:363–370, 1984

[lin94-2] Lindeberg, T., Scale-Space Theory in Computer Vision, Kluwer Academic Publishers, 1994 （页面存档备份，存于互联网档案馆）, ISBN 0-7923-9418-6

[flo97-3] Florack, Luc, Image Structure, Kluwer Academic Publishers, 1997.

[lin08-4] Lindeberg, Tony. Scale-space. Encyclopedia of Computer Science and Engineering (Benjamin Wah, ed), John Wiley and Sons. 2008, IV: 2495–2504 [2013-06-27]. doi:10.1002/9780470050118.ecse609. （原始内容存档于2019-02-13）.

[Babaud-EtAl-5] J. Babaud, A. P. Witkin, M. Baudin, and R. O. Duda, Uniqueness of the Gaussian kernel for scale-space filtering. IEEE Trans. Pattern Anal. Machine Intell. 8(1), 26–33, 1986.

[Yuille-Poggio-6] A. Yuille, T.A. Poggio: Scaling theorems for zero crossings. IEEE Trans. Pattern Analysis & Machine Intelligence, Vol. PAMI-8, no. 1, pp. 15–25, Jan. 1986.

[Lindeberg-1990-7] Lindeberg, T., "Scale-space for discrete signals," PAMI(12), No. 3, March 1990, pp. 234–254.. [2013-06-27]. （原始内容存档于2017-08-25）.

[Pauwels-EtAl-8] Pauwels, E., van Gool, L., Fiddelaers, P. and Moons, T.: An extended class of scale-invariant and recursive scale space filters, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17, No. 7, pp. 691–701, 1995.

[Lindeberg-1997-9] Lindeberg, T.: On the axiomatic foundations of linear scale-space: Combining semi-group structure with causailty vs. scale invariance. In: J. Sporring et al. (eds.) Gaussian Scale-Space Theory: Proc. PhD School on Scale-Space Theory , (Copenhagen, Denmark, May 1996), pages 75–98, Kluwer Academic Publishers, 1997.. [2013-06-27]. （原始内容存档于2017-08-25）.

[Weickert-1999-10] Weickert, J. Linear scale space has first been proposed in Japan. Journal of Mathematical Imaging and Vision, 10(3):237–252, 1999.

[Lindeberg-1998-11] T. Lindeberg, "Feature detection with automatic scale selection", International Journal of Computer Vision 30 (2): pp 77–116, 1998.. [2013-06-27]. （原始内容存档于2021-03-07）.

[Lowe-2004-12] Lowe, D. G., “Distinctive image features from scale-invariant keypoints”, International Journal of Computer Vision, 60, 2, pp. 91-110, 2004.. [2013-06-27]. （原始内容存档于2008-03-07）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]