約翰遜-林登斯特勞斯定理

約翰遜-林登斯特勞斯定理(Johnson–Lindenstrauss theorem)，又稱約翰遜-林登斯特勞斯引理(Johnson–Lindenstrauss lemma)，是由William Johnson（英語：William_B._Johnson_(mathematician)）和Joram Lindenstrauss（英語：Joram_Lindenstrauss）於1984年提出的一個關於降維的著名定理^[1]，在現代機器學習，尤其是壓縮感知、降維、形狀分析（英語：Nonlinear_dimensionality_reduction#Manifold_learning_algorithms）和分佈學習（英語：Density_estimation）等領域中有很重要的應用^[2]^[3]^[4]^[5]。

這個定理告訴我們，一個高維空間中的點集，可以被線性地鑲嵌到低維空間中，同時其空間結構只遭受比較小的形變^[6]。約翰遜-林登斯特勞斯定理的證明，還說明了如何用隨機投影法（英語：Random_projection）明確地求出這個變換，所用的算法只需要隨機多項式時間^[7]。當然，降維不是免費的：如果要求形變很少的話，代價（英語：trade-off）是被嵌入的低維空間維數不能很低；反之亦然，如果要求將點集嵌入很低維的空間，那麼就不能很好地控制結構形變的程度。

因為能將維數下降到樣本量的對數階，更兼所用的變換是線性的、顯式的還可以被快速計算，約翰遜-林登斯特勞斯定理是一個力度非常強的結論。

定理陳述[編輯]

對任何給定的 $\epsilon \in (0,1)$ 以及 $N$ 維歐幾里德空間中的 $m$ 個點 $\{x_{1},\ldots ,x_{m}\}$ ，對於任意滿足條件 $n>(\epsilon ^{2}/2-\epsilon ^{3}/3)^{-1}\log m$ 的正整數 $n$ ，存在一個線性映射 $f:\mathbb {R} ^{N}\to \mathbb {R} ^{n}$ ，將這 $m$ 個點，從 $\mathbb {R} ^{N}$ (維數可能很高的空間)中映射到 $\mathbb {R} ^{n}$ (低維空間)中，同時「基本上」保持了點集成員兩兩之間的距離，即：對於任意兩個點 $(x_{i},x_{j}):1\leq i<j\leq m$ ，都有

(1-\epsilon )\|x_{i}-x_{j}\|_{2}^{2}\leq \|f(x_{i})-f(x_{j})\|_{2}^{2}\leq (1+\epsilon )\|x_{i}-x_{j}\|_{2}^{2}

更進一步地，這個線性映射 $f$ 還可以在隨機多項式時間內求出^[7]。

直觀理解[編輯]

約翰遜-林登斯特勞斯定理揭示了一些關於降維映射深刻事實，其中一些還是違反簡單直覺的^[8]。因此，要想直觀地理解這個定理，對初學者來說，可能比從數學式子上讀懂證明還要難(反而此定理的證明只用到了比較簡單的關於投影的隨機誤差不等式（英語：Concentration of measure）^[9])。舉例來說，定理的結論表明，度量形變程度的誤差參數 $\epsilon$ 以及低維空間的維數 $n$ 這兩個度量降維水準的關鍵量，均與原始數據所在的空間維數 $N$ 或者原始的 $m$ 個點具體為何種空間結構無關，這是很不平凡的^[9]。

最優性[編輯]

約翰遜-林登斯特勞斯定理是不能被本質性地改進的^[10]，即：給定任意正整數 $m$ 和誤差參數 $\epsilon$ ，存在某個 $N$ 以及 $\mathbb {R} ^{N}$ 中的 $m$ 個點，這個點集「難以降維」——也就是說，對任何一個滿足「基本保持點距」要求(即： $(1-\epsilon )\|x_{i}-x_{j}\|_{2}^{2}\leq \|f(x_{i})-f(x_{j})\|_{2}^{2}\leq (1+\epsilon )\|x_{i}-x_{j}\|_{2}^{2}$ 要對任意 $(i,j):1\leq i<j\leq n$ 成立)的線性映射 $f:\mathbb {R} ^{N}\to \mathbb {R} ^{n}$ ，它用來鑲嵌高維數據的那個低維空間(即 $\mathbb {R} ^{n}$ )，至少必須具有

n=\Omega \left({\frac {\log m}{\epsilon ^{2}}}\right)

這麼多的維數^[11]。

證明提要[編輯]

定理可以用高年級本科生容易理解的方法證明^[7]^[12]，其思路是證明如下事實：多次獨立地重複進行隨機投影的試驗，每次試驗中隨機抽取的投影 $f_{0}:\mathbb {R} ^{N}\to \mathbb {R} ^{n}$ 都有至少 $1/n$ 的概率符合定理中對映射 $f$ 全部的要求(顯然，驗證任何一個 $f_{0}$ 是否符合這些要求只需 $O(n^{2})$ 時間)，因此只要重複該獨立實驗 $\omega (n)$ 次就能以逼近100%的概率產生至少一個符合要求的映射 $f$ 。

參考文獻[編輯]

^ Johnson, William B; Lindenstrauss, Joram. Extensions of Lipschitz mappings into a Hilbert space. Contemporary mathematics. 1984, 26 (1): 189-206.
^ Devdatt Dubhashi. Johnson-Lindenstrauss, Concentration and applications to Support Vector Machines and Kernels (PDF). simons.berkeley.edu. [2018-11-13]. （原始內容存檔 (PDF)於2020-10-27）.
^ Suresh Venkat. When to use the Johnson-Lindenstrauss lemma over SVD?. cstheory.stackexchange.com. [2018-11-13]. （原始內容存檔於2020-11-25）.
^ Krahmer, Felix; Ward, Rachel. New and improved Johnson--Lindenstrauss embeddings via the restricted isometry property. SIAM Journal on Mathematical Analysis. 2011, 43 (3): 1269--1281.
^ Akselrod-Ballin, Ayelet; Bock, Davi and Reid, R Clay and Warfield, Simon K. Accelerating image registration with the Johnson--Lindenstrauss lemma: application to imaging 3-D neural ultrastructure with electron microscopy. IEEE transactions on medical imaging. 2011, 30 (7): 1427--1438.
^ Paul Beame. CSE 522: Sublinear (and Streaming) Algorithms Spring 2014 Lecture 10: Johnson-Lindenstrauss Lemmas (PDF). courses.cs.washington.edu. [2018-11-13]. （原始內容存檔 (PDF)於2017-04-01）.
^ ^7.0 ^7.1 ^7.2 Dasgupta, Sanjoy; Gupta, Anupam. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures & Algorithms. 2003, 22 (1): 60--65.
^ Sariel Har-Peled. The Johnson-Lindenstrauss Lemma (PDF). sarielhp.org. [2018-11-13].
^ ^9.0 ^9.1 Roman Vershynin. High-dimensional probability: An introduction with applications in data science. Cambridge University Press. 2018-08-01 [2018-11-13]. ISBN 9781108415194.
^ Alon, Noga. Problems and results in extremal combinatorics—I. Discrete Mathematics. 2003, 273 (1-3): 31--53.
^ Larsen, Kasper Green; Nelson, Jelani. Optimality of the johnson-lindenstrauss lemma. 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS). IEEE: 633––638. 2017.
^ Michael Mahoney. CS369M: Algorithms for Modern Massive Data Set Analysis Lecture 1: The Johnson-Lindenstrauss Lemma (PDF). cs.stanford.edu. [2018-11-13]. （原始內容存檔 (PDF)於2015-12-13）.

[JL_paper-1] Johnson, William B; Lindenstrauss, Joram. Extensions of Lipschitz mappings into a Hilbert space. Contemporary mathematics. 1984, 26 (1): 189-206.

[slides_JL-SVM-Kernels-2] Devdatt Dubhashi. Johnson-Lindenstrauss, Concentration and applications to Support Vector Machines and Kernels (PDF). simons.berkeley.edu. [2018-11-13]. （原始內容存檔 (PDF)於2020-10-27）.

[JL-application_1-3] Suresh Venkat. When to use the Johnson-Lindenstrauss lemma over SVD?. cstheory.stackexchange.com. [2018-11-13]. （原始內容存檔於2020-11-25）.

[JL-application_2-4] Krahmer, Felix; Ward, Rachel. New and improved Johnson--Lindenstrauss embeddings via the restricted isometry property. SIAM Journal on Mathematical Analysis. 2011, 43 (3): 1269--1281.

[JL-application_3-5] Akselrod-Ballin, Ayelet; Bock, Davi and Reid, R Clay and Warfield, Simon K. Accelerating image registration with the Johnson--Lindenstrauss lemma: application to imaging 3-D neural ultrastructure with electron microscopy. IEEE transactions on medical imaging. 2011, 30 (7): 1427--1438.

[slides_JL-intro-6] Paul Beame. CSE 522: Sublinear (and Streaming) Algorithms Spring 2014 Lecture 10: Johnson-Lindenstrauss Lemmas (PDF). courses.cs.washington.edu. [2018-11-13]. （原始內容存檔 (PDF)於2017-04-01）.

[JL_proof-7] 7.0 ^7.1 ^7.2 Dasgupta, Sanjoy; Gupta, Anupam. An elementary proof of a theorem of Johnson and Lindenstrauss. Random Structures & Algorithms. 2003, 22 (1): 60--65.

[JL-understanding-8] Sariel Har-Peled. The Johnson-Lindenstrauss Lemma (PDF). sarielhp.org. [2018-11-13].

[RV-book-9] 9.0 ^9.1 Roman Vershynin. High-dimensional probability: An introduction with applications in data science. Cambridge University Press. 2018-08-01 [2018-11-13]. ISBN 9781108415194.

[Alon_2003-10] Alon, Noga. Problems and results in extremal combinatorics—I. Discrete Mathematics. 2003, 273 (1-3): 31--53.

[Optimality-2017-11] Larsen, Kasper Green; Nelson, Jelani. Optimality of the johnson-lindenstrauss lemma. 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS). IEEE: 633––638. 2017.

[JL_proof2-12] Michael Mahoney. CS369M: Algorithms for Modern Massive Data Set Analysis Lecture 1: The Johnson-Lindenstrauss Lemma (PDF). cs.stanford.edu. [2018-11-13]. （原始內容存檔 (PDF)於2015-12-13）.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]