逐次超鬆弛迭代法

數值線性代數中，逐次超鬆弛（successive over-relaxation，SOR）迭代法是高斯-賽德爾迭代的一種變體，用於求解線性方程組。類似方法也可用於任何緩慢收斂的迭代過程。

SOR迭代法由David M. Young Jr.和Stanley P. Frankel在1950年同時獨立提出，目的是在計算機上自動求解線性方程組。之前，人們已經為計算員的計算開發過超鬆弛法，如路易斯·弗萊·理查德森的方法以及R. V. Southwell開發的方法。但這些方法需要一定專業知識確保求解的收斂，不適用於計算機編程。David M. Young Jr.的論文對這些方面進行了探討。^[1]

形式化

給定n個線性方程組成的方系統：

A\mathbf {x} =\mathbf {b}

其中

A={\begin{bmatrix}a_{11}&a_{12}&\cdots &a_{1n}\\a_{21}&a_{22}&\cdots &a_{2n}\\\vdots &\vdots &\ddots &\vdots \\a_{n1}&a_{n2}&\cdots &a_{nn}\end{bmatrix}},\qquad \mathbf {x} ={\begin{bmatrix}x_{1}\\x_{2}\\\vdots \\x_{n}\end{bmatrix}},\qquad \mathbf {b} ={\begin{bmatrix}b_{1}\\b_{2}\\\vdots \\b_{n}\end{bmatrix}}.

則A可分解為對角矩陣D、嚴格上下三角矩陣U、L：

A=D+L+U,

其中

D={\begin{bmatrix}a_{11}&0&\cdots &0\\0&a_{22}&\cdots &0\\\vdots &\vdots &\ddots &\vdots \\0&0&\cdots &a_{nn}\end{bmatrix}},\quad L={\begin{bmatrix}0&0&\cdots &0\\a_{21}&0&\cdots &0\\\vdots &\vdots &\ddots &\vdots \\a_{n1}&a_{n2}&\cdots &0\end{bmatrix}},\quad U={\begin{bmatrix}0&a_{12}&\cdots &a_{1n}\\0&0&\cdots &a_{2n}\\\vdots &\vdots &\ddots &\vdots \\0&0&\cdots &0\end{bmatrix}}.

線性方程組可重寫為

(D+\omega L)\mathbf {x} =\omega \mathbf {b} -[\omega U+(\omega -1)D]\mathbf {x}

其中 $\omega >1$ 是常數，稱作鬆弛因子（relaxation factor）。

逐次超鬆弛迭代法可以通過迭代逼近x的精確解，可分析地寫作

\mathbf {x} ^{(k+1)}=(D+\omega L)^{-1}{\big (}\omega \mathbf {b} -[\omega U+(\omega -1)D]\mathbf {x} ^{(k)}{\big )}=L_{\omega }\mathbf {x} ^{(k)}+\mathbf {c} ,

其中 $\mathbf {x} ^{(k)}$ 是 $\mathbf {x}$ 的第k次迭代值， $\mathbf {x} ^{(k+1)}$ 是 $\mathbf {x}$ 下一次迭代所得的值。利用 $(D+\omega L)$ 的三角形，可用向前替換法依次計算 $\mathbf {x} ^{(k+1)}$ 的元素：

x_{i}^{(k+1)}=(1-\omega )x_{i}^{(k)}+{\frac {\omega }{a_{ii}}}\left(b_{i}-\sum _{j<i}a_{ij}x_{j}^{(k+1)}-\sum _{j>i}a_{ij}x_{j}^{(k)}\right),\quad i=1,2,\ldots ,n.

收斂性

鬆弛因子 $\omega$ 的選擇並不容易，取決於係數矩陣的性質。1947年，亞歷山大·馬雅科維奇·奧斯特洛夫斯基證明，若A是對稱正定矩陣，則 $\forall 0<\omega <2,\ \rho (L_{\omega })<1$ 。因此，迭代過程將收斂，但更高的收斂速度更有價值。

收斂速度

SOR法的收斂速度可通過分析得出。假設^[2]^[3]

鬆弛因子適當： $\omega \in (0,2)$
雅可比法迭代矩陣 $C_{\text{Jac}}:=I-D^{-1}A$ 只有實特徵值
雅可比法收斂： $\mu :=\rho (C_{\text{Jac}})<1$
矩陣分解 $A=D+L+U$ 滿足 $\forall z\in \mathbb {C} \setminus \{0\},\ \lambda \in \mathbb {C} ,\ \operatorname {det} (\lambda D+zL+{\tfrac {1}{z}}U)=\operatorname {det} (\lambda D+L+U)$

則收斂速度可表為

\rho (C_{\omega })={\begin{cases}{\frac {1}{4}}\left(\omega \mu +{\sqrt {\omega ^{2}\mu ^{2}-4(\omega -1)}}\right)^{2}\,,&0<\omega \leq \omega _{\text{opt}}\\\omega -1\,,&\omega _{\text{opt}}<\omega <2\end{cases}}

最優鬆弛因子是

\omega _{\text{opt}}:=1+\left({\frac {\mu }{1+{\sqrt {1-\mu ^{2}}}}}\right)^{2}=1+{\frac {\mu ^{2}}{4}}+O(\mu ^{3})\,.

特別地， $\omega =1$ 時SOR法即退化為高斯-賽德爾迭代，有 $\rho (C_{\omega })=\mu ^{2}=\rho (C_{\text{Jac}})^{2}$ 。對最優的 $\omega$ ，有 $\rho (C_{\omega })={\frac {1-{\sqrt {1-\mu ^{2}}}}{1+{\sqrt {1-\mu ^{2}}}}}={\frac {\mu ^{2}}{4}}+O(\mu ^{3})$ ，表明SOR法的效率約是高斯-賽德爾迭代的4倍。

最後一條假設對三對角矩陣也滿足，因為 $Z(\lambda D+L+U)Z^{-1}=\lambda D+zL+{\tfrac {1}{z}}U$ 對對角陣Z，其元素 $Z_{ii}=z^{i-1}$ 、 $\operatorname {det} (\lambda D+L+U)=\operatorname {det} (Z(\lambda D+L+U)Z^{-1})$ 。

算法

由於此算法中，元素可在迭代過程中被覆蓋，所以只需一個存儲向量，不需要向量索引。

输入： $A$ ,  $b$ ,  $ω$ 
输出： $φ$ 

选择初始解 $φ$ 
repeat until convergence
    for  $i$  from 1 until  $n$  do
        set  $σ$  to 0
        for  $j$  from 1 until  $n$  do
            if  $j$  ≠  $i$  then
                set  $σ$  to  $σ + a ij φ j$ 
            end if
        end ( $j$ -loop)
        set  $φ i$  to  $(1 - ω) φ i + ω (b i - σ) / a ii$ 
    end ( $i$ -loop)
    check if convergence is reached
end (repeat)

注意： $(1-\omega )\phi _{i}+{\frac {\omega }{a_{ii}}}(b_{i}-\sigma )$ 也可寫作 $\phi _{i}+\omega \left({\frac {b_{i}-\sigma }{a_{ii}}}-\phi _{i}\right)$ ，這樣每次外層for循環可以省去一次乘法。

例子

解線性方程組

{\begin{aligned}4x_{1}-x_{2}-6x_{3}+0x_{4}&=2,\\-5x_{1}-4x_{2}+10x_{3}+8x_{4}&=21,\\0x_{1}+9x_{2}+4x_{3}-2x_{4}&=-12,\\1x_{1}+0x_{2}-7x_{3}+5x_{4}&=-6.\end{aligned}}

擇鬆弛因子 $\omega =0.5$ 與初始解 $\phi =(0,0,0,0)$ 。由SOR算法可得下表，在38步取得精確解(3, −2, 2, 1)。

迭代	$x_{1}$	$x_{2}$	$x_{3}$	$x_{4}$
01	0.25	−2.78125	1.6289062	0.5152344
02	1.2490234	−2.2448974	1.9687712	0.9108547
03	2.070478	−1.6696789	1.5904881	0.76172125
...	...	...	...	...
37	2.9999998	−2.0	2.0	1.0
38	3.0	−2.0	2.0	1.0

用Common Lisp的簡單實現：

;; 默认浮点格式设为long-float，以确保在更大范围数字上正确运行
(setf *read-default-float-format* 'long-float)

(defparameter +MAXIMUM-NUMBER-OF-ITERATIONS+ 100
  "The number of iterations beyond which the algorithm should cease its
   operation, regardless of its current solution. A higher number of
   iterations might provide a more accurate result, but imposes higher
   performance requirements.")

(declaim (type (integer 0 *) +MAXIMUM-NUMBER-OF-ITERATIONS+))

(defun get-errors (computed-solution exact-solution)
  "For each component of the COMPUTED-SOLUTION vector, retrieves its
   error with respect to the expected EXACT-SOLUTION vector, returning a
   vector of error values.
   ---
   While both input vectors should be equal in size, this condition is
   not checked and the shortest of the twain determines the output
   vector's number of elements.
   ---
   The established formula is the following:
     Let resultVectorSize = min(computedSolution.length, exactSolution.length)
     Let resultVector     = new vector of resultVectorSize
     For i from 0 to (resultVectorSize - 1)
       resultVector[i] = exactSolution[i] - computedSolution[i]
     Return resultVector"
  (declare (type (vector number *) computed-solution))
  (declare (type (vector number *) exact-solution))
  (map '(vector number *) #'- exact-solution computed-solution))

(defun is-convergent (errors &key (error-tolerance 0.001))
  "Checks whether the convergence is reached with respect to the
   ERRORS vector which registers the discrepancy betwixt the computed
   and the exact solution vector.
   ---
   The convergence is fulfilled if and only if each absolute error
   component is less than or equal to the ERROR-TOLERANCE, that is:
   For all e in ERRORS, it holds: abs(e) <= errorTolerance."
  (declare (type (vector number *) errors))
  (declare (type number            error-tolerance))
  (flet ((error-is-acceptable (error)
          (declare (type number error))
          (<= (abs error) error-tolerance)))
    (every #'error-is-acceptable errors)))

(defun make-zero-vector (size)
  "Creates and returns a vector of the SIZE with all elements set to 0."
  (declare (type (integer 0 *) size))
  (make-array size :initial-element 0.0 :element-type 'number))

(defun successive-over-relaxation (A b omega
                                   &key (phi (make-zero-vector (length b)))
                                        (convergence-check
                                          #'(lambda (iteration phi)
                                              (declare (ignore phi))
                                              (>= iteration +MAXIMUM-NUMBER-OF-ITERATIONS+))))
  "Implements the successive over-relaxation (SOR) method, applied upon
   the linear equations defined by the matrix A and the right-hand side
   vector B, employing the relaxation factor OMEGA, returning the
   calculated solution vector.
   ---
   The first algorithm step, the choice of an initial guess PHI, is
   represented by the optional keyword parameter PHI, which defaults
   to a zero-vector of the same structure as B. If supplied, this
   vector will be destructively modified. In any case, the PHI vector
   constitutes the function's result value.
   ---
   The terminating condition is implemented by the CONVERGENCE-CHECK,
   an optional predicate
     lambda(iteration phi) => generalized-boolean
   which returns T, signifying the immediate termination, upon achieving
   convergence, or NIL, signaling continuant operation, otherwise. In
   its default configuration, the CONVERGENCE-CHECK simply abides the
   iteration's ascension to the ``+MAXIMUM-NUMBER-OF-ITERATIONS+'',
   ignoring the achieved accuracy of the vector PHI."
  (declare (type (array  number (* *)) A))
  (declare (type (vector number *)     b))
  (declare (type number                omega))
  (declare (type (vector number *)     phi))
  (declare (type (function ((integer 1 *)
                            (vector number *))
                           *)
                 convergence-check))
  (let ((n (array-dimension A 0)))
    (declare (type (integer 0 *) n))
    (loop for iteration from 1 by 1 do
      (loop for i from 0 below n by 1 do
        (let ((rho 0))
          (declare (type number rho))
          (loop for j from 0 below n by 1 do
            (when (/= j i)
              (let ((a[ij]  (aref A i j))
                    (phi[j] (aref phi j)))
                (incf rho (* a[ij] phi[j])))))
          (setf (aref phi i)
                (+ (* (- 1 omega)
                      (aref phi i))
                   (* (/ omega (aref A i i))
                      (- (aref b i) rho))))))
      (format T "~&~d. solution = ~a" iteration phi)
      ;; Check if convergence is reached.
      (when (funcall convergence-check iteration phi)
        (return))))
  (the (vector number *) phi))

;; Summon the function with the exemplary parameters.
(let ((A              (make-array (list 4 4)
                        :initial-contents
                        '((  4  -1  -6   0 )
                          ( -5  -4  10   8 )
                          (  0   9   4  -2 )
                          (  1   0  -7   5 ))))
      (b              (vector 2 21 -12 -6))
      (omega          0.5)
      (exact-solution (vector 3 -2 2 1)))
  (successive-over-relaxation
    A b omega
    :convergence-check
    #'(lambda (iteration phi)
        (declare (type (integer 0 *)     iteration))
        (declare (type (vector number *) phi))
        (let ((errors (get-errors phi exact-solution)))
          (declare (type (vector number *) errors))
          (format T "~&~d. errors   = ~a" iteration errors)
          (or (is-convergent errors :error-tolerance 0.0)
              (>= iteration +MAXIMUM-NUMBER-OF-ITERATIONS+))))))

上述偽代碼的簡單Python實現。

import numpy as np
from scipy import linalg

def sor_solver(A, b, omega, initial_guess, convergence_criteria):
    """
    This is an implementation of the pseudo-code provided in the Wikipedia article.
    Arguments:
        A: nxn numpy matrix.
        b: n dimensional numpy vector.
        omega: relaxation factor.
        initial_guess: An initial solution guess for the solver to start with.
        convergence_criteria: The maximum discrepancy acceptable to regard the current solution as fitting.
    Returns:
        phi: solution vector of dimension n.
    """
    step = 0
    phi = initial_guess[:]
    residual = linalg.norm(A @ phi - b)  # Initial residual
    while residual > convergence_criteria:
        for i in range(A.shape[0]):
            sigma = 0
            for j in range(A.shape[1]):
                if j != i:
                    sigma += A[i, j] * phi[j]
            phi[i] = (1 - omega) * phi[i] + (omega / A[i, i]) * (b[i] - sigma)
        residual = linalg.norm(A @ phi - b)
        step += 1
        print("Step {} Residual: {:10.6g}".format(step, residual))
    return phi

# An example case that mirrors the one in the Wikipedia article
residual_convergence = 1e-8
omega = 0.5  # Relaxation factor

A = np.array([[4, -1, -6, 0],
              [-5, -4, 10, 8],
              [0, 9, 4, -2],
              [1, 0, -7, 5]])

b = np.array([2, 21, -12, -6])

initial_guess = np.zeros(4)

phi = sor_solver(A, b, omega, initial_guess, residual_convergence)
print(phi)

對稱逐次超鬆弛

對對稱矩陣A，其中

U=L^{T},\,

有對稱逐次超鬆弛迭代法（SSOR）：

P=\left({\frac {D}{\omega }}+L\right){\frac {\omega }{2-\omega }}D^{-1}\left({\frac {D}{\omega }}+U\right),

迭代法為

\mathbf {x} ^{k+1}=\mathbf {x} ^{k}-\gamma ^{k}P^{-1}(A\mathbf {x} ^{k}-\mathbf {b} ),\ k\geq 0.

SOR與SSOR法都來自David M. Young Jr.

其他應用

任何迭代法都可應用相似技巧。若原迭代的形式為

x_{n+1}=f(x_{n})

則可將其改為

x_{n+1}^{\mathrm {SOR} }=(1-\omega )x_{n}^{\mathrm {SOR} }+\omega f(x_{n}^{\mathrm {SOR} }).

但若將x視作完整向量，則上述解線性方程組的公式不是這種公式的特例。此公式基礎上，計算下一個向量的公式是

\mathbf {x} ^{(k+1)}=(1-\omega )\mathbf {x} ^{(k)}+\omega L_{*}^{-1}(\mathbf {b} -U\mathbf {x} ^{(k)}),

其中 $L_{*}=L+D$ 。 $\omega >1$ 用於加快收斂速度， $\omega <1$ 可使發散的迭代收斂或加快過調（overshoot）過程的收斂。有多種方法可根據觀察到的收斂過程行為，自適應地調整鬆弛因子 $\omega$ 。這些方法通常只對一部分問題有效。

另見

註釋

^ Young, David M., Iterative methods for solving partial difference equations of elliptical type (PDF), PhD thesis, Harvard University, 1950-05-01 [2009-06-15]
^ Hackbusch, Wolfgang. 4.6.2. Iterative Solution of Large Sparse Systems of Equations | SpringerLink. Applied Mathematical Sciences 95. 2016. ISBN 978-3-319-28481-1. doi:10.1007/978-3-319-28483-5 （英國英語）.
^ Greenbaum, Anne. 10.1. Iterative Methods for Solving Linear Systems. Frontiers in Applied Mathematics 17. 1997. ISBN 978-0-89871-396-1. doi:10.1137/1.9781611970937 （英國英語）.

參考文獻

Abraham Berman, Robert J. Plemmons, Nonnegative Matrices in the Mathematical Sciences, 1994, SIAM. ISBN 0-89871-321-8.
Black, Noel. Successive Overrelaxation Method. MathWorld.
A. Hadjidimos, Successive overrelaxation (SOR) and related methods, Journal of Computational and Applied Mathematics 123 (2000), 177–199.
Yousef Saad, Iterative Methods for Sparse Linear Systems, 1st edition, PWS, 1996.
Netlib's copy of "Templates for the Solution of Linear Systems", by Barrett et al.
Richard S. Varga 2002 Matrix Iterative Analysis, Second ed. (of 1962 Prentice Hall edition), Springer-Verlag.
David M. Young Jr. Iterative Solution of Large Linear Systems, Academic Press, 1971. (reprinted by Dover, 2003)

外部連結

Module for the SOR Method
Tridiagonal linear system solver based on SOR, in C++

[1] Young, David M., Iterative methods for solving partial difference equations of elliptical type (PDF), PhD thesis, Harvard University, 1950-05-01 [2009-06-15]

[2] Hackbusch, Wolfgang. 4.6.2. Iterative Solution of Large Sparse Systems of Equations | SpringerLink. Applied Mathematical Sciences 95. 2016. ISBN 978-3-319-28481-1. doi:10.1007/978-3-319-28483-5 （英國英語）.

[3] Greenbaum, Anne. 10.1. Iterative Methods for Solving Linear Systems. Frontiers in Applied Mathematics 17. 1997. ISBN 978-0-89871-396-1. doi:10.1137/1.9781611970937 （英國英語）.

[1]

[2]

[3]