# 声学模型

## 输出概率

${\displaystyle G(x)=\sum _{i=1}^{n}w_{i}\cdot G_{i}(x)}$

${\displaystyle p(x|\lambda )=\sum _{i}^{M}\omega _{i}p_{i}(x)}$

${\displaystyle p_{i}(x)={\frac {1}{(2\pi )^{D/2}|\Sigma _{i}|}}\exp \left\{-{\frac {1}{2}}(x-\mu _{i})'\Sigma _{i}^{-1}(x-\mu _{i})\right\}}$

${\displaystyle \lambda =\left\{w_{i},\mu _{i},\Sigma _{i}\right\}\quad \quad i=1,2,\cdots ,M}$

GMM模型的主要问题为训练问题，亦即参数估计问题数估计，使得GMM模型和训练数据之间达到最佳的匹配程度。GMM的参数估 计方法有多种方法，其中应用最广泛的是基于最大似然准则(Maximum Likelihood Estimation, MLE)的方法。

${\displaystyle p(O|\lambda )=\prod _{t=1}^{T}p(O_{t}|\lambda )}$

${\displaystyle {\hat {w}}_{i}={\frac {1}{T}}\sum _{t}^{T}p(i|x_{t},\lambda )}$
${\displaystyle {\hat {\mu }}_{i}={\frac {\sum _{t=1}^{T}p(i|x_{t},\lambda )x_{t}}{\sum _{t=1}^{T}p(i|x_{t},\lambda )}}}$
${\displaystyle {\hat {\sigma }}_{i}={\frac {\sum _{t=1}^{T}p(i|x_{t},\lambda )x_{t}^{2}}{\sum _{t=1}^{T}p(i|x_{t},\lambda )}}-{\hat {\mu }}_{i}^{2}}$

${\displaystyle p(i|x_{t},\lambda )={\frac {w_{i}p_{i}(x_{i})}{\sum _{k=1}^{M}w_{k}p_{k}(x_{i})}}}$

## 參考資料

1. ^ L.R. Rabiner, “A tutorial on Hidden Markov Models and selected applications in speech recognition”, in Proceedings of the IEEE, vol. 77, pp. 257–287, 1989
2. ^ D.A. Reynolds and R.C. Rose, “Robust text-independent speaker identification using Gaussian mixture speaker models”, IEEE Transaction on Speech Audio Process, vol. 3, pp. 72–83, 1995.
3. ^ K.F. Lee, Large-vocabulary speaker independent continuous speech recognition, the Sphinx system, Ph.D. thesis, Carnegie Mellon University, 1988.