# 声学模型

## 输出概率

$G(x) = \sum_{i=1}^{n}w_i\cdot G_i(x)$

$p(x|\lambda) = \sum_{i}^{M}\omega_ip_i(x)$

$p_i(x)=\frac{1}{(2\pi)^{D/2}|\Sigma_i|} \exp\left\{-\frac{1}{2}(x-\mu_i)'\Sigma_i^{-1}(x-\mu_i)\right\}$

$\lambda=\left\{w_i,\mu_i,\Sigma_i\right\} \quad\quad i=1,2,\cdots,M$

GMM模型的主要问题为训练问题，亦即参数估计问题数估计，使得GMM模型和训练数据之间达到最佳的匹配程度。GMM的参数估 计方法有多种方法，其中应用最广泛的是基于最大似然准则(Maximum Likelihood Estimation, MLE)的方法。

$p(O|\lambda) = \prod_{t=1}^{T}p(O_t|\lambda)$

$\hat{w}_i = \frac{1}{T}\sum_t^{T}p(i|x_t,\lambda)$
$\hat{\mu}_i = \frac{\sum_{t=1}^{T}p(i|x_t,\lambda)x_t}{\sum_{t=1}^{T}p(i|x_t,\lambda)}$
$\hat{\sigma}_i = \frac{\sum_{t=1}^{T}p(i|x_t,\lambda)x^{2}_t}{\sum_{t=1}^{T}p(i|x_t,\lambda)}-\hat{\mu}_i^2$

$p(i|x_t,\lambda) = \frac{w_ip_i(x_i)}{\sum_{k=1}^M w_kp_k(x_i)}$

## 參考資料

1. ^ 高勤 汉语语音文档检索技术研究及系统实现 北京大学硕士研究生学位论文 http://geek.kyloo.net/public/master-thesis.pdf
2. ^ L.R. Rabiner, “A tutorial on Hidden Markov Models and selected applications in speech recognition”, in Proceedings of the IEEE, vol. 77, pp. 257–287, 1989
3. ^ D.A. Reynolds and R.C. Rose, “Robust text-independent speaker identification using Gaussian mixture speaker models”, IEEE Transaction on Speech Audio Process, vol. 3, pp. 72–83, 1995.
4. ^ K.F. Lee, Large-vocabulary speaker independent continuous speech recognition, the Sphinx system, Ph.D. thesis, Carnegie Mellon University, 1988.