# 梅尔倒频谱

1. 将一讯号进行傅里叶转换（Fourier transform）
2. 将频谱映射（mapping）至梅尔刻度，利用三角窗函数（triangular overlapping window）
3. 对数（logarithm）
4. 离散余弦转换（discrete cosine transform）
5. MFCC是转换后的频谱

## 历史

Bridle 和 Brown 运用了一组十九个、由余弦转换导出的频谱型的系数，转换的输入值是讯号在一组在频带上有非均匀间隔分布的带通滤波器(band pass)后的输出。

## 系数推导

1. 对该信号做傅立叶变换

${\displaystyle X[k]=FT{x[n]}}$

2. 根据下面公式算出Y[m]

${\displaystyle Y[m]=\log \left(\sum _{k=f_{m-1}}^{f_{m+1}}\left|X[k]\right|^{2}B_{m}[k]\right)}$

${\displaystyle B_{m}[k]={\begin{cases}0&{\mbox{for }}kf_{m+1}\\{\cfrac {k-f_{m-1}}{f_{m}-f_{m-1}}}&{\mbox{for }}f_{m-1}\leq k\leq f_{m}\\{\cfrac {f_{m+1}-k}{f_{m+1}-f_{m}}}&{\mbox{for }}f_{m}\leq k\leq f_{m+1}\end{cases}}}$

3.对Y[m]做IDCT得${\displaystyle c_{x}[n]}$，因为Y[m]是偶函数,故用IDCT(反离散余弦变换)取代IDFT(反离散傅立叶变换)

${\displaystyle c_{x}[n]={\frac {1}{M}}\sum _{m=1}^{M}Y[m]cos\left({\cfrac {\pi n(m-1/2)}{M}}\right)}$

## 参考

1. ^ Min Xu; 等. HMM-based audio keyword generation. (编) Kiyoharu Aizawa, Yuichi Nakamura, Shin'ichi Satoh. Advances in Multimedia Information Processing – PCM 2004: 5th Pacific Rim Conference on Multimedia (PDF). Springer. 2004. ISBN 3-540-23985-5. [失效链接]
2. ^ Sahidullah, Md.; Saha, Goutam. Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition. Speech Communication. May 2012, 54 (4): 543–565. doi:10.1016/j.specom.2011.11.004.
3. ^ Meinard Müller. Information Retrieval for Music and Motion. Springer. 2007: 65. ISBN 978-3-540-74047-6.
4. C. H. Chen, Ed., pp. 374–388. Academic, New York.
5. ^ S.B. Davis, and P. Mermelstein (1980), "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences," in IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), pp. 357–366.
6. ^ J. S. Bridle and M. D. Brown (1974), "An Experimental Automatic Word-Recognition System", JSRU Report No. 1003, Joint Speech Research Unit, Ruislip, England.
7. ^ L. C. W. Pols (1966), "Spectral Analysis and Identification of Dutch Vowels in Monosyllabic Words," Doctoral dissertion, Free University, Amsterdam, The Netherlands
8. ^ R. Plomp, L. C. W. Pols, and J. P. van de Geer (1967). "Dimensional analysis of vowel spectra." J. Acoustical Society of America, 41(3):707–712.