# 梅爾倒頻譜

1. 將一訊號進行傅利葉轉換（Fourier transform）
2. 將頻譜映射（mapping）至梅爾刻度，利用三角窗函數（triangular overlapping window）
3. 對數（logarithm）
4. 離散餘弦轉換（discrete cosine transform）
5. MFCC是轉換後的頻譜

## 歷史

Bridle 和 Brown 運用了一組十九個、由餘弦轉換導出的頻譜型的係數，轉換的輸入值是訊號在一組在頻帶上有非均勻間隔分布的帶通濾波器(band pass)後的輸出。

## 係數推導

1. 對該信號做傅立葉變換

${\displaystyle X[k]=FT{x[n]}}$

2. 根據下面公式算出Y[m]

${\displaystyle Y[m]=\log \left(\sum _{k=f_{m-1}}^{f_{m+1}}\left|X[k]\right|^{2}B_{m}[k]\right)}$

${\displaystyle B_{m}[k]={\begin{cases}0&{\mbox{for }}kf_{m+1}\\{\cfrac {k-f_{m-1}}{f_{m}-f_{m-1}}}&{\mbox{for }}f_{m-1}\leq k\leq f_{m}\\{\cfrac {f_{m+1}-k}{f_{m+1}-f_{m}}}&{\mbox{for }}f_{m}\leq k\leq f_{m+1}\end{cases}}}$

3.對Y[m]做IDCT得${\displaystyle c_{x}[n]}$，因為Y[m]是偶函數,故用IDCT(反離散餘弦變換)取代IDFT(反離散傅立葉變換)

${\displaystyle c_{x}[n]={\frac {1}{M}}\sum _{m=1}^{M}Y[m]cos\left({\cfrac {\pi n(m-1/2)}{M}}\right)}$

## 參考

1. ^ Min Xu; 等. HMM-based audio keyword generation. (編) Kiyoharu Aizawa, Yuichi Nakamura, Shin'ichi Satoh. Advances in Multimedia Information Processing – PCM 2004: 5th Pacific Rim Conference on Multimedia (PDF). Springer. 2004. ISBN 3-540-23985-5.[失效連結]
2. ^ Sahidullah, Md.; Saha, Goutam. Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition. Speech Communication. May 2012, 54 (4): 543–565. doi:10.1016/j.specom.2011.11.004.
3. ^ Meinard Müller. Information Retrieval for Music and Motion. Springer. 2007: 65. ISBN 978-3-540-74047-6.
4. P. Mermelstein (1976), "Distance measures for speech recognition, psychological and instrumental," in Pattern Recognition and Artificial Intelligence, C. H. Chen, Ed., pp. 374–388. Academic, New York.
5. ^ S.B. Davis, and P. Mermelstein (1980), "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences," in IEEE Transactions on Acoustics, Speech, and Signal Processing, 28(4), pp. 357–366.
6. ^ J. S. Bridle and M. D. Brown (1974), "An Experimental Automatic Word-Recognition System", JSRU Report No. 1003, Joint Speech Research Unit, Ruislip, England.
7. ^ L. C. W. Pols (1966), "Spectral Analysis and Identification of Dutch Vowels in Monosyllabic Words," Doctoral dissertion, Free University, Amsterdam, The Netherlands
8. ^ R. Plomp, L. C. W. Pols, and J. P. van de Geer (1967). "Dimensional analysis of vowel spectra." J. Acoustical Society of America, 41(3):707–712.