字母频率
维基百科,自由的百科全书
字母频率,通常是用于密码学中的对字母在文本中出现频率所作的分析。在活字排版时代,人们曾根据经典总结出了英语中各字母的出现频率为etaoin shrdlu cmfwyp vbgkqj xz。
目录 |
[编辑] 英语中字母的使用频率
字母的频率为:[1]
| 字母 | 频率 |
|---|---|
| a | 8.167% |
| b | 1.492% |
| c | 2.782% |
| d | 4.253% |
| e | 12.702% |
| f | 2.228% |
| g | 2.015% |
| h | 6.094% |
| i | 6.966% |
| j | 0.153% |
| k | 0.772% |
| l | 4.025% |
| m | 2.406% |
| n | 6.749% |
| o | 7.507% |
| p | 1.929% |
| q | 0.095% |
| r | 5.987% |
| s | 6.327% |
| t | 9.056% |
| u | 2.758% |
| v | 0.978% |
| w | 2.360% |
| x | 0.150% |
| y | 1.974% |
| z | 0.074% |
英语中,空格的频率是出现次数最多的字母E的107%,而非字母符号(数字、标点符号等)则排在第四位,位于T与A之间。[2]
[编辑] 其他语言中字母的使用频率
| 字母 | 法语 [3] | 德语 [4] | 西班牙语 [5] | 葡萄牙语 [6] | 世界语 [7] | 意大利语[8] | 土耳其语 | 瑞典语[9] | 波兰语[10] | 道本语 [11] | 荷兰语 [12] |
|---|---|---|---|---|---|---|---|---|---|---|---|
| a | 7.636% | 6.51% | 12.53% | 14.63% | 12.12% | 11.74% | 11.68% | 9.3% | 8.0% | 17.2% | 7.49% |
| b | 0.901% | 1.89% | 1.42% | 1.04% | 0.98% | 0.92% | 2.95% | 1.3% | 1.3% | 0.0% | 1.58% |
| c | 3.260% | 3.06% | 4.68% | 3.88% | 0.78% | 4.5% | 0.97% | 1.3% | 3.8% | 0.0% | 1.24% |
| d | 3.669% | 5.08% | 5.86% | 4.99% | 3.04% | 3.73% | 4.87% | 4.5% | 3.0% | 0.0% | 5.93% |
| e | 14.715% | 17.40% | 13.68% | 12.57% | 8.99% | 11.79% | 9.01% | 9.9% | 6.9% | 7.4% | 18.91% |
| f | 1.066% | 1.66% | 0.69% | 1.02% | 1.03% | 0.95% | 0.44% | 2.0% | 0.1% | 0.0% | 0.81% |
| g | 0.866% | 3.01% | 1.01% | 1.30% | 1.17% | 1.64% | 1.34% | 3.3% | 1.0% | 0.0% | 3.40% |
| h | 0.737% | 4.76% | 0.70% | 1.28% | 0.38% | 1.54% | 1.14% | 2.1% | 1.0% | 0.0% | 2.38% |
| i | 7.529% | 7.55% | 6.25% | 6.18% | 10.01% | 11.28% | 8.27%* | 5.1% | 7.0% | 14.8% | 6.50% |
| j | 0.545% | 0.27% | 0.44% | 0.40% | 3.50% | 0.00% | 0.01% | 0.7% | 1.9% | 3.0% | 1.46% |
| k | 0.049% | 1.21% | 0.01% | 0.02% | 4.16% | 0.00% | 4.71% | 3.2% | 2.7% | 5.1% | 2.25% |
| l | 5.456% | 3.44% | 4.97% | 2.78% | 6.14% | 6.51% | 5.75% | 5.2% | 3.1% | 10.2% | 3.57% |
| m | 2.968% | 2.53% | 3.15% | 4.74% | 2.99% | 2.51% | 3.74% | 3.5% | 2.4% | 4.4% | 2.21% |
| n | 7.095% | 9.78% | 6.71% | 5.05% | 7.96% | 6.88% | 7.23% | 8.8% | 4.7% | 11.6% | 10.03% |
| o | 5.378% | 2.51% | 8.68% | 10.73% | 8.78% | 9.83% | 2.45% | 4.1% | 7.1% | 7.7% | 6.06% |
| p | 3.021% | 0.79% | 2.51% | 2.52% | 2.74% | 3.05% | 0.79% | 1.7% | 2.4% | 3.7% | 1.57% |
| q | 1.362% | 0.02% | 0.88% | 1.20% | 0.00% | 0.51% | 0 | 0.007% | - | 0.0% | 0.009% |
| r | 6.553% | 7.00% | 6.87% | 6.53% | 5.91% | 6.37% | 6.95% | 8.3% | 3.5% | 0.0% | 6.41% |
| s | 7.948% | 7.27% | 7.98% | 7.81% | 6.09% | 4.98% | 2.95% | 6.3% | 3.8% | 4.1% | 3.73% |
| t | 7.244% | 6.15% | 4.63% | 4.74% | 5.27% | 5.62% | 3.09% | 8.7% | 2.4% | 4.6% | 6.79% |
| u | 6.311% | 4.35% | 3.93% | 4.63% | 3.18% | 3.01% | 3.43% | 1.8% | 1.8% | 3.2% | 1.99% |
| v | 1.628% | 0.67% | 0.90% | 1.67% | 1.90% | 2.10% | 0.98% | 2.4% | - | 0.0% | 2.85% |
| w | 0.114% | 1.89% | 0.02% | 0.01% | 0.00% | 0.00% | 0 | 0.03% | 3.6% | 2.8% | 1.52% |
| x | 0.387% | 0.03% | 0.22% | 0.21% | 0.00% | 0.00% | 0 | 0.1% | - | 0.0% | 0.04% |
| y | 0.308% | 0.04% | 0.90% | 0.01% | 0.00% | 0.00% | 3.37% | 0.6% | 3.2% | 0.0% | 0.035% |
| z | 0.136% | 1.13% | 0.52% | 0.47% | 0.50% | 0.49% | 1.50% | 0.02% | 5.1% | 0.0% | 1.39% |
| à | 0.486% | 0 | 0 | 参见 a | 0 | 参见 a | 0 | 0.0% | 0 | - | 参见 a |
| å | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1.6% | 0 | - | - |
| ä | 0 | - | 0 | 0 | 0 | 0 | 0 | 2.1% | 0 | - | 参见 a |
| ą | 0 | - | 0 | 0 | 0 | 0 | 0 | 0 | 参见 a | - | - |
| œ | 0.018% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | - |
| ç | 0.085% | 0 | 0 | 参见 c | 0 | 0 | 1.26% | 0 | 0 | - | - |
| ĉ | 0 | 0 | 0 | 0 | 0.66% | 0 | 0 | 0 | 0 | - | - |
| ć | 0 | - | 0 | 0 | 0 | 0 | 0 | 0 | 参见 c | - | - |
| è | 0.271% | 0 | 0 | 0 | 0 | 参见 e | 0 | 0.0% | 0 | - | 参见 e |
| é | 1.904% | 0 | 0 | 参见 e | 0 | 参见 e | 0 | 0.0% | 0 | - | 参见 e |
| ê | 0.225% | 0 | 0 | 参见 e | 0 | 0 | 0 | 0 | 0 | - | - |
| ë | 0.000% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | 参见 e |
| ę | 0 | - | 0 | 0 | 0 | 0 | 0 | 0 | 参见 e | - | - |
| ĝ | 0 | 0 | 0 | 0 | 0.69% | 0 | 0 | 0 | 0 | - | - |
| ğ | 0 | 0 | 0 | 0 | 0 | 0 | 1.13% | 0 | 0 | - | - |
| ĥ | 0 | 0 | 0 | 0 | 0.02% | 0 | 0 | 0 | 0 | - | - |
| î | 0.045% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - |
| ì | 0 | 0 | 0 | 0 | 0 | 参见 i | 0 | 0 | 0 | - | 参见 i |
| ï | 0.005% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | 参见 i |
| ı | 0 | 0 | 0 | 0 | 0 | 0 | 5.20%* | 0 | 0 | - | - |
| ĵ | 0 | 0 | 0 | 0 | 0.12% | 0 | 0 | 0 | 0 | - | - |
| ł | 0 | - | 0 | 0 | 0 | 0 | 0 | 0 | 参见 l | - | - |
| ñ | 0 | 0 | 0.31% | 0 | 0 | 0 | 0 | 0 | 0 | - | - |
| ń | 0 | - | 0 | 0 | 0 | 0 | 0 | 0 | 参见 n | - | - |
| ò | 0 | 0 | 0 | 0 | 0 | 参见 o | 0 | 0 | 0 | - | 参见 o |
| ö | 0 | - | 0 | 0 | 0 | 0 | 0.87% | 1.5% | 0 | - | 参见 o |
| ó | 0 | - | 0 | 参见 o | 0 | 0 | 0 | 0 | 参见 o | - | 参见 o |
| ŝ | 0 | 0 | 0 | 0 | 0.38% | 0 | 0 | 0 | 0 | - | - |
| ş | 0 | 0 | 0 | 0 | 0 | 0 | 1.94% | 0 | 0 | - | - |
| ś | 0 | - | 0 | 0 | 0 | 0 | 0 | 0 | 参见 s | - | - |
| ß | 0 | 0.31% | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | - |
| ù | 0.058% | 0 | 0 | 0 | 0 | 参见 u | 0 | 0 | 0 | - | 参见 u |
| ŭ | 0 | 0 | 0 | 0 | 0.52% | 0 | 0 | 0 | 0 | - | - |
| ü | 0 | - | 0 | 0 | 0 | 0 | 1.99% | 0 | 0 | - | 参见 u |
| ź | 0 | - | 0 | 0 | 0 | 0 | 0 | 0 | 参见 z | - | - |
| ż | 0 | - | 0 | 0 | 0 | 0 | 0 | 0 | 0.7% | - | - |
* 参见带点与不带点I
基于这些表格,每种语言使用频率最高的10个字母为:
- 法语:esait nrulo(印欧语系罗曼语族,传统上使用发音更便利的esartinulop[13])
- 西班牙语:eaosr nidlc(印欧语系罗曼语族)
- 葡萄牙语:aeosr indmt(印欧语系罗曼语族)
- 意大利语:eaion lrtsc(印欧语系罗曼语族)
- 世界语:aieon lsrtk(人工语言,主要受印欧语系罗曼语族和日耳曼语族影响)
- 德语:enisr atdhu(印欧语系日耳曼语族)
- 瑞典语:eantr slido(印欧语系日耳曼语族)
- 土耳其语:aeinr ldkmu(土耳其语系)
- 荷兰语:enati rodsl(印欧语系日耳曼语族)[14]
- 波兰语:aoiez nscwr(印欧语系斯拉夫语族)
所有这些语言都拥有一个基本类似的25+个字母的字母表。
[编辑] 相关条目
[编辑] 参考
- ^ Lewand, Robert. Cryptological Mathematics. 美国数学协会. 2000: 36. ISBN 978-0883857199. Table also available from [1]
- ^ Lee, E. Stewart; Essays about Computer Security; University of Cambridge Computer Laboratory, p. 181
- ^ CorpusDeThomasTempé [2007-06-15].
- ^ Albrecht Beutelspacher, Kryptologie, 7. Aufl., Wiesbaden: Vieweg Verlagsgesellschaft, 2005, ISBN 3-8348-0014-7, p.10
- ^ Fletcher Pratt, Secret and Urgent: the Story of Codes and Ciphers Blue Ribbon Books, 1939, pp. 254-255.
- ^ Frequência da ocorrência de letras no Português [2009-06-16].
- ^ La Oftecoj de la Esperantaj Literoj [2007-09-14].
- ^ Simon Singh, Codici e Segreti, 1999, RCS, ISBN 88-17-12539-3
- ^ Simon Singh, Kodboken, 1999, Norstedts, ISBN 91-1-1300708-4
- ^ Wstęp do kryptologii, counting [space] 17.2%, [dot point] 0.9%, [comma] 0.9% and [semicolon] 0.5%
- ^ lipu pi jan Jakopo pi toki pona [2007-09-14].
- ^ Letterfrequenties. Genootschap OnzeTaal [2009-05-17].
- ^ Perec, Georges; Alphabets; Éditions Galilée, 1976
- ^ Letterfrequenties. Genootschap OnzeTaal [2008-12-26].