树-邻接文法

树-邻接文法(TAG)是 Aravind Joshi 定义的文法形式化。树-邻接（adjoining）文法在某种意义上类似于上下文无关文法，但是基本的重写单位是树而不是符号。上下文无关文法有把符号重写为其他符号的规则，而树-毗连文法有把树的节点重写为其他树的规则。

文法规则[编辑]

TAG 中的规则是带有叫做“足节点”的特殊叶子的树，它们锚接（anchor）到一个字。在 TAG 中有两个种类的基本树：“初始”树（经常被表示为 ' $\alpha$ '）和“辅助”树（' $\beta$ '）。初始树表示基本的价（valency）关系，而辅助树允许递归。^[1] 辅助树有标记（label）上同样符号的根（顶）节点和足节点。推导开始于初始树，通过要么“代换”要么“附加”来结合。代换把末梢节点替换为其顶节点有同样符号的另一个树。附加把一个辅助树插入到另一个树的中心。^[2] 辅助树的根／足标记必须匹配它所邻接的节点的标记。其他 TAG 的变体允许多种成分的树，带有多个足节点的树，和其他扩展。

树-邻接文法经常被描述为“适度上下文有关的”，这意味着它们有（在弱生成能力方面上）特定性质使其有比上下文无关文法更强力，但有比附标文法或上下文有关文法更弱的能力。适度上下文有关文法被推测为足够强力可以建模自然语言，而仍保持在一般情况下有效解析。^[3] 由于它们的形式特性，TAG 经常被用于计算语言学和自然语言处理。

文法缘起[编辑]

TAG 起源于 Joshi 和他的学生对附加文法（AG）家族和 Zellig Harris 的“字符串文法”的研究 ^[4] 。AG 以自然和高效的方式处理语言的向心（英语：Endocentric and exocentric）性质，但是没有对离心构造的好特征描述；重写文法或短语-结构文法（PSG）正好反过来。在 1969 年，Joshi 通过混合两种类型的规则介入了开拓出这种补足的文法家族。一些非常简单的重写规则足够生成附加规则的字符串的词汇表。这个家族不同于乔姆斯基层级，但是有所交叠。^[5]

TAG 可以描述有平方的语言（在其中某个任意字符串被重复），和语言 $\{a^{n}b^{n}c^{n}d^{n}|n\geq 1\}$ ，有立方的语言（就是三倍的字符串）或有相等长度的多于四个不同字符的字符串的语言不可以被树-邻接文法所生成。为此，树-毗连文法生成的语言被称为“适度上下文有关语言”。

引用[编辑]

^ Jurafsky, Daniel; James H. Martin. Speech and Language Processing. Upper Saddle River, NJ: Prentice Hall. 2000: 354.
^ Joshi, Aravind; Owen Rambow. A Formalism for Dependency Grammar Based on Tree Adjoining Grammar (PDF). Proceedings of the Conference on Meaning-Text Theory. 2003 [2007-10-19]. （原始内容存档 (PDF)于2020-11-29）.
^ Joshi, Aravind. How much context-sensitivity is necessary for characterizing structural descriptions. D. Dowty, L. Karttunen, and A. Zwicky, (eds.) (编). Natural Language Processing: Theoretical, Computational, and Psychological Perspectives. New York, NY: Cambridge University Press. 1985: 206–250.
^ Joshi, Aravind; S. R. Kosaraju, H. Yamada. String Adjunct Grammars. Proceedings Tenth Annual Symposium on Automata Theory, Waterloo, Canada. 1969.
^ Joshi, Aravind. Properties of Formal Grammars with Mixed Types of Rules and Their Linguistic Relevance. Proceedings Third International Symposium on Computational Linguistics, Stockholm, Sweden. 1969.

外部链接[编辑]

The XTAG project （页面存档备份，存于互联网档案馆），which uses a TAG for natural language processing.
A tutorial on TAG （页面存档备份，存于互联网档案馆）
SemConst Documentation （页面存档备份，存于互联网档案馆） A quick survey on Syntax and Semantic Interface problematic within the TAG framework.

[jurafsky-martin2000-1] Jurafsky, Daniel; James H. Martin. Speech and Language Processing. Upper Saddle River, NJ: Prentice Hall. 2000: 354.

[joshi-rambow2003-2] Joshi, Aravind; Owen Rambow. A Formalism for Dependency Grammar Based on Tree Adjoining Grammar (PDF). Proceedings of the Conference on Meaning-Text Theory. 2003 [2007-10-19]. （原始内容存档 (PDF)于2020-11-29）.

[joshi1985-3] Joshi, Aravind. How much context-sensitivity is necessary for characterizing structural descriptions. D. Dowty, L. Karttunen, and A. Zwicky, (eds.) (编). Natural Language Processing: Theoretical, Computational, and Psychological Perspectives. New York, NY: Cambridge University Press. 1985: 206–250.

[JoshiKosarajuYamada1969-4] Joshi, Aravind; S. R. Kosaraju, H. Yamada. String Adjunct Grammars. Proceedings Tenth Annual Symposium on Automata Theory, Waterloo, Canada. 1969.

[Joshi1969-5] Joshi, Aravind. Properties of Formal Grammars with Mixed Types of Rules and Their Linguistic Relevance. Proceedings Third International Symposium on Computational Linguistics, Stockholm, Sweden. 1969.

[1]

[2]

[3]

[4]

[5]