Context-Free Grammar
A Simple Grammar
Let’s start off by looking at a simple context-free grammar
(CFG). By convention, the lefthand side of the first production is the
start-symbol of the grammar,
typically S
, and all well-formed
trees must have this symbol as their root label. In NLTK, context-free
grammars are defined in the nltk.grammar
module. In Example 8-9 we define a
grammar and show how to parse a simple sentence admitted by the
grammar.
Example 8-9. A simple context-free grammar.
grammar1 = nltk.parse_cfg(""" S -> NP VP VP -> V NP | V NP PP PP -> P NP V -> "saw" | "ate" | "walked" NP -> "John" | "Mary" | "Bob" | Det N | Det N PP Det -> "a" | "an" | "the" | "my" N -> "man" | "dog" | "cat" | "telescope" | "park" P -> "in" | "on" | "by" | "with" """)
>>> sent = "Mary saw Bob".split() >>> rd_parser = nltk.RecursiveDescentParser(grammar1) >>> for tree in rd_parser.nbest_parse(sent): ... print tree (S (NP Mary) (VP (V saw) (NP Bob)))
The grammar in Example 8-9 contains productions involving various syntactic categories, as laid out in Table 8-1. The recursive descent parser used here can also be inspected via a graphical interface, as illustrated in Figure 8-3; we discuss this parser in more detail in Parsing with Context-Free Grammar.
Table 8-1. Syntactic categories
Symbol | Meaning | Example |
---|---|---|
S | sentence | the man walked |
NP | noun phrase | a dog |
VP | verb phrase | saw a park |
PP | prepositional phrase | with a telescope |
Det | determiner | the |
N | noun | dog |
V | verb | walked |
P | preposition | in |
A production like VP -> V NP | V NP
PP
has a disjunction ...
Get Natural Language Processing with Python now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.