Atlanta to Denver Stanford University

Atlanta To Denver Stanford University-PDF Download

  • Date:04 May 2020
  • Views:20
  • Downloads:0
  • Pages:31
  • Size:218.83 KB

Share Pdf : Atlanta To Denver Stanford University

Download and Preview : Atlanta To Denver Stanford University


Report CopyRight/DMCA Form For : Atlanta To Denver Stanford University


Description:

ter 8 showed that part of speech categories could act as a kind of equivalence class for words In this chapter and the next few we introduce a variety of syntactic phe nomena and models for syntax that go well beyond these simpler approaches The bulk of this chapter is devoted to the topic of context free grammars Context

Transcription:

2 C HAPTER 11 C ONSTITUENCY G RAMMARS,11 1 Constituency. The fundamental notion underlying the idea of constituency is that of abstraction. groups of words behaving as single units or constituents A significant part of. developing a grammar involves discovering the inventory of constituents present in. the language, noun phrase How do words group together in English Consider the noun phrase a sequence. of words surrounding at least one noun Here are some examples of noun phrases. thanks to Damon Runyon,Harry the Horse a high class spot such as Mindy s. the Broadway coppers the reason he comes into the Hot Box. they three parties from Brooklyn, What evidence do we have that these words group together or form constituents. One piece of evidence is that they can all appear in similar syntactic environments. for example before a verb,three parties from Brooklyn arrive.
a high class spot such as Mindy s attracts,the Broadway coppers love. But while the whole noun phrase can occur before a verb this is not true of each. of the individual words that make up a noun phrase The following are not grammat. ical sentences of English recall that we use an asterisk to mark fragments that. are not grammatical English sentences,from arrive as attracts. the is spot sat, Thus to correctly describe facts about the ordering of these words in English we. must be able to say things like Noun Phrases can occur before verbs. preposed Other kinds of evidence for constituency come from what are called preposed or. postposed postposed constructions For example the prepositional phrase on September sev. enteenth can be placed in a number of different locations in the following examples. including at the beginning preposed or at the end postposed. On September seventeenth I d like to fly from Atlanta to Denver. I d like to fly on September seventeenth from Atlanta to Denver. I d like to fly from Atlanta to Denver on September seventeenth. But again while the entire phrase can be placed differently the individual words. making up the phrase cannot be, On September I d like to fly seventeenth from Atlanta to Denver. On I d like to fly September seventeenth from Atlanta to Denver. I d like to fly on September from Atlanta to Denver seventeenth. 11 2 Context Free Grammars, The most widely used formal system for modeling constituent structure in English.
CFG and other natural languages is the Context Free Grammar or CFG Context. 11 2 C ONTEXT F REE G RAMMARS 3, free grammars are also called Phrase Structure Grammars and the formalism. is equivalent to Backus Naur Form or BNF The idea of basing a grammar on. constituent structure dates back to the psychologist Wilhelm Wundt 1900 but was. not formalized until Chomsky 1956 and independently Backus 1959. rules A context free grammar consists of a set of rules or productions each of which. expresses the ways that symbols of the language can be grouped and ordered to. lexicon gether and a lexicon of words and symbols For example the following productions. NP express that an NP or noun phrase can be composed of either a ProperNoun or. a determiner Det followed by a Nominal a Nominal in turn can consist of one or. more Nouns,NP Det Nominal,NP ProperNoun,Nominal Noun Nominal Noun. Context free rules can be hierarchically embedded so we can combine the pre. vious rules with others like the following that express facts about the lexicon. Noun flight, The symbols that are used in a CFG are divided into two classes The symbols. terminal that correspond to words in the language the nightclub are called terminal. symbols the lexicon is the set of rules that introduce these terminal symbols The. non terminal symbols that express abstractions over these terminals are called non terminals In. each context free rule the item to the right of the arrow is an ordered list of one. or more terminals and non terminals to the left of the arrow is a single non terminal. symbol expressing some cluster or generalization Notice that in the lexicon the. non terminal associated with each word is its lexical category or part of speech. which we defined in Chapter 8, A CFG can be thought of in two ways as a device for generating sentences. and as a device for assigning a structure to a given sentence Viewing a CFG as a. generator we can read the arrow as rewrite the symbol on the left with the string. of symbols on the right,So starting from the symbol NP.
we can use our first rule to rewrite NP as Det Nominal. and then rewrite Nominal as Det Noun, and finally rewrite these parts of speech as a flight. We say the string a flight can be derived from the non terminal NP Thus a CFG. can be used to generate a set of strings This sequence of rule expansions is called a. derivation derivation of the string of words It is common to represent a derivation by a parse. parse tree tree commonly shown inverted with the root at the top Figure 11 1 shows the tree. representation of this derivation, dominates In the parse tree shown in Fig 11 1 we can say that the node NP dominates. all the nodes in the tree Det Nom Noun a flight We can say further that it. immediately dominates the nodes Det and Nom, The formal language defined by a CFG is the set of strings that are derivable. start symbol from the designated start symbol Each grammar must have one designated start. symbol which is often called S Since context free grammars are often used to define. sentences S is usually interpreted as the sentence node and the set of strings that. are derivable from S is the set of sentences in some simplified version of English. 4 C HAPTER 11 C ONSTITUENCY G RAMMARS,Figure 11 1 A parse tree for a flight. Let s add a few additional rules to our inventory The following rule expresses. verb phrase the fact that a sentence can consist of a noun phrase followed by a verb phrase. S NP VP I prefer a morning flight, A verb phrase in English consists of a verb followed by assorted other things.
for example one kind of verb phrase consists of a verb followed by a noun phrase. VP Verb NP prefer a morning flight, Or the verb may be followed by a noun phrase and a prepositional phrase. VP Verb NP PP leave Boston in the morning, Or the verb phrase may have a verb followed by a prepositional phrase alone. VP Verb PP leaving on Thursday, A prepositional phrase generally has a preposition followed by a noun phrase. For example a common type of prepositional phrase in the ATIS corpus is used to. indicate location or direction,PP Preposition NP from Los Angeles. The NP inside a PP need not be a location PPs are often used with times and. dates and with other nouns as well they can be arbitrarily complex Here are ten. examples from the ATIS corpus,to Seattle on these flights.
in Minneapolis about the ground transportation in Chicago. on Wednesday of the round trip flight on United Airlines. in the evening of the AP fifty seven flight,on the ninth of July with a stopover in Nashville. Figure 11 2 gives a sample lexicon and Fig 11 3 summarizes the grammar rules. we ve seen so far which we ll call L0 Note that we can use the or symbol to. indicate that a non terminal has alternate possible expansions. We can use this grammar to generate sentences of this ATIS language We. start with S expand it to NP VP then choose a random expansion of NP let s say to. I and a random expansion of VP let s say to Verb NP and so on until we generate. the string I prefer a morning flight Figure 11 4 shows a parse tree that represents a. complete derivation of I prefer a morning flight, It is sometimes convenient to represent a parse tree in a more compact format. bracketed called bracketed notation here is the bracketed representation of the parse tree of. 11 2 C ONTEXT F REE G RAMMARS 5,Noun flights breeze trip morning. Verb is prefer like need want fly,Adjective cheapest non stop first latest. other direct,Pronoun me I you it,Proper Noun Alaska Baltimore Los Angeles.
Chicago United American,Determiner the a an this these that. Preposition from to on near,Conjunction and or but. Figure 11 2 The lexicon for L0,Grammar Rules Examples. S NP VP I want a morning flight,NP Pronoun I,Proper Noun Los Angeles. Det Nominal a flight,Nominal Nominal Noun morning flight.
Noun flights,VP Verb do,Verb NP want a flight,Verb NP PP leave Boston in the morning. Verb PP leaving on Thursday,PP Preposition NP from Los Angeles. Figure 11 3 The grammar for L0 with example phrases for each rule. 11 1 S NP Pro I VP V prefer NP Det a Nom N morning Nom N flight. A CFG like that of L0 defines a formal language We saw in Chapter 2 that a for. mal language is a set of strings Sentences strings of words that can be derived by a. grammar are in the formal language defined by that grammar and are called gram. grammatical matical sentences Sentences that cannot be derived by a given formal grammar are. ungrammatical not in the language defined by that grammar and are referred to as ungrammatical. This hard line between in and out characterizes all formal languages but is only. a very simplified model of how natural languages really work This is because de. termining whether a given sentence is part of a given natural language say English. often depends on the context In linguistics the use of formal languages to model. generative, grammar natural languages is called generative grammar since the language is defined by. the set of possible sentences generated by the grammar. 11 2 1 Formal Definition of Context Free Grammar, We conclude this section with a quick formal description of a context free gram. mar and the language it generates A context free grammar G is defined by four. parameters N R S technically this is a 4 tuple,6 C HAPTER 11 C ONSTITUENCY G RAMMARS.
Pro Verb NP,I prefer Det Nom,a Nom Noun,Noun flight. Figure 11 4 The parse tree for I prefer a morning flight according to grammar L0. N a set of non terminal symbols or variables,a set of terminal symbols disjoint from N. R a set of rules or productions each of the form A. where A is a non terminal, is a string of symbols from the infinite set of strings N. S a designated start symbol and a member of N, For the remainder of the book we adhere to the following conventions when dis. cussing the formal properties of context free grammars as opposed to explaining. particular facts about English or other languages,Capital letters like A B and S Non terminals.
S The start symbol, Lower case Greek letters like and Strings drawn from N. Lower case Roman letters like u v and w Strings of terminals. A language is defined through the concept of derivation One string derives an. other one if it can be rewritten as the second one by some series of rule applications. More formally following Hopcroft and Ullman 1979, if A is a production of R and and are any strings in the set. directly derives N then we say that A directly derives or A. Derivation is then a generalization of direct derivation. Let 1 2 m be strings in N m 1 such that,1 2 2 3 m 1 m. derives We say that 1 derives m or 1 m, We can then formally define the language LG generated by a grammar G as the. set of strings composed of terminal symbols that can be derived from the designated. 11 3 S OME G RAMMAR RULES FOR E NGLISH 7,start symbol S.
LG w w is in and S w, The problem of mapping from a string of words to its parse tree is called syn. parsing tactic parsing we define algorithms for parsing in Chapter 12. 11 3 Some Grammar Rules for English, In this section we introduce a few more aspects of the phrase structure of English. for consistency we will continue to focus on sentences from the ATIS domain Be. cause of space limitations our discussion is necessarily limited to highlights Read. ers are strongly advised to consult a good reference grammar of English such as. Huddleston and Pullum 2002,11 3 1 Sentence Level Constructions. In the small grammar L0 we provided only one sentence level construction for. declarative sentences like I prefer a morning flight Among the large number of. constructions for English sentences four are particularly common and important. declaratives imperatives yes no questions and wh questions. declarative Sentences with declarative structure have a subject noun phrase followed by. a verb phrase like I prefer a morning flight Sentences with this structure have. a great number of different uses that we follow up on in Chapter 25 Here are a. number of examples from the ATIS domain,I want a flight from Ontario to Chicago. The flight should be eleven a m tomorrow, The return flight should leave at around seven p m.
imperative Sentences with imperative structure often begin with a verb phrase and have. no subject They are called imperative because they are almost always used for. commands and suggestions in the ATIS domain they are commands to the system. Show the lowest fare, Give me Sunday s flights arriving in Las Vegas from New York City. List all flights between five and seven p m, We can model this sentence structure with another rule for the expansion of S. yes no question Sentences with yes no question structure are often though not always used to. ask questions they begin with an auxiliary verb followed by a subject NP followed. by a VP Here are some examples Note that the third example is not a question at. all but a request Chapter 25 discusses the uses of these question forms to perform. different pragmatic functions such as asking requesting or suggesting. Do any of these flights have stops, Does American s flight eighteen twenty five serve dinner. Can you give me the same information for United,Here s the rule. S Aux NP VP,8 C HAPTER 11 C ONSTITUENCY G RAMMARS, The most complex sentence level structures we examine here are the various wh.
wh phrase structures These are so named because one of their constituents is a wh phrase that. wh word is one that includes a wh word who whose when where what which how why. These may be broadly grouped into two classes of sentence level structures The. wh subject question structure is identical to the declarative structure except that. the first noun phrase contains some wh word,What airlines fly from Burbank to Denver. Which flights depart Burbank after noon and arrive in Denver by six p m. Whose flights serve breakfast, Here is a rule Exercise 11 7 discusses rules for the constituents that make up the. S Wh NP VP, wh non subject In the wh non subject question structure the wh phrase is not the subject of the. sentence and so the sentence includes another subject In these types of sentences. the auxiliary appears before the subject NP just as in the yes no question structures. Here is an example followed by a sample rule, What flights do you have from Burbank to Tacoma Washington. S Wh NP Aux NP VP, Constructions like the wh non subject question contain what are called long.
long distance, dependencies distance dependencies because the Wh NP what flights is far away from the predi. cate that it is semantically related to the main verb have in the VP In some models. of parsing and understanding compatible with the grammar rule above long distance. dependencies like the relation between flights and have are thought of as a semantic. relation In such models the job of figuring out that flights is the argument of have. is done during semantic interpretation In other models of parsing the relationship. between flights and have is considered to be a syntactic relation and the grammar is. modified to insert a small marker called a trace or empty category after the verb. We return to such empty category models when we introduce the Penn Treebank on. 11 3 2 Clauses and Sentences, Before we move on we should clarify the status of the S rules in the grammars we. just described S rules are intended to account for entire sentences that stand alone. as fundamental units of discourse However S can also occur on the right hand side. of grammar rules and hence can be embedded within larger sentences Clearly then. there s more to being an S than just standing alone as a unit of discourse. What differentiates sentence constructions i e the S rules from the rest of the. grammar is the notion that they are in some sense complete In this way they corre. clause spond to the notion of a clause which traditional grammars often describe as form. ing a complete thought One way of making this notion of complete thought more. precise is to say an S is a node of the parse tree below which the main verb of the S. has all of its arguments We define verbal arguments later but for now let s just see. an illustration from the tree for I prefer a morning flight in Fig 11 4 on page 6 The. verb prefer has two arguments the subject I and the object a morning flight One of. the arguments appears below the VP node but the other one the subject NP appears. only below the S node,11 3 S OME G RAMMAR RULES FOR E NGLISH 9. 11 3 3 The Noun Phrase, Our L0 grammar introduced three of the most frequent types of noun phrases that. occur in English pronouns proper nouns and the NP Det Nominal construction. The central focus of this section is on the last type since that is where the bulk of. the syntactic complexity resides These noun phrases consist of a head the central. noun in the noun phrase along with various modifiers that can occur before or after. the head noun Let s take a close look at the various parts. The Determiner, Noun phrases can begin with simple lexical determiners as in the following exam.
a stop the flights this flight,those flights any flights some flights. The role of the determiner in English noun phrases can also be filled by more. complex expressions as follows,United s flight,United s pilot s union. Denver s mayor s mother s canceled flight, In these examples the role of the determiner is filled by a possessive expression. consisting of a noun phrase followed by an s as a possessive marker as in the. following rule,Det NP 0 s, The fact that this rule is recursive since an NP can start with a Det helps us model. the last two examples above in which a sequence of possessive expressions serves. as a determiner, Under some circumstances determiners are optional in English For example.
determiners may be omitted if the noun they modify is plural. 11 2 Show me flights from San Francisco to Denver on weekdays. As we saw in Chapter 8 mass nouns also don t require determination Recall that. mass nouns often not always involve something that is treated like a substance. including e g water and snow don t take the indefinite article a and don t tend. to pluralize Many abstract nouns are mass nouns music homework Mass nouns. in the ATIS domain include breakfast lunch and dinner. 11 3 Does this flight serve dinner,The Nominal, The nominal construction follows the determiner and contains any pre and post. head noun modifiers As indicated in grammar L0 in its simplest form a nominal. can consist of a single noun,Nominal Noun, As we ll see this rule also provides the basis for the bottom of various recursive. rules used to capture more complex nominal constructions. 10 C HAPTER 11 C ONSTITUENCY G RAMMARS,Before the Head Noun. Cardinal A number of different kinds of word classes can appear before the head noun the. postdeterminers in a nominal These include cardinal numbers ordinal num. ordinal bers quantifiers and adjectives Examples of cardinal numbers. quantifiers,two friends one stop, Ordinal numbers include first second third and so on but also words like next. last past other and another,the first one the next day the second leg.
the last flight the other American flight, Some quantifiers many a few several occur only with plural count nouns. many fares, Adjectives occur after quantifiers but before nouns. a first class fare a non stop flight,the longest layover the earliest lunch flight. phrase Adjectives can also be grouped into a phrase called an adjective phrase or AP. APs can have an adverb before the adjective see Chapter 8 for definitions of adjec. tives and adverbs,the least expensive fare,After the Head Noun. A head noun can be followed by postmodifiers Three kinds of nominal postmodi. fiers are common in English,prepositional phrases all flights from Cleveland.
non finite clauses any flights arriving after eleven a m. relative clauses a flight that serves breakfast, They are especially common in the ATIS corpus since they are used to mark the. origin and destination of flights, Here are some examples of prepositional phrase postmodifiers with brackets. inserted to show the boundaries of each PP note that two or more PPs can be strung. together within a single NP,all flights from Cleveland to Newark. arrival in San Jose before seven p m, a reservation on flight six oh six from Tampa to Montreal. Here s a new nominal rule to account for postnominal PPs. Nominal Nominal PP, non finite The three most common kinds of non finite postmodifiers are the gerundive.
ing ed and infinitive forms, gerundive Gerundive postmodifiers are so called because they consist of a verb phrase that. begins with the gerundive ing form of the verb Here are some examples. any of those leaving on Thursday,any flights arriving after eleven a m. flights arriving within thirty minutes of each other. 11 3 S OME G RAMMAR RULES FOR E NGLISH 11, We can define the Nominals with gerundive modifiers as follows making use of. a new non terminal GerundVP,Nominal Nominal GerundVP. We can make rules for GerundVP constituents by duplicating all of our VP pro. ductions substituting GerundV for V,GerundVP GerundV NP.
GerundV PP GerundV GerundV NP PP,GerundV can then be defined as. GerundV being arriving leaving, The phrases in italics below are examples of the two other common kinds of. non finite clauses infinitives and ed forms,the last flight to arrive in Boston. I need to have dinner served,Which is the aircraft used by this flight. A postnominal relative clause more correctly a restrictive relative clause is. pronoun a clause that often begins with a relative pronoun that and who are the most com. mon The relative pronoun functions as the subject of the embedded verb in the. following examples,a flight that serves breakfast,flights that leave in the morning.
the one that leaves at ten thirty five, We might add rules like the following to deal with these. Nominal Nominal RelClause,RelClause who that VP, The relative pronoun may also function as the object of the embedded verb as. in the following example we leave for the reader the exercise of writing grammar. rules for more complex relative clauses of this kind. the earliest American Airlines flight that I can get. Various postnominal modifiers can be combined as the following examples. a flight from Phoenix to Detroit leaving Monday evening. evening flights from Nashville to Houston that serve dinner. a friend living in Denver that would like to visit me in DC. Before the Noun Phrase, predeterminers Word classes that modify and appear before NPs are called predeterminers Many. of these have to do with number or amount a common predeterminer is all. all the flights all flights all non stop flights, The example noun phrase given in Fig 11 5 illustrates some of the complexity. that arises when these rules are combined,12 C HAPTER 11 C ONSTITUENCY G RAMMARS.
all Det Nom,the Nom GerundiveVP,Nom PP leaving before 10. Nom PP to Tampa,Nom Noun from Denver,Noun flights, Figure 11 5 A parse tree for all the morning flights from Denver to Tampa leaving before 10. 11 3 4 The Verb Phrase, The verb phrase consists of the verb and a number of other constituents In the. simple rules we have built so far these other constituents include NPs and PPs and. combinations of the two,VP Verb disappear,VP Verb NP prefer a morning flight. VP Verb NP PP leave Boston in the morning,VP Verb PP leaving on Thursday.
Verb phrases can be significantly more complicated than this Many other kinds. of constituents such as an entire embedded sentence can follow the verb These are. sentential,complements called sentential complements. You VP V said S you had a two hundred sixty six dollar fare. VP V Tell NP me S how to get from the airport in Philadelphia to down. I VP V think S I would like to take the nine thirty flight. Here s a rule for these, Similarly another potential constituent of the VP is another VP This is often the. case for verbs like want would like try intend need. I want VP to fly from Milwaukee to Orlando,Hi I want VP to arrange three flights. 11 3 S OME G RAMMAR RULES FOR E NGLISH 13,Frame Verb Example. 0 eat sleep I ate, NP prefer find leave Find NP the flight from Pittsburgh to Boston.
NP NP show give Show NP me NP airlines with flights from Pittsburgh. PPfrom PPto fly travel I would like to fly PP from Boston PP to Philadelphia. NP PPwith help load Can you help NP me PP with a flight. VPto prefer want need I would prefer VPto to go by United Airlines. VPbrst can would might I can VPbrst go from Boston. S mean Does this mean S AA has a hub in Boston, Figure 11 6 Subcategorization frames for a set of example verbs. While a verb phrase can have many possible kinds of constituents not every. verb is compatible with every verb phrase For example the verb want can be used. either with an NP complement I want a flight or with an infinitive VP comple. ment I want to fly to By contrast a verb like find cannot take this sort of VP. complement I found to fly to Dallas, This idea that verbs are compatible with different kinds of complements is a very. transitive old one traditional grammar distinguishes between transitive verbs like find which. intransitive take a direct object NP I found a flight and intransitive verbs like disappear. which do not I disappeared a flight, subcategorize Where traditional grammars subcategorize verbs into these two categories tran. sitive and intransitive modern grammars distinguish as many as 100 subcategories. Subcategorizes We say that a verb like find subcategorizes for an NP and a verb like want sub. categorizes for either an NP or a non finite VP We also call these constituents the. complements complements of the verb hence our use of the term sentential complement above. So we say that want can take a VP complement These possible sets of complements. Subcategorization are called the subcategorization frame for the verb Another way of talking about. the relation between the verb and these other constituents is to think of the verb as. a logical predicate and the constituents as logical arguments of the predicate So we. can think of such predicate argument relations as FIND I A FLIGHT or WANT I TO. FLY We talk more about this view of verbs and arguments in Chapter 15 when we. talk about predicate calculus representations of verb semantics Subcategorization. frames for a set of example verbs are given in Fig 11 6. We can capture the association between verbs and their complements by making. separate subtypes of the class Verb e g Verb with NP complement Verb with Inf. VP complement Verb with S complement and so on,Verb with NP complement find leave repeat. Verb with S complement think believe say,Verb with Inf VP complement want try need.
Each VP rule could then be modified to require the appropriate verb subtype. VP Verb with no complement disappear,VP Verb with NP comp NP prefer a morning flight. VP Verb with S comp S said there were two flights, A problem with this approach is the significant increase in the number of rules. and the associated loss of generality,14 C HAPTER 11 C ONSTITUENCY G RAMMARS. 11 3 5 Coordination, conjunctions The major phrase types discussed here can be conjoined with conjunctions like and. coordinate or and but to form larger constructions of the same type For example a coordinate. noun phrase can consist of two other noun phrases separated by a conjunction. Please repeat NP NP the flights and NP the costs, I need to know NP NP the aircraft and NP the flight number.
Here s a rule that allows these structures,NP NP and NP. Note that the ability to form coordinate phrases through conjunctions is often. used as a test for constituency Consider the following examples which differ from. the ones given above in that they lack the second determiner. Please repeat the Nom Nom flights and Nom costs, I need to know the Nom Nom aircraft and Nom flight number. The fact that these phrases can be conjoined is evidence for the presence of the. underlying Nominal constituent we have been making use of Here s a rule for this. Nominal Nominal and Nominal, The following examples illustrate conjunctions involving VPs and Ss. What flights do you have VP VP leaving Denver and VP arriving in. San Francisco, S S I m interested in a flight from Dallas to Washington and S I m. also interested in going to Baltimore, The rules for VP and S conjunctions mirror the NP one given above.
VP VP and VP, Since all the major phrase types can be conjoined in this fashion it is also possible. to represent this conjunction fact more generally a number of grammar formalisms. metarules such as GPSG Gazdar et al 1985 do this using metarules such as the following. This metarule simply states that any non terminal can be conjoined with the same. non terminal to yield a constituent of the same type Of course the variable X. must be designated as a variable that stands for any non terminal rather than a non. terminal itself,11 4 Treebanks, Sufficiently robust grammars consisting of context free grammar rules can be used. to assign a parse tree to any sentence This means that it is possible to build a. corpus where every sentence in the collection is paired with a corresponding parse. treebank tree Such a syntactically annotated corpus is called a treebank Treebanks play. 11 4 T REEBANKS 15, an important role in parsing as we discuss in Chapter 12 as well as in linguistic. investigations of syntactic phenomena, A wide variety of treebanks have been created generally through the use of. parsers of the sort described in the next few chapters to automatically parse each. sentence followed by the use of humans linguists to hand correct the parses The. Penn Treebank Penn Treebank project whose POS tagset we introduced in Chapter 8 has pro. duced treebanks from the Brown Switchboard ATIS and Wall Street Journal cor. pora of English as well as treebanks in Arabic and Chinese A number of treebanks. use the dependency representation we will introduce in Chapter 14 including many. that are part of the Universal Dependencies project Nivre et al 2016. 11 4 1 Example The Penn Treebank Project, Figure 11 7 shows sentences from the Brown and ATIS portions of the Penn Tree.
bank 1 Note the formatting differences for the part of speech tags such small dif. ferences are common and must be dealt with in processing treebanks The Penn. Treebank part of speech tagset was defined in Chapter 8 The use of LISP style. parenthesized notation for trees is extremely common and resembles the bracketed. notation we saw earlier in 11 1 For those who are not familiar with it we show a. standard node and line tree representation in Fig 11 8. NP SBJ DT That S,JJ cold NP SBJ The DT flight NN,JJ empty NN sky VP should MD. VP VBD was VP arrive VB,ADJP PRD JJ full PP TMP at IN. PP IN of NP eleven CD a m RB,NP NN fire NP TMP tomorrow NN. Figure 11 7 Parsed sentences from the LDC Treebank3 version of the Brown a and ATIS. Figure 11 9 shows a tree from the Wall Street Journal This tree shows an. traces other feature of the Penn Treebanks the use of traces NONE nodes to mark. syntactic long distance dependencies or syntactic movement For example quotations often. follow a quotative verb like say But in this example the quotation We would have. to wait until we have collected on those assets precedes the words he said An. empty S containing only the node NONE marks the position after said where the. quotation sentence often occurs This empty node is marked in Treebanks II and. III with the index 2 as is the quotation S at the beginning of the sentence Such. co indexing may make it easier for some parsers to recover the fact that this fronted. or topicalized quotation is the complement of the verb said A similar NONE node. 1 The Penn Treebank project released treebanks in multiple languages and in various stages for ex. ample there were Treebank I Marcus et al 1993 Treebank II Marcus et al 1994 and Treebank III. releases of English treebanks We use Treebank III for our examples.

Related Books