An efficient probabilistic context-free. Parsing algorithm that computes prefix probabilities

Содержание

Слайд 2

Overview What is this paper all about? Key ideas from the

Overview

What is this paper all about?
Key ideas from the title:
Context-Free Parsing
Probabilistic
Computes

Prefix Probabilities
Efficient
Слайд 3

Context-Free Parsing The ball is heavy. Parser The ball is heavy

Context-Free Parsing

The ball is heavy.
Parser

The ball is heavy

Слайд 4

Context-Free Parsing Parser Grammar The ball is heavy. The ball is heavy

Context-Free Parsing
Parser
Grammar

The ball is heavy.

The ball is heavy

Слайд 5

Context-Free Parsing Parser Grammar Lexicon The ball is heavy. The ball is heavy

Context-Free Parsing
Parser
Grammar
Lexicon

The ball is heavy.

The ball is heavy

Слайд 6

Context-Free Parsing What if there are multiple legal parses? Example: “Yair

Context-Free Parsing

What if there are multiple legal parses?
Example: “Yair looked over

the paper.”
How does the word “over” function?

Yair looked over the paper

S

NP

VP

V

NP

Yair looked over the paper

S

NP

VP

V

NP

PP

P

N

N

or

Слайд 7

Probabilistic Parsing Use probabilities to find the most likely parse Store

Probabilistic Parsing

Use probabilities to find the most likely parse
Store typical probabilities

for words and rules
In this case:

P = 0.99

P = 0.01

Yair looked over the paper

S

NP

VP

V

NP

Yair looked over the paper

S

NP

VP

V

NP

PP

P

N

N

or

Слайд 8

Prefix Probabilities How likely is a partial parse? Yair looked over

Prefix Probabilities

How likely is a partial parse?

Yair looked over …

Yair looked

over …

S

NP

VP

V

NP

S

NP

VP

V

NP

PP

P

N

N

or

Слайд 9

Efficiency The Earley algorithm (upon which Stolcke builds) is one of

Efficiency
The Earley algorithm (upon which Stolcke builds) is one of the

most efficient known parsing algorithms
Probabilities allow intelligent pruning of the developing parse tree(s)
Слайд 10

Parsing Algorithms How do we construct a parse tree? Work from

Parsing Algorithms
How do we construct a parse tree?
Work from grammar to

sentence (top-down)
Work from sentence to grammar (bottom-up)
Work from both ends at once (Earley)
Predictably, Earley works best
Слайд 11

Earley Parsing Overview Start with a root constituent, e.g. sentence Until

Earley Parsing Overview

Start with a root constituent, e.g. sentence
Until the sentence

is complete, repeatedly
Predict: expand constituents as specified in the grammar
Scan: match constituents with words in the input
Complete: propagate constituent completions up to their parents
Prediction is top-down, while scanning and completion are bottom-up
Слайд 12

Earley Parsing Overview Earley parsing uses a chart rather than a

Earley Parsing Overview

Earley parsing uses a chart rather than a tree

to develop the parse
Constituents are stored independently, indexed by word positions in the sentence
Why do this?
Eliminate recalculation when tree branches are abandoned and later rebuilt
Concisely represent multiple parses
Слайд 13

Earley Parsing Example S → NP VP NP → ART N VP → V ADJ

Earley Parsing Example

S → NP VP NP → ART N VP →

V ADJ
Слайд 14

Earley Parsing Example S → NP VP NP → ART N VP → V ADJ

Earley Parsing Example

S → NP VP NP → ART N VP →

V ADJ
Слайд 15

Earley Parsing Example S → NP VP NP → ART N VP → V ADJ

Earley Parsing Example

S → NP VP NP → ART N VP →

V ADJ
Слайд 16

Earley Parsing Example S → NP VP NP → ART N VP → V ADJ

Earley Parsing Example

S → NP VP NP → ART N VP →

V ADJ
Слайд 17

Earley Parsing Example S → NP VP NP → ART N VP → V ADJ

Earley Parsing Example

S → NP VP NP → ART N VP →

V ADJ
Слайд 18

Earley Parsing Example S → NP VP NP → ART N VP → V ADJ

Earley Parsing Example

S → NP VP NP → ART N VP →

V ADJ
Слайд 19

Earley Parsing Example S → NP VP NP → ART N VP → V ADJ

Earley Parsing Example

S → NP VP NP → ART N VP →

V ADJ
Слайд 20

Earley Parsing Example S → NP VP NP → ART N VP → V ADJ

Earley Parsing Example

S → NP VP NP → ART N VP →

V ADJ
Слайд 21

Earley Parsing Example S → NP VP NP → ART N VP → V ADJ

Earley Parsing Example

S → NP VP NP → ART N VP →

V ADJ
Слайд 22

Probabilistic Parsing How do we parse probabilistically? Assign probabilities to grammar

Probabilistic Parsing
How do we parse probabilistically?
Assign probabilities to grammar rules and

words in lexicon
Grammar and lexicon “randomly” generate all possible sentences in the language
P(parse tree) = P(sentence generation)
Слайд 23

Probabilistic Parsing Terminology Earley state: each step of the processing that

Probabilistic Parsing

Terminology
Earley state: each step of the processing that a constituent

undergoes. Examples:
Starting sentence
Half-finished sentence
Complete sentence
Half-finished noun phrase
etc.
Earley path: a sequence of linked states
Example: the complete parse just described
Слайд 24

etc. Probabilistic Parsing Can represent the parse as a Markov chain:

etc.

Probabilistic Parsing

Can represent the parse as a Markov chain:
Markov assumption (state

probability is independent of path) applies, due to CFG

S ►
NP VP
Begin

S
Begin

NP ►
ART N Begin

NP ►
PN Begin

NP Half Done

NP Done

S Half Done

(path abandoned)

Predict S

Predict NP

Scan “the”

Scan “ball”

Complete NP

Слайд 25

Probabilistic Parsing Every Earley path corresponds to a parse tree P(tree)

Probabilistic Parsing

Every Earley path corresponds to a parse tree
P(tree) = P(path)
Assign

a probability to each state transition
Prediction: probability of grammar rule
Scanning: probability of word in lexicon
Completion: accumulated (“inner”) probability of the finished constituent
P(path) = product of P(transition)s
Слайд 26

Probabilistic Parse Example Grammar Lexicon

Probabilistic Parse Example

Grammar

Lexicon

Слайд 27

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 28

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 29

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 30

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 31

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 32

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 33

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 34

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 35

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 36

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 37

Probabilistic Parse Example

Probabilistic Parse Example

Слайд 38

Prefix Probabilities Current algorithm reports parse tree probability when the sentence

Prefix Probabilities
Current algorithm reports parse tree probability when the sentence is

completed
What if we don’t have a full sentence?
Probability is tracked by constituent (“inner”), rather than by path (“forward”)
Слайд 39

Prefix Probabilities Solution: add a separate path probability Same as before,

Prefix Probabilities
Solution: add a separate path probability
Same as before, but propagate

down on prediction step
This is the missing link to chain the path probabilities together
Слайд 40

Prefix Probability Example

Prefix Probability Example

Слайд 41

Prefix Probability Example

Prefix Probability Example

Слайд 42

Prefix Probability Example

Prefix Probability Example

Слайд 43

Prefix Probability Example

Prefix Probability Example

Слайд 44

Prefix Probability Example

Prefix Probability Example

Слайд 45

Prefix Probability Example

Prefix Probability Example

Слайд 46

Prefix Probability Example

Prefix Probability Example

Слайд 47

Prefix Probability Example

Prefix Probability Example

Слайд 48

Prefix Probability Example

Prefix Probability Example