PDF The Proposition Bank: An Annotated Corpus of Semantic Roles

We define a set of underlying semantic roles for each verb and annotate each occurrence in the text of the original Penn Treebank. We begin the article by providing examples of the variation in the syntactic realization of semantic arguments and drawing connections to previous research on verbal ternary behavior. In addition to the semantic roles described in the role sets, verbs can take any of a set of general adjunct-like arguments (ArgMs) characterized by one of the function tags shown in Table 1.

For some of these, the instrument of hitting is necessarily included in the semantics of the verb itself. For example, Skop shows 28 instances in the Tree Bank, but only one instance of a (somewhat marginal) instrument:. It is of course perfectly possible to name both meeting parties in the same constituency:.

The Proposition Bank assigns semantic roles to nodes in Penn Treebank syntax trees.

The PropBank Development Process

Framing

Split Components: In this case, a single semantic role label refers to multiple nodes in the original Treebank structure. For example, in the flat structure we used, this looks like a case of repeated scroll labels. Internally, however, there is one role label that refers to multiple components of the tree, shown in Figure 1.

Annotation

Thus, for the role identification kappa, the inter-annotator agreement probability P(A) is the number of node observation agreements divided by the number of the total number of nodes considered, which is the number of nodes in each parse tree multiplied by the number of predicates annotated in the sentence. All the PropBank data were annotated by two people, and when calculating kappa we compare these two annotations, ignoring the specific identities of the annotators for the predicate (in practice, agreement varied with individual annotators' training and skills). In the second scenario, we ignore ArgM labels, treat them as unlabeled nodes, and compute agreement only for the identification and classification of numbered arguments.

Kappa on the combined identification and classification decision, calculated over all nodes in the tree, is .91 including all subtypes of ArgM and .93 over numbered arguments only. Disagreements between annotators tended to be less on numbered arguments than on the choice of feature tags, as shown in the confusion matrices in Tables 3 and 4. Certain types of feature tags, especially ADV, MNR, and DIS, can be difficult to distinguish. Both of these will be available through the Linguistic Data Consortium, although due to the use of stand-off notation, prior possession of Treebank is also required.

The example sentences were mainly chosen to ensure coverage of all the syntactic realizations of the frame elements, and simple examples of these realizations were preferred over those involving complex syntactic structure not immediately relevant to the lexical predicate itself. It is interesting to note that the semantic frames are a useful way to generalize between predicates; Words in the same frame have frequently been found to share the same syntactic argument structure (Gildea and Jurafsky, 2002). However, there is much less emphasis on defining the semantics of the class with which the verbs are associated, although for the verbs in question additional semantic information is provided by the mapping to VerbNet.

PropBank requires an additional level of inference to determine who owns the car in either case. Parse trees are not used in FrameNet; annotators mark the start and end points of frame elements in the text, and add a grammatical function marker that expresses the frame element's syntactic relationship to the predicate. 10NYU is currently annotating nominalizations in the Penn Treebank using the PropBank Frames files and annotation interface, creating a resource known as NomBank.

A Quantitative Analysis of the Semantic Role Labels

Associating role labels with specific syntactic constructions

Associating verb classes with specific syntactic constructions

Semantic roles of subjects of verbs for the verb classes of Merlo and Stevenson (2001) Relative frequency of semantic role. Semantic roles of subjects of verbs, for the verb classes of Merlo and Stevenson (2001), continued Relative frequency of semantic role. verb count Arg0 Arg1 Arg2 ArgA TMP. Statistics for all framesets of kick are shown in Table 10; the first row in Table 10 corresponds to the kick entry in Table 9.

Automatic Determination of Semantic Role Labels

System Description

The main word is a lexical feature and provides information about the semantic type of the role filler. Due to sparse data, this probability cannot be estimated from counts in the training data. Results We used the same system with the same functions for the previous release of PropBank data.

To provide results comparable to the statistical parsing literature, Treebank section 23 annotations were used as the test set; all other sections were included in the training set. Semantic role prediction accuracy (in percent) for known boundaries — the system is given ingredients for classification. Results for PropBank equal to those for FrameNet, despite the smaller number of training examples for many predicates.

The FrameNet data contained at least ten examples of each predicate, while 12% of the PropBank data had less than ten training examples. Accuracy of semantic role prediction (in percentages) for unknown boundaries — the system must identify the correct constituents as arguments and assign them the correct roles. When annotating syntactic trees, the PropBank annotators marked the traces with their annotators as arguments of the relevant verbs.

One reason for the relatively small impact may be the rarity of the feature - 7% of the paths in the test set are not seen in the training data. The most common feature values are shown in Table 13, where the first two rows correspond to standard subject and object positions. As a measure of how closely the PropBank semantic role labels correspond to the path feature in general, we note that by always assigning the most common role to each path, for example always assigning Arg0 to the subject position and using no other feature, we get the role correct 64.0% of the time, vs.

The Relation of Syntactic Parsing and Semantic Role Labeling

Accuracy of semantic role prediction for unknown boundaries — the system must identify the correct constituents as arguments and assign them the correct roles. Based on the gold standard trees and including tracking information, the system produced exactly the correct labels:. A chunk-based system (the gold standard) assigns the following semantic role labels:. 52) Big investment banks have refused to move to [Arg0the plate] to support [Arg1the stressed traders] by buying large blocks of shares, traders say.

Here, as before, no real Arg0 relation is found, and it would be difficult to imagine its identification without constructing a complete syntactic parsing of the sentence. But now, unlike the tree-based output, the Arg0 label is mistakenly attached to the noun phrase just before the predicate. The relation Arg1 in the direct object position is quite easily recognizable in the split representation as a noun phrase immediately following the verb.

A prepositional phrase expressing a modal relation, however, does not identify a piece-based system. A chunk-based system sees this as a prepositional phrase that appears as the second chunk after the predicate. Participants in the joint CoNLL semantic labeling task in 2004 (Carreras and M`arquez, 2004) reported higher scores for chunk-based systems, but to date chunk-based systems have not closed the gap with state-of-the-art results that are parser based. exit.

Specifically, long-range dependencies indicated by traces in the treebank are crucial for semantic interpretation, but do not affect the constituent recall and precision metrics most commonly used to evaluate parsers, and are not included in the output of the standard parsers. Gildea and Hockenmaier (2003) present a system for labeling Propbank semantic roles based on a statistical parser for Combinatory Categorial Grammar (CCG) (Steedman, 2000). The parser, described in detail in Hockenmaier and Steedman (2002), is trained on a version of Penn Treebank that automatically converts to CCG representations.

Conclusion

Despite the complex relationship between syntactic and semantic structures, we find that statistical parsers, although computationally expensive, do a good job at providing information relevant to this level of semantic interpretation. Our preliminary linguistic analyzes have only scratched the surface of what is possible with current annotation, and yet it is only a first approximation to capturing the richness of semantic representation. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

We are indebted to Scott Cotton, our resident programmer, for his tireless efforts and his wonderful annotation tool, to Joseph Rosenzweig for the initial automatic semantic role labeling of the Penn Treebank, and to Ben Snyder for programming support and computer similarity figures. In Proceedings of the 1st International Conference on Language Resources and Evaluation, pages 447–454, Granada, Spain. In Proceedings of the 1st Annual Meeting of the North American Chapter of the ACL (NAACL), pages 132–139, Seattle, Washington.

In Michael Collins and Mark Steedman, editors, Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pages 41–48. In Proceedings of the First International Conference on Language Resources and Evaluation, Granada, Spain. In Proceedings of the Seventh National Conference on Artificial Intelligence (AAAI-2000), Austin, TX, July-August.

In Verrigtinge van die 41ste jaarlikse konferensie van die Association for Computational Linguistics (ACL-03), Sapporo, Japan. In Proceedings of the 31th Annual Meeting of the Association for Computational Linguistics, bladsye 235–242, Ohio State University, Columbus, Ohio. In Proceedings of the 1st Annual Meeting of the North American Chapter of the ACL (NAACL), bladsye 256–263, Seattle, Washington.