Skip to main content
×
Home
    • Aa
    • Aa

Construction and evaluation of event graphs

  • GORAN GLAVAŠ (a1) and JAN ŠNAJDER (a1)
Abstract
Abstract

Events play an important role in natural language processing and information retrieval due to numerous event-oriented texts and information needs. Many natural language processing and information retrieval applications could benefit from a structured event-oriented document representation. In this paper, we propose event graphs as a novel way of structuring event-based information from text. Nodes in event graphs represent the individual mentions of events, whereas edges represent the temporal and coreference relations between mentions. Contrary to previous natural language processing research, which has mainly focused on individual event extraction tasks, we describe a complete end-to-end system for event graph extraction from text. Our system is a three-stage pipeline that performs anchor extraction, argument extraction, and relation extraction (temporal relation extraction and event coreference resolution), each at a performance level comparable with the state of the art. We present EvExtra, a large newspaper corpus annotated with event mentions and event graphs, on which we train and evaluate our models. To measure the overall quality of the constructed event graphs, we propose two metrics based on the tensor product between automatically and manually constructed graphs. Finally, we evaluate the overall quality of event graphs with the proposed evaluation metrics and perform a headroom analysis of the system.

Copyright
Linked references
Hide All

This list contains references from the content that can be linked to their source. For a full set of references and notes please see the PDF or HTML where available.

E. Agirre , and A. Soroa 2009. Personalizing PageRank for word sense disambiguation. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL ’09). Athens, Greece. Stroudsburg, PA: ACL, pp. 3341.

J. Allan 2002. Topic Detection and Tracking: Event-Based Information Organization, vol. 12. Dordrecht, Netherlands: Kluwer.

J. R. Allen , 1983. Maintaining knowledge about temporal intervals. Communications of the ACM 26 (11): 832–43.

C. Aone , and M. Ramos-Santacruz 2000. REES: a large-scale relation and event extraction system. In Proceedings of the Sixth Conference on Applied Natural Language Processing. Seattle, WA. Stroudsburg, PA: ACL, pp. 7683.

A. Bagga , and B. Baldwin 1999. Cross-document event coreference: annotations, experiments, and observations. In Proceedings of the Workshop on Coreference and its Applications. Stroudsburg, Pennsylvania. Stroudsburg, PA: ACL, pp. 18.

C. F. Baker , C. J. Fillmore , and J. B. Lowe , 1998. The Berkeley framenet project. In Proceedings of the 17th International Conference on Computational linguistics (COLING ’98), Montreal, Canada. Stroudsburg, PA: ACL, pp. 8690.

C. A. Bejan , and C. Hathaway , 2007. UTD-SRL: a pipeline architecture for extracting frame semantic structures. In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval 2007), Prague, Czech Republic. Stroudsburg, PA: ACL, pp. 460–63.

P. Bramsen , P. Deshpande , Y. K. Lee , and R. Barzilay , 2006. Inducing temporal graphs. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP ’06), Sydney, Australia. Stroudsburg, PA: ACL, pp. 189–98.

C. C. Chang , and C. J. Lin 2011. LibSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2 (27): 127. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm.

L. P. Cordella , P. Foggia , C. Sansone , and M. Vento , 2004. A subgraph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence 26 (10): 1367–72.

C. J. Fillmore , 1976. Frame semantics and the nature of language*. Annals of the New York Academy of Sciences 280 (1): 2032.

T. Gärtner , P. Flach , and S. Wrobel 2003. On graph kernels: hardness results and efficient alternatives. In B. Schölkopf and M. K. Warmuth (eds.), Learning Theory and Kernel Machines. Berlin, Germany: Springer-Verlag, pp. 129–43.

D. Gildea , and D. Jurafsky , 2002. Automatic labeling of semantic roles. Computational Linguistics 28 (3): 245–88.

A. Haghighi , K. Toutanova , and C. D. Manning , 2005. A joint model for semantic role labeling. In Proceedings of the Ninth Conference on Computational Natural Language Learning, Ann Arbor, MI. Stroudsburg, PA: ACL, pp. 173–6.

K. Humphreys , R. Gaizauskas , and S. Azzam , 1997. Event coreference for information extraction. In Proceedings of a Workshop on Operational Factors in Practical, Robust Anaphora Resolution for Unrestricted Texts, Madrid, Spain. Stroudsburg, PA: ACL, pp. 7581.

A. Lavie , and M. J. Denkowski , 2009. The METEOR metric for automatic evaluation of machine translation. Machine Translation 23 (2–3): 105–15.

H. Llorens , E. Saquete , and B. Navarro-Colorado , 2013. Applying semantic knowledge to the automatic processing of temporal expressions and events in natural language. Information Processing & Management 49 (1): 179–97.

J. Makkonen , 2003. Investigations on event evolution in TDT. In Proceedings of the Student Research Workshop at Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL-HLT ’03), Edmonton, Canada. Stroudsburg, PA: ACL, pp. 4348.

J. Makkonen , H. Ahonen-Myka , and M. Salmenkivi , 2004. Simple semantics in topic detection and tracking. Information Retrieval 7 (3): 347–68.

A. McCallum , and W. Li , 2003. Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, Sapporo, Japan, Stroudsburg, PA: ACL, pp. 188–91.

S. Menchetti , F. Costa , and P. Frasconi , 2005. Weighted decomposition kernels. In Proceedings of the 22nd International Conference on Machine Learning (ICML ’05), Bonn, Germany. New York, NY: ACM, pp. 585–92.

P. Moreda , H. Llorens , E. Saquete , and M. Palomar , 2011. Combining semantic information in question answering systems. Information Processing & Management 47 (6): 870–85.

M. Palmer , D. Gildea , and P. Kingsbury , 2005. The proposition bank: an annotated corpus of semantic roles. Computational Linguistics 31 (1): 71106.

L. Polanyi , and A. Zaenen , 2006. Contextual valence shifters. Computing Attitude and Affect in Text: Theory and Applications 20: 110.

J. Pustejovsky , 1991. The syntax of event structure. Cognition 41 (1): 4781.

G. Salton , A. Wong , and C. Yang , 1975. A vector space model for automatic indexing. Communications of the ACM 18 (11): 613–20.

R. Saurí , and J. Pustejovsky , 2012. Are you sure that this happened? Assessing the factuality degree of events in text. Computational Linguistics 38 (2): 261–99.

C. S. Smith 1999. Activities: states or events? Linguistics and Philosophy 22 (5): 479508.

W. Soon , H. Ng , and D. Lim , 2001. A machine learning approach to coreference resolution of noun phrases. Computational Linguistics 27 (4): 521–44.

W. Sun , A. Rumshisky , and O. Uzuner , 2013. Evaluating temporal relations in clinical text: 2012 i2b2 challenge. Journal of the American Medical Informatics Association 20 (5): 806–13.

M. Surdeanu , and J. Turmo , 2005. Semantic role labeling using complete syntactic analysis. In Proceedings of the Ninth Conference on Computational Natural Language Learning, Ann Arbor, MI. Stroudsburg, PA: ACL, pp. 221–4.

H. Tong , C. Faloutsos , B. Gallagher , and T. Eliassi-Rad 2007. Fast best-effort pattern matching in large attributed graphs. In Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY: ACM, pp. 737–46.

Z. Vendler , 1957. Verbs and times. The Philosophical Review 66 (2): 143–60.

M. Verhagen 2007. Drawing TimeML relations with TBox. In F. Schilder , G. Katz and J. Pustejovsky (eds.), Annotating, Extracting and Reasoning about Time and Events. Berlin, Germany: Springer-Verlag, pp. 728.

M. Verhagen , R. Gaizauskas , F. Schilder , M. Hepple , G. Katz , and J. Pustejovsky , 2007. SemEval-2007 Task 15: TempEval temporal relation identification. In Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007), Prague, Czech Republic. Stroudsburg, PA: ACL, pp. 7580.

C. P. Wei , and Y. H. Chang , 2007. Discovering event evolution patterns from document sequences. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 37 (2): 273–83.

F. Wolf , and E. Gibson , 2005. Representing discourse coherence: a corpus-based study. Computational Linguistics 31 (2): 249–87.

Z. Wu , and M. Palmer , 1994. Verbs semantics and lexical selection. In Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics (ACL 1994), Las Cruces, NM. Stroudsburg, PA: ACL, pp. 133–8.

Y. Yang , J. G. Carbonell , R. D. Brown , T. Pierce , B. T. Archibald , and X. Liu , 1999. Learning approaches for detecting and tracking news events. Intelligent Systems and their Applications 14 (4): 3243.

C. C. Yang , X. Shi , and C. P. Wei , 2009. Discovering event evolution graphs from news corpora. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans 39 (4): 850–63.

R. Yangarber , R. Grishman , P. Tapanainen , and S. Huttunen , 2000. Automatic acquisition of domain knowledge for information extraction. In Proceedings of the 18th Conference on Computational Linguistics, Hong Kong, vol. 2. Stroudsburg, PA: ACL, pp. 940–6.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Full text views

Total number of HTML views: 3
Total number of PDF views: 47 *
Loading metrics...

Abstract views

Total abstract views: 459 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 23rd September 2017. This data will be updated every 24 hours.