Hostname: page-component-76fb5796d-45l2p Total loading time: 0 Render date: 2024-04-25T16:12:19.806Z Has data issue: false hasContentIssue false

ERD-MedLDA: Entity relation detection using supervised topic models with maximum margin learning

Published online by Cambridge University Press:  14 March 2012

DINGCHENG LI
Affiliation:
Liberal Arts–TC, University of Minnesota, Twin Cities, MN 55455, USA email: lixxx345@umn.edu
SWAPNA SOMASUNDARAN
Affiliation:
Siemens Corporate Research, Princeton, NJ 08540, USA email: swapna.somasundaran@siemens.com
AMIT CHAKRABORTY
Affiliation:
Siemens Corporate Research, Princeton, NJ 08540, USA email: swapna.somasundaran@siemens.com

Abstract

This paper proposes a novel application of topic models to do entity relation detection (ERD). In order to make use of the latent semantics of text, we formulate the task of relation detection as a topic modeling problem. The motivation is to find underlying topics that are indicative of relations between named entities (NEs). Our approach considers pairs of NEs and features associated with them as mini documents, and aims to utilize the underlying topic distributions as indicators for the types of relations that may exist between the NE pair. Our system, ERD-MedLDA, adapts Maximum Entropy Discriminant Latent Dirichlet Allocation (MedLDA) with mixed membership for relation detection. By using supervision, ERD-MedLDA is able to learn topic distributions indicative of relation types. Further, ERD-MedLDA is a topic model that combines the benefits of both, maximum likelihood estimation (MLE) and maximum margin estimation (MME), and the mixed-membership formulation enables the system to incorporate heterogeneous features. We incorporate different features into the system and perform experiments on the ACE 2005 corpus. Our approach achieves better overall performance for precision, recall, and F-measure metrics as compared to baseline SVM-based and LDA-based models. We also find that our system shows better and consistent improvements with the addition of complex informative features as compared to baseline systems.

Type
Articles
Copyright
Copyright © Cambridge University Press 2012

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

ACE. 2000–2005. Automatic content extraction. http://www.ldc.upenn.edu/Projects/ACE/Google Scholar
Blei, D. M., and Jordan, M. I. 2006. Variational inference for Dirichlet process mixtures. Bayesian Analysis 1 (1): 121–44.CrossRefGoogle Scholar
Blei, D. M., and McAuliffe, J. 2008. Supervised topic models. Advances in Neural Information Processing Systems 20: 121–8.Google Scholar
Blei, D. M., Ng, A. Y., and Jordan, M. I. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3: 9931022.Google Scholar
Bunescu, R. C., and Mooney, R. J. 2005. A shortest path dependency kernel for relation extraction. In HLT & EMNLP Proceedings, pp. 724–31, Vancouver, Canada.Google Scholar
Carreras, X., and Màrquez, L. 2005. Introduction to the CoNLL-2005 shared task: semantic role labeling. In Proceedings of the 9th Conference on Computational Natural Language Learning, pp. 152–64, Ann Arbor, MI.Google Scholar
Chan, Y., and Roth, D. 2011. Exploiting syntactico-semantic structures for relation extraction. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR.Google Scholar
Collins, M., and Duffy, N. 2002. Convolution kernels for natural language. Advances in Neural Information Processing Systems 1: 625–32.Google Scholar
Cortes, C., and Vapnik, V. 1995. Support-vector networks. Machine Learning 20 (3): 273–97.CrossRefGoogle Scholar
Culotta, A., and Sorensen, J. 2004. Dependency tree kernels for relation extraction. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 423, Barcelona, Spain.Google Scholar
Doddington, G., Mitchell, A., Przybocki, M., Ramshaw, L., Strassel, S., and Weischedel, R. 2004. The automatic content extraction (ACE) program: tasks, data, and evaluation. Proceedings of LREC 4: 837–40.Google Scholar
Farkas, R., Vincze, V., Móra, G., Csirik, J., and Szarvas, G. 2010. The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text. In Proceedings of the 14th Conference on Computational Natural Language Learning (CoNLL-2010): Shared Task, pp. 112, Uppsala, Sweden.Google Scholar
Flaherty, P., Giaever, G., Kumm, J., Jordan, M. I., and Arkin, A. P. 2005. A latent variable model for chemogenomic profiling. Bioinformatics 21 (15): 3286–93.CrossRefGoogle ScholarPubMed
Hachey, B. 2006. Comparison of similarity models for the relation discovery task. In Proceedings of the Workshop on Linguistic Distances, p. 25, Sydney, Australia.CrossRefGoogle Scholar
Hasegawa, T., Sekine, S., and Grishman, R. 2004. Discovering relations among named entities from large corpora. In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL'04), Main Volume, pp. 415–22, Barcelona, Spain.Google Scholar
Jiang, J. 2009. Multi-task transfer learning for weakly-supervised relation extraction. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2, pp. 1012–20, Suntec, Singapore.Google Scholar
Jiang, J., and Zhai, C. X. 2007. A systematic exploration of the feature space for relation extraction. In Proceedings of NAACL/HLT, pp. 113–20, Rochester, NY.Google Scholar
Joachims, T. 1999. Making large scale SVM learning practical. In Advances in Kernel Methods: Support Vector Learning, pp. 169184. Cambridge, MA: MIT Press.Google Scholar
Kambhatla, N. 2004. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, p. 22, Barcelona, Spain.CrossRefGoogle Scholar
Khayyamian, M., Mirroshandel, S. A., and Abolhassani, H. 2009. Syntactic tree-based relation extraction using a generalization of Collins and Duffy convolution tree kernel. In Proceedings of the HLT/NAACL Student Research Workshop and Doctoral Consortium, pp. 6671, Boulder, CO.Google Scholar
Lacoste-Julien, S., Sha, F., and Jordan, M. I. 2008. DiscLDA: discriminative learning for dimensionality reduction and classification. In Advances in Neural Information Processing Systems 21: Proceedings of the 22nd Annual Conference on Neural Information Processing Systems, Vancouver, Canada.Google Scholar
Lin, W. H., Xing, E., and Hauptmann, A. 2008. A joint topic and perspective model for ideological discourse. In Daelemans, W., Goethals, B., and Morik, K. (eds.), Machine Learning and Knowledge Discovery in Databases, pp. 1732. Berlin: Springer-Verlag.Google Scholar
Miller, S., Fox, H., Ramshaw, L., and Weischedel, R. 2000. A novel use of statistical parsing to extract information from text. In Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, pp. 226–33, Seattle, WA.Google Scholar
Minka, T. P. 2003. A comparison of numerical optimizers for logistic regression. Technical Report, Department of Statistics, Carnegie Mellon University.Google Scholar
Mintz, M., Bills, S., Snow, R., and Jurafsky, D. 2009. Distant supervision for relation extraction without labeled data. In 47th ACL & 4th AFNLP Proceedings, pp. 1003–11, Suntec, Singapore.Google Scholar
Moschitti, A. 2006. Efficient convolution kernels for dependency and constituent syntactic trees. In 17th ECML Proceedings, pp. 318–29, Berlin, Germany.Google Scholar
Nguyen, T. V. T., Moschitti, A., and Riccardi, G. 2009. Convolution kernels on constituent, dependency and sequential structures for relation extraction. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pp. 1378–87, Singapore.Google Scholar
Qian, L., Zhou, G., Kong, F., Zhu, Q., and Qian, P. 2008. Exploiting constituent dependencies for tree kernel-based semantic relation extraction. In Proceedings of the 22nd ACL Conference, pp. 697704, Manchester.Google Scholar
Ramage, D., Hall, D., Nallapati, R., and Manning, C. D. 2009. Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In Proceedings of the 2009 EMNLP Conference, pp. 248–56, Singapore.Google Scholar
Roth, D., and Yih, W. 2002. Probabilistic reasoning for entity & relation recognition. In Proceedings of the 19th International Conference on Computational Linguistics (COLING), p. 7, Morristown, NJ.Google Scholar
Shan, H., Banerjee, A., and Oza, N. C. 2009. Discriminative mixed-membership models. In Proceedings of the 9th IEEE International Conference on Data Mining, pp. 466–75, Miami, FL.Google Scholar
Titov, I., and McDonald, R. 2008. Modeling online reviews with multi-grain topic models. In Proceeding of the 17th International Conference on World Wide Web, pp. 111–20, New York.CrossRefGoogle Scholar
Wang, C., Blei, D., and Li, F. F. 2009. Simultaneous image classification and annotation. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1903–10, Miami, FL.CrossRefGoogle Scholar
Zelenko, D., Aone, C., and Richardella, A. 2003. Kernel methods for relation extraction. Journal of Machine Learning Research 3: 1083–106.Google Scholar
Zhang, M., Zhang, J., Su, J., and Zhou, G. 2006. A composite kernel to extract relations between entities with both flat and structured features. In 21st ICCL & 44th ACL Proceedings, pp. 825–32, Sydney, Australia.Google Scholar
Zhao, S., and Grishman, R. 2005. Extracting relations with integrated information using kernel methods. In 43rd ACL Proceedings, p. 426, Ann Arbor, MI.Google Scholar
Zhao, W. X., Jiang, J., Yan, H., and Li, X. 2010. Jointly modeling aspects and opinions with a MaxEnt-LDA hybrid. In Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 5665, MIT Stata Center, MA.Google Scholar
Zhao, X., Jiang, J., He, J., Song, Y., Achananuparp, P., LIM, E. P., and Li, X. 2011. Topical keyphrase extraction from twitter. Proceedings of the 49th Annual ACL-HLT Meeting, Portland, OR.Google Scholar
Zhou, G., Jian, S., Jie, Z., and Min, Z. 2005. Exploring various knowledge in relation extraction. In Proceedings of the 43rd Annual Meeting of the ACL, pp. 427–34, Ann Arbor, MI.Google Scholar
Zhou, G., Zhang, M., Ji, D. H., and Zhu, Q. 2007. Tree kernel-based relation extraction with context-sensitive structured parse tree information. In Proceedings of the EMNLP/CoNLL-2007 Conference, pp. 728–36, Prague, Czech Republic.Google Scholar
Zhou, G. D., Zhang, M., Ji, D. H., and Zhu, Q. M. 2008. Hierarchical learning strategy in semantic relation extraction. Information Processing & Management 44 (3): 1008–21.CrossRefGoogle Scholar
Zhu, J., Ahmed, A., and Xing, E. P. 2009. MedLDA: maximum margin supervised topic models for regression and classification. In Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1257–64, Montreal, Canada.CrossRefGoogle Scholar
Zhu, J., and Xing, E. P. 2010. Conditional topic random fields. In Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel.Google Scholar