Skip to main content
×
Home
    • Aa
    • Aa

CCG supertagging with bidirectional long short-term memory networks*

  • REKIA KADARI (a1), YU ZHANG (a1), WEINAN ZHANG (a1) and TING LIU (a1)
Abstract
Abstract

Neural Network-based approaches have recently produced good performances in Natural language tasks, such as Supertagging. In the supertagging task, a Supertag (Lexical category) is assigned to each word in an input sequence. Combinatory Categorial Grammar Supertagging is a more challenging problem than various sequence-tagging problems, such as part-of-speech (POS) tagging and named entity recognition due to the large number of the lexical categories. Specifically, simple Recurrent Neural Network (RNN) has shown to significantly outperform the previous state-of-the-art feed-forward neural networks. On the other hand, it is well known that Recurrent Networks fail to learn long dependencies. In this paper, we introduce a new neural network architecture based on backward and Bidirectional Long Short-Term Memory (BLSTM) Networks that has the ability to memorize information for long dependencies and benefit from both past and future information. State-of-the-art methods focus on previous information, whereas BLSTM has access to information in both previous and future directions. Our main findings are that bidirectional networks outperform unidirectional ones, and Long Short-Term Memory (LSTM) networks are more precise and successful than both unidirectional and bidirectional standard RNNs. Experiment results reveal the effectiveness of our proposed method on both in-domain and out-of-domain datasets. Experiments show improvements about (1.2 per cent) over standard RNN.

Copyright
Footnotes
Hide All
*

We thank the anonymous reviewers for their valuable comments. This work was supported by the Natural Science Foundation of China (Grant No. 61472105, 61472107) and the High Technology Research and Development Program of China (Grant No. 2015AA015407).

Footnotes
References
Hide All
BangaloreS., and JoshiA. K., 1999. Supertagging: an approach to almost parsing. Computational Linguistics 25 (2): 237–65.
CharniakE., CarrollG., AdcockJ., CassandraA., GotohY., KatzJ., LittmanM., and McCannJ., 1996. Taggers for parsers. Artificial Intelligence 85 (1): 4557.
ChiuJ. P. C., and NicholsE. 2016. Named entity recognition with bidirectional lstm-cnns. Transactions of the Association for Computational Linguistics, 4: 357–70.
ChoK., MerrienboerB. V., BahdanauD., and BengioY., 2014. On the properties of neural machine translation: encoder-decoder approaches. In Proceedings of 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8), Doha, Qatar, pp. 103–11.
CholletF. 2015. Keras. https://github.com/fchollet/keras.
ClarkS., and CurranR. J., 2004. The importance of supertagging for wide-coverage CCG parsing. In Proceedings of the 20th Conference on Computational Linguistics COLING-04, Geneva, Switzerland, pp. 282–88.
ClarkS., and CurranR. J., 2007. Wide coverage efficient statistical parsing with CCG and log-linear models. Computational Linguistics 33 (4): 493552.
CollobertR., WestonJ., BottouL., KarlenM., KavukcuogluK., and KuksaP., 2011. Natural language processing (almost) from scratch. The Journal of Machine Learning Research 12 : 2493–537.
GersF. 2001. Long Short-Term Memory in Recurrent Neural Networks. PhD Thesis, Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland.
GravesA., MohamedA-R., and HintonG., 2013. Speech recognition with deep recurrent neural networks. In Proceedings of the IEEE International Conference Acoustics, Speech and Signal Processing (ICASSP), Vancouver, British Columbia, Canada, pp. 6645–649.
HochreiterS., and SchmidhuberJ., 1997. Long short-term memory. Neural computation 9 (8): 1735–780.
HockenmaierJ., and SteedmanM., 2007. CCGbank: a corpus of CCG derivations and dependency structures extracted from the Penn Treebank. Computational Linguistics 33 (3): 355–96.
HonnibalM., NothmanJ., and CurranR. J., 2009. Evaluating a statistical CCG parser on wikipedia. In Proceedings of the 2009 Workshop on The People’s Web Meets NLP: Collaboratively Constructed Semantic Resources, Association for Computational Linguistics, Suntec, Singapore, pp. 3841.
LewisM., LeeK., and ZettlemoyerL., 2016. LSTM CCG Parsing. In Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL): Human Language Technologies, Association for Computational Linguistics, San Diego, California, pp. 221–31.
LewisM., and SteedmanM., 2014. Improved CCG parsing with semi-supervised supertagging. Transactions of the Association for Computational Linguistics 2 : 327–38.
LingW., DyerC., BlackA., and TrancosoI., 2015. Two/too simple adaptations of Word2Vec for syntax problems. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), Denver, Colorado, USA, pp. 1299–304.
MarcusM. P., SantoriniB., and MarcinkiewiczA. M., 1993. Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics 19 (2): 313–30.
MikolovT., KombrinkS., DeorasA., BurgetL., and CernockyJ. H., 2011. Rnnlm-recurrent neural network language modeling toolkit. In Proceedings of the 2011 Automatic Speech Recognition and Understanding Workshop (ASRU), IEEE Signal Processing Society, Waikoloa, HI, USA, pp. 196201.
MikolovT., SutskeverI., ChenK., CorradoG., and DeanJ., 2013. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, Lake Tahoe, Stateline, Nevada, pp. 3111–119.
PhamV., BlucheT., KermorvantC., and LouradourJ., 2014. Dropout improves recurrent neural networks for handwriting recognition. In Proceedings of the 14th International Conferenace Frontiers in Handwriting Recognition (ICFHR), IEEE, Crete, Greece, pp. 285–90.
PyysaloS., GinterF., HeimonenJ., BjorneJ., BobergJ., JouniJ., and SalakoskiT., 2007. Bioinfer: a corpus for information extraction in the biomedical domain. BioMed Central Bioinformatics 8 (1): 50.
RimellL., and ClarkS., 2008. Adapting a lexicalized-grammar parser to contrasting domains. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP-08), Association for Computational Linguistics, Honolulu, Hawai’i, USA, pp. 475–84.
SteedmanM. 2000. The Syntactic Process. Cambridge, MA: The MIT Press.
SteedmanM., and BaldridgeJ. 2011. Combinatory categorial grammar. In Borsley R. D. and Börjars K. (eds.), Non-Transformational Syntax: Formal and Explicit Models of Grammar. Oxford: Wiley-Blackwell, pp. 181224.
TurianJ., RatinovL., and BengioY., 2010. Word representations: a simple and general method for semi-supervised learning. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, pp. 384–94.
VaswaniA., BiskY., SagaeK., and MusaR. 2016. Supertagging With LSTMs. In Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), San Diego, California, pp. 232–37.
WangD., and NybergE., 2015. A Long Short-Term Memory Model for Answer Sentence Selection in Question Answering. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, Beijing, China, pp. 707–12.
WangP., QianY., SoongF. K., HeL., and ZhaoH. 2015. Part-of-speech tagging with bidirectional long short-term memory recurrent neural network. arXiv preprint arXiv: 1510.0618.
XuW., AuliM., and ClarkS. 2015. CCG supertagging with a recurrent neural network. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL), Beijing, China, pp. 250–55.
ZeilerM. D. 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv: 1212.5701.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Metrics

Full text views

Total number of HTML views: 5
Total number of PDF views: 19 *
Loading metrics...

Abstract views

Total abstract views: 129 *
Loading metrics...

* Views captured on Cambridge Core between 4th September 2017 - 22nd October 2017. This data will be updated every 24 hours.