Skip to main content Accessibility help
×
Home

Analyzing and interpreting neural networks for NLP: A report on the first BlackboxNLP workshop

  • Afra Alishahi (a1), Grzegorz Chrupała (a1) and Tal Linzen (a2)

Abstract

The Empirical Methods in Natural Language Processing (EMNLP) 2018 workshop BlackboxNLP was dedicated to resources and techniques specifically developed for analyzing and understanding the inner-workings and representations acquired by neural models of language. Approaches included: systematic manipulation of input to neural networks and investigating the impact on their performance, testing whether interpretable knowledge can be decoded from intermediate representations acquired by neural networks, proposing modifications to neural network architectures to make their knowledge state or generated output more explainable, and examining the performance of networks on simplified or formal languages. Here we review a number of representative studies in each category.

Copyright

Corresponding author

*Corresponding author. Email: A.Alishahi@uvt.nl

References

Hide All
Adi, Y., Kermany, E., Belinkov, Y., Lavi, O. and Goldberg, Y. (2017). Fine-grained analysis of sentence embeddings using auxiliary prediction tasks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings.
Alishahi, A., Barking, M. and Chrupała, G. (2017). Encoding of phonology in a recurrent neural model of grounded speech. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp. 368378.
Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.-R. and Samek, W. (2015). On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS One 10(7), e0130140.
Bacon, G. and Regier, T. (2018). Probing sentence embeddings for structure-dependent tense. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 334336.
Bastings, J., Baroni, M., Weston, J., Cho, K. and Kiela, D. (2018). Jump to better conclusions: Scan both left and right. In Proceedings of EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 4755.
Burns, K., Nematzadeh, A., Grant, E., Gopnik, A. and Griffiths, T. (2018). Exploiting attention to reveal shortcomings in memory models. In Proceedings of EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 378380.
Camacho-Collados, J. and Pilehvar, M.T. (2018). On the role of text preprocessing in neural network architectures: An evaluation study on text categorization and sentiment analysis. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 4046.
Campello, R.J., Moulavi, D. and Sander, J. (2013). Density-based clustering based on hierarchical density estimates. In Pacific-Asia Conference on Knowledge Discovery and Data Mining, 17th Pacific-Asia Conference, PAKDD 2013. April 14–17, Gold Coast, Australia: Springer. Proceedings, Part II, pp. 160172.
Chomsky, N. (1962). Context-free grammars and pushdown storage. Report No. 65, RLE, M.I.T. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics. pp. 112118.
Chrupała, G., Kádár, À. and Alishahi, A. (2015). Learning language through pictures. In Proceedings of ACL/IJCNLP (Short Papers).
Conneau, A., Kruszewski, G., Lample, G., Barrault, L. and Baroni, M. (2018). What you can cram into a single vector: Probing sentence embeddings for linguistic properties. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics. pp. 21262136.
Croce, D., Rossini, D. and Basili, R. (2018). Explaining non-linear classifier decisions within kernel-based deep architectures. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 1624.
Croce, D., Filice, S., Castellucci, G. and Basili, R. (2017). Deep learning in semantic kernel spaces. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 345354.
Dhar, P. and Bisazza, A. (2018). Does syntactic knowledge in multilingual language models transfer across languages? In Proceedings of the 2018 EMNLPWorkshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 374377.
Elman, J.L. (1990). Finding structure in time. Cognitive Science 14(2), 179211.
Gevrey, M., Dimopoulos, I. and Lek, S. (2003). Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological Modelling 160(3), 249264.
Giannakidou, A. (2011). Negative and positive polarity items: Variation, licensing, and compositionality. In Maienborn, C., von Heusinger, K., and Portner, P. (Eds.), Semantics: An International Handbook of Natural Language Meaning. Berlin: Mouton de Gruyter. pp. 16601712.
Giulianelli, M., Harding, J., Mohnert, F., Hupkes, D. and Zuidema, W. (2018). Under the hood: Using diagnostic classifiers to investigate and improve how language models track agreement information. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 240248
Grefenstette, E., Hermann, K.M., Suleyman, M. and Blunsom, P. (2015). Learning to transduce with unbounded memory. In Advances in Neural Information Processing Systems 28, pp. 18281836.
Gulordava, K., Bojanowski, P., Grave, E., Linzen, T. and Baroni, M. (2018). Colorless green recurrent networks dream hierarchically. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Volume 1, pp. 11951205.
Gupta, P. and Schütze, H. (2018). Lisa: Explaining recurrent neural network judgments via layer-wise semantic accumulation and example to pattern transformation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 154164.
Hao, Y., Merrill, W., Angluin, D., Frank, R., Amsel, N., Benz, A. and Mendelsohn, S. (2018). Context-free transductions with neural stacks. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 306315.
Harbecke, D., Schwarzenberg, R. and Alt, C. (2018). Learning explanations from language data. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 316318.
Hendrickx, I., Kim, S.N., Kozareva, Z., Nakov, P., Ó Séaghdha, D., Padó, S., Pennacchiotti, M., Romano, L. and Szpakowicz, S. (2009). Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals. In Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions. Association for Computational Linguistics, pp. 9499
Hiebert, A., Peterson, C., Fyshe, A. and Mehta, N. (2018). Interpreting word-level hidden state behaviour of character-level lstm language models. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 258266.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation 9(8), 17351780.
Htut, P.M., Cho, K. and Bowman, S. (2018). Grammar induction with neural language models: An unusual replication. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 371373.
Hupkes, D., Veldhoen, S. and Zuidema, W. (2018). Visualisation and ’diagnostic classifiers’ reveal how recurrent and recursive neural networks process hierarchical structure. JAIR 61, 907926.
Jacovi, A., Sar Shalom, O. and Goldberg, Y. (2018). Understanding convolutional neural networks for text classification. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 5665.
Jang, E., Gu, S. and Poole, B. (2017). Categorical reparameterization with gumbel-softmax. In 5th International Conference on Learning Representations, ICLR 2017, April 24–26, Toulon, France. Conference Track Proceedings.
Jumelet, J. and Hupkes, D. (2018). Do language models understand anything? on the ability of lstms to understand negative polarity items. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 222231
Kádár, A., Chrupała, G. and Alishahi, A. (2017). Representation of linguistic form and function in recurrent neural networks. CL 43(4), 761780.
Kementchedjhieva, Y. and Lopez, A. (2018). ‘Indicatements’ that character language models learn English morpho-syntactic units and regularities. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 145153.
Kindermans, P.-J., Schütt, K.T., Alber, M., Müller, K.-R., Erhan, D., Kim, B. and Dähne, S. (2017). Learning how to explain neural networks: Patternnet and patternattribution. 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, Conference Track Proceedings.
Kiros, R., Zhu, Y., Salakhutdinov, R.R., Zemel, R., Urtasun, R., Torralba, A. and Fidler, S. (2015). Skip-thought vectors. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, Montreal, Quebec, Canada, pp. 32943302.
Krug, A. and Stober, S. (2018). Introspection for convolutional automatic speech recognition. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 187199.
Lake, B.M. and Baroni, M. (2018). Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, pp. 28792888.
Linzen, T. (2019). What can linguistics and deep learning contribute to each other? response to pater. Language 95(1), e98e108.
Linzen, T., Dupoux, E. and Goldberg, Y. (2016). Assessing the ability of lstms to learn syntax-sensitive dependencies. TACL 4(1), 521535.
Loula, J., Baroni, M. and Lake, B. (2018). Rearranging the familiar: Testing compositional generalization in recurrent networks. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 108114.
Madhyastha, P.S., Wang, J. and Specia, L. (2018a). End-to-end image captioning exploits distributional similarity in multimodal space. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 381383.
Madhyastha, P.S., Wang, J. and Specia, L. (2018b). End-to-end image captioning exploits distributional similarity in multimodal space. In British Machine Vision Conference 2018, BMVC 2018, September 3–6, Northumbria University, Newcastle, UK. pp. 306.
Mareček, D. and Rosa, R. (2018). Extracting syntactic trees from transformer encoder self-attentions. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 347349.
Montavon, G., Lapuschkin, S., Binder, A., Samek, W. and Müller, K.-R. (2017). Explaining nonlinear classification decisions with deep Taylor decomposition. Pattern Recognition 65, 211222.
Nematzadeh, A., Burns, K., Grant, E., Gopnik, A. and Griffiths, T.L. (2018). Evaluating theory of mind in question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, October 31–November 4, Brussels, Belgium, pp. 23922400.
Niculae, V. and Blondel, M. (2017). A regularized framework for sparse and structured neural attention. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, July 10–15, Long Beach, CA, USA, pp. 33403350.
Niculae, V., Martins, A.F., Blondel, M. and Cardie, C. (2018). Sparsemap: Differentiable sparse structured inference. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden. pp. 37963805.
Paperno, D. (2018). Limitations in learning an interpreted language with recurrent models. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 384386.
Papernot, N. and McDaniel, P. (2018). Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning. arXiv:1803.04765.
Peters, B., Niculae, V. and Martins, A.F.T. (2018). Interpretable structure induction via sparse attention. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 365367.
Poerner, N., Roth, B. and Schütze, H. (2018). Interpretable textual neuron representations for NLP. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 325327.
Poliak, A., Haldar, A., Rudinger, R., Hu, J.E., Pavlick, E., White, A.S. and Van Durme, B. (2018). Collecting diverse natural language inference problems for sentence representation evaluation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 337340.
Raganato, A. and Tiedemann, J. (2018). An analysis of encoder representations in transformer-based machine translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 287297.
Raghu, M., Gilmer, J., Yosinski, J. and Sohl-Dickstein, J. (2017). SVCCA: Singular vector canonical correlation analysis for deep learning dynamics and interpretability. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. Long Beach, CA, USA, pp. 60786087.
Ravfogel, S., Goldberg, Y. and Tyers, F. (2018). Can LSTM learn to capture agreement? The case of Basque. In Proceedings of the 2018 EMNLPWorkshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 98107.
Rønning, O., Hardt, D. and Søgaard, A. (2018). Linguistic representations in multi-task neural networks for ellipsis resolution. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, pp. 6673.
Ross, J.R. (1967). Constraints on Variables in Syntax. PhD Thesis, MIT.
Saphra, N. and Lopez, A. (2018a). Language models learn POS first. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 328330.
Saphra, N. and Lopez, A. (2018b). Understanding learning dynamics of language models with SVCCA. arXiv:1811.00225.
Sennhauser, L. and Berwick, R. (2018). Evaluating the ability of lstms to learn context-free grammars. In Proceedings of the 2018 EMNLPWorkshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 115124.
Shen, Y., Lin, Z., Huang, C.-W. and Courville, A. (2018). Neural language modeling by jointly learning syntax and lexicon. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings.
Siegelmann, H.T. and Sontag, E.D. (1995). On the computational power of neural nets. Journal of Computer and System Sciences 50(1), 132150.
Skachkova, N., Trost, T. and Klakow, D. (2018). Closing brackets with recurrent neural networks. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 232239.
Søgaard, A., de Lhoneux, M. and Augenstein, I. (2018). Nightmare at test time: How punctuation prevents parsers from generalizing. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics, pp. 2529.
Sommerauer, P. and Fokkens, A. (2018). Firearms and tigers are dangerous, kitchen knives and zebras are not: Testing whether word embeddings can tell. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 276286.
Spinks, G. and Moens, M.-F. (2018). Evaluating textual representations through image generation. In Proceedings of EMNLP BlackboxNLP. Association for Computational Linguistics.
Stahlberg, F., Saunders, D. and Byrne, B. (2018). An operation sequence model for explainable neural machine translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 175186.
Sushil, M., Suster, S. and Daelemans, W. (2018). Rule induction for global explanation of trained models. In Proceedings of the 2018 EMNLPWorkshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 8297.
Trifonov, V., Ganea, O.-E., Potapenko, A. and Hofmann, T. (2018). Learning and evaluating sparse interpretable sentence embeddings. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 200210.
Tutek, M. and Šnajder, J. (2018). Iterative recursive attention model for interpretable sequence classification. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 249257.
Vania, C. and Lopez, A. (2018). Explicitly modeling case improves neural dependency parsing. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 356358.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. Long Beach, CA, USA, pp. 60006010.
Verwimp, L., Van Hamme, H., Renkens, V. and Wambacq, P. (2018a). State gradients for RNN memory analysis. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 344346.
Verwimp, L., Van Hamme, H., Renkens, V. and Wambacq, P. (2018b). State gradients for RNN memory analysis. Interspeech 2018.
Vu, N.T., Adel, H., Gupta, P. et al. (2016). Combining recurrent and convolutional neural networks for relation classification. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. pp. 534539.
Wallace, E., Feng, S. and Boyd-Graber, J. (2018). Interpreting neural networks with nearest neighbors. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 136144.
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O. and Bowman, S. (2018). Glue: A multi-task benchmark and analysis platform for natural language understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 353355.
Wei, J., Pham, K., O’Connor, B. and Dillon, B. (2018). Evaluating grammaticality in seq2seq models with a broad coverage HPSG grammar: A case study on machine translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 298305.
Weiss, G., Goldberg, Y. and Yahav, E. 2018. On the practical computational power of finite precision RNNs for language recognition. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics. pp. 740745.
Wilcox, E., Levy, R., Morita, T. and Futrell, R. (2018). What do RNN language models learn about filler–gap dependencies? In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics. pp. 211221.
Williams, A., Drozdov, A. and Bowman, S.R. (2018). Do latent tree learning models identify meaningful structure in sentences? Transactions of the Association for Computational Linguistics 6, 253267.
Yin, P., Zhou, C., He, J. and Neubig, G. (2018). StructVAE: Tree-structured latent variable models for semi-supervised semantic parsing. In Proceedings The 56th Annual Meeting of the Association for Computational Linguistics (ACL), Melbourne, Australia.
Zhang, K. and Bowman, S. (2018a). Language modeling teaches you more syntax than translation does: Lessons learned through auxiliary task analysis. arXiv:1809.10040.
Zhang, K. and Bowman, S. (2018b). Language modeling teaches you more than translation does: Lessons learned through auxiliary syntactic task analysis. In Proceedings of EMNLP BlackboxNLP. Association for Computational Linguistics. pp. 359361.
Zhou, C. and Neubig, G. (2017). Multi-space variational encoder-decoders for semi-supervised labeled sequence transduction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics

Keywords

Metrics

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed