Hostname: page-component-8448b6f56d-mp689 Total loading time: 0 Render date: 2024-04-19T10:41:54.468Z Has data issue: false hasContentIssue false

Digitization of the Canadian Parliamentary Debates

Published online by Cambridge University Press:  18 January 2017

Kaspar Beelen*
Affiliation:
University of Amsterdam
Timothy Alberdingk Thijm
Affiliation:
University of Toronto
Christopher Cochrane*
Affiliation:
University of Toronto
Kees Halvemaan
Affiliation:
University of Amsterdam
Graeme Hirst*
Affiliation:
University of Toronto
Michael Kimmins
Affiliation:
University of Toronto
Sander Lijbrink
Affiliation:
University of Amsterdam
Maarten Marx*
Affiliation:
University of Amsterdam
Nona Naderi*
Affiliation:
University of Toronto
Ludovic Rheault*
Affiliation:
University of Toronto
Roman Polyanovsky
Affiliation:
University of Toronto
Tanya Whyte*
Affiliation:
University of Toronto
*
Informatics Institute, University of Amsterdam, Science Park 904, Amsterdam, 1098 XH, email: k.beelen@uva.nl
Department of Political Science, University of Toronto, 100 St. George Street, Toronto, Ontario, M5S 3G3, email: christopher.cochrane@utoronto.ca
Department of Computer Science, University of Toronto, 10 King's College Road, Toronto, Ontario, M5S 3G4, email: gh@cs.toronto.edu
Informatics Institute, University of Amsterdam, Science Park 904, Amsterdam, 1098 XH, email: maartenmarx@uva.nl
Department of Computer Science, University of Toronto, 10 King's College Road, Toronto, Ontario, M5S 3G4, email: nona@cs.toronto.edu
Department of Political Science, University of Toronto, 100 St. George Street, Toronto, Ontario, M5S 3G3, email: ludovic.rheault@utoronto.ca
Department of Political Science, University of Toronto, 100 St. George Street, Toronto, Ontario, M5S 3G3, email: tanya.whyte@mail.utoronto.ca

Abstract

This paper describes the digitization and enrichment of the Canadian House of Commons English Debates from 1901 to present. We start by laying out the general framework in which this project took place and then present the structure of the database and provide guidelines to prospective users. The paper concludes with the introduction of www.lipad.ca, an online platform designed as a hub for archiving Canadian political data, with the parliamentary proceedings at the centre of its architecture.

Résumé

Cet article décrit la numérisation et l'enrichissement de la publication parlementaire Débats de la Chambre des communes du Canada en langue anglaise, de 1901 à nos jours. Nous commençons par exposer le cadre général dans lequel ce projet s'est inscrit pour présenter ensuite la structure de la base de données et fournir des lignes directrices aux utilisateurs potentiels. L'article se conclut par la présentation de www.lipad.ca, une plateforme en ligne conçue pour être un carrefour d'archivage des données politiques canadiennes, avec les débats parlementaires au centre de son architecture.

Type
Research Note
Copyright
Copyright © Canadian Political Science Association (l'Association canadienne de science politique) and/et la Société québécoise de science politique 2017 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Alonso, José, Ambur, Owen, Amutio, Miguel A., Azañón, Oscar, Bennett, Daniel, Flagg, Rachel, McAllister, Dave, Novak, Kevin, Rush, Sharron and Sheridan, John. 2009. “Improving access to government through better use of the web.” World Wide Web Consortium.Google Scholar
Auer, Sören, Bizer, Christian, Kobilarov, Georgi, Lehmann, Jens, Cyganiak, Richard and Ives, Zachary. 2007. “DBPedia: A Nucleus for a Web of Open Data.” In The Semantic Web: Lectures Notes in Computer Science 4825, ed. Aberer, Karl, Choi, Key-Sun, Noy, Natasha Allemang, Dean, Lee, Kyung-Il, Nixon, Lyndon Golbeck, Jennifer, Mika, Peter, Maynard, Diana, Mizoguchi, Riichiro, Schreiber, Guus and Cudré-Mauroux, Philippe. Berlin: Springer.Google Scholar
Barbera, Michele. 2013. “Linked (open) data at web scale: research, social and engineering challenges in the digital humanities.” Journal of Law and Information Science 4: 91101.Google Scholar
Berners-Lee, Tim, Hendler, James and Lassila, Ora. 2001. “The Semantic Web. A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities.” Scientific American, May 1, 15.Google Scholar
Bizer, Christian, Heath, Tom and Berners-Lee, Tim. 2009. “Linked data—the story so far.” International Journal on Semantic Web and Information Systems 5: 205–27.CrossRefGoogle Scholar
Blanke, Tobias, Bodard, Gabriel, Bryant, Michael, Dunn, Stuart, Hedges, Mark, Jackson, Michael and Scott, David. 2012. “Linked data for humanities research—The SPQR experiment.” Paper presented at the 6th IEEE International Conference, IEEE.CrossRefGoogle Scholar
Brown, Peter F., Cocke, John, Della Pietra, Stephen A., Della Pietra, Vincent J., Jelinek, Fredrick, Lafferty, John D., Mercer, Robert L., and Roossin, Paul S.. 1990. “A statistical approach to machine translation.” Computational linguistics 16: 7985.Google Scholar
Brown, Peter F., Della Pietra, Stephen A., Della Pietra, Vincent J., and Mercer, Robert L.. 1991. “Word-sense disambiguation using statistical methods.” In Proceedings of the 29th annual meeting on Association for Computational Linguistics, Association for Computational Linguistics, pp. 264270.CrossRefGoogle Scholar
Brown, Peter F., Della Pietra, Vincent J., Della Pietra, Stephen A., and Mercer, Robert L.. 1993. “The mathematics of statistical machine translation: Parameter estimation.” Computational linguistics 19: 263311.Google Scholar
Diermeier, Daniel, Godbout, Jean-François, Yu, Bei and Kaufmann, Stefan. 2012. “Language and Ideology in Congress.” British Journal of Political Science 42: 3155.CrossRefGoogle Scholar
Grimmer, Justin and Gary King, G. 2011. “General Purpose Computer-Assisted Clustering and Conceptualization.” Proceedings of the National Academy of Sciences 108: 2643–50.CrossRefGoogle ScholarPubMed
Fraser, Alexander and Marcu, Daniel. 2007. “Measuring word alignment quality for statistical machine translation.” Computational Linguistics 33: 293303.CrossRefGoogle Scholar
Grimmer, Justin and Stewart, Brandon M.. 2013. “Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts.” Political Analysis 21: 267297.CrossRefGoogle Scholar
Hansard Association of Canada. 2005. “Tradition and Innovation, Celebrating 125 years of Hansard.” Ottawa.Google Scholar
Ide, Nancy, and Veronis, Jean, eds. 1995. Text encoding initiative: Background and contexts. vol. 29. Berlin: Springer Science & Business Media.CrossRefGoogle Scholar
Kitchin, Rob. 2014. The data revolution: Big data, open data, data infrastructures and their consequences. London: Sage.Google Scholar
Manin, Bernard. 1997. The principles of representative government. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Marleau, Robert, and Montpetit, Camille. 2000. House of Commons Procedure and Practice. http://www.parl.gc.ca/marleaumontpetit/ (December 1, 2015).Google Scholar
Marx, Maarten. 2009. “Advanced information access to parliamentary debates.” Journal of Digital Information 10: 111.Google Scholar
Meroño-Peñuela, Albert, Ashkpour, Ashkan, Rietveld, Laurens and Hoekstra, Rinke. 2012. “Linked humanities data: The next frontier? A case-study in historical census data.” In Proceedings of the 2nd International Workshop on Linked Science, Boston.Google Scholar
Milligan, Ian. 2014. “Open Data's Potential for Political History.” Canadian Parliamentary Review 35: 3443.Google Scholar
Monroe, Burt L., Colaresi, Michael P. and Quinn, Kevin M.. 2008. “Fightin' Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict.” Political Analysis 16: 372403.CrossRefGoogle Scholar
O'Brien, Audrey. 2002. “Prism: The House of Commons Integrated Technology Project.” Canadian Parliamentary Review 25.Google Scholar
Proksch, Sven-Oliver and Slapin, Jonathan B.. 2010. “Position Taking in European Parliament Speeches.” British Journal of Political Science 40: 587611.CrossRefGoogle Scholar
Rademaker, Alexandre, Borges Oliveira, Dário Augusto, de Paiva, Valeria, Higuchi, Suemi, Medeiros e Sá, Asla, and Alvim, Moacyr. 2015. “A linked open data architecture for the historical archives of the Getulio Vargas Foundation.” International Journal on Digital Libraries 15: 153–67.CrossRefGoogle Scholar
Roberts, Margaret E., Stewart, Brandon M., Tingley, Dustin, Lucas, Christopher, Leder-Luis, Jetson, Kushner Gadarian, Shana, Albertson, Bethany and Rand, David G.. 2014. “Structural Topic Models for Open-Ended Survey Responses.” American Journal of Political Science 58: 1064–82.CrossRefGoogle Scholar
Slembrouck, Stef. 1992. “The parliamentary Hansard ‘verbatim’ report: the written construction of spoken discourse.” Language and literature 1: 101–19.CrossRefGoogle Scholar
Sztyler, Timo, Huber, Jakob, Noessner, Jan, Murdock, Jaimie, Allen, Colin and Niepert, Mathias. 2014. “LODE: Linking digital humanities content to the web of data.” In Proceedings of the 14th ACM/IEEE-CS Joint Conference on Digital Libraries. IEEE Press.CrossRefGoogle Scholar
Tarasova, Tatiana and Marx, Maarten. 2013. “ParlBench: A SPARQL Benchmark for Electronic Publishing Applications.” In The Semantic Web: ESWC 2013 Satellite Events. Berlin: Springer.Google Scholar