Skip to main content Accessibility help
×
Home
Hostname: page-component-544b6db54f-rlmms Total loading time: 0.268 Render date: 2021-10-21T03:27:24.671Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "metricsAbstractViews": false, "figures": true, "newCiteModal": false, "newCitedByModal": true, "newEcommerce": true, "newUsageEvents": true }

IRT Models for Expert-Coded Panel Data

Published online by Cambridge University Press:  03 September 2018

Kyle L. Marquardt*
Affiliation:
V-Dem Institute, Department of Political Science, University of Gothenburg, Gothenburg, Sweden. Email: kyle.marquardt@gu.se
Daniel Pemstein
Affiliation:
Department of Criminial Justice and Political Science, North Dakota State University, Fargo, ND 58105, USA. Email: daniel.pemstein@ndsu.edu

Abstract

Data sets quantifying phenomena of social-scientific interest often use multiple experts to code latent concepts. While it remains standard practice to report the average score across experts, experts likely vary in both their expertise and their interpretation of question scales. As a result, the mean may be an inaccurate statistic. Item-response theory (IRT) models provide an intuitive method for taking these forms of expert disagreement into account when aggregating ordinal ratings produced by experts, but they have rarely been applied to cross-national expert-coded panel data. We investigate the utility of IRT models for aggregating expert-coded data by comparing the performance of various IRT models to the standard practice of reporting average expert codes, using both data from the V-Dem data set and ecologically motivated simulated data. We find that IRT approaches outperform simple averages when experts vary in reliability and exhibit differential item functioning (DIF). IRT models are also generally robust even in the absence of simulated DIF or varying expert reliability. Our findings suggest that producers of cross-national data sets should adopt IRT techniques to aggregate expert-coded data measuring latent concepts.

Type
Articles
Copyright
Copyright © The Author(s) 2018. Published by Cambridge University Press on behalf of the Society for Political Methodology. 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Authors’ note: Earlier drafts presented at the 2016 MPSA Annual Convention, the 2016 IPSA World Convention and the 2016 V-Dem Latent Variable Modeling Week Conference. We thank Chris Fariss, Juraj Medzihorsky, Pippa Norris, Jon Polk, Shawn Treier, Carolien van Ham and Laron Williams for their comments on earlier drafts of this paper, as well as V-Dem Project members for their suggestions and assistance. We are also grateful to the editor and two anonymous reviewers for their detailed suggestions. This material is based upon work supported by the National Science Foundation under Grant No. SES-1423944 (PI: Daniel Pemstein); the Riksbankens Jubileumsfond, Grant M13-0559:1 (PI: Staffan I. Lindberg); the Swedish Research Council, 2013.0166 (PI: Staffan I. Lindberg and Jan Teorell); the Knut and Alice Wallenberg Foundation (PI: Staffan I. Lindberg); the University of Gothenburg, Grant E 2013/43; and internal grants from the Vice-Chancellor’s office, the Dean of the College of Social Sciences, and the Department of Political Science at University of Gothenburg. We performed simulations and other computational tasks using resources provided by the Notre Dame Center for Research Computing (CRC) through the High Performance Computing section and the Swedish National Infrastructure for Computing (SNIC) at the National Supercomputer Centre in Sweden (SNIC 2016/1- 382, 2017/1-407 and 2017/1-68). We specifically acknowledge the assistance of In-Saeng Suh at CRC and Johan Raber at SNIC in facilitating our use of their respective systems. Replication materials available in Marquardt and Pemstein (2018).

Contributing Editor: R. Michael Alvarez

References

Aldrich, John H., and McKelvey, Richard D.. 1977. A method of scaling with applications to the 1968 and 1972 Presidential elections. American Political Science Review 71(1):111130.CrossRefGoogle Scholar
Bakker, R., de Vries, C., Edwards, E., Hooghe, L., Jolly, S., Marks, G., Polk, J., Rovny, J., Steenbergen, M., and Vachudova, M. A.. 2012. Measuring party positions in Europe: The Chapel Hill expert survey trend file, 1999–2010. Party Politics 21(1):143152.CrossRefGoogle Scholar
Bakker, Ryan, Jolly, Seth, Polk, Jonathan, and Poole, Keith. 2014. The European common space: Extending the use of anchoring vignettes. The Journal of Politics 76(4):10891101.CrossRefGoogle Scholar
Boyer, K. K., and Verma, R.. 2000. Multiple raters in survey-based operations management research: A review and tutorial. Production and Operations Management 9(2):128140.CrossRefGoogle Scholar
Brady, Henry E. 1985. The perils of survey research: Inter-personally incomparable responses. Political Methodology 11(3/4):269291.Google Scholar
Buttice, Matthew K., and Stone, Walter J.. 2012. Candidates matter: Policy and quality differences in congressional elections. Journal of Politics 74(3):870887.CrossRefGoogle Scholar
Clinton, Joshua D., and Lewis, David E.. 2008. Expert opinion, agency characteristics, and agency preferences. Political Analysis 16(1):320.CrossRefGoogle Scholar
Coppedge, Michael, Gerring, John, Lindberg, Staffan I., Teorell, Jan, Pemstein, Daniel, Tzelgov, Eitan, Wang, Yi-ting, Glynn, Adam, Altman, David, Bernhard, Michael, Steven Fish, M., Hicken, Allen, McMann, Kelly, Paxton, Pamela, Reif, Megan, Skaaning, Svend-Erik, and Staton, Jeffrey. 2014. V-Dem: A new way to measure democracy. Journal of Democracy 25(3):159169.Google Scholar
Coppedge, Michael, Gerring, John, Lindberg, Staffan I., Skaaning, Svend-Erik, Teorell, Jan, Altman, David, Bernhard, Michael, Steven Fish, M., Glynn, Adam, Hicken, Allen, Knutsen, Carl Henrik, McMann, Kelly, Paxton, Pamela, Pemstein, Daniel, Staton, Jeffrey, Zimmerman, Britte, Andersson, Frida, Mechkova, Valeriya, and Miri, Farhad. 2016. Varieties of democracy codebook v6. Technical report. Varieties of Democracy Project: Project Documentation Paper Series.Google Scholar
Coppedge, Michael, Gerring, John, Lindberg, Staffan I., Skaaning, Svend-Erik, Teorell, Jan, Altman, David, Bernhard, Michael, Steven Fish, M., Glynn, Adam, Hicken, Allen, Knutsen, Carl Henrik, Marquardt, Kyle L., McMann, Kelly, Miri, Farhad, Paxton, Pamela, Pemstein, Daniel, Staton, Jeffrey, Tzelgov, Eitan, Wang, Yi-ting, and Zimmerman, Brigitte. 2016. V–Dem Dataset v6.2. Technical report. Varieties of Democracy Project. https://ssrn.com/abstract=2968289.CrossRefGoogle Scholar
Coppedge, Michael, Gerring, John, Lindberg, Staffan I., Skaaning, Svend-Erik, Teorell, Jan, Andersson, Frida, Marquardt, Kyle L., Mechkova, Valeriya, Miri, Farhad, Pemstein, Daniel, Pernes, Josefine, Stepanova, Natalia, Tzelgov, Eitan, and Wang, Yi-Ting. 2016. Varieties of Democracy Methodology v5. Technical report. Varieties of Democracy Project: Project Documentation Paper Series.Google Scholar
Hare, Christopher, Armstrong, David A., Bakker, Ryan, Carroll, Royce, and Poole, Keith T. 2015. Using Bayesian Aldrich-McKelvey Scaling to study citizens’ ideological preferences and perceptions. American Journal of Political Science 59(3):759774.CrossRefGoogle Scholar
Johnson, Valen E., and Albert, James H.. 1999. Ordinal Data Modeling . New York: Springer.Google Scholar
Jones, Bradford S., and Norrander, Barbara. 1996. The reliability of aggregated public opinion measures. American Journal of Political Science 40(1):295309.CrossRefGoogle Scholar
King, Gary, Murray, Christopher J. L., Salomon, Joshua A., and Tandon, Ajay. 2004. Enhancing the validity and cross-cultural comparability of measurement in survey research. The American Political Science Review 98(1):191207.CrossRefGoogle Scholar
King, Gary, and Wand, Jonathan. 2007. Comparing incomparable survey responses: Evaluating and selecting anchoring vignettes. Political Analysis 15(1):4666.CrossRefGoogle Scholar
Konig, T., Marbach, M., and Osnabrugge, M.. 2013. Estimating party positions across countries and time–a dynamic latent variable model for manifesto data. Political Analysis 21(4):468491.CrossRefGoogle Scholar
Kozlowski, Steve W., and Hattrup, Keith. 1992. A disagreement about within-group agreement: Disentangling issues of consistency versus consensus. Journal of Applied Psychology 77(2):161167.CrossRefGoogle Scholar
Lebreton, J. M., and Senter, J. L.. 2007. Answers to 20 questions about interrater reliability and interrater agreement. Organizational Research Methods 11(4):815852.CrossRefGoogle Scholar
Lindstädt, Rene, Proksch, Sven-Oliver, and Slapin, Jonathan B.. 2016. When experts disagree: Response aggregation and its consequences in expert surveys.Google Scholar
Maestas, Cherie D., Buttice, Matthew K., and Stone, Walter J.. 2014. Extracting wisdom from experts and small crowds: Strategies for improving informant-based measures of political concepts. Political Analysis 22(3):354373.CrossRefGoogle Scholar
Marquardt, Kyle, and Pemstein, Daniel. 2018. Replication Data for: IRT models for expert-coded panel data, https://doi.org/10.7910/DVN/KGP01E, Harvard Dataverse, V1.CrossRefGoogle Scholar
Norris, Pippa, Frank, Richard W., and Martínez I Coma, Ferran. 2013. Assessing the quality of elections. Journal of Democracy 24(4):124135.CrossRefGoogle Scholar
Pemstein, Daniel, Seim, Brigitte, and Lindberg, Staffan I.. 2016. Anchoring vignettes and item response theory in cross-national expert surveys.CrossRefGoogle Scholar
Pemstein, Daniel, Tzelgov, Eitan, and Wang, Yi-ting. 2015. Evaluating and improving item response theory models for cross-national expert surveys. Varieties of Democracy Institute Working Paper 1(March):153.Google Scholar
Pemstein, Daniel, Marquardt, Kyle L., Tzelgov, Eitan, Wang, Yi-ting, and Miri, Farhad. 2015. The V-Dem measurement model: Latent variable analysis for cross-national and cross-temporal expert-coded data. Varieties of Democracy Institute Working Paper , 21.Google Scholar
Ramey, Adam. 2016. Vox populi, vox dei? Crowdsourced ideal point estimation. The Journal of Politics 78(1):281295.CrossRefGoogle Scholar
Stan Development Team. 2015. Stan: A C++ Library for Probability and Sampling, Version 2.9.0. http://mc-stan.org/.Google Scholar
Teorell, Jan, Dahlström, Carl, and Dahlberg, Stefan. 2011. The QoG expert survey dataset. Technical report. University of Gothenburg: The Quality of Government Institute, http://www.qog.pol.gu.se.Google Scholar
Treier, Shawn, and Jackman, Simon. 2008. Democracy as a latent variable. American Journal of Political Science 52(1):201217.CrossRefGoogle Scholar
Van Bruggen, Gerrit H., Lilien, Gary L., and Kacker, Manish. 2002. Informants in organizational marketing research: Why use multiple informants and how to aggregate responses. Journal of Marketing Research 39(4):469478.CrossRefGoogle Scholar
von Davier, Matthias, Shin, Hyo-Jeong, Khorramdel, Lale, and Stankov, Lazar. 2017. The effects of vignette scoring on reliability and validity of self-reports. Applied Psychological Measurement 42(4):291306.CrossRefGoogle ScholarPubMed
Supplementary material: File

Marquardt and Pemstein supplementary material

Online Appendix

Download Marquardt and Pemstein supplementary material(File)
File 696 KB
28
Cited by

Send article to Kindle

To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

IRT Models for Expert-Coded Panel Data
Available formats
×

Send article to Dropbox

To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

IRT Models for Expert-Coded Panel Data
Available formats
×

Send article to Google Drive

To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

IRT Models for Expert-Coded Panel Data
Available formats
×
×

Reply to: Submit a response

Please enter your response.

Your details

Please enter a valid email address.

Conflicting interests

Do you have any conflicting interests? *