Wordbank: an open repository for developmental vocabulary data*

MICHAEL C. FRANK; MIKA BRAGINSKY; DANIEL YUROVSKY; VIRGINIA A. MARCHMAN

doi:10.1017/S0305000916000209

Wordbank: an open repository for developmental vocabulary data*

Published online by Cambridge University Press: 18 May 2016

DANIEL YUROVSKY and

MICHAEL C. FRANK*: Affiliation:
Stanford University, USA
MIKA BRAGINSKY: Affiliation:
Stanford University, USA
DANIEL YUROVSKY: Affiliation:
Stanford University, USA
VIRGINIA A. MARCHMAN: Affiliation:
Stanford University, USA
*: Address for correspondence: Michael C. Frank, Department of Psychology, Jordan Hall (Bldg. 420), 450 Serra Mall, Stanford, CA 94305; tel: (650) 724-4003; e-mail: mcfrank@stanford.edu

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

The MacArthur-Bates Communicative Development Inventories (CDIs) are a widely used family of parent-report instruments for easy and inexpensive data-gathering about early language acquisition. CDI data have been used to explore a variety of theoretically important topics, but, with few exceptions, researchers have had to rely on data collected in their own lab. In this paper, we remedy this issue by presenting Wordbank, a structured database of CDI data combined with a browsable web interface. Wordbank archives CDI data across languages and labs, providing a resource for researchers interested in early language, as well as a platform for novel analyses. The site allows interactive exploration of patterns of vocabulary growth at the level of both individual children and particular words. We also introduce wordbankr, a software package for connecting to the database directly. Together, these tools extend the abilities of students and researchers to explore quantitative trends in vocabulary development.

Type: Articles
Information: Journal of Child Language , Volume 44 , Issue 3 , May 2017 , pp. 677 - 694

DOI: https://doi.org/10.1017/S0305000916000209 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2016

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

[*]

This work supported by a John Merck Scholars award and NSF BCS-1528526. Thanks to Ranjay Krishna for contributions to the initial development of the site, to Rune Nørgaard Jørgensen for helping port data from CLEX, to all of the contributors listed at <http://wordbank.stanford.edu/contributors> for generously sharing their data, and to the Advisory Board of the MacArthur-Bates Communicative Development Inventories, especially Philip Dale and Larry Fenson, for their support.

References

REFERENCES

Bates, E. (1976). Language and context: the acquisition of pragmatics (Vol. 13). New York, NY: Academic Press.Google Scholar

Bates, E. & Goodman, J. (1999). On the emergence of grammar from the lexicon. In MacWhinney, B. (ed.), The emergence of language (pp. 29–79). Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar

Bates, E., Marchman, V., Thal, D., Fenson, L., Dale, P., Reznick, J. S., … Hartung, J. (1994). Developmental and stylistic variation in the composition of early vocabulary. Journal of Child Language 21, 85–123.Google Scholar

Bloom, P. (2002). How children learn the meanings of words. Cambridge, MA: MIT Press.Google Scholar

Bornstein, M. H. & Haynes, O. M. (1998). Vocabulary competence in early childhood: measurement, latent construct, and predictive validity. Child Development 69, 654–71.CrossRef Google Scholar PubMed

Braginsky, M., Yurovsky, D., Marchman, V. A. & Frank, M. C. (2015). Developmental changes in the relationship between grammar and the lexicon. In Noelle, D. C., Dale, R., Warlaumont, A. S., Yoshimi, J., Matlock, T., Jennings, C. D., & Maglio, P. P. (Eds.), Proceedings of the 37th Annual Meeting of the Cognitive Science Society. Austin, TX: Cognitive Science Society.Google Scholar

Brown, R. (1973). A first language: the early stages. Cambridge, MA: Harvard University Press.CrossRef Google Scholar

Cartmill, E. A., Armstrong, B. F., Gleitman, L. R., Goldin-Meadow, S., Medina, T. N. & Trueswell, J. C. (2013). Quality of early parent input predicts child vocabulary 3 years later. Proceedings of the National Academy of Sciences 110, 11278–83.Google Scholar

Clark, E. (2003). First language acquisition. Cambridge: Cambridge University Press.Google Scholar

Dale, P. S. (n.d.). Adaptations, not translations! Online: <http://mb-cdi.stanford.edu/adaptations.html> (last accessed 2015).+(last+accessed+2015).>Google Scholar

Dale, P. S. & Fenson, L. (1996). Lexical development norms for young children. Behavior Research Methods, Instruments & Computers 28, 125–7.Google Scholar

Dale, P. S. & Penfold, M. (n.d.). Adaptations of the MacArthur-Bates CDI into non-US English languages. Online: <http://mb-cdi.stanford.edu/documents/AdaptationsSurvey7-5-11Web.pdf> (last accessed 2011).+(last+accessed+2011).>Google Scholar

Dickinson, D. K. & Tabors, P. O. (2001). Beginning literacy with language: young children learning at home and school. Baltimore, MD: Paul H. Brookes Publishing.Google Scholar

Dunn, L. M. & Dunn, L. M. (2007). Peabody Picture Vocabulary Test, 4th ed. Parsippany, NJ: AGS Publishing / Pearson Assessments.Google Scholar

Eriksson, M., Marschik, P. B., Tulviste, T., Almgren, M., Pérez Pereira, M., Wehberg, S., … Gallego, C. (2012). Differences between girls and boys in emerging language skills: evidence from 10 language communities. British Journal of Developmental Psychology 30, 326–43.CrossRef Google Scholar

Feldman, H. M., Dale, P. S., Campbell, T. F., Colborn, D. K., Kurs-Lasky, M., Rockette, H. E. & Paradise, J. L. (2005). Concurrent and predictive validity of parent reports of child language at ages 2 and 3 years. Child Development 76, 856–68.Google Scholar

Feldman, H. M., Dollaghan, C. A., Campbell, T. F., Kurs-Lasky, M., Janosky, J. E. & Paradise, J. L. (2000). Measurement properties of the MacArthur Communicative Development Inventories at ages one and two years. Child Development 71, 310–22.Google Scholar

Fenson, L., Bates, E., Dale, P., Goodman, J., Reznick, J. S. & Thal, D. (2000). Reply: measuring variability in early child language: don't shoot the messenger. Child Development 71, 323–8.CrossRef Google Scholar

Fenson, L., Dale, P. S., Reznick, J. S., Bates, E., Hartung, J. P., Pethick, S. & Reilly, J. (1993). MacArthur Communicative Development Inventories: user's guide and technical manual. Baltimore, MD: Paul H. Brookes Publishing Co.Google Scholar

Fenson, L., Dale, P., Reznick, J., Bates, E., Thal, D., Pethick, S., … Stiles, J. (1994). Variability in early communicative development. Monographs of the Society for Research in Child Development 59.CrossRef Google Scholar PubMed

Fenson, L., Marchman, V. A., Thal, D., Dale, P., Reznick, J. S. & Bates, E. (2007). MacArthur-Bates Communicative Development Inventories: user's guide and technical manual, 2nd ed. Baltimore, MD: Brookes Publishing Company.Google Scholar

Hidaka, S. (2016). Estimating the latent number of types in growing corpora with reduced cost–accuracy trade-off. Journal of Child Language 43, 1–28.Google Scholar

Hills, T. T., Maouene, J., Riordan, B. & Smith, L. B. (2010). The associative structure of language: contextual diversity in early word learning. Journal of Memory and Language 63, 259–73.CrossRef Google Scholar PubMed

Hills, T. T., Maouene, M., Maouene, J., Sheya, A. & Smith, L. (2009). Longitudinal analysis of early semantic networks: Preferential attachment or preferential acquisition? Psychological Science 20, 729–39.CrossRef Google Scholar PubMed

Huttenlocher, J., Haight, W., Bryk, A., Seltzer, M. & Lyons, T. (1991). Early vocabulary growth: relation to language input and gender. Developmental Psychology 27, 236–48.Google Scholar

Jørgensen, R. N., Dale, P. S., Bleses, D. & Fenson, L. (2010). CLEX: a cross-linguistic lexical norms database. Journal of Child Language 37, 419–28.CrossRef Google Scholar PubMed

Kristoffersen, K. E., Simonsen, H. G., Bleses, D., Wehberg, S., Jørgensen, R. N., Eiesland, E. A. & Henriksen, L. Y. (2013). The use of the Internet in collecting CDI data – an example from Norway. Journal of Child Language 40, 567–85.CrossRef Google Scholar PubMed

Lieven, E., Salomo, D. & Tomasello, M. (2009). Two-year-old children's production of multiword utterances: a usage-based analysis. Cognitive Linguistics 20, 481–507.Google Scholar

MacWhinney, B. (2000). The CHILDES Project: tools for analyzing talk, 3rd ed. Mahwah, NJ: Lawrence Erlbaum Associates.Google Scholar

Marchman, V. A. & Martínez-Sussmann, C. (2002). Concurrent validity of caregiver/parent report measures of language for children who are learning both English and Spanish. Journal of Speech, Language, and Hearing Research 45, 983–97.CrossRef Google Scholar PubMed

Mayor, J. & Plunkett, K. (2011). A statistical estimate of infant and toddler vocabulary size from CDI analysis. Developmental Science 14, 769–85.CrossRef Google Scholar PubMed

Muggeo, V. M., Sciandra, M., Tomasello, A. & Calvo, S. (2013). Estimating growth charts via nonparametric quantile regression: a practical framework with application in ecology. Environmental and Ecological Statistics 20, 519–31.CrossRef Google Scholar

Nelson, K. (1973). Structure and strategy in learning to talk. Monographs of the Society for Research in Child Development 38, 1–135.CrossRef Google Scholar

Norrman, G. & Bylund, E. (2016). The irreversibility of sensitive period effects in language development: evidence from second language acquisition in international adoptees. Developmental Science 19, 513–20.Google Scholar

R Foundation for Statistical Computing (2014). R: a language and environment for statistical computing. Software, online: <http://www.r-project.org>..>Google Scholar

Rescorla, L. (1989). The language development survey: a screening tool for delayed language in toddlers. Journal of Speech and Hearing Disorders 54, 587–99.Google Scholar

Roy, B. C., Frank, M. C., DeCamp, P., Miller, M. & Roy, D. (2015). Predicting the birth of a spoken word. Proceedings of the National Academy of Sciences 112, 12663–68.Google Scholar

Song, J. Y., Shattuck-Hufnagel, S. & Demuth, K. (2015). Development of phonetic variants (allophones) in 2-year-olds learning American English: a study of alveolar stop /t, d/ codas. Journal of Phonetics 52, 152–69.Google Scholar

Tardif, T., Fletcher, P., Liang, W., Zhang, Z., Kaciroti, N. & Marchman, V. A. (2008). Baby's first 10 words. Developmental Psychology 44, 929–38.CrossRef Google Scholar PubMed

Thal, D., Jackson-Maldonado, D. & Acosta, D. (2000). Validity of a parent-report measure of vocabulary and grammar for Spanish-speaking toddlers. Journal of Speech, Language, and Hearing Research 43, 1087–100.Google Scholar

Tomasello, M. & Mervis, C. B. (1994). The instrument is great, but measuring comprehension is still a problem. Monographs of the Society for Research in Child Development 59, 174–9.Google Scholar

Wallentin, M. (2009). Putative sex differences in verbal abilities and language cortex: a critical review. Brain and Language 108, 175–83.Google Scholar

Weisleder, A. & Fernald, A. (2013). Talking to children matters: early language experience strengthens processing and builds vocabulary. Psychological Science 24, 2143–52.Google Scholar

Wickham, H. (2009). Ggplot2: elegant graphics for data analysis. New York, NY: Springer Science & Business Media.Google Scholar

Wickham, H. & Francois, R. (2014). Dplyr: a grammar of data manipulation. R package version 0·3·0·2. Online: <https://cran.r-project.org/web/packages/dplyr>..>Google Scholar

Article contents

Wordbank: an open repository for developmental vocabulary data*

Abstract

Access options

Article purchase

Temporarily unavailable

Footnotes

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests