Skip to main content

What can Neighbourhood Density effects tell us about word learning? Insights from a connectionist model of vocabulary development*


In this paper, we investigate the effect of neighbourhood density (ND) on vocabulary size in a computational model of vocabulary development. A word has a high ND if there are many words phonologically similar to it. High ND words are more easily learned by infants of all abilities (e.g. Storkel, 2009; Stokes, 2014). We present a neural network model that learns general phonotactic patterns in the exposure language, as well as specific word forms and, crucially, mappings between word meanings and word forms. The network is faster at learning frequent words, and words containing high-probability phoneme sequences, as human word learners are, but, independently of this, the network is also faster at learning words with high ND, and, when its capacity is reduced, it learns high ND words in preference to other words, similarly to late talkers. We analyze the model and propose a novel explanation of the ND effect, in which word meanings play an important role in generating word-specific biases on general phonological trajectories. This explanation leads to a new prediction about the origin of the ND effect in infants.

Corresponding author
Address for correspondence: Martin Takac, Comenius University – Centre for Cognitive Science, Mlynská dolina, Bratislava 84248, Slovakia. e-mail:
Hide All

We are grateful to the Marsden fund of New Zealand for Grant 13-UOO-048 and Slovak VEGA agency for grant 1/0898/14 (Martin Takac). Big thanks to Jen Hay and Pat LaShell for the statistical analyses in this paper. We would also like to thank Igor Farkaš for valuable discussions on neural network issues. We also thank the anonymous reviewers of this paper for their numerous helpful comments.

Hide All
Baayen, R. H., Piepenbrock, R. & van Rijn, H. (1995). The CELEX lexical database (CD-ROM). Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania.
Chang, F., Dell, G. & Bock, K. (2006). Becoming syntactic. Psychological Review 113(2), 234–72.
Christiansen, M., Allen, J. & Seidenberg, M. (1998). Learning to segment speech using multiple cues: a connectionist model. Language and Cognitive Processes 13, 221–68.
Cottrell, G. & Plunkett, K. (1994). Acquiring the mapping from meaning to sounds. Connection Science 6, 379412.
De Cara, B. & Goswami, U. (2002). Statistical analysis of similarity relations among spoken words: evidence for the special status of rimes in English. Behavioural Research Methods and Instrumentation 34(3), 416–23.
Dell, G., Juliano, C. & Govindjee, A. (1993). Structure and content in language production: a theory of frame constraints in phonological speech errors. Cognitive Science 17(2), 149–95.
Dziak, J. J., Coffman, D. L., Lanza, S. T. & Li, R. (2012). Sensitivity and specificity of information criteria. Technical Report #12-119, College of Health and Human Development, The Pennsylvania State University, State College, PA.
Elman, J. (1990). Finding structure in time. Cognitive Science 14, 179211.
Fenson, L., Dale, P., Reznick, J. S., Thal, D., Bates, E., Hartung, J. & Reilly, J. (1994). Variability in early communicative development. Monographs of the Society for Research in Child Development 59(5), i185.
Frisch, S. A., Large, N. R. & Pisoni, D. B. (2000). Perception of wordlikeness: effects of segment probability and length on the processing of nonwords. Journal of Memory and Language 42, 481496.
Gaskell, M. & Marslen-Wilson, W. (1997). Integrating form and meaning: a distributed model of speech perception. Language and Cognitive Processes 12, 613656.
Hay, J., Pierrehumbert, J. & Beckman, M. (2003). Speech perception, well-formedness, and the statistics of the lexicon. In Local, J., Ogden, R. & Temple, R. (eds), Papers in laboratory phonology VI, 5874. Cambridge: Cambridge University Press.
Klee, T. & Harrison, C. (2001). CDI words and sentences validity and preliminary norms for British English. Paper presented at Child Language Seminar, University of Hertfordshire, England.
Kruskal, J. B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 9(1), 127.
Li, P. & MacWhinney, B. (2002). PatPho: a phonological pattern generator for neural networks. Behavior Research Methods, Instruments, and Computers 34, 408–15.
Magnuson, J. S., Dixon, J. A., Tanenhaus, M. K. & Aslin, R. N. (2007). The dynamics of lexical competition during spoken word recognition. Cognitive Science 31, 133–56.
Miikkulainen, R. (1997). Dyslexic and category-specific aphasic impairments in a self-organizing feature map model of the lexicon. Brain and Language 59, 334–66.
Moyle, J., Stokes, S. & Klee, T. (2011). Early language delay and specific language impairment. Developmental Disabilities Research Reviews 17, 160–69.
Rumelhart, D., McClelland, J. & the PDP research group. (1986). Parallel distributed processing: explorations in the microstructure of cognition, Vol. 1. Cambridge, MA: MIT Press.
Servan-Schreiber, D., Cleeremans, A. & McClelland, J. L. (1991). Graded state machines: the representation of temporal contingencies in simple recurrent networks. Machine Learning 7(2/3), 161–93.
Shillcock, R., Cairns, P., Chater, N. & Levy, J. (2000). Statistical and connectionist modelling of the development of speech segmentation. In Broeder, P. & Murre, J. (eds), Models of language learning, 103–20. Oxford: Oxford University Press.
Sibley, D., Kello, C., Plaut, D. & Elman, J. (2008). Large-scale modeling of wordform learning and representation. Cognitive Science 32, 741–54.
Stokes, S. (2010). Neighborhood density and word frequency predict vocabulary size in toddlers. Journal of Speech, Language, and Hearing Research 53, 670–83.
Stokes, S. (2014). The impact of phonological neighbourhood density on typical and atypical emerging lexicons. Journal of Child Language 41(3), 634–57.
Stokes, S., Bleses, D., Basbøll, H. & Lambertsen, C. (2012). Statistical learning in emerging lexicons: the case of Danish. Journal of Speech, Language, and Hearing Research 55, 1265–73.
Stokes, S., Kern, S. & dos Santos, C. (2012). Extended statistical learning as an account for slow vocabulary growth. Journal of Child Language 39(1), 105–29.
Stokes, S. & Klee, T. (2009). Factors that influence vocabulary development in two-year-old children. Journal of Child Psychology and Psychiatry 50, 498505.
Storkel, H. L. (2004). Do children acquire dense neighborhoods? An investigation of similarity neighborhoods in lexical acquisition. Applied Psycholinguistics 25, 201–21.
Storkel, H. L. (2008). First utterances. In Rickheit, G. & Strohner, H. (eds), The balancing act: combining symbolic and statistical approaches to language, 125–47. Berlin: Mouton de Gruyter.
Storkel, H. L. (2009). Developmental differences in the effects of phonological, lexical and semantic variables on word learning by infants. Journal of Child Language 36, 291321.
Storkel, H. L. & Lee, S.-Y. (2011). The independent effects of phonotactic probability and neighborhood density on lexical acquisition by preschool children. Language & Cognition Processes 26(2), 191211.
Takac, M. & Knott., A. (2015). A neural network model of episode representations in working memory. Cognitive Computation 7(5), 509–25.
Vitevich, M. & Storkel, H. (2012). Examining the acquisition of phonological word forms with computational experiments. Language and Speech 56(4), 493527.
Werbos, P. J. (1990). Backpropagation through time: what it does and how to do it. Proceedings of the IEEE 78(10), 1550–60.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Journal of Child Language
  • ISSN: 0305-0009
  • EISSN: 1469-7602
  • URL: /core/journals/journal-of-child-language
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed