The study of cognition is challenged by the difficulty of inferring representation and processes in such a complex system as the brain. The field of cognitive science has met this challenge by borrowing and developing research tools with which to study the brain. Tools are meant broadly to include not just hardware (e.g., computers, eye trackers, imaging equipment) that is used for data collection, but also the quantitative tools used to guide inference, including statistical methods (frequentist and Bayesian) and cognitive modeling.
Cognitive modeling assists in scientific inference by, among other things, assessing the plausibility of an explanation (e.g., theory, process). It achieves this by instantiating a version of the explanation in some quantitative form (i.e., the mathematical model), and thereby demonstrating its plausibility (Polk and Seifert, 2002; Busemeyer and Diederich, 2010; Lewandowsky and Farrell, 2011; Lee and Wagenmakers, 2014). However, as with theories, all quantitative explanations are not equally good or convincing, so what criteria should be used to evaluate models? What makes a model a good explanation from which it is reasonable to draw inferences, and what signs indicate the model is poor? These questions are the focus of this chapter. Like modeling itself, the field is still very much in its infancy. Progress has been made, but many challenges remain. Before reviewing the state-of-the-art, we first provide a broader context in which to situate the enterprise of model evaluation.
Although cognitive modeling has been around since the 1950s, its popularity increased once computers became cheap and fast. Also, user-friendly software has accelerated its adoptions, to the point where more and more researchers recognize the value of models and their usefulness for knowledge discovery (Shiffrin and Nobel, 1997; Fum et al., 2007; McClelland, 2009). Theories in much of the field tend to be broad claims about foundational issues in cognition (e.g., representations are distributed rather than local; grammar acquisition is probabilistic rather than rule-based; category learning is Bayesian). By instantiating these claims in a model, the theory becomes more viable, and as a consequence more persuasive, especially when its performance is shown to mimic that of individuals. In addition, it can be difficult to develop a theory with much depth without formalizing it quantitatively in some way.