Adults see the human body shape in all manner of stimuli, including highly abstract stick figures that barely specify the human body configuration and point-light displays (PLDs) that specify only the characteristic motion patterns of the human form. This capacity to quickly and effortlessly perceive the human form reflects our expertise at visually processing human bodies. In this chapter we will argue that expertise in perceiving bodies occurs by virtue of their ubiquity and social significance, not because of any kind of innate representation or privileged learning mechanism. We are claiming this because visual discrimination of human bodies is slow to develop in infancy, and because it is initially stimulus-dependent, becoming more and more generalisable over time in a typical learning trajectory.
Body perception involves several processing steps. First, viewers detect that a visual object is a human body, as distinct from other object classes such as cars or dogs. At later stages of processing, viewers may identify features of an individual body, such as the posture, the gender or attractiveness, and they may also recognise the body’s personal identity. All of this information is ultimately interpreted for its social-communicative relevance and meaning in context. In this chapter, we will be focusing on the initial step of body processing: body detection, here defined as the capacity to visually discriminate bodies from other objects. We will describe a series of experiments in which we have investigated infants’ responses to typical human bodies versus scrambled bodies. The typical body stimuli portray the human form in various postures (e.g. arms raised above the head, legs spread wide, etc.). To create the scrambled body stimuli, we move the arms and/or legs to non-canonical locations (e.g. arms coming out of the head, legs and arms switched on the torso, etc.). We compare these two stimuli because scrambling bodies preserves the low level visual elements of typical bodies including total contour, contrast and visual detail, and distorts only the configural properties, that is, the unique overall shape by which viewers detect that a visual object is a human body as opposed to something else (see Figure 5.1 for examples). This technique has been used widely to investigate perception of faces (Johnson et al., 1991) as well as bodies (Peelen and Downing, 2005; Reed et al., 2006).