The way the cognitive system scans the visual environment for relevant information – visual search in short – has been a long-standing central topic in vision science. From its inception as a research topic, and despite a number of promising alternative perspectives, the study of visual search has been governed by the assumption that a search proceeds on the basis of individual items (whether processed in parallel or not). This has led to the additional assumptions that shallow search slopes (at most a few tens of milliseconds per item for target-present trials) are most informative about the underlying process, and that eye movements are an epiphenomenon that can be safely ignored. We argue that the evidence now overwhelmingly favours an approach that takes fixations, not individual items, as its central unit. Within fixations, items are processed in parallel, and the functional field of view determines how many fixations are needed. In this type of theoretical framework, there is a direct connection between target discrimination difficulty, fixations, and reaction time (RT) measures. It therefore promises a more fundamental understanding of visual search by offering a unified account of both eye movement and manual response behaviour across the entire range of observed search efficiency, and provides new directions for research. A high-level conceptual simulation with just one free and four fixed parameters shows the viability of this approach.