This paper reconciles the standpoint that language users do not aim at improving their sound systems with the observation that languages seem to improve their sound systems. If learners optimise their perception by gradually ranking their cue constraints, and reuse the resulting ranking in production, they automatically introduce a prototype effect, which can be counteracted by an articulatory effect. If the two effects are of unequal size, the learner will end up with a sound system auditorily different from that of her language environment. Computer simulations of sibilant inventories show that, independently of the initial auditory sound system, a stable equilibrium is reached within a small number of generations. In this stable state, the dispersion of the sibilants of the language strikes an optimal balance between articulatory ease and auditory contrast. Crucially, these results are derived within a model without any goal-oriented elements such as dispersion constraints.