A number of methodological papers published during the last yearstestify that a need for a thorough revision of the researchmethodology is felt by the operations research community – see, forexample, [Barr et al., J. Heuristics1 (1995) 9–32; Eiben and Jelasity,Proceedings of the 2002 Congress on Evolutionary Computation (CEC'2002) 582–587; Hooker,J. Heuristics1 (1995) 33–42; Rardin and Uzsoy,J. Heuristics7 (2001) 261–304]. In particular, theperformance evaluation of nondeterministic methods, including widelystudied metaheuristics such as evolutionary computation and ant colonyoptimization, requires the definition of new experimental protocols.A careful and thorough analysis of the problem of evaluatingmetaheuristics reveals strong similarities between this problem andthe problem of evaluating learning methods in the machine learningfield.In this paper, we show that several conceptual tools commonly used inmachine learning – such as, for example, the probabilistic notion ofclass of instances and the separation between the training and thetesting datasets – fit naturally in the context of metaheuristicsevaluation.Accordingly, we propose and discuss some principles inspired by theexperimental practice in machine learning for guiding the performanceevaluation of optimization algorithms.Among these principles, a clear separation between the instances thatare used for tuning algorithms and those that are used in the actualevaluation is particularly important for a proper assessment.