The evaluation of idea sets for design solutions using Shah et al.’s criteria of quality, quantity, novelty and variety can help design teams understand the thoroughness of their ideation work and can help design researchers compare the performance of different ideation methods. However, existing methods for aggregating these metrics to obtain total set scores for quality, quantity, novelty and variety are problematic. The present paper proposes axioms for the desired behavior of aggregation functions for quality, quantity, variety and novelty, then defines functions that meet the axioms. These axioms are intended to ensure that scoring methods reflect best practices in ideation and appropriately reward preferred ideation behavior, such as promoting the contribution of all ideas. Further, this paper provides operational definitions for quality, novelty and quantity evaluations of ideas and draws from previous methods to provide expedient scoring methods of individual ideas. Evaluation mechanics are presented that allow repeatable evaluation of idea sets containing thousands of ideas. Software tools are provided to automatically calculate the aggregation functions for ideas evaluated according to the mechanics of this paper. Finally, a method for evaluating both the variety of complete sets of ideas and the contributions of individual ideas to the overall set variety is proposed. The evaluation of variety is sufficiently defined that it can be automatically evaluated for any genealogy tree of ideas. The operational definitions for evaluating quality, novelty and quantity are suitable for adoption in artificial intelligence tools to allow automated evaluation of idea sets for these quantities.