Hostname: page-component-77c78cf97d-xcx4r Total loading time: 0 Render date: 2026-04-25T16:56:02.944Z Has data issue: false hasContentIssue false

Adopting “blackbox” engineering advice: the influence of imperfect suggestions during AI-assisted decision-making with multiple objectives

Published online by Cambridge University Press:  10 March 2025

Ananya Nandy*
Affiliation:
Department of Mechanical Engineering, University of California, Berkeley, Berkeley, CA, USA
David Antonio Herrera
Affiliation:
Department of Mechanical Engineering, University of California, Berkeley, Berkeley, CA, USA
Kosa Goucher-Lambert
Affiliation:
Department of Mechanical Engineering, University of California, Berkeley, Berkeley, CA, USA
*
Corresponding author: Ananya Nandy; Email: ananyan@berkeley.edu
Rights & Permissions [Opens in a new window]

Abstract

Engineering design requires humans to make complex, multi-objective decisions involving trade-offs where it is challenging to identify the best solution. AI-embedded computational support tools are increasingly used to aid in such scenarios, enhancing the design decision-making process. However, over- or under-reliance on imperfect “blackbox” models may prevent optimal outcomes. To investigate AI-assisted decision-making in engineering design, two complementary experiments (N = 90) were conducted. Participants chose between pairs of aircraft jet engine brackets and were tasked with selecting the better design based on two (Experiment 1) or three (Experiment 2) competing objectives. Participants received simulated AI suggestions, which correctly suggested a better design, incorrectly suggested a worse design, or arbitrarily suggested an approximately equivalent design. At times, these suggestions were accompanied by an example-based explanation. Results demonstrate that participants follow suggestions less than expected when the model can objectively determine the better-performing alternative, often underutilizing the model’s advice to their detriment. When the “better” choice is uncertain, the tendency to follow an arbitrary suggestion differs, with overutilization occurring only in the bi-objective case. There is no evidence that providing an explanation of the model’s suggestion impacts decision-making. The results provide valuable insights into how engineering designers’ multi-objective decisions may be affected – positively, negatively, or not at all – by computational tools meant to assist them.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. Non-dominated sorting, an iterative process of calculating rankings based on Pareto optimality, was used to quantify comparisons between pairs of designs based on multi-objective criteria. Iteration 0 refers to the globally optimal set of designs.

Figure 1

Table 1. Experiments conducted and the number of participants in each

Figure 2

Table 2. Conditions and trials in each experiment

Figure 3

Figure 2. Participants were instructed to select designs using the interface shown.

Figure 4

Figure 3. Performance values of bracket stimuli in each condition.

Figure 5

Figure 4. Probability of selecting the suggested design across all participants (median = 0.57) vs. the expected proportion (0.71) (W = 24.0, p < 0.001).

Figure 6

Figure 5. The proportion of trials in the equivalent condition where participants chose what the model suggested (median = 0.78) vs. the expected proportion (0.50) (W = 5.0, p < 0.001).

Figure 7

Figure 6. The difference in stimuli properties along each objective (a positive value indicates a higher value for the suggested design): (a) for all conditions and (b) as a visualized interpolation for the “equivalent” condition only.

Figure 8

Figure 7. Interface for three objective decisions with an example-based explanation. The examples (highlighted on the left) are present only in Experiment 2b, while the additional objective (highlighted on the right) is present in both Experiment 2a and 2b.

Figure 9

Figure 8. Performance values of stimuli in each condition across three objectives.

Figure 10

Figure 9. Probability of selecting the suggested design across all participants (median = 0.62 for both Experiment 2a and 2b) vs. the expected proportion (0.71) (W = 40.0, p = 0.00056, W = 35.0, p = 0.00013).

Figure 11

Figure 10. The proportion of trials in the equivalent condition where participants chose what the model suggested (median = 0.45 and median = 0.44) vs. the expected proportion (0.50) (W = 165.0, p = 0.56 and W = 160.0, p = 0.22).

Figure 12

Figure 11. The difference in stimuli properties along each objective (a positive value indicates a higher value for the suggested design): (a) for all three objectives and (b) as pairwise comparisons of each objective.

Figure 13

Table 3. Suggestion selection models for Experiments 1 and 2

Figure 14

Figure 12. Group-level predicted probabilities from GLMMs compared to real participant selection data.

Figure 15

Table 4. Accuracy models for Experiments 1 and 2

Figure 16

Figure 13. Group-level predicted probabilities from GLMMs compared to real participant accuracy data.

Figure 17

Figure 14. Distribution of participants’ answers to the question “What percentage of the time do you think the model suggested the better design?” (median = 70%). The answer choices ranged from 0 – 100% in increments of 10.

Figure 18

Table 5. Strategies and reasoning from open-ended survey questions

Figure 19

Figure 15. Probability of selecting the suggested design vs. the expected proportion for the correct and incorrect conditions using reclassification under the 1% and 5% thresholds in Experiment 1.

Figure 20

Figure 16. Probability of selecting the suggested design vs. the expected proportion for the equivalent conditions using reclassification under the 1% and 5% thresholds in Experiment 1.

Figure 21

Figure 17. The probability of selecting the suggested design vs. the expected proportion (0.5) for the correct and incorrect conditions using reclassification under the 1% and 5% thresholds in Experiment 2.

Figure 22

Figure 18. Probability of selecting the suggested design vs. the expected proportion for the equivalent conditions using reclassification under the 1% and 5% thresholds in Experiment 2.