Hostname: page-component-89b8bd64d-46n74 Total loading time: 0 Render date: 2026-05-07T22:45:28.843Z Has data issue: false hasContentIssue false

Enabling multi-modal search for inspirational design stimuli using deep learning

Published online by Cambridge University Press:  27 July 2022

Elisa Kwon
Affiliation:
Department of Mechanical Engineering, University of California, Berkeley, CA, USA
Forrest Huang
Affiliation:
Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA
Kosa Goucher-Lambert*
Affiliation:
Department of Mechanical Engineering, University of California, Berkeley, CA, USA
*
Author for correspondence: Kosa Goucher-Lambert, E-mail: kosa@berkeley.edu
Rights & Permissions [Opens in a new window]

Abstract

Inspirational stimuli are known to be effective in supporting ideation during early-stage design. However, prior work has predominantly constrained designers to using text-only queries when searching for stimuli, which is not consistent with real-world design behavior where fluidity across modalities (e.g., visual, semantic, etc.) is standard practice. In the current work, we introduce a multi-modal search platform that retrieves inspirational stimuli in the form of 3D-model parts using text, appearance, and function-based search inputs. Computational methods leveraging a deep-learning approach are presented for designing and supporting this platform, which relies on deep-neural networks trained on a large dataset of 3D-model parts. This work further presents the results of a cognitive study (n = 21) where the aforementioned search platform was used to find parts to inspire solutions to a design challenge. Participants engaged with three different search modalities: by keywords, 3D parts, and user-assembled 3D parts in their workspace. When searching by parts that are selected or in their workspace, participants had additional control over the similarity of appearance and function of results relative to the input. The results of this study demonstrate that the modality used impacts search behavior, such as in search frequency, how retrieved search results are engaged with, and how broadly the search space is covered. Specific results link interactions with the interface to search strategies participants may have used during the task. Findings suggest that when searching for inspirational stimuli, desired results can be achieved both by direct search inputs (e.g., by keyword) as well as by more randomly discovered examples, where a specific goal was not defined. Both search processes are found to be important to enable when designing search platforms for inspirational stimuli retrieval.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2022. Published by Cambridge University Press
Figure 0

Fig. 1. Overview of process of transformation of embeddings from appearance network to functional embeddings. Appearance embeddings of input part (scissor blade) were used to generate a predicted functional transformation using the functional network. Functional network was then trained by considering this prediction as similar to the appearance embedding of a neighboring part (scissor handle) and dissimilar to the appearance embedding of an unrelated part (chair leg). Intermediate representation within the functional network was used as the functional embedding of each model part in the dataset.

Figure 1

Fig. 2. (a) Search results for a keyword search of the term “container”; (b) search results for a part search of a result from keyword search for “container”.

Figure 2

Fig. 3. Interactions with selected part in Figure 2 – (a) adding part to the workspace; (b) viewing part in context by seeing related parts with text labels in the same object assembly; and (c) adding part to a gallery of saved 3D parts.

Figure 3

Fig. 4. Overview of flow between subtasks: training on search types and features in the interface preceded presentation of instructions and completion of each subtask.

Figure 4

Table 1. Overview of search types and inputs specified for each subtask of the cognitive study

Figure 5

Table 2. Summary of rank-based accuracies for similarity measures of test set data describing retrieval behavior of neural networks

Figure 6

Table 3. Frequencies of new and modified part searches with changes in functional and/or appearance similarity (+: increasing similarity, −: decreasing similarity)

Figure 7

Table 4. Frequencies of new and modified workspace searches with changes in functional and/or appearance similarity (+: increasing similarity, −: decreasing similarity)

Figure 8

Fig. 5. Differences between observed and expected values of parts engaged with and not engaged with, shown by search type.

Figure 9

Fig. 6. Differences between the observed and expected values of parts viewed in context and added to the workspace, shown by search type.

Figure 10

Fig. 7. 2D visualization of the 128-dimensional appearance-based neural network: parts retrieved during the study using each search modality are represented based on their appearance embeddings. Closely related (yellow, chair seat) and distantly related (green, lamp shade) examples in appearance to the reference part (black, tabletop) are shown.

Figure 11

Fig. 8. Expanded view of a cluster of parts in the 2D visualization of the appearance-based neural network. Parts 1–4 are the top 4 nearest neighbors in Euclidean distance to a trash can lid in the full 128-dimensional embedding space. Part * is close based on visual inspection to the trash can lid in the 2D projection.

Figure 12

Fig. 9. 2D visualization of the 64-dimensional function-based neural network: parts retrieved during the study using each search modality are represented based on their functional embeddings. Closely related (yellow, sink drawer face) and distantly related (green, chair legs) examples in function to the reference part (black, cabinet door) are shown.

Figure 13

Table 5. Total variation and highest variance of a single variable in appearance (128-dimensional) and functional (64-dimensional) embedding spaces by search type