Hostname: page-component-89b8bd64d-4ws75 Total loading time: 0 Render date: 2026-05-05T19:01:30.987Z Has data issue: false hasContentIssue false

Multimodal generative AI for conceptual design: enabling text-based and sketch-based human-AI conversations

Published online by Cambridge University Press:  27 August 2025

Gaelle Baudoux*
Affiliation:
University of California, Berkeley, USA
Chenjun Guo
Affiliation:
University of California, Berkeley, USA
Kosa Goucher-Lambert
Affiliation:
University of California, Berkeley, USA

Abstract:

Recent advances in AI offer promising opportunities for creative design, particularly through the generation of inspirational images. While prior research has explored the general benefits and limitations of text-to-image tools, there is significant potential in overcoming these constraints by investigating agile, multimodal prompting to facilitate more project-appropriate human-AI interaction. We present the development of a system designed to support both text-based and sketch-based image generation, serving as a research artefact for studying creativity support through multimodal Generative AI. The system enables dynamic dialogue interaction and visualization of the respective contributions. This paper focuses on the development of this AI system as a research artefact to enable future research through design, exploring how multimodal prompting can influence the design process.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright
© The Author(s) 2025
Figure 0

Figure 1. Extract of conversational co-creative loops between a designer and our AI system

Figure 1

Figure 2. Software architecture and interface visuals

Figure 2

Figure 3. Diagram of the prompt structure

Figure 3

Table 1. Prompt’s textual fixed parts

Figure 4

Figure 4. User tester scoring on the Creativity Support Index evaluation scale