Hostname: page-component-77f85d65b8-hzqq2 Total loading time: 0 Render date: 2026-03-29T03:59:29.588Z Has data issue: false hasContentIssue false

Multi-modal sensing and analysis of poster conversations with smart posterboard

Published online by Cambridge University Press:  02 March 2016

Tatsuya Kawahara*
Affiliation:
Academic Center for Computing and Media Studies, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
Takuma Iwatate
Affiliation:
Academic Center for Computing and Media Studies, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
Koji Inoue
Affiliation:
Academic Center for Computing and Media Studies, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
Soichiro Hayashi
Affiliation:
Academic Center for Computing and Media Studies, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
Hiromasa Yoshimoto
Affiliation:
Academic Center for Computing and Media Studies, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
Katsuya Takanashi
Affiliation:
Academic Center for Computing and Media Studies, Kyoto University, Sakyo-ku, Kyoto 606-8501, Japan
*
Corresponding author:T. Kawahara Email: kawahara@i.kyoto-u.ac.jp

Abstract

Conversations in poster sessions in academic events, referred to as poster conversations, pose interesting, and challenging topics on multi-modal signal and information processing. We have developed a smart posterboard for multi-modal recording and analysis of poster conversations. The smart posterboard has multiple sensing devices to record poster conversations, so we can review who came to the poster and what kind of questions or comments he/she made. The conversation analysis incorporates face and eye-gaze tracking for effective speaker diarization. It is demonstrated that eye-gaze information is useful for predicting turn-taking and also improving speaker diarization. Moreover, high-level indexing of interest and comprehension level of the audience is explored based on the multi-modal behaviors during the conversation. This is realized by predicting the audience's speech acts such as questions and reactive tokens.

Information

Type
Original Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Authors, 2016
Figure 0

Fig. 1. Proposed scheme of multi-modal sensing and analysis.

Figure 1

Fig. 2. Outlook of smart posterboard.

Figure 2

Fig. 3. Process flow of multi-modal sensing.

Figure 3

Fig. 4. Statistics of eye-gaze and its relationship with turn-taking (ratio).

Figure 4

Table 1. Duration of eye-gaze and its relationship with turn-taking (s).

Figure 5

Table 2. Definition of joint eye-gaze events by presenter and audience.

Figure 6

Table 3. Statistics of joint eye-gaze events by presenter and audience in relation with turn-taking (ratio of occurrence frequency).

Figure 7

Table 4. Prediction result of speaker change.

Figure 8

Table 5. Prediction result of the next speaker.

Figure 9

Table 6. Evaluation of speaker diarization (DER (%)).

Figure 10

Fig. 5. Distribution of interest & comprehension level according to question type.

Figure 11

Table 7. Relationship of audience's eye-gaze at the presenter (count/utterance and duration ratio) and questions (by type).

Figure 12

Table 8. Prediction result of topic segments involving questions and/or reactive tokens.

Figure 13

Table 9. Identification result of confirming or substantive questions.

Figure 14

Fig. 6. Poster conversation browser.