Hostname: page-component-89b8bd64d-rbxfs Total loading time: 0 Render date: 2026-05-06T22:44:59.324Z Has data issue: false hasContentIssue false

When to stop social learning from a predecessor in an information-foraging task

Published online by Cambridge University Press:  20 January 2025

Hidezo Suganuma*
Affiliation:
Department of Social Psychology, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
Aoi Naito
Affiliation:
School of Environmental Society, Institute of Science Tokyo, 3-3-6 Shibaura, Minato-ku, Tokyo 108-0023, Japan Japan Society for the Promotion of Science, 5-3-1 Kojimachi, Chiyoda-ku, Tokyo 102-0083, Japan
Kentaro Katahira
Affiliation:
Human Informatics and Interaction Research Institute, National Institute of Advanced Industrial Science and Technology, Tsukuba, Ibaraki 305-8566, Japan
Tatsuya Kameda
Affiliation:
Faculty of Mathematical Informatics, Meiji Gakuin University, 1518 Kamikuratachou, Totsuka-ku, Yokohama, 244-8539 Japan Center for Interdisciplinary Informatics, Meiji Gakuin University, 1-2-37 Shirokanedai, Minato-ku, Tokyo 108-8636, Japan Center for Experimental Research in Social Sciences, Hokkaido University, N10W7, Kita-ku, Sapporo, Hokkaido 060-0810, Japan Brain Science Institute, Tamagawa University, 6-1-1 Tamagawagakuen, Machida, Tokyo, 194-8610 Japan
*
Corresponding author: Hidezo Suganuma; Email: suganuma.hiz@gmail.com

Abstract

Striking a balance between individual and social learning is one of the key capabilities that support adaptation under uncertainty. Although intergenerational transmission of information is ubiquitous, little is known about when and how newcomers switch from learning loyally from preceding models to exploring independently. Using a behavioural experiment, we investigated how social information available from a preceding demonstrator affects the timing of becoming independent and individual performance thereafter. Participants worked on a 30-armed bandit task for 100 trials. For the first 15 trials, participants simply observed the choices of a demonstrator who had accumulated more knowledge about the environment and passively received rewards from the demonstrator's choices. Thereafter, participants could switch to making independent choices at any time. We had three conditions differing in the social information available from the demonstrator: choice only, reward only or both. Results showed that both participants’ strategies about when to stop observational learning and their behavioural patterns after independence depended on the available social information. Participants generally failed to make the best use of previously observed social information in their subsequent independent choices, suggesting the importance of direct communication beyond passive observation for better intergenerational transmission under uncertainty. Implications for cultural evolution are discussed.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. Overview of the experiment. Thirty options were arrayed in a 5 × 6 grid and displayed as 30 tiles with unique labels (e.g., P, K, Q) on a computer screen. Participants worked on the 30-armed bandit task for a total of 100 trials. The gray arrow illustrates trials in which participants observed the demonstrator's choice (the Choice-only condition), reward amount (the Reward-only condition), or both (the Choice-plus-reward condition) in each trial, without making their own choices. For those observational trials, the option selected by the demonstrator (tile U in this example) was revealed in the Choice-only and Choice-plus-reward conditions, while in the Reward-only condition, just the reward amount the demonstrator obtained was shown at the center. Observation was mandatory for the first 15 trials, but after this “mandatory observation” phase, participants could switch to independence at any trial (the “optional independence” phase: 16th-60th trial). After deciding to become independent or entering the “mandatory independence” phase (61st-100th trial), participants made their own choices without observation (the black arrow). The option that the participant chose in the previous trial was highlighted in pink.

Figure 1

Figure 2. Timing of independence and behavioral performance thereafter. (a) Survival curves for the decision to continue observing the demonstrator's behaviors. The horizontal axis corresponds to the elapsed trial after the 15th trial (i.e., the last trial of the mandatory observation phase). The curve for the Reward-only condition does not reach zero, because there were participants who continued observation to the limit (the 60th trial), whereas all participants in the other two conditions switched to independence before the limit. (b) Behavioral performance after independence. We used the mean of the quality of options chosen by participants as a performance index, which ranges from 1 (choosing only the worst-category options) to 6 (choosing only the best-category option).

Figure 2

Figure 3. Pair-level comparison of behavioral performance between demonstrators and participants. X-axis refers to the paired demonstrator's performance (averaged across trials) after the participant switched to independence. Y-axis refers to the participant's performance during the same independent trials. Dot colors indicate the length of observational trials of each participant, with the lighter color meaning longer observations. Diagonal lines correspond to the cases where performance matches within the pair.

Figure 3

Figure 4. Participants generally exhibited exploratory behavior after independence. (a) Cosine similarity of the choice proportions among the 30 options between each pair. This value represents overall “option-by-option” choice similarity between the demonstrator and the participant (i.e., participant's after-independence imitation of the demonstrator's choices during the observation trials), ranging from 0 (not similar at all) to 1 (highly similar). (b) Behavioral similarity between a participant and the demonstrator during the first 40 trials after independence. We checked whether a participant chose any of the options that the demonstrator had chosen during the observation period in each trial. The graph shows the proportions of the participants who exhibited such behavioral similarity in each trial. (c) Exploration rates. Each dot refers to the proportion of exploratory choices in which a participant selected options other than the best-known option (i.e., the option with which she/he had directly experienced the largest mean reward up to the preceding trial) after independence. (d) Change in the proportion of participants who chose the optimal (single-best) option. The proportion of demonstrators choosing the optimal option (i.e., the demonstrator's continued improvement during the mandatory independence phase, with the time lag adjusted) is also shown for comparison (black line).

Supplementary material: File

Suganuma et al. supplementary material

Suganuma et al. supplementary material
Download Suganuma et al. supplementary material(File)
File 21.2 MB