How open science can benefit bilingualism research: A lesson in six tales

Abstract Bilingualism is hard to define, measure, and study. Sparked by the “replication crisis” in the social sciences, a recent discussion on the advantages of open science is gaining momentum. Here, we join this debate to argue that bilingualism research would greatly benefit from embracing open science. We do so in a unique way, by presenting six fictional stories that illustrate how open science practices – sharing preprints, materials, code, and data; pre-registering studies; and joining large-scale collaborations – can strengthen bilingualism research and further improve its quality.


Introduction
Cutting-edge research pushes the frontiers of the unknown and provides more complete scientific explanations of our world (Scheel, Tiokhin, Isager & Lakens, 2020). At the same time, no matter how impressive or popular a study is, replication by independent researchers is the ultimate test of scientific knowledge (Nosek & Errington, 2020). In this way, science advances in an interplay of innovation and confirmation, improving with each iteration. However, recent assessments of the scientific literature have revealed that this interplay may be unbalanced (Ioannidis, 2005;Simmons, Nelson & Simonsohn, 2011). Some research findings formerly thought to provide solid evidence have been shown to be hardly replicable, calling established social science phenomena into question (Dailey & Bergelson, 2021;Ebersole et al., 2020;Klein & Al, 2013;Open Science Collaboration, 2015).
This set of practices is commonly known as OPEN SCIENCE and is increasingly embraced by many disciplines (Nosek et al., 2015). Discussions of open science in bilingualism research in the published literature have been limited (although see Leivada et al., 2020), yet we believe that these practices can be especially beneficial to bilingualism research, in particular by improving how researchers approach and understand the rich variation inherent to bilingualism (Marian & Hayakawa, 2020). One example of how open science can bolster bilingualism research comes from Byers-Heinlein and colleagues (2021), as part of the ManyBabies Consortium. The Consortium set out to replicate a robust finding from the developmental literature: infants' preference for infant-directed over adult-directed speech. Using open science practices, 17 labs from 7 countries collected data from 333 bilinguals and 385 monolinguals between the ages of 6 and 15 months and found that both bilinguals and monolinguals showed comparable preference for infant-directed speech versus adult-directed speech. Having such an unusually large sample of bilingual infants also allowed the researchers to demonstrate a continuous relationship between exposure to the stimulus language and the magnitude of infants' preference. Critically, these effects were found after controlling for variance arising from labs and infants, either entering them as random effects in a mixed-effects model, or estimating their variation using a meta-analytic approach, making results more generalizable and helping to disentangle competing explanations. Other researchers have also shown how open science can benefit bilingualism research on more disputed topics, such as bilingual cognitive advantages (Leivada et al., 2020). However, while some bilingualism researchers have begun to adopt open science practices, these are not yet the norm everywhere in bilingualism research. This might be due to lack of awareness as well as real or perceived challenges in changing the way that research is conducted.
We believe that the adoption and consistent use of open science can lead us to a much stronger science of bilingualism (Ioannidis, 2005;John et al., 2012;Simmons et al., 2011;Wicherts et al., 2016). Yet, the transition to open science can sometimes be fraught with feelings of anxiety and uncertainty. We suggest imagining the researcher who is new to open science as the protagonist at the outset of an adventure story. The tale usually begins with the researcher safe and warm in the lab, cozying up to their traditional research practices, not knowing that an adventure awaits. But then it happens: a mysterious journal article mentions the words "Registered Report", an enthusiastic colleague pleads for them to join their own open science journey to preregister their collaborative study, or perhaps an all-powerful funding agency decrees that open science shall be the rule of the land. Will the researcher brave the unknown and embark on an adventure towards open science practices?
This paper is aimed primarily at those researchers who are wondering if they should embark on the journey and what they might encounter along the way, although it may also be of interest to others who have begun to venture forth but could use a little encouragement. Here, we have drawn from our lab's open science experiences to craft six tales of bilingualism researchers discovering different components of open science: The Legend of the Preprint; A Fairy Tale About Pre-Registration; Open Materials: A Memoir; Open Data and Analysis Code Mystery Solved; An Epic Tale of Large-Scale Collaboration; and The Story of the Adventurous Dr. All-in-One. For readers who are short on time or less into storytelling, we refer you to Table 1, which summarizes the open science practices illustrated in the stories and provides further readings. For those who do read on, we hope that our stories resonate with your experiences, as they have with our own, and provide you with encouragement to begin or continue your open science adventure.

The legend of the preprint
Dr. File-Drawer was frustrated. It was a feeling they had experienced before when submitting to their favorite and most well-respected journal in their field, The International Journal of Esperanto and Exercise. This journal was famous for its 17-round peer-review process. Dr. File-Drawer was reading the first round of 7 reviews that each had divergent points of view on their study. The research had compared monolinguals and bilinguals on their ability to learn Esperanto while running on a treadmill. The study was methodologically sound, but there was no difference in performance between the two groupsthe dreaded null result. Despite knowing its importance to other researchersespecially for meta-analyses -Dr. File-Drawer feared that it would be months, nay years, before the paper would reach the end of the peer review process, which was particularly arduous for papers that reported null results. And indeed, if Dr. File-Drawer's resolve faltered, this study could end up like so many others, in the file drawer.
But this time there was hope. There were legends circulating of a mythical place where research could be shared promptly and openly: The Land of the Preprint. Although this land had odd place names (e.g., PsyArXiv, bioRxiv, arXiv, MetaArXiv), it seemed that anyone around the world, from inside or outside of academia, could enter this land and learn about the latest scientific findings, even before the peer-review process was complete. After a quick search on the internet, Dr. File-Drawer was mesmerized. The Land of the Preprint was real! Wandering throughout this land, Dr. File-Drawer found many interesting projects that they had never seen before, and marveled at such an inclusive, diverse scientific oasis. Indeed, as they explored further, Dr. File-Drawer found one interesting study that had already tested differences between monolinguals and bilinguals in learning Esperanto while swimming, which appeared to also be under review at The International Journal of Esperanto and Exerciseperfect to cite in their own paper! Looking through the different manuscripts, some were of higher quality than others, but Dr. File-Drawer was able to use an open comments feature to publicly participate in discussions of different papers, and in a few cases they e-mailed the authors to schedule a more in-depth video chat.
Despite the expected years-long wait until publication, Dr. File-Drawer did not feel quite so daunted anymore. Even while the paper underwent the journal's extensive review process, Dr. File-Drawer posted their paper as a citable preprint, uploading updated versions from time to time that incorporated feedback from both peer-reviewers and others who happened upon the preprint. Indeed, Dr. File-Drawer noticed that some researchers posted their preprints for feedback even prior to journal submission.
After the expected 17 rounds of revisions over 7 years, Dr. File-Drawer's paper was finally accepted for publication. As a last step, Dr. File-Drawer uploaded a final version of their manuscript to The Land of the Preprint in accordance with the journal's open-access policies (see https://v2.sherpa.ac.uk/romeo). Dr. File-Drawer kept the copyright to their paper, and scientists around the world were able to read their important work without paying the exorbitant access fees charged by The International Journal of Esperanto and Exercise.
Dr. File-Drawer appreciated the peer review process for the valuable feedback that reviewers provide, and for its role in verifying the scientific quality of contributions prior to their entry into a field's literature, but had always felt frustrated by its glacial pace. At the end, they felt they had the best of both worlds: a peer-reviewed publication in a well-regarded journal, which was also freely available to other researchers as a preprint.
A fairy tale about pre-registration Once upon a time, while watching a track and field competition, Dr. Fortune-Teller noticed that bilingual competitors were unusually likely to win the walking-backwards event. Could monolingual-bilingual differences in walking backwards be a real effect? To test this idea more formally, he designed an elegant study, with rigorous analyses, and happily found that his hypothesis was supported by the data! But reviewers were in disbelief that Dr. Fortune-Teller had accurately predicted the (admittedly unusual) effect he found and suspected he had Hypothesized After the Results were Known (HARKing; Kerr, 1998). Because of these doubts, Dr. Fortune-Teller was unable to publish his groundbreaking findings. He was so downtrodden that he vowed to never let this happen again.
Remembering a piece of advice from his trusted colleague Dr. Early-Bird, an advocate for open science practices such as pre-registration, the next time that Dr. Fortune-Teller made an unusual observation about bilinguals that he planned to test formally (that they could learn Klingon faster than monolinguals), he decided to pre-register his study. Dr. Fortune-Teller wrote up his hypothesis, data collection plan, and data analysis plan as a pre-registration and registered it on the Open Science Framework (https://osf.io). This created an eternal, crystal clear record of his hypotheses, predictions, methods, and data analysis plan, stamped indelibly with the exact moment of its creation.
His predictions again proved to be remarkably accurate, and the results of this study supported his hypothesis! This time, when the reviewers questioned whether he had engaged in HARKing, Dr. Fortune-Teller was ready. His pre-registration demonstrated that he had in fact hypothesized the effect before running the study. He also didn't despair when some unexpected findings that even he couldn't predict occurredhe simply wrote them up as exploratory results. Clever Dr. Fortune-Teller! Excited about the feedback he was getting on his new publication, Dr. Fortune-Teller reached out to Dr. Early-Bird to thank her for the advice. Dr. Early-Bird was thrilled but then said with a cryptic grin, "But, of course, pre-registration is not the entire story. Registered Reports are also changing research pipelines -[dramatic pause]they have all the benefits of preregistration, with the added advantage of a two-stage peer review!" Dr. Early-Bird went on to explain that in the first stage, reviewers evaluate the research idea and design and eventually grant "in-principle" acceptance. With this initial blessing, the author collects and analyzes the data, then writes the final manuscript. In the second stage of review, the reviewers check everything to make sure the authors followed their original analysis plan, but they also welcome additional exploratory analyses (as long as they are labeled as such). Critically, the statistical significance of the results does not affect whether or not the report is published. Dr. Fortune-Teller could not help but think that the publication bias that had long loomed over bilingualism research would start to dissipate as pre-registrations and Registered Reports became more common. Indeed, Dr. Fortune-Teller's hope was not misplaced. He soon learned that more and more journals in his field are accepting Registered Reports submissions (https://www.cos.io/initiatives/registered-reports) and that despite requiring some up-front preparation, Registered Reports lead to more reproducible and transparent research (Chambers & Tzavella, 2021). Dr. Fortune-Teller was glad to see that his friend got it right again; open science could indeed lead to a brighter and more reliable science of bilingualism.

Open materials: A memoir
Future-Dr. Brouillard (yes, a co-author of this paper!) woke up on a beautiful summer morning and decided out of the blue to move data collection for her in-person study online. Okay, maybe it was not quite like that, let's try again. It was early 2020 and the world had been overtaken by a pandemic. Chaos and uncertainty abounded, but one thing remained certain for Future-Dr. Brouillard: data collection must continue.
Though many of the tasks she was using in her experiment were standardized measures, already administered online by other researchers with published results, Future-Dr. Brouillard did not have much luck finding existing code for programming these tasks to be presented online. She felt like she was re-inventing the wheel, having to create most of her tasks from scratch. It took her weeks to record the audio stimuli alone. However, the internet is a big place. After tireless searching, Future-Dr. Brouillard discovered another researcher's previously written code for one of the tasks and was quickly able to adapt it to her needs. Using previously developed (and tested) materials not only saved Future-Dr. Brouillard weeks of work and troubleshooting, but also improved the standardization, and, most likely, replicability of her experimental tasks (Hales et al., 2019). She could be confident her results were comparable to other researchers' without worrying that small differences in stimuli or task presentation were affecting her findings.
In the hopes that it might save other researchers time and effort, as the code she had found had done for her, Future-Dr. Brouillard uploaded her own materials, including code and stimuli, to the Open Science Framework (https://osf.io; see also Brouillard & Byers-Heinlein, 2021 for the repository mentioned in this tale). Now, should anyone wish to replicate or expand on Future-Dr. Brouillard's work, they could use the very same stimuli that she used and also consult her code to answer any questions not addressed in her manuscript. If, instead, a researcher is studying a population in a different cultural context, Future-Dr. Brouillard's open materials could make it easier to adapt them to be culturally-sensitive resources for another study. Future-Dr. Brouillard would of course be credited for her contributions via citation.
Future-Dr. Brouillard felt confident that openly sharing materials could make a big difference for the field of bilingualism. For instance, one of the most notorious debates in bilingualism research regards the very definition of bilingualism (early/late, simultaneous/sequential, balanced/unbalanced, categorical/con-tinuous…; Kremin & Byers-Heinlein, 2021;Surrain & Luk, 2019). By openly sharing questionnaires and scripts used during a study, colleagues (or virtual colleagues, to be more exact) could quickly identify which definition was used to guide inclusion and exclusion criteria. Moreover, they could examine the effects of using different criteria in analyzing the dataset. In addition, by openly sharing materials, hidden moderators such as stimuli complexity and differences in task programming are placed out in the opentopics that are particularly important in bilingualism research. Future researchers, particularly those interested in directly replicating the original results, could build on previous work instead of starting from scratch every time, increasing the reliability of bilingualism science.
After uploading the last of her materials to the Open Science Framework, Future-Dr. Brouillard sighed contentedly. Soon those beautiful days when she could run her studies in person again would return. Whether her studies would be online or in person, she knew that she would always be part of the open materials community, working to create more interpretable, reliable, and replicable research.

Open data and analysis code: Mystery solved
The Bilingual Research Convention was about to start. Dr. Analysis II, herself fluent in 26 languages, was early, as usual. The eerie light of the vending machine mixed with the smell of fresh coffee filled the room with a sense of anticipation. While waiting for the talks to start, Dr. Analysis II headed toward the baked goods, following the unmistakable scent of a fresh cherry danish. As she reached for a pastry, she caught a glimpse of a poster booth buzzing with activity. The gasps and whispers coming from the crowd became louder as she approached the poster. It all made sense when she read the title: "Bilinguals are 10 times more likely to break their legs: a study spanning 3 skiing seasons" by Dr. Sharemuch and colleagues. She must have spent close to an hour examining the poster in silence. How can this be? she thought. All those years skiing by day and studying bilinguals by night, and I never noticed we were at such risk! The more she read, the more questions she had. The worst part was that the poster author would arrive late to the conference, and two days of waiting was too much for Dr. Analysis II. Enough was enough! Feeling consumed by frustration, something caught her eye. It was a small detail that made a big difference: two colorful badges, the open materials and open data badges, next to a checkered QR code. She was so happy she almost hugged a crowd of undergraduate students who had stopped to peek at the poster everyone at the conference was talking about.
After a full day of interesting talks and multilingual networking, Dr. Analysis II rode the crystal elevator towards her hotel room. She was weary, but her thoughts never left that one intriguing poster. She poured herself a glass of white wine and used her laptop to access the website associated with the QR code she had snagged at the poster session. She shrieked in delight as the online repository for the research appeared on her screen. It was better than she could have ever imagined! Row after row of glorious data, and folders containing other goodies: a "readme" file, variable codebooks, and full data analysis scripts.
Dr. Analysis II delved into the scripts, looking for clues to explain Dr. Sharemuch's contentious findings. After grasping the main ideas, she downloaded the data and replicated the results with her own analysis script. How could this be? How could bilingualism have anything to do with bone-breaking risk? Driven by curiosity, she delved deeper and deeper into the data for an explanation. After hours of investigation, long into the night, she found it-a third variable that could explain the pattern! The mystery was solved. Of course, she thought, Bilinguals may break their legs more often, but they also tend to go down harder routes (e.g., black diamonds) than monolinguals! Finally satisfied, she fell asleep.
Two days later, when Dr. Sharemuch finally arrived at the conference, Dr. Analysis II was waiting for him with a cup of fresh coffee and the news of her discovery. After a few cups and two delicious slices of cake, Dr. Sharemuch was delighted. To see someone so eager to make use of his open data and code was a joy, and he invited Dr. Analysis II to collaborate on a new project investigating bilinguals' risk-taking compared to monolinguals. Dr. Analysis II was overwhelmed with excitement. Who would have thought that the simple act of sharing a data spreadsheet would be the key to solving mysteries, unlocking amazing discoveries, and embarking on new collaborative adventures?

An epic tale of large-scale collaboration
Dr. Small-Town had a destiny, revealed to him in an ancient text passed down from his grandmother: to bring knowledge to his town of Elbit. Elbit was a scarcely populated area with two prominent populations, elves and hobbits. Some families had roots in both lineages and spoke both Elvish and Hobbitish at home. Dr. Small-Town was deeply interested in bilingualism and strove to recruit every bilingual he could track down, to broaden his knowledge of the Elbitian people and fulfill his destiny. Despite his efforts, he was aware that the statistical inferences he could make with a sample size as small as he had were very limited, and he felt powerless. After sharing his feelings and concerns on an appropriate online forum (i.e., Twitter), he was all but ready to give up on his passion for bilingualism research and study something else. Just then, a tweet appeared on his timeline, sharing a large-scale collaboration being done by infant bilingualism researchers . Against all odds, Dr. Small-Town felt his heart beating again: there was hope! Clicking the link in the tweet, Dr. Small-Town learned that large-scale collaborations could be an effective solution to the problem of small samples. He knew that this could be just the thing for him to make a contribution to bilingualism research! In the following weeks, he joined forces with other labs (who had access not only to elves and hobbits, but also centaurs and dragons), and together with researchers from these labs, they drafted a solid, pre-registered research protocol. In the next few months, Dr. Small-Town followed the protocol as closely as possible and collected data with as many Elbitian bilinguals as he could find, resulting indrumroll please -17 participants! Finally, data from all the labs were pooled and analyzed using a single analytic framework. With such combined statistical power, sophisticated modeling was possible, and Dr. Small-Town was able to draw some exciting conclusions about bilinguals.
But it wasn't just the unexpected and nuanced results that excited Dr. Small-Town. By collaborating with researchers from diverse locations and backgrounds, he learned a great deal about the importance of diversity in research, with respect to both researchers and participants. For example, researchers in centaur-dominant areas, which tended to be less advantaged, faced bigger hurdles in participating in the collaboration, due to the lack of materials translated into Centaurish and older lab equipment. The consortium of labs taking part in the collaboration pooled together their resources to help pay a translator and upgrade the centaurs' equipment, so that all labs could take part as much as possible. Participants were also affected by certain research choices. Dragons seemed to have more difficulty with the demographic questionnaire than other participants, and Dr. Small-Town realized that it might have been due to dragons' cultural norms surrounding wealth and privacy. He thought back to previous studies of his where there were some confusing answers about socio-economic status and now saw this was due to a lack of understanding of participants' cultural backgrounds and the sensitivity of such information. With his new-found cultural awareness, thanks to the large-scale collaboration, he was able to redesign his questionnaires to be more widely applicable and inclusive.
Dr. Small-Town's experience in this large-scale collaboration was also invaluable in another way, for he was getting on in years, and needed to find the next-generation knowledge-seeker to whom he would pass down his ancient prophetic text. He carefully selected a protégée from his keen cohort of undergraduates, and used the lessons learned from his collaboration to introduce the standard methods, materials, and scripts that would make knowledge-generation swifter. His protégée, Holly, was eager to take part in such collaborative projects, which would give her access to real (and huge!) datasets (e.g., Hardwicke, Bohn, MacDonald, Hembacher, Nuijten, Peloquin, DeMayo, Long, Yoon & Frank, 2021) on which she could hone her skills, before her time came to take Dr. Small-Town's place as Chief Knowledge-Seeker. She took her training very seriously and innovated a new way to test even more participants: transforming in-person studies to online versions, building on the standardized methods and measures that were openly shared by the large-scale consortiums. As an old man looking back, Dr. Small-Town is happy that he did not give up on his passion for bilingualism and that he learned how to fulfill his destiny as a knowledgeseeker in an inclusive and impactful way.
The story of the adventurous Dr. All-in-One Dr. All-in-One was excited about open science. She knew the advantages of collaborating on large-scale and online projects, pre-registering a study, openly sharing materials, data and analysis code, and posting preprints. She also knew that any additional effort to embrace open science would be worth it in the end. But Dr. All-in-One also felt uneasy. She was on the tenure track and adopting these practices would mean changing the way she had done things since her early years in bilingualism research. Her thoughts were spinning in endless loops of worry: If I post a preprint, will a potential reviewer read it and compromise the anonymous review process? Will I need to test hundreds of participants? What about data scooping? If I do a pre-registration and These were very reasonable concerns. As if someone had heard her thoughts, a listserv email popped up in her mailbox: an online symposium on open science and bilingualism would take place in the coming weeks, and guess who was on the panel of speakers? Dr. File-Drawer, Dr. Small-Town, Dr. Fortuneteller, Future-Dr. Brouillard, and Dr. Sharemuch Dr. All-in-One registered at once.
At the symposium, Dr. File-Drawer was the first speaker. They explained how preprints make science more accessible, help to reduce publication bias against null results, and work in parallel to the peer-review process. Dr. All-in-One was relieved to hear that Dr. File-Drawer had not had any difficulty having their submitted manuscripts that were also posted as preprints peerreviewed. In one instance, a reviewer had read the preprint, but contacted the editor to let them know and was still allowed to review with that information disclosed. Dr. All-in-One also realized that she already shared her research prior to publication with potential reviewers, at conferences and seminars. This had never gotten in the way of the peer review process before, so she was heartened that publishing preprints wouldn't either.
Next, Dr. Small-Town shared his story of how to deal with his sample size limitations and how to incorporate online testing. Dr. All-in-One felt more at ease. It was now clear that although testing hundreds of participants can be desirable, it is not the only way to do good science. Open science practices highlight the importance of reproducibility and adequate statistical power, which can be achieved by using more sensitive methods and analyses, studying phenomena with larger effect sizes, and testing more trials per participantnot just huge datasets.
While Dr. All-in-One was encouraged by Dr. Small-Town's experience, the symposium continued and started to touch on one of her deepest fears: data scooping. Dr. Fortuneteller began to share his struggles when trying to publish his most peculiar research on bilinguals being better than monolinguals at walking backwards. After hearing Dr. Fortuneteller's story, Dr. All-in-One recognized how pre-registration (peer reviewed via a Registered Report or not) might have prevented all those publishing struggles. Dr. Fortuneteller concluded his talk with a discussion of how pre-registering a study decreases the chances of data being scooped. While open science repositories allow embargos that can serve as protective mechanisms, having a pre-registration also serves as a time-stamped public proof of idea, methods, and analysis plan conceptualizationwhich by no means limits exploratory data analyses.
Future-Dr. Brouillard spoke next on open materials. Dr. All-in-One, being new to coding reproducible analyses, was a little anxious about sharing her materials. She was reassured to hear that Future-Dr. Brouillard had received valuable and constructive feedback from other researchers on her task coding that helped streamline them for future studies and even led to new research questions and collaborations.
The session ended with a commentary by the moderator Dr. Sharemuch, who discussed the benefits of data sharing. One comment really resonated in Dr. All-in-One's mind: when it comes to sharing, more is usually better, but it is also perfectly fine to start small and share only the parts that are critical for reproducibility (i.e., to replicate the analyses presented in a paper). Dr. Sharemuch also mentioned that in bilingualism research, the language characteristics of the population studied should be shared as a raw dataset with an explanation document, or data dictionary. On the other hand, any personal health information or personal identifier should always be kept private.
Dr. All-in-one would still need to figure out the ins and outs for her own research, but after the symposium she was reassured: open science is not impossiblequite the oppositeit is feasible and inclusive. Open science comes in many shapes and forms but always has at its core one common feature: improving science.

Final remarks
Many discoveries and invaluable eureka moments await those who bravely venture into bilingualism research. Bilingualism research has the power to contribute to a better understanding of human development, learning, and behavior. But researchers need to be particularly diligent when dealing with the variability inherent to bilingualism. Open science practices are great tools for doing so. By embracing transparency, avoiding questionable research practices, striving for reasonable statistical power, and being ready to collaborate and increase diversity, we can build a much stronger science of bilingualism. While each of the open science practices we have discussed also has its downsides and particular challenges (admittedly our stories were optimistic in tone, and we leave further discussion of concerns about embracing open science practices to the technical papers referred to in Table 1), we believe that the benefits of open science strongly outweigh the limitations.
Of course, new adventures can bring discomfort, but they also create unique opportunities for innovation and growth. We hope that our stories have resonated with your personal experiences as a researcher and that, by doing so, they have convinced you that the costs of learning open science practices are easily surpassed by the benefits for individual scientists and for the scientific community. This is the time to band together on the adventure towards open sciencewill you come?
Conflicts of interest. None.