How researchers can solve the bottle-opener problem with compute capsules
Imagine a group of people playing a sport together on a hot day. Although it’s a friendly match, they play vigorously and at the end of their game they’re hot and thirsty. A few people in the group brought a cooler filled with ice and a variety of bottled beverages (among them our favourite, a tangy non-alcoholic ginger beer). They pass out the bottles and everyone is looking forward to a refreshing cool drink and trying the different flavours. But, disaster! No-one has a bottle-opener! The players are stuck with a delicious drink that is trapped in the bottle. They press the cool glass to their foreheads to get some relief and cool down, and laugh about the misfortune of having the the refreshing, cool drink so close, but not being able to drink it.
If you can briefly suspend your disbelief that no-one in this imaginary group would have any kind of tool that couldn’t be used as a bottle-opener, consider this scenario as a rough metaphor for scientific research and publication. Playing sport is like doing research, and sharing the drinks is publishing the research, what we do near the end of the project to share the results with everyone so they can benefit from our findings. The reason why a bottled drink that you cannot open is like a research publication is that a typical research publication is like the label on the bottle – a brief description of some interesting results. The delicious drink inside the bottle is like the substance of the research – the data, code used to analyse the data, and other research materials. These are the details of the research that we often want to use with our own research, and that we really need to see in detail to properly assess the reliability of the claims made in the publication. Researchers lack a bottle-opener, that is, they lack a convenient and efficient way to routinely access the substance of the research we read about in publications.
In our recent paper in Advances in Archaeological Practice we investigate this problem of accessing the research materials behind publications in the archaeological literature. We found that archaeologists could really benefit from some more bottle-openers! Our pilot studies show that only 20% of our private requests to authors of journal articles for data behind their paper resulted in data being shared, and while 53% of sampled journal articles contained openly available data, they were in a wide variety of formats and locations. This indicates substantial potential for journal editors and funders to improve the availability of research data by requiring authors to deposit the data and materials behind their publications in publicly-accessible trustworthy repositories. We also found a lot of unrealised potential to make archaeological research data easier to reuse by sharing data in plain text structured data formats such as CSV, rather than as PDF files. We identify the uncertainty of receiving credit for sharing data as a key issue in obstructing sharing, and propose a citation standard for data to encourage researchers allocate credit to data providers by citing their datasets.
In our paper we are excited to be the first archaeological publication to showcase a special kind of bottle-opener: a Code Ocean compute capsule. This capsule gives the reader immediate and complete access to the data and R code behind the figures and tables in our paper – you can browse the spreadsheets right in the middle of the paper. This means you can easily see every step and decision of our data analysis, a level of detail that is rarely possible to convey within the word limits of a typical article. But the most remarkable feature is that the capsule allows you to run the code and reproduce the graphs and tables in the paper. This means you can see exactly how the code works, and you can even change our code to see how adjusting the parameters effects the results. If you’re curious about an assumption we made in our analysis, you can see the effect of different assumptions by editing our code, running the code with our data, and inspecting the results. And all of this happens in the capsule in your web browser, you don’t need to download or install any software.
Below you can see how the capsule looks in our paper, with the code and data files listed on the left, immediately accessible to the reader:
Below is an example of inspecting raw data in the capsule, in this case a spreadsheet:
Below we can see the R code embedded in our paper, which any reader can edit to explore, and run the edited code. Edits made by readers do not change the published code, which is part of the version of record and can only be changed by submitting a correction to the journal.
We’re excited to see compute capsules like this in more archaeological publications because of how easily it solves the bottle-opener problem by making the substance of the research – the code and data – directly available to readers. There’s no need to email the author and wait, potentially for them to refuse to share anything. We’re also excited by the interactivity and dynamism that compute capsules enable for scholarly articles. For the first time since the earliest academic journals in the 17th century, compute capsules allow us to directly and instantly act on the impulse to wonder ‘what would those results look like if we tweaked this variable?’, to zoom into complex plots, and drill down into large datasets. We see compute capsules as a vital tool in a fundamental change to improve the efficiency and impact of scientific communication. We encourage archaeologists to enhance the transparency and openness of their research by including compute capsules in their publications.
Ben Marwick and Suzanne E. Pilaar Birch’s paper, ‘A Standard for the Scholarly Citation of Archaeological Data as an Incentive to Data Sharing‘, is freely available until 1 September.