
Introduction
Shipwrecks are witnesses to the life history of waterborne crafts, of their loading, departure, voyage, sinking, ruin formation and eventual reintroduction to the world. Shipwreck sites act as time capsules, preserving an array of cultural artefacts due to the unique conditions of the underwater environment (Adams Reference Adams2001). In particular, shipwrecks provide a glimpse into the lives of sailors and help shed light on various aspects of human history, including shipbuilding technologies, trade voyages and the origins of cargo, as well as highlighting the interconnectedness of different geographic regions (Law Reference Law1984; Hamilton Reference Hamilton1996; Caple Reference Caple2006). For instance, analysis of a vessel that sank off the Turkish coast at Uluburun (c. 1320 BC) and its cargo helped elucidate exchange across Late Bronze Age Eurasia (Pulak Reference Pulak1988, Reference Pulak1998; Cline Reference Cline2021; Powell et al. Reference Powell2022). Similarly, the Brunei shipwreck (c. AD 1500) in the South China Sea provided evidence on the nature of contemporary international trade (Pirazzoli-t’Serstevens Reference Pirazzoli-t’Serstevens2011).
China, with its rich history of waterborne transportation, boasts numerous historical shipwrecks (Green Reference Green1983). Well-known cases include the ‘Nanhai I’ and ‘Huaguangjiao No. 1’ shipwrecks from the Southern Song Dynasty (AD 1127–1279), the ‘Nan’ao I’ shipwreck from the late Ming Dynasty (AD 1573–1620) and the ‘Xiaobaijiao I’ shipwreck from the Qing Dynasty (AD 1636–1912). These shipwrecks have provided valuable information concerning the production of Chinese porcelain, shipbuilding techniques, ocean navigation technology and maritime trade in their respective periods. Previous archaeological research on these shipwrecks primarily focused on the hull, cargo and utensils, or on their preservation and restoration, as well as the wreck environment and biotaxonomy (Hao et al. Reference Hao, Zhu, Xu, Wang and Zhang2018; Ma et al. Reference Ma, Chen, Li, Chen, Du and Luo2022).
Shipwreck sites also serve as valuable repositories of biological material, including organisms that were originally onboard (Bass & Van Doorninck Reference Bass and van Doorninck1982) or subsequently colonised the wreck (Mallefet et al. Reference Mallefet2008). Most bioarchaeological research on ancient shipwrecks has focused on macro-remains (bones, teeth, seeds, etc.) or microfossils (pollen, starch granules, phytoliths) (Allevato et al. Reference Allevato, Saracino, Fici and Di Pasquale2016; Pearsall Reference Pearsall2018; Briggs Reference Briggs2020). For example, pollen assemblages have been used to identify the geographical origins of shipwrecks (Couillebault et al. Reference Couillebault, Kaniewski, Domínguez, Otto, Luce and Boetto2023) and combined stable isotope and palaeogenomic analyses helped trace the potential source of elephant tusks from a southern African shipwreck (De Flamingh et al. Reference De Flamingh2021). While some organisms can be identified based on preserved morphological features, many require ancient DNA analysis for precise species identification. Ancient DNA has been successfully sequenced from various materials such as bones, teeth and plants (Poinar et al. Reference Poinar2006; Massilani et al. Reference Massilani2020), as well as sediments (i.e. environmental DNA (eDNA); Willerslev et al. Reference Willerslev2003). In fact, a wealth of biological remains not visible with the human eye are contained in water, soil and sediments (Thomsen & Willerslev Reference Thomsen and Willerslev2015; Kjær et al. Reference Kjær2022). Metagenomic analysis of eDNA allows the reconstruction of biological communities of environmental bacteria, archaea and eukaryotes (Y. Wang et al. Reference Wang, Wang, Liu, Xu and Li2021). Shipwrecks, being mostly found underwater where temperature, UV radiation and oxygen and light levels are low, provide excellent conditions for the preservation of genetic material and are thus an ideal target for ancient DNA studies. Yet, few studies of shipwrecks include an analysis of sediment eDNA.
The ‘Yangzi Estuary II’ shipwreck (abbreviated here as YEII shipwreck), dating to the Tongzhi period (同治, AD 1862–1875) of the Qing Dynasty, preserves the remains of a junk vessel that sank in the zone of maximum water turbidity in the Yangzi River Estuary. Initial underwater archaeological surveys discovered a large amount of porcelain inside the vessel, including a well-preserved celadon-ground blue and white ‘figural’ amphora found in an undisturbed burial environment inside the vessel. This amphora is the largest porcelain vessel (about 0.60m high) uncovered at the YEII shipwreck so far and was successfully salvaged along with all the sediments inside it. Despite the abundance of historical documents from this period in history, detailed accounts of shipping and trade, particularly concerning the Yangzi River Estuary as a crucial maritime gateway connecting northern and southern China, are relatively scarce. Therefore, this study aims to help reconstruct the life history of the YEII shipwreck, to independently verify the nature and origin of its cargo, and to gain a more comprehensive understanding of the economic and cultural dynamics of the Yangzi River Estuary in the 1860s through the analysis of metagenomic eDNA and sedimentological features of the amphora sediments.
Material and methods
Sampling and sub-sampling
The amphora was salvaged from the YEII shipwreck in the subaqueous North Channel of Yangzi River mouth in 2021 (Figure 1). After being removed from the water, it was curated at the Shanghai Cultural Heritage Conservation and Research Center, maintaining low-temperature conditions. To minimise risk of contamination, the amphora had no contact with other external animals and plants. The contained sediment was divided into sterile sampling bags using a sterile spoon, forming 15 ‘layers’ at approximately 40mm intervals. One sample and a corresponding replicate were taken from each layer (total of 30 samples) for environmental parameter analysis (e.g. lithology, geochemistry and grain size) and DNA analyses, respectively.

Figure 1. a) Distribution map of rice geographical sources, the loading port and the sinking site. Grey circles represent the province distribution, blue circles represent possible loading port, red circles represent the sinking site, and red lines represent possible voyage route; b) shipwreck in the dock (figure by authors).
All samples were stored in freezers at −20°C pending subsequent analysis. For environmental parameter analysis, samples were shipped to the State Key Laboratory of Estuarine and Coastal Research, East China Normal University. To avoid contamination, samples for DNA extraction were subsampled under clean-room controlled conditions in a dedicated ancient eDNA laboratory at Fudan University and researchers wore full-body suits, gloves, shoe covers, masks and hairnets. Sediment from each layer was sub-sampled three times (to produce laboratory duplicates, total of 45 samples), and each 1–2g sample was placed into a sterile 15ml spin tube that was immediately closed.
Radiocarbon ages, lithology, geochemistry and grain size
Two samples of plant fragments were obtained from the amphora for radiocarbon dating, these were processed by Beta Analytic (Miami, USA) and the resultant radiocarbon ages were calibrated with Calib 8.2 (Stuiver et al. Reference Stuiver, Reimer and Reimer2021). Sedimentary particle size analysis was performed per layer using a Beckman Coulter LS13320 laser diffraction particle size analyser. The percentages of clay, silt and sand components and compiled parameters, including the median value (Md), standard deviation (σ) and skewness (SK), were calculated based on grain-size data. Levels of alkaline earth metals, such as strontium (Sr) and barium (Ba), were also analysed in each layer as these are sensitive to salinity changes in the land-sea transitional zone (A. Wang et al. Reference Wang, Wang, Liu, Xu and Li2021).
DNA extraction, library preparation and sequencing
Ancient eDNA was extracted from the 45 sediment samples following an InhibitEx-based protocol (Y. Wang et al. Reference Wang, Wang, Liu, Xu and Li2021) with proteinase K (Merck, Germany). Macro-remains were not separated from sediments during DNA extraction. Humic acids and inhibitors were then removed using a OneStep PCR Inhibitor Removal Kit (Zymo Research). In addition, five extraction negative controls (at least one negative control for every nine extracts) were prepared. All DNA extracts and negative controls were converted into double-stranded libraries following the protocol described by Meyer & Kircher (Reference Meyer and Kircher2010). Libraries were amplified with indexing primers in two parallel PCRs (polymerase chain reactions) using Q5 High-Fidelity DNA Polymerase (NEB (New England Biolabs), USA) and indexed products were purified using the AMPure XP bead (Beckman Coulter, USA). Following this, all libraries were shotgun sequenced as dual-indexed libraries on the DNBSEQ-T7 platform (PE100) (Zhu et al. Reference Zhu2022). Five extraction negative controls and six library negative controls (one for each batch) were sequenced as controls for contaminants (45 samples in total).
Data processing
Taxa identification was performed by matching sequencing reads against a database of reads annotated with taxonomic information. Raw reads were quality controlled to make sure each had an equal chance to be aligned against all entries in the database. AdapterRemoval2 v.2.3.0 (Schubert et al. Reference Schubert, Lindgreen and Orlando2016) was used for trimming, setting read length at ≥30 base pairs (bp) (--minlength 30). Then, fastq-tools v.0.8.3 (https://github.com/dcjones/fastq-tools) was applied for removing trailing adenine/thymine base pairs and trimming adapters. Contigs were then assembled de novo on the remaining high-quality reads using MEGAHIT v.1.2.9 (Li et al. Reference Li, Liu, Luo, Sadakane and Lam2015), with the parameters ‘min-contig-len 500’ filtering out contigs shorter than 500bp.
Taxa identification
The National Center for Biotechnology Information (NCBI) nucleotide collection (nt, ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nt.gz) was applied for reference database creation and taxonomic classification. The assembled contigs were aligned with the reference database using the Basic Local Alignment Search Tool (BLAST) v.2.13.0 (Camacho et al. Reference Camacho, Coulouris, Avagyan, Ma, Papadopoulos, Bealer and Madden2009). Using ≥97% sequence identity and the lowest E-value thresholds, assembled sequences were annotated before the three biological replicates from each layer were merged. To ensure the reliability of the annotated taxa and minimise false positives, only those taxa that appeared more than three times in the combined dataset were retained.
DNA damage estimation
To confirm that the investigated signal derived from the ancient DNA of past organisms and to exclude modern contamination, MapDamage v.2.0.8 was used to calculate the frequencies of 5’ C to T (cytosine to thymine) and 3’ G to A (guanine to adenine) deamination and the standard deviations by comparing sequenced reads to taxa with sufficient reads and reference genomes for generating reliable damage models. Here, plant macro-remains (rice chaff/straw) were suitable for confirming the presence of damage on DNA using MapDamage. The database was then constructed based on the Oryza sativa Japonica Group reference genome IRGSP-1.0 (GCF_001433935.1).
Alpha and beta diversity
Alpha diversity describes the taxonomic diversity within individual samples, whereas beta diversity describes differences in community composition between samples. To visually demonstrate the differences in alpha diversity within the amphora sediment, samples were divided into two groups based on the results of the environmental parameter analysis. Taxonomic alpha diversity indices, including Shannon’s diversity index, Simpson’s diversity index and the Evenness index, were measured in R v.3.3.2 (R Core Team 2021). Then, an unpaired t-test was used to validate the statistical differences of the index between the above two groups. Beta diversity was estimated by computing Bray–Curtis distances abundance profiles. Bray–Curtis distances per layer were then applied to principal co-ordinates analysis (PCoA) to project the samples based on genus abundance into two-dimensional Euclidean space to visualise the clustering of the samples.
Taxonomy and phylogeny of plant macro-remains
The genus Oryza (the visible macro-remains) exhibits high genetic variability, encompassing multiple species and subspecies. Visual identification alone may not be sufficient to determine the specific variety or subspecies of rice, while DNA analysis offers precise taxonomic identification. Sequencing of the chloroplast genome from rice chaff/straw in the amphora permits taxonomic and phylogenetic identification, while also providing significant information for further studies. Thirteen chloroplast reference sequences were selected (O. sativa Japonica, O. sativa Indica, O. rufipogon and wild rice variants; see online supplementary material (OSM) Table S4.1). The selected 13 chloroplast whole-genome reference sequences were then aligned using MAFFT v.6 (multiple alignment using fast Fourier transform; Katoh et al. Reference Katoh, Misawa, Kuma and Miyata2002) with default parameters. Consensus sequences were generated in Geneious R10 v.10.2.3 with a 75% majority rule threshold from all clade-consensus chloroplast references. Additionally, pre-processed data were aligned with the constructed continuous reference sequence using Burrows–Wheeler Aligner v.0.7.5a aln algorithm. Samtools v.0.1.19 was used for filtering out reads that could not be matched (-F 4), retaining reads with an alignment quality greater than 25 (-q 25). To remove low quality reads, bam (coverage depth <5), Angsd (Korneliussen et al. Reference Korneliussen, Albrechtsen and Nielsen2014) was used. Sequence alignments were then performed using MAFFT with reference sequences. Finally, a phylogenetic tree was constructed in MEGA6 (molecular evolutionary genetics analysis version 6.0; Tamura et al. Reference Tamura, Stecher, Peterson, Filipski and Kumar2013). Leersia japonica (Japanese cutgrass) (NC_034766.1) was used as the outgroup.
Biogeography of plant macro-remains
To explore the geographical origin of rice chaff/straw in the amphora, a database containing our data and reference genomes with clear geographical information was constructed. Reference genomes were taken from the International Rice GenBank Collection at the International Rice Research Institute, which had been included in the 3K-RG project (Li et al. Reference Li, Wang and Zeigler2014). Accession passport data were used from the International Rice Information System (IRIS; http://iris.irri.org/) (McLaren et al. Reference McLaren, Bruskiewich, Portugal and Cosico2005; Gutaker et al. Reference Gutaker2020). As the Oryza samples in this study were more than 100 years old, ‘improved’ varieties were removed from the database. Wild and weedy plant remains may exhibit high genetic diversity and gene introgression from multiple sources, making it challenging to discern clear genetic patterns. Additionally, these types do not provide sufficient biogeographical information, and thus ‘wild’ and ‘weedy’ accessions were also excluded. ‘Traditional variety/landrace’ accessions and ‘breeding/inbred line’ accessions were kept, if these were pure lines directly derived from ‘traditional varieties/landraces’ or classic breeding lines from before the Green Revolution in the 1960s (Gutaker et al. Reference Gutaker2020). The geographical origins of the reference genome of Oryza from China used in this study therefore included Gansu, Hunan, Guangxi, Guangdong, Jiangxi and Shanghai (Table S5.1). Analysis of single nucleotide polymorphisms (SNPs) was conducted using a core SNP set. These SNPs provide comprehensive genome-wide coverage and have a minor allele frequency (MAF) above 5% to ensure they are both informative and common.
ZS97RS3 from the Oryza sativa Indica Group genome assembly (GCA_001623345.3) was selected for genomic analysis. The collapse of overlapping pair-end reads was merged using all samples and then aligned with the reference genome using bwa-aln. Subsequently, sorting and indexing were performed using Samtools and duplicates were removed using the ‘dedup’ tool. Quality scores below 30 and bases with base quality below 30 were removed for quality filtering using Samtools mpileup. PileupCaller v.1.4.0.2 (https://github.com/stschiff/sequenceTools.git) was used to randomly call alleles for each SNP. Finally, selected Chinese reference SNPs of Oryza were merged. Based on merged data, smartpca (Isproject: YES) were applied for principal component analysis (PCA). Then, based on paired F st (a fixation index used as a measure of genetic differentiation), the genetic distance within O. sativa Indica in China was constructed as a neighbour-joining tree using R package ‘ape’.
Results
Evidence for two sedimentary environments within the amphora
Sediments from the amphora were divided into 15 sampling depths (indicated by the purple lines in Figure 2a, see also Table S1.1). Two groups of cups were placed horizontally and vertically within the amphora between depths of 0.05 and 0.30m (five cups per group), while four groups of cups were positioned vertically between depths of 0.35 and 0.60m (10 cups per group). Two types of macro-remains were found: shells/shell fragments (located above 0.30m) and rice chaff/straw (found between some cups in the 0.35–0.60m-depth range within the body of the amphora), as shown in Figure S1. Morphological traits indicate that the well-preserved shells were mostly derived from brackish water environments. Two rice chaff/straw samples from different cups at a depth of 0.30–0.40m were radiocarbon dated at 180 calibrated years before present (Figures 2b & S2). Sediment can be divided into eight units based on vertical change in lithology (Figure 2b). Grain size changes significantly around the 0.30m depth boundary, with a larger median grain size below 0.30m compared with above (p<0.05, Figure 2c). Additionally, the Sr and Ba content and Sr/Ba ratio, which are significantly higher above 0.30m (p<0.05), indicate different sedimentary environments—brackish and terrestrial, respectively—for the layers above and below 0.30m.

Figure 2. a) Outer appearance and display of sampling points inside the amphora; b) sedimentary log for lithology, geochemistry and grain size; c) violin plots of median grain size and Sr/Ba ratio between the dataset above and below 0.30m (figure by authors).
Sedimentary differences corresponded to disturbed and original layers
After sequencing and quality-control filtering, reads from the negative controls revealed low quality scores and failed to assemble for further analysis, indicating no contamination. Excluding these, a total of 7 492 024 894 raw reads were obtained from the 45 sediment samples, resulting in 3 498 014 728 collapsed reads after filtration (Tables S1.2 & S1.3). The read length distribution was concentrated between 40 and 80bp for all species (Figure 3a) and for mapping only to rice chaff/straw (Figure 3b), further supporting the authenticity of the ancient DNA. Additionally, C-to-T deamination rates at the 5′ end of reads were typical for ancient DNA. Although these features were less pronounced compared to older samples due to the relatively recent age of these specimens, an excess of C-to-T and G-to-A mismatches was observed in the rice chaff/straw data (Figures 3c, S3 & S4).

Figure 3. a) Read length distribution on all taxa; b) distribution of DNA damage for Oryza using sequence length; c) DNA damage patterns for Oryza (figure by authors).
Use of the NCBI-nt databases with BLAST revealed that Bacteria are the most abundant genetic component (62.5%), followed by Eukaryotes (27.4%), see Figure 4a. The proportions of individual genera differed above and below the 0.30m mark, with Paraburkholderia (a genus of gram-negative rod-shaped bacteria) dominant above 0.30m, while Oryza was the dominant taxon below 0.30m (Figure 4b). Additionally, the obligate anaerobe Methanothrix (belonging to the Archaea domain) was mostly detected at depths below 0.30m. When considering only Eukaryotes, taxa cluster into two major branches defined by layer depth (see Figure 4c for groupings by phylum). Among these, Streptophyta, Chordata and Arthropoda occupy the top three most frequent taxa, with the highest percentage of reads in both branches. Hierarchical clustering at the phylum level further validates this stratification, revealing clear separation between the two branches (Figure 4d). Invertebrates may be genetically distinguished between brackish and terrestrial origins, as invertebrates of the class Insecta primarily inhabit terrestrial environments. The layers below 0.30m contain more terrestrial invertebrate DNA reads, with Insecta occupying the highest proportion at the class level (Figure 4e). Marine invertebrates such as Demospongiae, Echinoidea and Gastropoda were only detected above 0.30m. Altogether, this indicates that sediments below 0.30m mainly consist of material from terrestrial origins (referred to as the ‘original layer’), while above 0.30m sediments contain DNA from both brackish and terrestrial sources (referred to as the ‘disturbed layer’).

Figure 4. a) Relative proportion of Archaea, Bacteria and Eukaryotes at the phylum level using shotgun profiling data; b) column stacking diagram of relative abundance at the genus level; c) heatmap of phylum level ranked according to their percentage reads; d) hierarchical clustering of phylum level; e) bubble chart showing the top 20 number of reads by class level of invertebrates (figure by authors).
Employing Shannon’s index and the Evenness index (p<0.05; Figure 5a & b), the disturbed layer is shown to exhibit higher diversity and species richness compared to the original layer. However, no significant difference in Simpson’s index is observed between the two layers (p=0.11; Figure 5c). PCoA analysis of beta diversity using Bray–Curtis distances reveals a distinct clustering corresponding to the disturbed and original layers (Figure 5d).

Figure 5. a–c) Box plots of Shannon, Evenness and Simpson indices; the latter two of which were calculated based on the genus-level abundance of the disturbed and original layer, respectively. P-values were calculated using the Wilcoxon rank sum test; d) PCoA plot of samples from the disturbed (green) and original (purple) layers based on Bray–Curtis distance calculated from the genus level (figure by authors).
The original layer contains abundant eukaryotic information
Rice chaff/straw, mainly concentrated in the original layer, are the most abundant plant macro-remains and provide the highest number of reads. Genome coverage from the sediment environment was good, with an average depth of 3.51 × (Table S2.1). Reconstruction of the phylogenetic tree based on the chloroplast genome shows that the rice chaff/straw in the amphora clusters with O. sativa Indica (Figure S5).
To investigate the geographical origin of this strain of O. sativa Indica, a genome-wide panel was constructed using 47 indica and nine japonica samples (adhering to high-quality standards, including high sequencing depth, low missing data rates and minimal contamination) after our screening from the Rice 3K-RG database. Around 285 350 SNPs were obtained through the merged dataset. The overall population structure of O. sativa was evaluated through a PCA, which separated indica and japonica based on these SNPs (Figure 6a, Table S3.1). The data from the YEII shipwreck cluster with O. sativa Indica, consistent with the result obtained from the chloroplast genome analysis. The sample strongly resembles data from Jiangxi Province, along the middle Yangzi in south China (Figure 1a). A neighbour-joining tree of O. sativa Indica based on paired F st also reveals that the rice chaff/straw from the YEII shipwreck has the closest genetic distance to Jiangxi Province (Figure 6b, Table S3.2). Furthermore, our analysis also detected the presence of various organisms, such as Diptera (two-winged flying insects, e.g. Anopheles), Lepidoptera (moths, butterflies and skippers, e.g. Heliconius), Hemiptera (bugs, e.g. Pentatomidae) and Poales (flowering plants, e.g. Phyllostachys) in the original layer, even in the absence of visible remains (Figure S6).

Figure 6. a) PCA of SNPs in DNA sequences from Oryza sativa Japonica and O. sativa Indica grown in different locations in China, coloured by location; b) neighbour-joining tree using genetic distance within Chinese samples of O. sativa Indica based on paired F st (figure by authors).
Discussion
Sinking date and last sailing season
In addition to the large amphora, the YEII shipwreck contained porcelain inscribed with the characters 同治 (‘Tongzhi’ reign era, late Qing) on the base, providing evidence that the vessel sunk during the period AD 1862–1875. Radiocarbon dates obtained from plant macro-remains from the porcelain amphora are consistent with this period, confirming the contemporaneity of the rice chaff/straw and amphora. Analyses of sedimentary records from the site of the shipwreck have led to speculation that the ship sank during a typhoon (Niu et al. Reference Niu, Zhao, Switzer, Zhai, Zhang and Wang2021). Local chronicles in Shanghai (Continuation of the Shanghai County Gazetteer上海县志续编; Continuation of the Songjiang Prefecture Gazetteer 松江州志续编) record numerous instances of storm surges and seawater inundation, with typhoons typically occurring in summer and early autumn. In Jiangxi and its surrounding areas, rice was often double-cropped. Historical records from the Qing Dynasty, specifically He Dejun’s (1855–1938) An investigation of agricultural products in Fu Commandery (抚郡农产考略), state that O. sativa Indica was harvested in the summer, while O. sativa Japonica was harvested in the autumn. As the rice chaff/straw in the amphora is identified here as Jiangxi O. sativa Indica, this suggests that the cargo was packaged during the summer. The presence of certain insects, such as Lepidoptera (Heliconius, etc.) and Diptera (mosquitoes, etc.), in the original layer within the amphora also suggests that the final voyage occurred during summer or early autumn, based on peak mosquito densities in July (Guo et al. Reference Guo2018). This sailing season aligns with the possibility that a typhoon resulted in the vessel’s sinking.
Sedimentary mechanisms for the amphora deposits
Ba and Sr concentrations in the amphora are consistent with those found in the wider shipwreck and the surrounding sedimentary environment. This could be attributed to the shipwreck’s location in the subaqueous distributary channel between major shoals of the Yangzi River Estuary (Niu et al. Reference Niu, Zhao, Switzer, Zhai, Zhang and Wang2021). The area belongs to a delta front platform setting, characterised by strong hydrodynamic processes such as storm waves (Goodbred & Saito Reference Goodbred, Saito, Davis and Dalrymple2011). This turbulent current probably infilled and stirred the contents of the amphora, forming the coarse-grained sediments with the rice chaff/straw in the lower, original layer. The presence of the stacked cups within the amphora probably prevented much of the chaff from floating out. Later, suspended sediments settled in the amphora under weaker currents, and the amphora became a habitat for aquatic organisms. This explains the higher Sr/Ba ratios and DNA enrichment of marine invertebrates in the upper, disturbed layer. Hydrobiotic species such as those belonging to the Gastropoda, Echinoidea and Demospongiae classes, were only found in the disturbed layer. Arachnida, Collembola and other species that inhabit terrestrial land (Dunlop & Selden Reference Dunlop, Selden, Fortey and Thomas1998; Selden et al. Reference Selden, Dunlop, Edgecombe and Edgecombe1998; Dunlop & Webster Reference Dunlop and Webster1999; Coddington et al. Reference Coddington, Giribet, Harvey, Prendini, Walter, Cracraft and Donoghue2004) were only detected in the original layer.
Replicating the packaging process
The original layer of the amphora contains a substantial amount of plant macro-remains, specifically rice chaff/straw. This material was likely used to protect the cups inside the amphora from collisions associated with packaging and transport, as mentioned in the Records of Jingdezhen ceramics (景德镇陶录). Methanothrix, commonly found in rice paddies, was also identified in the original layer, indicating a low-oxygen or anoxic environment. This environment likely promoted better preservation of chaff morphology and DNA (Cadet & Wagner Reference Cadet and Wagner2013). Although no physical remains were found, DNA from Phyllostachys (a genus of Asian bamboo) was detected in the original layer. A packaging technique that uses bamboo strips to wrap around the outside has been observed in various shipwreck studies (Kim & Moon Reference Kim and Moon2011).
Ancient texts, such as Ode to ceramics (陶说) and Records of Jingdezhen ceramics, indicate that the packaging process for exporting porcelain from folk kilns in the Qing Dynasty involved placing rice chaff between small cups to prevent collisions, stacking the cups in piles of five or 10, and packing them together with rice straw and bamboo strips. Assembled cups were then placed inside amphora, with any gaps filled with rice chaff. This packaging method reduced costs by using rice chaff/straw, utilised the toughness and versatility of bamboo strips in securing the cargo and helped economise on space during shipping (Jiang Reference Jiang1959; Carlson Reference Carlson2004).
The ship’s final loading port
Both rice chaff/straw and bamboo were typically sourced locally (Records of Jingdezhen ceramics). Porcelain typology and x-ray fluorescence composition analysis, indicate that the porcelains associated with the YEII shipwreck originated in Jingdezhen, Jiangxi Province (pers. comm.). Our analyses indicate a close genetic relationship between the rice chaff/straw used for packaging and the native O. sativa Indica from Jiangxi, confirming Jiangxi was not only the place of production for the porcelain but also the probable loading port for the cargo.
Conclusion
Metagenomic analysis of sedimentary eDNA from a well-preserved amphora enhances our understanding of the life history of the YEII shipwreck. Elucidation of two distinct layers of sediment, differentiated by the proportional representation of genera and by sediment grain size and corresponding to terrestrial and brackish environments, helps create a narrative of before, during and after the fateful voyage. Before sinking, the junk vessel was loaded with a cargo of porcelain in Jiangxi. Stacked cups, packed with rice chaff and wrapped with bamboo, were placed inside amphorae and packed with more rice chaff/straw. Sailing in summer or early autumn, evidenced by the presence of summer-harvested Oryza sativa Indica and certain insects, the ship likely encountered a typhoon and sank in the river estuary, where strong currents buffeted its cargo. Following submersion, the wreck became a habitat for estuary-dwelling organisms, including marine bacteria. These findings highlight the power of ancient environmental metagenomic analyses for enhancing our understanding not only of the biological components of past environments, but of associated economic and cultural dynamics where these environments were, at least partly, created by human actions. The wider application of sedimentary metagenomic analyses at shipwreck sites around the world can help us build richer narratives of the past.
Funding statement
This work was sponsored by the National Natural Science Foundation of China (32371692) to XM, the Lantai Youth Scholar Program (2022LTQN602) to SW and the Natural Science Foundation of Shanghai (23ZR1458500) to LZ.
Online supplementary material (OSM)
To view supplementary material for this article, please visit https://doi.org/10.15184/aqy.2025.10271 and select the supplementary materials tab.
Author contributions: CRediT Taxonomy
Xiaolin Ma: Conceptualization-Equal, Data curation-Equal, Writing-Equal. Zhihang Ma: Formal analysis-Equal, Methodology-Equal, Software-Equal, Visualization-Equal. Panxin Du: Methodology-Equal. Haixia Wen: Investigation-Equal. Nan Hu: Formal analysis-Equal. Yiran Xu: Data curation-Equal. Edward Allen: Visualization-Equal. Luo Zhao: Resources-Equal. Yan Ge: Resources-Equal. Xin Wei: Investigation-Equal. Zhanghua Wang: Conceptualization-Equal. Yang Zhai: Conceptualization-Equal, Resources-Equal. Shaoqing Wen: Conceptualization-Equal, Writing-Equal.






