Holocene selection for variants associated with cognitive ability: Comparing ancient and modern genomes

Human populations living in Eurasia during the Holocene experienced significant evolutionary change. It has been predicted that the transition of Holocene populations into agrarianism and urbanization brought about culture-gene co-evolution that favoured via directional selection genetic variants associated with higher general cognitive ability (GCA). Population expansion and replacement has also been proposed as an important source of GCA gene-frequency change during this time period. To examine whether GCA might have risen during the Holocene, we compare a sample of 99 ancient Eurasian genomes (ranging from 4,557 to 1,208 years of age) with a sample of 503 modern European genomes, using three different cognitive polygenic scores. Significant differences favouring the modern genomes were found for all three polygenic scores (Odds Ratio=0.92, p=0.037; 0.81, p=0.001 and 0.81, p=0.02). Furthermore, a significant increase in positive allele count over 3,249 years was found using a sample of 66 ancient genomes (r=0.217, pone-tailed=0.04). These observations are consistent with the expectation that GCA rose during the Holocene.


Introduction
The Holocene (11,700 ybp to the present) was a time of significant evolutionary change, especially among the populations of Europe and Asia. Based on the degree to which new haplotypes were arising and diverging, it has been estimated that the rate of adaptive evolution among these populations may have been 100 times greater than during the preceding Pleistocene (1). Novel adaptations that are known to have arisen and spread during this period include lactase persistence (2) and alterations in haemoglobin permitting enhanced tolerance to diminished oxygen levels among populations living at altitude (3).
Other significant adaptations that may have been favoured by selection during this period are the general learning and problem solving mechanisms that give rise to General Cognitive Ability (GCA). At the level of individual differences in cognitive performance, GCA is associated with the positive manifold that exists among performance variances on different measures of cognitive abilities. In addition to GCA, cognitive ability measures also tap specialized sources of ability variance, which are associated with performance in specific domains (e.g. verbal, spatial and perceptual abilities) (4). GCA is highly heritable with behaviour genetic twin studies indicating that between 22% and 88% of the variance in GCA, depending on age, may be due to the action of additive genetic effects (5). Consistent with this, studies employing genome-wide complex trait analysis (GCTA) GREML have found that a significant amount of the variance in GCA can be directly attributed to large numbers of variants with individually small, but collectively large (additive) effects on the phenotype (6). 4 GCA is not just present in human populations, but is present (at lower levels) also in a variety of animal taxa, including chimpanzees and other primates, racoons, mice, rats and ravens (7). The existence of these species differences furthermore indicates that GCA has been under directional selection, particularly in the primate clade. This seems to be especially true in the case of abilities associated with tool-use, which in primates are among the most strongly associated with GCA, in addition to being the most heritable and are also associated with the strongest signals of recent directional selection (8,9). This is consistent with the theoretical expectation (10) that GCA is an adaptation to coping with evolutionarily novel challenges -these being challenges that occur infrequently throughout the phylogeny of a lineage, thus cannot be solved with recourse to specialized cognitive systems (such as those associated with cheater detection or language acquisition). Instead such problems require generalized and open-ended problem solving systems, such as learning and working memory, in order to tailor solutions to them. The ability to innovate a solution to such a problem (via the development of a tool) is a key manifestation of the action of these generalized problem-solving systems (10).
The Holocene is believed to have been an environment of evolutionary relevance for GCA as it was characterized by major transitions among human populations, especially in Eurasia, away from a hunter-gatherer mode of subsistence, towards a sedentary agriculture-based one and beyond that to urbanization (1,11). This bought with it many evolutionarily novel problems, such as having to cope with increased population densities and new forms of warfare (1,11). Innovations that played a major role in facilitating this transition would have included the domestication of 5 cultivars and animals, the development of novel tools for raising the productivity of land (such as the plough), and the development of novel weapons of war (11).
Cultural innovations such as monotheism, monarchy, aristocracy, feudalism and currency-based economics arose in response to the need for coping with the hierarchical power distribution characteristic of large, static populations (11). Those populations that were successful in using innovations to solve novel problems would furthermore have had an advantage in warfare, allowing them to replace less successful populations. Population growth would have increased the chances of rare GCA enhancing mutations arising, which would have favoured the aggregate fitness of those populations permitting them to expand to the greatest extent (1,11). Thus positive culture-gene co-evolutionary feedback might have rapidly increased GCA (11).
As was mentioned previously, GCTA GREML has demonstrated that GCA arises from the action of many variants each with small, but additively large effects on the phenotype (6). High-resolution genome-wide association studies (GWAS) have identified a number single nucleotide polymorphisms (SNPs), primarily associated with the development of the central nervous system, which predict variance in both GCA and educational attainment -with which GCA shares approximately 60% of its linkage pruned genetic variance (12,13,14,15,16). These SNPs can be concatenated into cognitive polygenic scores (henceforth POLY COG ), which are normally distributed and can be used as a molecular index of GCA and educational attainment in standard regression-type analyses. In the present analysis, a sample of 99 ancient Holocene genomes sourced from Europe and West Asia, aged between 4,557 and 1,208 years (mean age=3,443 years, SD=623.52), with reasonable (approximately 30%) coverage depth (17) will be used to test the hypothesis of gene frequency change, favouring higher POLY COG -and therefore higher GCA. This will be achieved by comparing the aggregate POLY COG levels of these ancient genomes with a genetically matched modern population, specifically the 503 individuals comprising the 1000 Genomes Phase3 EUR sample (for full details on these samples see the Materials and Methods section). Such an analysis builds on previous comparative analyses utilizing these same samples for tracking the changes in single SNPs associated with specific phenotypic traits (such as skin reflectance, eye colour and lactase persistence) (17).

Results
The data were categorical (positive vs. negative allele/ancient vs. modern population) hence a 2x2 contingency table was created for each of the three POLY COG (for details of how each polygenic score was computed, see the Materials and Methods section) assigning allele counts by GWAS effect status (positive or negative effect) to ancient and modern genomes. Odds ratios (OR) of the difference in POLY COG between the ancient and modern genomes were computed and are reported in Table 1. As Fisher's Exact Tests and χ 2 Tests showed nearly identical results, only the former are reported.
In addition the G-Test (log-likelihood ratio) is utilized as an alternative test of the significance of the difference. As a further robustness test, the difference in POLY COG between the two sets of genomes was examined using the unconditional Barnard's and Boschloo's Exact Tests. These statistics are extremely computationally intensive when large numbers of pair-wise comparisons are employed. Therefore they could only be used for the two smaller POLY COG (9 and 11 SNP). The results were identical to those obtained via the conditional techniques. A forest plot of the OR values for the three POLY COG scores is presented in Figure 1. These results indicate that modern European genomes have higher POLY COG relative to ancient ones (sourced from Europe and West Asia).
A second analysis was conducted to investigate the association between POLY COG and sample year within the subset of 66 ancient genomes for which age estimates were available (17). If selection is operating on these variants throughout the 3,249 years covered by this sample then less ancient genomes should have higher POLY COG relative to more ancient ones. The correlation of positive allele frequency with the sample year (scaled using the BCE/CE calendar eras) was in the expected positive direction (samples from more recent years had higher positive allele counts) the result is significant when a one-tailed test is used (r=0.217, 95% CI=-0.026 to 0.043, p one-9 tailed =0.04), which is theoretically justified on the basis that the direction of the effect was anticipated (18). The scatter plot is presented in Figure 2.  CI.

Discussion
Consistent with theoretical expectations, the Holocene appears to have been an environment of evolutionary relevance for GCA, as modern European genomes have higher POLY COG , relative to those sourced (predominantly) from Bronze Age Europe and West Asia. A second analysis revealed that within a subset of the ancient genome sample, there was a (one-tailed) significant trend towards increased positive allele counts in the less ancient samples.

0
This increase in POLY COG can be theoretically accounted for in four different ways: i) the appearance of new mutations, ii) genetic drift involving already existing alleles, iii) selection pressure on extant alleles, and iv) population expansion, replacement and admixture due to migration.
The first two theories can be discounted, as the POLY COG scores utilized here are comprised of SNPs that are present (either directly or in strong linkage) in both the ancient and modern populations. Therefore these particular SNPs did not arise de novo in this period. Genetic drift based models would make the assumption that the variants comprising POLY COG are selectively neutral, and that their distributions should therefore correspond to the action of genetic diffusion and population bottlenecks. It is unclear however why this would lead to modern genomes being higher in POLY COG relative to ancient ones, and is furthermore at odds with the observation from studies utilizing various POLY COG that these do not appear to be selectively neutral in contemporary human populations (19,20,21,22,23,24). This leaves selection pressure acting on already existing alleles and population expansion, replacement and admixture.
The idea that selection takes many millennia to produce noticeable changes in a population is at odds with data indicating that within human populations, changes in the means of traits such as height and age at menarche can occur due to selection in as little as five decades (25). Selection operating over the course of millennia (as in the present case) would be expected to produce quite considerable change. As was discussed in the Introduction, Holocene populations, especially those in Eurasia, appear to have undergone accelerated adaptive evolution relative to those living in the 1 1 Pleistocene ( It has been noted that concomitant in time with the apparent increase in POLY COG observed presently is a parallel selection trend favouring smaller brains (33). These trends may at first appear to be contradictory, as GMA is positively associated with brain volume (34). It must be noted however that meta-analyses reveal that the association is weak (ρ=0. 26), indicating that the majority of the variance in brain volume is unrelated to variance in GMA (34). Furthermore studies utilizing POLY COG have found that it does not predict variation in brain volume, despite both POLY COG and brain volume making independent contributions to GMA (35). These findings suggest that the decline in brain volume during the Holocene may have been a consequence of enhanced brain efficiency stemming from increased corticalization and neuronal connectivity, with more bioenergetically optimized brains simply requiring less mass to achieve greater processing power. It may therefore be this process that the increase in POLY COG is tracking in our samples. A prediction stemming from (33) is that variants that are uniquely predictive of brain volume should show the opposite trend in time to POLY COG when examined in the context of the present samples.
A limitation on the present work that needs to be discussed is linkage disequilibrium Since most GWAS hits are actually tag SNPs, decay in LD implies that the causal SNPs will be further removed from the tag SNPs. However, the tag SNPs will 1 4 resemble a sample of random SNPs. Since the frequency of the average SNP allele is 50%, the GWAS hits will tend to converge towards an average frequency of 50%, with increasing LD decay. The implication of this for our analysis is that our estimates of the difference in POLY COG are therefore conservative, when these have a frequency lower than 50%, whilst the opposite is true when they have higher The EUR sample, which is the most closely related to the ancient genomes (17)

Read realignment and base recalibration.
After removing duplicate reads the reads were re-aligned around the known indels from the 1000-genome sample using the GenomeAnalysisTKLite-2.3-9 toolkit. The known indels set was obtained from the GATK resource page (42). After performing realignment the base re-calibration step is performed, as is recommended (43,44). After recalibration, the quality score of 1 7 each base becomes more accurate. Known variant position is taken into account to recalibrate the quality score.
Variant calling and computing POLY COG . After performing re-alignment the GenomeAnalysisTKLite-2.3-9 toolkit UnifiedGenotyper was used to identify the single nucleotide variants (SNPs) and short indels. The positive allele counts and allele frequencies of the target SNPs were calculated from the Ancient genome VCF files and also from the 1000Genome VCF files using a custom made PERL script (available on request). Variant calling was used to construct a POLY COG score from a list of nucleotides that were found to be genome wide significant 'hits' for educational attainment in a recent large-scale meta-analytic GWAS study, which predicted 3.6% of the variance in GCA (14). After excluding those variants that were completely absent from the ancient genomes (likely due to coverage artefacts) the resultant POLY COG score consisted of 130 'hits' that were present to some degree in both the ancient and modern genomes. A second POLY COG score was constructed using data from a GCTA GREML study of GCA and related phenotypes (6). In this study 1,115 SNPs reached GWAS significance, of which 15 were independent signals. Four SNPs were absent from the ancient genomes thus a POLY COG was calculated using 11 SNPs. A third POLY COG score was constructed utilizing a technique developed in (18,19), employing only the nine GWAS hits from (14) that were in close linkage disequilibrium (linkage cut-off r>0.8, 500kb linkage window) with 'hits' predicting educational attainment across two other large GWAS studies (6,12). Linkage was determined using the NIH LDLink program (45) with the 1000 Genomes Phase3 CEU population as a reference group. These are presented in Table 2. By focussing on only the smaller numbers of linked replicates, the third POLY COG score was developed as a 1 8 means of negating the likely high numbers of false positive SNPs present in larger POLY COG scores. Our approach is based on maximizing the signal to noise ratio of directional selection by focusing on the GWAS hits with the highest significance (or replication rate across publications), as opposed to maximizing the amount of variance explained, which instead is the aim of the classical, within population GWAS design. Table 2.
Nine linked replicated SNPs with linkage disequilibrium (D') and R 2 coefficients.