Genetic architecture of schizophrenia: a review of major advancements

Abstract Schizophrenia is a severe psychiatric disorder with high heritability. Consortia efforts and technological advancements have led to a substantial increase in knowledge of the genetic architecture of schizophrenia over the past decade. In this article, we provide an overview of the current understanding of the genetics of schizophrenia, outline remaining challenges, and summarise future directions of research. World-wide collaborations have resulted in genome-wide association studies (GWAS) in over 56 000 schizophrenia cases and 78 000 controls, which identified 176 distinct genetic loci. The latest GWAS from the Psychiatric Genetics Consortium, available as a pre-print, indicates that 270 distinct common genetic loci have now been associated with schizophrenia. Polygenic risk scores can currently explain around 7.7% of the variance in schizophrenia case-control status. Rare variant studies have implicated eight rare copy-number variants, and an increased burden of loss-of-function variants in SETD1A, as increasing the risk of schizophrenia. The latest exome sequencing study, available as a pre-print, implicates a burden of rare coding variants in a further nine genes. Gene-set analyses have demonstrated significant enrichment of both common and rare genetic variants associated with schizophrenia in synaptic pathways. To address current challenges, future genetic studies of schizophrenia need increased sample sizes from more diverse populations. Continued expansion of international collaboration will likely identify new genetic regions, improve fine-mapping to identify causal variants, and increase our understanding of the biology and mechanisms of schizophrenia.


Introduction
Schizophrenia is a severe and often chronic psychiatric disorder causing substantial personal and societal burden from severe and long-term disability. Schizophrenia is characterised by positive (e.g. hallucinations, delusions), negative (e.g. alogia, avolition, anhedonia), and disorganized symptoms (e.g. speech, behaviour) (Dollfus & Lyne, 2017), and is also associated with cognitive impairment. The typical onset of schizophrenia occurs in late adolescence to early adulthood, with a peak in prevalence around 40 years of age (Charlson et al., 2018). Individuals with schizophrenia have a 15-20 years shorter life expectancy in comparison to the general population (Tanskanen, Tiihonen, & Taipale, 2018).
Schizophrenia is highly heritable and our understanding of the genetic architecture of schizophrenia has greatly increased over the past decade. This progress has arisen primarily through advances in molecular genetics technology, making feasible and affordable large-scale genotyping and sequencing, and through world-wide collaboration amalgamating research samples, exemplified by the Psychiatric Genetic Consortium (PGC). This collaborative effort has led to the identification of hundreds of common risk variants, rare variants and copy number variants (CNVs), providing key insights into the biological basis of schizophrenia ( Fig. 1). triggered by high population growth in those areas (Charlson et al., 2018). The incidence of schizophrenia has been reported to be 15.2/100 000 persons, with a median male: female rate ratio of 1.4 (McGrath, Saha, Chant, & Welham, 2008).

Morbidity and mortality
Although a low prevalence disorder, schizophrenia plays a major role in the global burden of disease, contributing approximately 13 million years of life lived with disability (Charlson et al., 2018). Approximately 1 in 7 schizophrenia cases recover, based on clinical and social functioning indicators (Jaaskelainen et al., 2013). Poor outcomes are common and include premature mortality, long-term hospitalisation, treatment-resistance, and poor quality of life. The risk of suicide in those with schizophrenia is very high, with estimates of ∼1 in 3 people with schizophrenia attempting suicide during their lifetime (Pompili et al., 2007). Individuals with schizophrenia are also subject to social disability (i.e. relationships with others, self-care), with research evidence including over 15 years of follow up time found that 25% have a severe social disability (Wiersma et al., 2000).

Environmental factors
Environmental factors act throughout life to influence the likelihood of disease, are not necessarily causative factors but can be largely grouped by timing into three categories: early development, proximal, and onset factors (Stilo & Murray, 2019). Early development factors include obstetric complications (Cannon, Jones, & Murray, 2002), and advanced paternal age (Sipos et al., 2004). Proximal factors include social adversity, migration, and urbanicity. Meta-analyses have demonstrated migration as an important risk factor for schizophrenia in first-and second-generation migrants (Bourque, Van Der Ven, & Malla, 2011;Cantor-Graae & Selten, 2005), with a greater risk in migrants from developing countries (Cantor-Graae & Selten, 2005); pointing towards a role for psychosocial adversity in schizophrenia aetiology. Lastly, factors occurring around the time of schizophrenia onset include primarily drug abuse, trauma, and social adversity (Stilo & Murray, 2019).

Heritability
Heritability is a measure of the proportion of variation of a trait that is attributable to genetic inheritance (Young et al., 2018). Heritability estimates of schizophrenia are based on family studies (e.g. familial aggregation and twin studies) and vary based on the Fig. 1. Genetic architecture of schizophrenia. The risk allele frequency of SNPs and CNVs identified in Psychiatric Genomics Consortium (PGC) schizophrenia datasets are shown on the x-axis. The effect sizes of risk allele are shown on the y-axis. Many common alleles with small effect sizes have been identified (shown by diamonds), consistent with the common disease common variant hypothesis. Using genome-wide approaches, some rare copy number variants have also been identified (shown by squares), consistent with common disease rare variant hypothesis. Only a few less-common variants (0.01 < MAF ⩽ 0.05) have been identified due to large sample sizes. The effect size of identified risk allele is approximately inversely proportional to allele frequency. study methodology. While estimates have ranged from 41 to 87% (Chou et al., 2017), the current estimate of heritability of schizophrenia is approximately 80% (Owen, Sawa, & Mortensen, 2016).

Common genetic variation
Genome-wide association studies (GWAS) investigate millions of common genetic variants (or single nucleotide polymorphism, SNP) simultaneously to determine their association with a trait. Many of the early GWAS of schizophrenia were underpowered to identify common SNPs, which have typically been shown to have small effects in schizophrenia. Including the largest GWAS to date from the PGC, which is available as a pre-print, over 300 independent genome-wide significant variants (p ⩽ 5 × 10 −8 ) have been associated with schizophrenia. Figure 2 visualizes the history of schizophrenia GWAS reported by key publications. In European populations, common variant associations identified from GWAS explain around one-third (24%) of genetic liability to schizophrenia (Lee, DeCandia, Ripke, Yang, & Wray, 2012;Ripke et al., 2013; Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014).
The first reported genome-wide significant loci associated with schizophrenia came in 2009 (International Schizophrenia Consortium et al., 2009;Shi et al., 2009;Stefansson et al., 2009). In the subsequent decade, efforts by the PGC have spearheaded much of the common variant discoveries in schizophrenia genetics. The first GWAS by the PGC identified significant associations in seven loci, five of which were novel and two of which had been previously implicated (Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consortium, 2011). The major breakthrough for schizophrenia GWAS came from the second GWAS by the PGC by identifying 128 independent associations in 108 loci, 83 of which had not been reported previously (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014). A subsequent meta-analysis identified 179 independent associations mapping to 145 independent loci, 52 of which were not previously reported by the PGC . Utilizing GWAS fine mapping, chromosome conformation capture, and summary-data-based Mendelian randomisation analyses, potential causal genes were mapped to 33 loci . Among these genes that are potentially causally related to schizophrenia, SLC39A8 deficiency has also been associated with severe neurodevelopmental disorders supposed via impaired manganese transport and glycosylation. As oral galactose supplementation is a treatment option for complete normalization of glycosylation (Park et al., 2015), it emphasizes a therapeutic potential for schizophrenia. The largest GWAS to date is the third effort from the PGC, currently available as a pre-print, in 69 369 people with schizophrenia and 236 642 controls, identifying a total of 329 linkage disequilibrium-independent significant SNPs mapping to 270 distinct loci (Schizophrenia Working Group of the Psychiatric Genomics Consortium, Ripke, Walters, & O'Donovan, 2020). Fine-mapping analyses refined these signals to 130 genes that are likely to explain these associations.
One of the most robust common variant associations with schizophrenia, among European populations, has been the 6p22.1 locus. This locus was first reported in 2009 in three independent studies as a genome-wide significant association (Purcell, Wray, Stone, Visscher, O'Donovan, & Sullivan, 2009;Shi et al., 2009;Stefansson et al., 2009), and has been repeatedly confirmed since then in the major studies reported in Fig. 2. The association at this locus, in an extended region around the Major Histocompatibility Complex, is likely to reflect causal variants located in at least three independent regions, one of which involves alleles of complement component 4 (C4) (Sekar et al., 2016). The associated alleles promote greater expression of C4A in the brain. This locus was also reported in studies in Chinese populations (Yue et al., 2011;Li et al., 2017), however, a subsequent meta-analysis of the largest East Asian schizophrenia GWAS (Lam et al., 2019) did not replicate this finding. Many genetic studies have demonstrated population-specific characteristics due to differences in underlying genetic architecture and environmental exposures, highlighting the importance of investigating trait/disease heterogeneity in population genetic studies. There have been GWAS studies reported in Indian (Periyasamy et al., 2019), African American (Fiorica & Wheeler, 2019) and Latin American (Bigdeli et al., 2019) populations but, to date, large scale schizophrenia GWAS has primarily been investigated in people of European ancestry and more recently East Asian ancestries. The largest schizophrenia GWAS in individuals of East Asian ancestries reported 21 independent associations in 19 loci, which included the top three associations shared with the European studies, and an additional 14 associations compared to the previous study of Chinese ancestry (Lam et al., 2019). The subsequent meta-analysis of East Asian and European ancestries reported 208 independent associations in 176 independent genetic loci, 53 of which were novel. These findings suggest that although the common genetic basis for schizophrenia is largely shared between populations, there are also likely to be populationspecific risk variants driven by underlying differences in linkage disequilibrium and/or allele frequency.

Polygenic risk score (PRS) prediction of schizophrenia
PRSs have emerged as an informative tool for studying the effects of genetic liability and may potentially be a clinically useful application of GWAS results. PRS is a single measure of the cumulative effects of common variants associated with a disorder, with higher scores indicating higher genetic liability (Lewis & Vassos, 2020). The variance explained varies depending on the GWAS p value threshold used for calculating the PRS (from all SNPs to only genome-wide significant SNPs); a p value threshold of <0.05 from currently powered GWAS explains the greatest amount of variance in schizophrenia case-control status (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014; Schizophrenia Working Group of the Psychiatric Genomics Consortium et al., 2020). First demonstrated in 2009 , PRS can currently explain around 7.7% of the variance in schizophrenia case-control status (Schizophrenia Working Group of the Psychiatric Genomics Consortium et al., 2020). However, the ability of PRS to explain schizophrenia casecontrol status is reduced in samples from a health care setting compared with the original GWAS research cohorts (Zheutlin et al., 2019). Although the ability of PRS to predict schizophrenia is currently insufficient for diagnostic purposes, there may be a greater promise for sampling individuals at the extreme ends of the PRS distribution. How schizophrenia PRS could be applied clinically remains unclear and in contrast to other common diseases that have benefited from PRS, such as coronary artery disease and type 2 diabetes, there is currently no preventative strategy in place for schizophrenia (Lewis & Vassos, 2020).
A key challenge in PRS analysis is the application across the major different ancestral populations. PRS derived from alleles discovered in GWAS of Europeans ancestries explain less variance when applied to African and Asian populations than in European ancestry samples, likely due to differences in allele frequencies and linkage disequilibrium structures (Curtis, 2018;Martin et al., 2019). PRS prediction of schizophrenia computed from European ancestry populations has been reported to be only 45% as accurate in East Asians compared to European individuals, despite a broadly shared genetic aetiology (Lam et al., 2019). The performance of PRS derived from a European ancestry GWAS applied to individuals with an African ancestry has been demonstrated to be particularly poor (Curtis, 2018;Vassos et al., 2017). Given the vast majority of genetic studies are conducted in populations of European ancestry, greater diversity in GWAS must be prioritised to realise the potential of PRS .
Dissecting the clinical heterogeneity of schizophrenia using PRS PRS has been used to disentangle the clinical heterogeneity of schizophrenia by investigating the phenotypic markers of genetic liability, particularly in relation to treatment outcomes, symptom severity, and cognitive ability . A higher PRS for schizophrenia has been associated with a more chronic illness course, indexed by the number and length of hospital admissions (Meier et al., 2016). Results from studies investigating an association between schizophrenia PRS and treatment-resistance to antipsychotics are conflicting, indicating that outcome specific PRSs are likely to be required (Frank et al., 2015;Horsdal et al., 2017;Kowalec et al., 2019;Legge et al., 2020;Zhang et al., 2019).
Schizophrenia PRS has been associated with negative and disorganised symptom dimensions in individuals with schizophrenia, although the reported variance explained is small (Bipolar Disorder and Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2018; Fanous et al., 2012;Jonas et al., 2019). There is no evidence to suggest that schizophrenia PRS is associated with the severity of positive symptoms in individuals with schizophrenia, but it has been associated with the presence of psychotic symptoms in first-episode samples and bipolar disorder (Allardyce et al., 2018;Ruderfer et al., 2014). There has been conflicting findings from studies investigating the association between schizophrenia PRS and the cognitive deficits observed in individuals with schizophrenia (Dickinson et al., 2020;Richards et al., 2020).
A further area of interest has been the relationship between PRS and intermediate phenotypes relevant to schizophrenia such as neuroimaging measures. A higher schizophrenia PRS has been associated with lower connectivity in areas disrupted in individuals with schizophrenia (Cao, Zhou, & Cannon, 2020). There does not currently appear to be any strong evidence for an association between PRS and brain structural changes relevant to schizophrenia (van der Merwe et al., 2019).

Shared common genetic heritability with other psychiatric disorders
Common genetic liability for schizophrenia has been robustly found to have pleiotropic effects on related psychiatric disorders (Fig. 3). Schizophrenia has significant genetic correlations with bipolar disorder (r g = 0.68), major depressive disorder (r g = 0.34), obsessive compulsive disorder (r g = 0.33), ADHD (r g = 0.22), anorexia nervosa (r g = 0.22), and autism spectrum disorder (r g = 0.21) (Brainstorm Consortium et al., 2018). In addition, schizophrenia had a positive genetic correlation with neuroticism (r g = 0.19), negative correlations with subjective well-being (r g = −0.30), body mass index (r g = −0.10), and intelligence (r g = −0.20) (Brainstorm Consortium et al., 2018). These results are also consistent in non-European populations (Lam et al., 2019), and in a PRS analysis of a large real-world sample of patients from four US-based healthcare systems (Zheutlin et al., 2019). These findings indicate that a substantial proportion of the common genetic architecture among psychiatric disorders and cognition is shared and has implications for the validity of current clinical diagnostic boundaries. These findings are supported by the observation that schizophrenia is often comorbid with other psychiatric disorders, with the highest rates for substance use disorders (Regier et al., 1990) and mood disorders (Buckley, Miller, Lehrer, & Castle, 2009). Possible explanations for this pleiotropy include the presence of a general genetic psychopathology factor that increases the risk for multiple psychiatric disorders or that additional genetic and environmental factors influence the eventual clinical presentation .

SNP-based heritability estimates
Current estimates place the SNP-based heritability, or the phenotypic variance due to genetic variation tagged by polymorphisms derived from array genotyping, at approximately 24% ; Schizophrenia Working Group of the Psychiatric Genomics Consortium et al., 2020). The disparity in heritability estimates from family studies and GWAS indicates there are susceptibility genes yet to be identified and has led to a search for the 'missing' heritability. There are several potential sources of missing heritability including common variants that current studies are not yet powered to detect, but also gene-environment interactions, epigenetic variation, and rare genetic variation.

Rare genetic variation
Rare variants are typically defined as those with a minor allele frequency <1% and include single nucleotide variants (SNVs), altering one or a small number of bases, and insertion-deletion variants, which can vary in size from those affecting single nucleotides to those classified as CNVs that affect thousands or millions of bases. CNV studies to date are typically based on the same genotyping platforms used to conduct GWAS, while other rare variants have largely been studied using whole exome sequencing (WES). Whole genome sequencing can capture both types of variants as well as other forms of structural changes, but to date, such studies in schizophrenia are small in scale. Rare variant studies are poised to be highly informative by pinpointing causal genes and understanding molecular and cellular aspects of the disease process (Sullivan & Geschwind, 2019). Including the most recent exome sequencing effort reported in pre-peer reviewed format, 10 genes have been implicated through exome sequencing studies to date and eight rare CNVs have been associated with schizophrenia.

SNVs and small insertion deletion (indel) variants
The current sample sizes included in WES cohorts are insufficiently powered to detect significant associations at the single variant level. However, an alternative statistical strategy exists, whereby variants are pooled together at the gene or genome level and then the burden of those variants can be compared between cases and controls. A WES study demonstrated an enrichment of damaging ultra-rare variants in 2536 schizophrenia cases, compared with 2543 controls (Purcell et al., 2014). Similar findings were reported from an extension of the same Swedish cohort of 12 332 individuals (4946 schizophrenia cases, 6242 controls) analysed along with 45 376 whole exomes from the Exome Fig. 3. Genetic correlations (r g ) of psychiatric disorders and related phenotypes with schizophrenia (Anttila et al., 2018). Error bars represent 95% confidence intervals.
Aggregation Consortium (ExAC), whereby an excess burden of disruptive and damaging (as predicted by bioinformatics) ultra-rare variants was found in cases [odds ratio = 1.07, 95% confidence interval (CI) = 1.05-1.09] (Genovese et al., 2016). However, no individual gene was found to have a significant excess of damaging ultra-rare variants, demonstrating the highly polygenic nature of schizophrenia. Another WES study identified a significant association between SETD1A loss of function (LoF: variants predicted to result in the loss of protein-coding function) variants and schizophrenia by combining a whole-exome case-control sequencing study of 4264 schizophrenia cases, and 9343 controls, with de novo mutation analysis in 1077 parent-proband trios (Singh et al., 2016). A subsequent follow-up study found the burden of LoF variants was enriched in LoF intolerant gene sets in schizophrenia cases, compared to controls (odds ratio = 1.24, 95% CI 1.16-1.31), although no individual gene was implicated (Singh et al., 2017). Similarly, the recent population-specific WES studies of South African (Gulsuner et al., 2020) and Taiwanese (Howrigan et al., 2020) ancestry did not identify any implicated gene but found the burden of LoF variants to be enriched in highly brain-expressed and evolutionarily constrained genes. Currently, the largest rare exome sequencing effort is from the Schizophrenia Exome Sequencing Meta-Analysis (SCHEMA) Consortium, which pooled data from 24 248 schizophrenia cases and 97 322 controls. As reported in pre-peer reviewed format, 10 genes (including SETD1A), reach genome wide significance for an excess of ultra-rare variants predicted to be damaging in cases compared to controls (p < 2 × 10 −6 ) (Singh, Neale, & Daly, 2020), and a further 22 reached suggestive levels of significance as defined by a false discovery rate of 0.05.

De novo variants (DNVs)
DNVs are variants that are present in offspring as a result of a new mutation event and are therefore absent in the parents. Schizophrenia is associated with a reduction in reproductive fecundity, leading to the hypothesis that DNVs with large effect sizes play a role in the genetic aetiology. In a recent WES study of 3444 schizophrenia parent-proband trios, LoF DNVs were found to be significantly enriched in LoF intolerant genes, but no gene individually achieved exome wide significance for the enrichment of LoF DNVs . However, the burden of DNVs was significantly higher in genes previously associated with neurodevelopmental disorders, among which SLC6A1 encoding a gamma-aminobutyric acid (GABA) transporter, was significantly enriched with missense variants. In gene-set analysis, DNVs were enriched in evolutionary constrained genes and in those implicated in multiple neurodevelopmental disorders. A separate study also noted that the DNV burden in schizophrenia is smaller as compared to early-onset neurodevelopmental disorders (Howrigan et al., 2020).

Copy number variations (CNVs)
CNVs are either duplications or deletions, ranging from 50 base pairs to megabases in the genome and can span a whole gene or multiple genes in a region. CNVs have been consistently implicated in the aetiology of schizophrenia, with the first associated CNV for schizophrenia being a large deletion on chromosome 22q11.2, which confers a 20-fold increased risk, with approximately 25% of carriers develop schizophrenia. The largest CNV study to date in 21 094 cases and 20 227 controls found eight CNVs (six deletions and two duplications) to be significantly associated with schizophrenia ( Fig. 1) (CNV and Schizophrenia Working Groups of the Psychiatric Genomics Consortium, 2017). These eight loci collectively explain 0.85% of the variance in liability to schizophrenia, with 1.4% of the cases carrying these risk CNVs. This is consistent with other studies demonstrating large effect sizes of CNVs, even though the absolute number of cases with CNVs is small . Studies have shown that CNV penetrance (i.e. the proportion of CNV carriers demonstrating the phenotype of interest) in schizophrenia is lower compared with other neurodevelopmental disorders Vassos et al., 2010). The penetrance of CNVs in schizophrenia seems to be additionally influenced by the burden of common risk alleles (Tansey et al., 2016), with evidence supporting an additive joint effect between the two classes of risk variants (Bergen et al., 2019).

Functional annotation of genetic variants
The polygenic nature of schizophrenia imposes a challenge to understand how, where, and when genetic variation acts to increase the vulnerability of developing the disorder. Following up on the findings from genome-wide association approaches, the aim is to then gain biological insights by identifying genes, tissues, cells, and biological pathways associated with schizophrenia.

Enrichment of gene-sets
Gene-set analysis investigates whether sets of genes, grouped by biological pathway or expression in particular tissues, are enriched for variants associated with schizophrenia. Consistent with the hypothesis that schizophrenia is primarily a disorder of neuronal dysfunction, genes highly expressed in the brain, mainly in cortical inhibitory interneurons and excitatory neurons from cerebral cortex and hippocampus, are strongly enriched for SNPs associated with schizophrenia Schizophrenia Working Group of the Psychiatric Genomics Consortium et al., 2020). Other notable gene-set findings included an association within a target gene of antipsychotic drugs (DRD2), genes involved in glutamatergic neurotransmission and synaptic plasticity (e.g. GRM3, GRIN2A, SRR, GRIA1) and additional genes encoding voltage-gated calcium channel subunits (CACNA1C, CACNB2 and CACNA1I) (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014).
Likewise, analyses of gene ontology classifications for schizophrenia have demonstrated enrichment in histone methylation and synaptic pathways, in particular, postsynaptic proteins and structures have been enriched for all classes of risk variants (Kirov et al., 2012;Schizophrenia Working Group of the Psychiatric Genomics Consortium et al., 2020;Singh et al., 2020). Notably, as the sample sizes have increased in genetic studies, enrichment analyses have shown convergence of rare and common variants findings, pointing to concordant genes (e.g. GRIN2A, SP4, STAG1, and FAM120A) and pathways, which strongly supports their relevance in schizophrenia.

Transcriptome-wide association studies (TWAS)
TWAS use expression quantitative trait loci (eQTLsvariants that affect expression) of genes in a specific tissue to predict the expression of that gene. Upon merging this information with

Psychological Medicine 2173
GWAS summary statistics, we can infer which genes might be causally related to schizophrenia and whether they are up or downregulated. A TWAS using genetic data from the second PGC GWAS and expression data from brain, blood, and adipose tissues identified 157 genes to be associated with schizophrenia, with 42 associated with chromatin organization, highlighting that regions responsible for gene expression regulation can be potential targets (Gusev et al., 2018). This was subsequently followed by another study reporting 107 genes, of which 11 were also differentially expressed and in the same direction as found previously in schizophrenia brain samples (Gandal et al., 2018). A more recent TWAS evaluated 5301 genes with cis-heritable expression (variance explained by SNPs close to the gene) in the dorsolateral prefrontal cortex and identified 89 genic associations, 20 of which were novel (Hall et al., 2020).

Transcriptomic and single-cell studies
Although TWAS are able to indicate possible causal genes from GWAS results, TWAS cannot directly extract time-specific information of disease nor which kind of cell type is involved. As an initial step to rectify this gap, the Common Mind Consortium generated transcriptomic data from dorsolateral prefrontal cortex (258 schizophrenia cases, 279 controls) and intersected with 142 GWAS associations, to demonstrate an overlap of 20 variants potentially influencing gene expression of one or more genes (Wang et al., 2018).
Recently, single cell RNAseq data have offered a new interpretation of GWAS results. For schizophrenia, PsychENCODEgenerated single cell RNAseq data identified spatiotemporal loci (Li et al., 2018) and mapped the associated genomic loci to pyramidal cells, medium spiny neurons, and interneurons in adult cortical cells (Skene et al., 2018) and to neural progenitors (Li et al., 2018), oligodendrocyte precursors and fetal microglia (Polioudakis et al., 2019).

Future perspectives and increasing diversity
In the next few years, we should expect that larger sample size GWAS and rare variant studies will identify more genes and refine the biological processes implicated in schizophrenia. Similarly, efforts to improve functional annotations provided by newer iterations of consortia such as PsychENCODE, GTEx, CommonMind Consortium and others will aid further interpretation of GWAS results. It is hoped that these insights could contribute to drug discovery and personalised medicine efforts. Future research is also likely to focus on investigating how the different elements of genetic risk combine and how genetic risk interacts with environmental factors to ultimately cause schizophrenia.
As mentioned previously, it is of paramount importance to improve the diversity of schizophrenia genetic studies, by including non-European populations. By 2019, 79% of all GWAS samples (including non-schizophrenia studies) are from individuals of European descent . For schizophrenia, the latest published GWAS pre-print included ∼20% non-European samples, the majority of which were from East Asia (Schizophrenia Working Group of the Psychiatric Genomics Consortium et al., 2020). Increasing genetic diversity will ensure emergent clinical applications are applicable to all human populations, and at the same time, will increase the power to identify genetic associations and aid fine mapping (Li & Keating, 2014). As new ongoing consortia and biobanks are established, it is essential that ancestry disparities are addressed.

Conclusions
In the last two decades, the sample size for genetic studies of schizophrenia has increased from hundreds to hundreds of thousands of individuals, resulting in the identification of >300 common variants, 10 genes with a burden of rare coding variants, and at least 8 CNVs (Fig. 4). Genetic studies of schizophrenia

2174
Sophie E. Legge et al. provided the first compelling evidence of the value of GWAS in investigating common genetic risk for psychiatric disorders, which has led to an expansion of sample sizes and subsequently the identification of many genetic variants for other psychiatric disorders (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014). Moreover, GWAS demonstrated the polygenic nature of psychiatric disorders and how hundreds or thousands of variants contribute to liability to schizophrenia. Rare variant studies in schizophrenia have also made important achievements, including identification of overlap of implicated genes and gene sets with other neurodevelopmental disorders. This pleiotropy further adds to the existing complex picture emerging out of these studies (Sullivan & Geschwind, 2019). For the coming years, we should expect not only an increase in the sample sizes but also an increase in diversity of genetic studies, which should identify new regions, improve fine mapping and increase our understanding of the biology and mechanisms of schizophrenia. Likewise, as sequencing prices decrease, larger whole genome sequencing studies should identify other rare variants that play a role in the disease. Now that many genes have been implicated with schizophrenia, functional studies will be critical to better understand how and when schizophrenia vulnerability acts during the neurodevelopment.