Hostname: page-component-89b8bd64d-x2lbr Total loading time: 0 Render date: 2026-05-07T09:03:25.416Z Has data issue: false hasContentIssue false

ReMo-SNPs: a new software tool for identification of polymorphisms in regions and motifs genome-wide

Published online by Cambridge University Press:  17 April 2015

LISETTE GRAAE
Affiliation:
Department of Neuroscience, Karolinska Institutet, Retzius väg 8, 171 77 Stockholm
SILVIA PADDOCK
Affiliation:
Rose Li and Associates, Inc., Bethesda, MD, USA
ANDREA CARMINE BELIN*
Affiliation:
Department of Neuroscience, Karolinska Institutet, Retzius väg 8, 171 77 Stockholm
*
* Corresponding author:Andrea.Carmine.Belin@ki.se
Rights & Permissions [Opens in a new window]

Summary

Studies of complex genetic diseases have revealed many risk factors of small effect, but the combined amount of heritability explained is still low. Genome-wide association studies are often underpowered to identify true effects because of the very large number of parallel tests. There is, therefore, a great need to generate data sets that are enriched for those markers that have an increased a priori chance of being functional, such as markers in genomic regions involved in gene regulation. ReMo-SNPs is a computational program developed to aid researchers in the process of selecting functional SNPs for association analyses in user-specified regions and/or motifs genome-wide. The useful feature of automatic selection of genotyped markers in the user-provided material makes the output data ready to be used in a following association study. In this article we describe the program and its functions. We also validate the program by including an example study on three different transcription factors and results from an association study on two psychiatric phenotypes. The flexibility of the ReMo-SNPs program enables the user to study any region or sequence of interest, without limitation to transcription factor binding regions and motifs. The program is freely available at: http://www.neuro.ki.se/ReMo-SNPs/

Information

Type
Research Papers
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © Cambridge University Press 2015
Figure 0

Table 1. Number of individuals included in the final data sets for the association analyses.

Figure 1

Fig. 1. Flowchart illustrating an overview of the input, action and output parts of the ReMo-SNPs program.

Figure 2

Table 2. Results from the long run with the ReMo-SNPs program.

Figure 3

Table 3. Summary of individuals and SNPs excluded in each quality control step.

Figure 4

Table 4. Association results showing the top associated SNP for each data set.

Figure 5

Fig. 2. Motif-SNPs placed within vs. outside experimentally verified transcription factor binding regions for (a) Regulome, (b) SNP Function Annotation Portal and (c) SNP Function Prediction. The score-values on the y-axis are unique for each program and could therefore not be compared between the different programs. Data is presented as mean ± standard error of the mean, ** = p < 0·01, *** = p < 0·0001.

Figure 6

Fig. 3. Assessment of SNP densities in regions and motifs of interest compared to the genome at large. Average SNP density in the human genome of the CEU population and in the binding regions and motifs for the three transcription factors, GR, PPAR and VDR. Data is presented as *** = p < 0·0001.

Figure 7

Fig. 4. The distribution of SNPs found at different positions within the motif. The bars represent the six different nucleotide positions within the motifs and the y-axis shows the amount of SNPs in percent found for each position normalized to the total number of SNPs found for each motif. n = any nucleotide, A, T, G or C; R = A or G; K = T or G; and S = C or G.

Figure 8

Fig. 5. The number of SNPs found per motif for each transcription factor (a) GR, (b) PPAR and (c) VDR.