-
American Journal of Human Genetics Jan 2019Functional genomics data has the potential to increase GWAS power by identifying SNPs that have a higher prior probability of association. Here, we introduce a method...
Functional genomics data has the potential to increase GWAS power by identifying SNPs that have a higher prior probability of association. Here, we introduce a method that leverages polygenic functional enrichment to incorporate coding, conserved, regulatory, and LD-related genomic annotations into association analyses. We show via simulations with real genotypes that the method, functionally informed novel discovery of risk loci (FINDOR), correctly controls the false-positive rate at null loci and attains a 9%-38% increase in the number of independent associations detected at causal loci, depending on trait polygenicity and sample size. We applied FINDOR to 27 independent complex traits and diseases from the interim UK Biobank release (average N = 130K). Averaged across traits, we attained a 13% increase in genome-wide significant loci detected (including a 20% increase for disease traits) compared to unweighted raw p values that do not use functional data. We replicated the additional loci in independent UK Biobank and non-UK Biobank data, yielding a highly statistically significant replication slope (0.66-0.69) in each case. Finally, we applied FINDOR to the full UK Biobank release (average N = 416K), attaining smaller relative improvements (consistent with simulations) but larger absolute improvements, detecting an additional 583 GWAS loci. In conclusion, leveraging functional enrichment using our method robustly increases GWAS power.
Topics: Calibration; Databases, Genetic; Datasets as Topic; False Positive Reactions; Genome-Wide Association Study; Humans; Multifactorial Inheritance; Polymorphism, Single Nucleotide; Probability; Reproducibility of Results; United Kingdom
PubMed: 30595370
DOI: 10.1016/j.ajhg.2018.11.008 -
Nature Genetics Nov 2020Here, we present a joint-tissue imputation (JTI) approach and a Mendelian randomization framework for causal inference, MR-JTI. JTI borrows information across... (Comparative Study)
Comparative Study
Here, we present a joint-tissue imputation (JTI) approach and a Mendelian randomization framework for causal inference, MR-JTI. JTI borrows information across transcriptomes of different tissues, leveraging shared genetic regulation, to improve prediction performance in a tissue-dependent manner. Notably, JTI includes the single-tissue imputation method PrediXcan as a special case and outperforms other single-tissue approaches (the Bayesian sparse linear mixed model and Dirichlet process regression). MR-JTI models variant-level heterogeneity (primarily due to horizontal pleiotropy, addressing a major challenge of transcriptome-wide association study interpretation) and performs causal inference with type I error control. We make explicit the connection between the genetic architecture of gene expression and of complex traits and the suitability of Mendelian randomization as a causal inference strategy for transcriptome-wide association studies. We provide a resource of imputation models generated from GTEx and PsychENCODE panels. Analysis of biobanks and meta-analysis data, and extensive simulations show substantially improved statistical power, replication and causal mapping rate for JTI relative to existing approaches.
Topics: Animals; Gene Expression Profiling; Genetic Association Studies; Humans; Lipoproteins, LDL; Mendelian Randomization Analysis; Mice; Models, Genetic; Multifactorial Inheritance; Predictive Value of Tests
PubMed: 33020666
DOI: 10.1038/s41588-020-0706-2 -
Current Protocols in Human Genetics Dec 2019Genome-wide variation data with millions of genetic markers have become commonplace. However, the potential for interpretation and application of these data for clinical... (Review)
Review
Genome-wide variation data with millions of genetic markers have become commonplace. However, the potential for interpretation and application of these data for clinical assessment of outcomes of interest, and prediction of disease risk, is currently not fully realized. Many common complex diseases now have numerous, well-established risk loci and likely harbor many genetic determinants with effects too small to be detected at genome-wide levels of statistical significance. A simple and intuitive approach for converting genetic data to a predictive measure of disease susceptibility is to aggregate the effects of these loci into a single measure, the genetic risk score. Here, we describe some common methods and software packages for calculating genetic risk scores and polygenic risk scores, with focus on studies of common complex diseases. We review the basic information needed, as well as important considerations for constructing genetic risk scores, including specific requirements for phenotypic and genetic data, and limitations in their application. © 2019 by John Wiley & Sons, Inc.
Topics: Disease; Genetic Markers; Genetic Predisposition to Disease; Genotype; Humans; Multifactorial Inheritance; Phenotype; Risk Factors; Software
PubMed: 31765077
DOI: 10.1002/cphg.95 -
Genetic Epidemiology Sep 2017Polygenic scores (PGS) summarize the genetic contribution of a person's genotype to a disease or phenotype. They can be used to group participants into different risk...
Polygenic scores (PGS) summarize the genetic contribution of a person's genotype to a disease or phenotype. They can be used to group participants into different risk categories for diseases, and are also used as covariates in epidemiological analyses. A number of possible ways of calculating PGS have been proposed, and recently there is much interest in methods that incorporate information available in published summary statistics. As there is no inherent information on linkage disequilibrium (LD) in summary statistics, a pertinent question is how we can use LD information available elsewhere to supplement such analyses. To answer this question, we propose a method for constructing PGS using summary statistics and a reference panel in a penalized regression framework, which we call lassosum. We also propose a general method for choosing the value of the tuning parameter in the absence of validation data. In our simulations, we showed that pseudovalidation often resulted in prediction accuracy that is comparable to using a dataset with validation phenotype and was clearly superior to the conservative option of setting the tuning parameter of lassosum to its lowest value. We also showed that lassosum achieved better prediction accuracy than simple clumping and P-value thresholding in almost all scenarios. It was also substantially faster and more accurate than the recently proposed LDpred.
Topics: Case-Control Studies; Computer Simulation; Databases, Genetic; Humans; Models, Genetic; Multifactorial Inheritance; Polymorphism, Single Nucleotide; Regression Analysis; Statistics as Topic
PubMed: 28480976
DOI: 10.1002/gepi.22050 -
Nature Genetics Oct 2022Single-cell RNA sequencing (scRNA-seq) provides unique insights into the pathology and cellular origin of disease. We introduce single-cell disease relevance score...
Single-cell RNA sequencing (scRNA-seq) provides unique insights into the pathology and cellular origin of disease. We introduce single-cell disease relevance score (scDRS), an approach that links scRNA-seq with polygenic disease risk at single-cell resolution, independent of annotated cell types. scDRS identifies cells exhibiting excess expression across disease-associated genes implicated by genome-wide association studies (GWASs). We applied scDRS to 74 diseases/traits and 1.3 million single-cell gene-expression profiles across 31 tissues/organs. Cell-type-level results broadly recapitulated known cell-type-disease associations. Individual-cell-level results identified subpopulations of disease-associated cells not captured by existing cell-type labels, including T cell subpopulations associated with inflammatory bowel disease, partially characterized by their effector-like states; neuron subpopulations associated with schizophrenia, partially characterized by their spatial locations; and hepatocyte subpopulations associated with triglyceride levels, partially characterized by their higher ploidy levels. Genes whose expression was correlated with the scDRS score across cells (reflecting coexpression with GWAS disease-associated genes) were strongly enriched for gold-standard drug target and Mendelian disease genes.
Topics: Gene Expression Profiling; Genome-Wide Association Study; Multifactorial Inheritance; RNA-Seq; Single-Cell Analysis; Triglycerides
PubMed: 36050550
DOI: 10.1038/s41588-022-01167-z -
Human Genomics Mar 2023Congenital hydrocephalus is characterized by ventriculomegaly, defined as a dilatation of cerebral ventricles, and thought to be due to impaired cerebrospinal fluid...
BACKGROUND
Congenital hydrocephalus is characterized by ventriculomegaly, defined as a dilatation of cerebral ventricles, and thought to be due to impaired cerebrospinal fluid (CSF) homeostasis. Primary congenital hydrocephalus is a subset of cases with prenatal onset and absence of another primary cause, e.g., brain hemorrhage. Published series report a Mendelian cause in only a minority of cases. In this study, we analyzed exome data of PCH patients in search of novel causal genes and addressed the possibility of an underlying oligogenic mode of inheritance for PCH.
MATERIALS AND METHODS
We sequenced the exome in 28 unrelated probands with PCH, 12 of whom from families with at least two affected siblings and 9 of whom consanguineous, thereby increasing the contribution of genetic causes. Patient exome data were first analyzed for rare (MAF < 0.005) transmitted or de novo variants. Population stratification of unrelated PCH patients and controls was determined by principle component analysis, and outliers identified using Mahalanobis distance 5% as cutoff. Patient and control exome data for genes biologically related to cilia (SYScilia database) were analyzed by mutation burden test.
RESULTS
In 18% of probands, we identify a causal (pathogenic or likely pathogenic) variant of a known hydrocephalus gene, including genes for postnatal, syndromic hydrocephalus, not previously reported in isolated PCH. In a further 11%, we identify mutations in novel candidate genes. Through mutation burden tests, we demonstrate a significant burden of genetic variants in genes coding for proteins of the primary cilium in PCH patients compared to controls.
CONCLUSION
Our study confirms the low contribution of Mendelian mutations in PCH and reports PCH as a phenotypic presentation of some known genes known for syndromic, postnatal hydrocephalus. Furthermore, this study identifies novel Mendelian candidate genes, and provides evidence for oligogenic inheritance implicating primary cilia in PCH.
Topics: Female; Pregnancy; Humans; Multifactorial Inheritance; Mutation; Hydrocephalus; Consanguinity; Databases, Factual
PubMed: 36859317
DOI: 10.1186/s40246-023-00464-w -
Nature Genetics Sep 2022The genetic etiology of autism spectrum disorder (ASD) is multifactorial, but how combinations of genetic factors determine risk is unclear. In a large family sample, we...
The genetic etiology of autism spectrum disorder (ASD) is multifactorial, but how combinations of genetic factors determine risk is unclear. In a large family sample, we show that genetic loads of rare and polygenic risk are inversely correlated in cases and greater in females than in males, consistent with a liability threshold that differs by sex. De novo mutations (DNMs), rare inherited variants and polygenic scores were associated with various dimensions of symptom severity in children and parents. Parental age effects on risk for ASD in offspring were attributable to a combination of genetic mechanisms, including DNMs that accumulate in the paternal germline and inherited risk that influences behavior in parents. Genes implicated by rare variants were enriched in excitatory and inhibitory neurons compared with genes implicated by common variants. Our results suggest that a phenotypic spectrum of ASD is attributable to a spectrum of genetic factors that impact different neurodevelopmental processes.
Topics: Autism Spectrum Disorder; Autistic Disorder; Child; Family; Female; Genetic Predisposition to Disease; Humans; Male; Multifactorial Inheritance
PubMed: 35654974
DOI: 10.1038/s41588-022-01064-5 -
Proceedings of the National Academy of... Aug 2023Autism spectrum disorder (ASD) has a complex genetic architecture involving contributions from both de novo and inherited variation. Few studies have been designed to...
Autism spectrum disorder (ASD) has a complex genetic architecture involving contributions from both de novo and inherited variation. Few studies have been designed to address the role of rare inherited variation or its interaction with common polygenic risk in ASD. Here, we performed whole-genome sequencing of the largest cohort of multiplex families to date, consisting of 4,551 individuals in 1,004 families having two or more autistic children. Using this study design, we identify seven previously unrecognized ASD risk genes supported by a majority of rare inherited variants, finding support for a total of 74 genes in our cohort and a total of 152 genes after combined analysis with other studies. Autistic children from multiplex families demonstrate an increased burden of rare inherited protein-truncating variants in known ASD risk genes. We also find that ASD polygenic score (PGS) is overtransmitted from nonautistic parents to autistic children who also harbor rare inherited variants, consistent with combinatorial effects in the offspring, which may explain the reduced penetrance of these rare variants in parents. We also observe that in addition to social dysfunction, language delay is associated with ASD PGS overtransmission. These results are consistent with an additive complex genetic risk architecture of ASD involving rare and common variation and further suggest that language delay is a core biological feature of ASD.
Topics: Child; Humans; Autism Spectrum Disorder; Multifactorial Inheritance; Parents; Whole Genome Sequencing; Language Development Disorders; Genetic Predisposition to Disease
PubMed: 37506195
DOI: 10.1073/pnas.2215632120 -
BMC Musculoskeletal Disorders Apr 2020Klippel-Feil syndrome (KFS) represents a rare anomaly characterized by congenital fusion of the cervical vertebrae. The underlying molecular etiology remains largely...
BACKGROUND
Klippel-Feil syndrome (KFS) represents a rare anomaly characterized by congenital fusion of the cervical vertebrae. The underlying molecular etiology remains largely unknown because of the genetic and phenotypic heterogeneity.
METHODS
We consecutively recruited a Chinese cohort of 37 patients with KFS. The clinical manifestations and radiological assessments were analyzed and whole-exome sequencing (WES) was performed. Additionally, rare variants in KFS cases and controls were compared using genetic burden analysis.
RESULTS
We primarily examined rare variants in five reported genes (GDF6, MEOX1, GDF3, MYO18B and RIPPLY2) associated with KFS and detected three variants of uncertain significance in MYO18B. Based on rare variant burden analysis of 96 candidate genes related to vertebral segmentation defects, we identified BAZ1B as having the highest probability of association with KFS, followed by FREM2, SUFU, VANGL1 and KMT2D. In addition, seven patients were proposed to show potential oligogenic inheritance involving more than one variants in candidate genes, the frequency of which was significantly higher than that in the in-house controls.
CONCLUSIONS
Our study presents an exome-sequenced cohort and identifies five novel genes potentially associated with KFS, extending the spectrum of known mutations contributing to this syndrome. Furthermore, the genetic burden analysis provides further evidence for potential oligogenic inheritance of KFS.
Topics: Adolescent; Adult; Case-Control Studies; Cervical Vertebrae; Child; Child, Preschool; Female; Humans; Klippel-Feil Syndrome; Male; Multifactorial Inheritance; Mutation; Pedigree; Radiography; Transcription Factors; Young Adult
PubMed: 32278351
DOI: 10.1186/s12891-020-03229-x -
Trends in Genetics : TIG Sep 2022Most large-scale genetic studies of autism have focused on the discovery of genes by proving an enrichment of de novo mutations (DNMs) in autism probands or... (Review)
Review
Most large-scale genetic studies of autism have focused on the discovery of genes by proving an enrichment of de novo mutations (DNMs) in autism probands or characterizing polygenic risk based on the association of common variants. We present evidence in support of an oligogenic model where two or more ultrarare mutations of more modest effect are preferentially transmitted to children with autism. Such private gene-disruptive mutations are enriched in families where there are multiple affected individuals, emerged two or three generations ago, and map to genes not previously associated with autism. Although no single gene has reached statistical significance, this class of variation should be considered along with genetic and nongenetic factors to better explain the etiology of this complex trait.
Topics: Autistic Disorder; Child; Genetic Predisposition to Disease; Humans; Multifactorial Inheritance; Mutation
PubMed: 35410794
DOI: 10.1016/j.tig.2022.03.009