-
Alzheimer's & Dementia : the Journal of... Jun 2024Alzheimer's disease (AD) prevalence increases with age, yet a small fraction of the population reaches ages > 100 years without cognitive decline. We studied the genetic...
BACKGROUND
Alzheimer's disease (AD) prevalence increases with age, yet a small fraction of the population reaches ages > 100 years without cognitive decline. We studied the genetic factors associated with such resilience against AD.
METHODS
Genome-wide association studies identified 86 single nucleotide polymorphisms (SNPs) associated with AD risk. We estimated SNP frequency in 2281 AD cases, 3165 age-matched controls, and 346 cognitively healthy centenarians. We calculated a polygenic risk score (PRS) for each individual and investigated the functional properties of SNPs enriched/depleted in centenarians.
RESULTS
Cognitively healthy centenarians were enriched with the protective alleles of the SNPs associated with AD risk. The protective effect concentrated on the alleles in/near ANKH, GRN, TMEM106B, SORT1, PLCG2, RIN3, and APOE genes. This translated to >5-fold lower PRS in centenarians compared to AD cases (P = 7.69 × 10), and 2-fold lower compared to age-matched controls (P = 5.83 × 10).
DISCUSSION
Maintaining cognitive health until extreme ages requires complex genetic protection against AD, which concentrates on the genes associated with the endolysosomal and immune systems.
HIGHLIGHTS
Cognitively healthy cent enarians are enriched with the protective alleles of genetic variants associated with Alzheimer's disease (AD). The protective effect is concentrated on variants involved in the immune and endolysosomal systems. Combining variants into a polygenic risk score (PRS) translated to > 5-fold lower PRS in centenarians compared to AD cases, and ≈ 2-fold lower compared to middle-aged healthy controls.
Topics: Humans; Alzheimer Disease; Polymorphism, Single Nucleotide; Female; Male; Aged, 80 and over; Genome-Wide Association Study; Genetic Predisposition to Disease; Multifactorial Inheritance; Alleles; Case-Control Studies
PubMed: 38634500
DOI: 10.1002/alz.13810 -
Nature Genetics May 2024We report a multi-ancestry genome-wide association study on liver cirrhosis and its associated endophenotypes, alanine aminotransferase (ALT) and γ-glutamyl...
We report a multi-ancestry genome-wide association study on liver cirrhosis and its associated endophenotypes, alanine aminotransferase (ALT) and γ-glutamyl transferase. Using data from 12 cohorts, including 18,265 cases with cirrhosis, 1,782,047 controls, up to 1 million individuals with liver function tests and a validation cohort of 21,689 cases and 617,729 controls, we identify and validate 14 risk associations for cirrhosis. Many variants are located near genes involved in hepatic lipid metabolism. One of these, PNPLA3 p.Ile148Met, interacts with alcohol intake, obesity and diabetes on the risk of cirrhosis and hepatocellular carcinoma (HCC). We develop a polygenic risk score that associates with the progression from cirrhosis to HCC. By focusing on prioritized genes from common variant analyses, we find that rare coding variants in GPAM associate with lower ALT, supporting GPAM as a potential target for therapeutic inhibition. In conclusion, this study provides insights into the genetic underpinnings of cirrhosis.
Topics: Humans; Liver Cirrhosis; Genome-Wide Association Study; Genetic Predisposition to Disease; Liver Neoplasms; Carcinoma, Hepatocellular; Alanine Transaminase; Polymorphism, Single Nucleotide; Male; Lipase; Female; gamma-Glutamyltransferase; Membrane Proteins; Cohort Studies; Case-Control Studies; Multifactorial Inheritance; Risk Factors; Genetic Variation
PubMed: 38632349
DOI: 10.1038/s41588-024-01720-y -
PLoS Genetics Apr 2024Population differences in risk of disease are common, but the potential genetic basis for these differences is not well understood. A standard approach is to compare...
Population differences in risk of disease are common, but the potential genetic basis for these differences is not well understood. A standard approach is to compare genetic risk across populations by testing for mean differences in polygenic scores, but existing studies that use this approach do not account for statistical noise in effect estimates (i.e., the GWAS betas) that arise due to the finite sample size of GWAS training data. Here, we show using Bayesian polygenic score methods that the level of uncertainty in estimates of genetic risk differences across populations is highly dependent on the GWAS training sample size, the polygenicity (number of causal variants), and genetic distance (FST) between the populations considered. We derive a Wald test for formally assessing the difference in genetic risk across populations, which we show to have calibrated type 1 error rates under a simplified assumption that all SNPs are independent, which we achieve in practise using linkage disequilibrium (LD) pruning. We further provide closed-form expressions for assessing the uncertainty in estimates of relative genetic risk across populations under the special case of an infinitesimal genetic architecture. We suggest that for many complex traits and diseases, particularly those with more polygenic architectures, current GWAS sample sizes are insufficient to detect moderate differences in genetic risk across populations, though more substantial differences in relative genetic risk (relative risk > 1.5) can be detected. We show that conventional approaches that do not account for sampling error from the training sample, such as using a simple t-test, have very high type 1 error rates. When applying our approach to prostate cancer, we demonstrate a higher genetic risk in African Ancestry men, with lower risk in men of European followed by East Asian ancestry.
Topics: Male; Humans; Bayes Theorem; Risk Factors; Linkage Disequilibrium; Multifactorial Inheritance; Prostatic Neoplasms; Genome-Wide Association Study; Genetic Predisposition to Disease; Polymorphism, Single Nucleotide
PubMed: 38630784
DOI: 10.1371/journal.pgen.1011212 -
Cancer Epidemiology, Biomarkers &... Jun 2024Previous studies have demonstrated that incorporating a polygenic risk score (PRS) to existing risk prediction models for breast cancer improves model fit, but to...
BACKGROUND
Previous studies have demonstrated that incorporating a polygenic risk score (PRS) to existing risk prediction models for breast cancer improves model fit, but to determine its clinical utility the impact on risk categorization needs to be established. We add a PRS to two well-established models and quantify the difference in classification using the net reclassification improvement (NRI).
METHODS
We analyzed data from 126,490 post-menopausal women of "White British" ancestry, aged 40 to 69 years at baseline from the UK Biobank prospective cohort. The breast cancer outcome was derived from linked registry data and hospital records. We combined a PRS for breast cancer with 10-year risk scores from the Tyrer-Cuzick and Gail models, and compared these to the risk scores from the models using phenotypic variables alone. We report metrics of discrimination and classification, and consider the importance of the risk threshold selected.
RESULTS
The Harrell's C statistic of the 10-year risk from the Tyrer-Cuzick and Gail models was 0.57 and 0.54, respectively, increasing to 0.67 when the PRS was included. Inclusion of the PRS gave a positive NRI for cases in both models [0.080 (95% confidence interval (CI), 0.053-0.104) and 0.051 (95% CI, 0.030-0.073), respectively], with negligible impact on controls.
CONCLUSIONS
The addition of a PRS for breast cancer to the well-established Tyrer-Cuzick and Gail models provides a substantial improvement in the prediction accuracy and risk stratification.
IMPACT
These findings could have important implications for the ongoing discussion about the value of PRS in risk prediction models and screening.
Topics: Humans; Female; Breast Neoplasms; Middle Aged; United Kingdom; Aged; Adult; Biological Specimen Banks; Risk Assessment; Prospective Studies; Risk Factors; Multifactorial Inheritance; Genetic Predisposition to Disease; Genetic Risk Score; UK Biobank
PubMed: 38630597
DOI: 10.1158/1055-9965.EPI-23-1432 -
PloS One 2024Detecting epistatic drivers of human phenotypes is a considerable challenge. Traditional approaches use regression to sequentially test multiplicative interaction terms...
Detecting epistatic drivers of human phenotypes is a considerable challenge. Traditional approaches use regression to sequentially test multiplicative interaction terms involving pairs of genetic variants. For higher-order interactions and genome-wide large-scale data, this strategy is computationally intractable. Moreover, multiplicative terms used in regression modeling may not capture the form of biological interactions. Building on the Predictability, Computability, Stability (PCS) framework, we introduce the epiTree pipeline to extract higher-order interactions from genomic data using tree-based models. The epiTree pipeline first selects a set of variants derived from tissue-specific estimates of gene expression. Next, it uses iterative random forests (iRF) to search training data for candidate Boolean interactions (pairwise and higher-order). We derive significance tests for interactions, based on a stabilized likelihood ratio test, by simulating Boolean tree-structured null (no epistasis) and alternative (epistasis) distributions on hold-out test data. Finally, our pipeline computes PCS epistasis p-values that probabilisticly quantify improvement in prediction accuracy via bootstrap sampling on the test set. We validate the epiTree pipeline in two case studies using data from the UK Biobank: predicting red hair and multiple sclerosis (MS). In the case of predicting red hair, epiTree recovers known epistatic interactions surrounding MC1R and novel interactions, representing non-linearities not captured by logistic regression models. In the case of predicting MS, a more complex phenotype than red hair, epiTree rankings prioritize novel interactions surrounding HLA-DRB1, a variant previously associated with MS in several populations. Taken together, these results highlight the potential for epiTree rankings to help reduce the design space for follow up experiments.
Topics: Humans; Epistasis, Genetic; Genome-Wide Association Study; Phenotype; Multifactorial Inheritance; Logistic Models; Polymorphism, Single Nucleotide
PubMed: 38625909
DOI: 10.1371/journal.pone.0298906 -
Nature Communications Apr 2024Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are...
Great efforts are being made to develop advanced polygenic risk scores (PRS) to improve the prediction of complex traits and diseases. However, most existing PRS are primarily trained on European ancestry populations, limiting their transferability to non-European populations. In this article, we propose a novel method for generating multi-ancestry Polygenic Risk scOres based on enSemble of PEnalized Regression models (PROSPER). PROSPER integrates genome-wide association studies (GWAS) summary statistics from diverse populations to develop ancestry-specific PRS with improved predictive power for minority populations. The method uses a combination of (lasso) and (ridge) penalty functions, a parsimonious specification of the penalty parameters across populations, and an ensemble step to combine PRS generated across different penalty parameters. We evaluate the performance of PROSPER and other existing methods on large-scale simulated and real datasets, including those from 23andMe Inc., the Global Lipids Genetics Consortium, and All of Us. Results show that PROSPER can substantially improve multi-ancestry polygenic prediction compared to alternative methods across a wide variety of genetic architectures. In real data analyses, for example, PROSPER increased out-of-sample prediction R for continuous traits by an average of 70% compared to a state-of-the-art Bayesian method (PRS-CSx) in the African ancestry population. Further, PROSPER is computationally highly scalable for the analysis of large SNP contents and many diverse populations.
Topics: Humans; Bayes Theorem; Genome-Wide Association Study; Multifactorial Inheritance; Population Health; Black People; Genetic Risk Score; Risk Factors
PubMed: 38622117
DOI: 10.1038/s41467-024-47357-7 -
Genome Biology and Evolution Apr 2024Most traits are polygenic, and the contributing loci can be identified by genome-wide association studies. The genetic basis of adaptation (adaptive architecture) is,...
Most traits are polygenic, and the contributing loci can be identified by genome-wide association studies. The genetic basis of adaptation (adaptive architecture) is, however, difficult to characterize. Here, we propose to study the adaptive architecture of traits by monitoring the evolution of their phenotypic variance during adaptation to a new environment in well-defined laboratory conditions. Extensive computer simulations show that the evolution of phenotypic variance in a replicated experimental evolution setting can distinguish between oligogenic and polygenic adaptive architectures. We compared gene expression variance in male Drosophila simulans before and after 100 generations of adaptation to a novel hot environment. The variance change in gene expression was indistinguishable for genes with and without a significant change in mean expression after 100 generations of evolution. We suggest that the majority of adaptive gene expression evolution can be explained by a polygenic architecture. We propose that tracking the evolution of phenotypic variance across generations can provide an approach to characterize the adaptive architecture.
Topics: Animals; Phenotype; Male; Multifactorial Inheritance; Adaptation, Physiological; Evolution, Molecular; Drosophila simulans; Drosophila; Biological Evolution; Computer Simulation
PubMed: 38620076
DOI: 10.1093/gbe/evae077 -
Diabetes & Metabolic Syndrome Apr 2024We evaluated whether incorporating information on ethnic background and polygenic risk enhanced the Leicester Risk Assessment (LRA) score for predicting 10-year risk of...
AIMS
We evaluated whether incorporating information on ethnic background and polygenic risk enhanced the Leicester Risk Assessment (LRA) score for predicting 10-year risk of type 2 diabetes.
METHODS
The sample included 202,529 UK Biobank participants aged 40-69 years. We computed the LRA score, and developed two new risk scores using training data (80% sample): LRArev, which incorporated additional information on ethnic background, and LRAprs, which incorporated polygenic risk for type 2 diabetes. We assessed discriminative and reclassification performance in a test set (20% sample). Type 2 diabetes was ascertained using primary care, hospital inpatient and death registry records.
RESULTS
Over 10 years, 7,476 participants developed type 2 diabetes. The Harrell's C indexes were 0.796 (95% Confidence Interval [CI] 0.785, 0.806), 0.802 (95% CI 0.792, 0.813), and 0.829 (95% CI 0.820, 0.839) for the LRA, LRArev and LRAprs scores, respectively. The LRAprs score significantly improved the overall reclassification compared to the LRA (net reclassification index [NRI] = 0.033, 95% CI 0.015, 0.049) and LRArev (NRI = 0.040, 95% CI 0.024, 0.055) scores.
CONCLUSIONS
Polygenic risk moderately improved the performance of the existing LRA score for 10-year risk prediction of type 2 diabetes.
Topics: Humans; Diabetes Mellitus, Type 2; Middle Aged; Female; Male; Risk Assessment; Adult; Aged; Follow-Up Studies; Risk Factors; Prognosis; Multifactorial Inheritance; Genetic Predisposition to Disease
PubMed: 38608567
DOI: 10.1016/j.dsx.2024.102996 -
Cell Genomics Apr 2024Polygenic risk scores (PRSs) are now showing promising predictive performance on a wide variety of complex traits and diseases, but there exists a substantial...
Polygenic risk scores (PRSs) are now showing promising predictive performance on a wide variety of complex traits and diseases, but there exists a substantial performance gap across populations. We propose MUSSEL, a method for ancestry-specific polygenic prediction that borrows information in summary statistics from genome-wide association studies (GWASs) across multiple ancestry groups via Bayesian hierarchical modeling and ensemble learning. In our simulation studies and data analyses across four distinct studies, totaling 5.7 million participants with a substantial ancestral diversity, MUSSEL shows promising performance compared to alternatives. For example, MUSSEL has an average gain in prediction R across 11 continuous traits of 40.2% and 49.3% compared to PRS-CSx and CT-SLEB, respectively, in the African ancestry population. The best-performing method, however, varies by GWAS sample size, target ancestry, trait architecture, and linkage disequilibrium reference samples; thus, ultimately a combination of methods may be needed to generate the most robust PRSs across diverse populations.
Topics: Humans; Animals; Multifactorial Inheritance; Genome-Wide Association Study; Bayes Theorem; Phenotype; Genetic Risk Score; Bivalvia
PubMed: 38604127
DOI: 10.1016/j.xgen.2024.100539 -
Bioinformatics (Oxford, England) Mar 2024Whole exome sequencing (WES) has emerged as a powerful tool for genetic research, enabling the collection of a tremendous amount of data about human genetic variation....
MOTIVATION
Whole exome sequencing (WES) has emerged as a powerful tool for genetic research, enabling the collection of a tremendous amount of data about human genetic variation. However, properly identifying which variants are causative of a genetic disease remains an important challenge, often due to the number of variants that need to be screened. Expanding the screening to combinations of variants in two or more genes, as would be required under the oligogenic inheritance model, simply blows this problem out of proportion.
RESULTS
We present here the High-throughput oligogenic prioritizer (Hop), a novel prioritization method that uses direct oligogenic information at the variant, gene and gene pair level to detect digenic variant combinations in WES data. This method leverages information from a knowledge graph, together with specialized pathogenicity predictions in order to effectively rank variant combinations based on how likely they are to explain the patient's phenotype. The performance of Hop is evaluated in cross-validation on 36 120 synthetic exomes for training and 14 280 additional synthetic exomes for independent testing. Whereas the known pathogenic variant combinations are found in the top 20 in approximately 60% of the cross-validation exomes, 71% are found in the same ranking range when considering the independent set. These results provide a significant improvement over alternative approaches that depend simply on a monogenic assessment of pathogenicity, including early attempts for digenic ranking using monogenic pathogenicity scores.
AVAILABILITY AND IMPLEMENTATION
Hop is available at https://github.com/oligogenic/HOP.
Topics: Humans; Exome; Exome Sequencing; Genetic Variation; High-Throughput Nucleotide Sequencing; Computational Biology
PubMed: 38603604
DOI: 10.1093/bioinformatics/btae184