-
Bioinformatics (Oxford, England) Mar 2024Whole exome sequencing (WES) has emerged as a powerful tool for genetic research, enabling the collection of a tremendous amount of data about human genetic variation....
MOTIVATION
Whole exome sequencing (WES) has emerged as a powerful tool for genetic research, enabling the collection of a tremendous amount of data about human genetic variation. However, properly identifying which variants are causative of a genetic disease remains an important challenge, often due to the number of variants that need to be screened. Expanding the screening to combinations of variants in two or more genes, as would be required under the oligogenic inheritance model, simply blows this problem out of proportion.
RESULTS
We present here the High-throughput oligogenic prioritizer (Hop), a novel prioritization method that uses direct oligogenic information at the variant, gene and gene pair level to detect digenic variant combinations in WES data. This method leverages information from a knowledge graph, together with specialized pathogenicity predictions in order to effectively rank variant combinations based on how likely they are to explain the patient's phenotype. The performance of Hop is evaluated in cross-validation on 36 120 synthetic exomes for training and 14 280 additional synthetic exomes for independent testing. Whereas the known pathogenic variant combinations are found in the top 20 in approximately 60% of the cross-validation exomes, 71% are found in the same ranking range when considering the independent set. These results provide a significant improvement over alternative approaches that depend simply on a monogenic assessment of pathogenicity, including early attempts for digenic ranking using monogenic pathogenicity scores.
AVAILABILITY AND IMPLEMENTATION
Hop is available at https://github.com/oligogenic/HOP.
Topics: Humans; Exome; Exome Sequencing; Genetic Variation; High-Throughput Nucleotide Sequencing; Computational Biology
PubMed: 38603604
DOI: 10.1093/bioinformatics/btae184 -
PLoS Biology Apr 2024A central aim of genome-wide association studies (GWASs) is to estimate direct genetic effects: the causal effects on an individual's phenotype of the alleles that they...
A central aim of genome-wide association studies (GWASs) is to estimate direct genetic effects: the causal effects on an individual's phenotype of the alleles that they carry. However, estimates of direct effects can be subject to genetic and environmental confounding and can also absorb the "indirect" genetic effects of relatives' genotypes. Recently, an important development in controlling for these confounds has been the use of within-family GWASs, which, because of the randomness of mendelian segregation within pedigrees, are often interpreted as producing unbiased estimates of direct effects. Here, we present a general theoretical analysis of the influence of confounding in standard population-based and within-family GWASs. We show that, contrary to common interpretation, family-based estimates of direct effects can be biased by genetic confounding. In humans, such biases will often be small per-locus, but can be compounded when effect-size estimates are used in polygenic scores (PGSs). We illustrate the influence of genetic confounding on population- and family-based estimates of direct effects using models of assortative mating, population stratification, and stabilizing selection on GWAS traits. We further show how family-based estimates of indirect genetic effects, based on comparisons of parentally transmitted and untransmitted alleles, can suffer substantial genetic confounding. We conclude that, while family-based studies have placed GWAS estimation on a more rigorous footing, they carry subtle issues of interpretation that arise from confounding.
Topics: Humans; Genome-Wide Association Study; Genotype; Phenotype; Multifactorial Inheritance; Alleles; Polymorphism, Single Nucleotide
PubMed: 38603516
DOI: 10.1371/journal.pbio.3002511 -
PLoS Computational Biology Apr 2024Prostate cancer is a heritable disease with ancestry-biased incidence and mortality. Polygenic risk scores (PRSs) offer promising advancements in predicting disease...
Prostate cancer is a heritable disease with ancestry-biased incidence and mortality. Polygenic risk scores (PRSs) offer promising advancements in predicting disease risk, including prostate cancer. While their accuracy continues to improve, research aimed at enhancing their effectiveness within African and Asian populations remains key for equitable use. Recent algorithmic developments for PRS derivation have resulted in improved pan-ancestral risk prediction for several diseases. In this study, we benchmark the predictive power of six widely used PRS derivation algorithms, including four of which adjust for ancestry, against prostate cancer cases and controls from the UK Biobank and All of Us cohorts. We find modest improvement in discriminatory ability when compared with a simple method that prioritizes variants, clumping, and published polygenic risk scores. Our findings underscore the importance of improving upon risk prediction algorithms and the sampling of diverse cohorts.
Topics: Humans; Prostatic Neoplasms; Male; Benchmarking; Genetic Predisposition to Disease; Algorithms; Multifactorial Inheritance; Cohort Studies; Risk Factors; Polymorphism, Single Nucleotide; Genome-Wide Association Study; Computational Biology; Risk Assessment; Case-Control Studies; Genetic Risk Score
PubMed: 38598551
DOI: 10.1371/journal.pcbi.1011990 -
Journal of Human Genetics Jul 2024Populations that have experienced a bottleneck are regularly used in Genome Wide Association Studies (GWAS) to investigate variants associated with complex traits. It is...
Populations that have experienced a bottleneck are regularly used in Genome Wide Association Studies (GWAS) to investigate variants associated with complex traits. It is generally understood that these isolated sub-populations may experience high frequency of otherwise rare variants with large effect size, and therefore provide a unique opportunity to study said trait. However, the demographic history of the population under investigation affects all SNPs that determine the complex trait genome-wide, changing its heritability and genetic architecture. We use a simulation based approach to identify the impact of the demographic processes of drift, expansion, and migration on the heritability of complex trait. We show that demography has considerable impact on complex traits. We then investigate the power to resolve heritability of complex traits in GWAS studies subjected to demographic effects. We find that demography is an important component for interpreting inference of complex traits and has a nuanced impact on the power of GWAS. We conclude that demographic histories need to be explicitly modelled to properly quantify the history of selection on a complex trait.
Topics: Humans; Genome-Wide Association Study; Multifactorial Inheritance; Polymorphism, Single Nucleotide; Models, Genetic; Genetics, Population; Quantitative Trait, Heritable; Computer Simulation; Phenotype; Selection, Genetic
PubMed: 38589509
DOI: 10.1038/s10038-024-01249-2 -
Journal of Dental Research May 2024Caries is a partially heritable disease, raising the possibility that a polygenic score (PS, a summary of an individual's genetic propensity for disease) might be a...
Caries is a partially heritable disease, raising the possibility that a polygenic score (PS, a summary of an individual's genetic propensity for disease) might be a useful tool for risk assessment. To date, PS for some diseases have shown clinical utility, although no PS for caries has been evaluated. The objective of the study was to test whether a PS for caries is associated with disease experience or increment in a cohort of Swedish adults. A genome-wide PS for caries was trained using the results of a published genome-wide association meta-analysis and constructed in an independent cohort of 15,460 Swedish adults. Electronic dental records from the Swedish Quality Registry for Caries and Periodontitis (SKaPa) were used to compute the decayed, missing, and filled tooth surfaces (DMFS) index and the number of remaining teeth. The performance of the PS was evaluated by testing the association between the PS and DMFS at a single dental examination, as well as between the PS and the rate of change in DMFS. Participants in the highest and lowest deciles of PS had a mean DMFS of 63.5 and 46.3, respectively. A regression analysis confirmed this association where a 1 standard deviation increase in PS was associated with approximately 4-unit higher DMFS ( < 2 × 10). Participants with the highest decile of PS also had greater change in DMFS during follow-up. Results were robust to sensitivity analysis, which adjusted for age, age squared, sex, and the first 20 genetic principal components. Mediation analysis suggested that tooth loss was a strong mediating factor in the association between PS and DMFS but also supported a direct genetic effect on caries. In this cohort, there are clinically meaningful differences in DMFS between participants with high and low PS for caries. The results highlight the potential role of genomic data in improving caries risk assessment.
Topics: Humans; Sweden; Dental Caries; Male; Female; Aged; Multifactorial Inheritance; Genome-Wide Association Study; DMF Index; Risk Assessment; Middle Aged; Genetic Predisposition to Disease; Registries
PubMed: 38584306
DOI: 10.1177/00220345241232330 -
European Heart Journal May 2024It is not clear how a polygenic risk score (PRS) can be best combined with guideline-recommended tools for cardiovascular disease (CVD) risk prediction, e.g. SCORE2.
BACKGROUND AND AIMS
It is not clear how a polygenic risk score (PRS) can be best combined with guideline-recommended tools for cardiovascular disease (CVD) risk prediction, e.g. SCORE2.
METHODS
A PRS for coronary artery disease (CAD) was calculated in participants of UK Biobank (n = 432 981). Within each tenth of the PRS distribution, the odds ratios (ORs)-referred to as PRS-factor-for CVD (i.e. CAD or stroke) were compared between the entire population and subgroups representing the spectrum of clinical risk. Replication was performed in the combined Framingham/Atherosclerosis Risk in Communities (ARIC) populations (n = 10 757). The clinical suitability of a multiplicative model 'SCORE2 × PRS-factor' was tested by risk reclassification.
RESULTS
In subgroups with highly different clinical risks, CVD ORs were stable within each PRS tenth. SCORE2 and PRS showed no significant interactive effects on CVD risk, which qualified them as multiplicative factors: SCORE2 × PRS-factor = total risk. In UK Biobank, the multiplicative model moved 9.55% of the intermediate (n = 145 337) to high-risk group increasing the individuals in this category by 56.6%. Incident CVD occurred in 8.08% of individuals reclassified by the PRS-factor from intermediate to high risk, which was about two-fold of those remained at intermediate risk (4.08%). Likewise, the PRS-factor shifted 8.29% of individuals from moderate to high risk in Framingham/ARIC.
CONCLUSIONS
This study demonstrates that absolute CVD risk, determined by a clinical risk score, and relative genetic risk, determined by a PRS, provide independent information. The two components may form a simple multiplicative model improving precision of guideline-recommended tools in predicting incident CVD.
Topics: Humans; Female; Male; Middle Aged; Risk Assessment; Cardiovascular Diseases; Practice Guidelines as Topic; Aged; United Kingdom; Coronary Artery Disease; Multifactorial Inheritance; Genetic Predisposition to Disease; Risk Factors; Adult
PubMed: 38551411
DOI: 10.1093/eurheartj/ehae048 -
Nature Genetics Apr 2024Common genetic variants confer substantial risk for chronic lung diseases, including pulmonary fibrosis. Defining the genetic control of gene expression in a...
Common genetic variants confer substantial risk for chronic lung diseases, including pulmonary fibrosis. Defining the genetic control of gene expression in a cell-type-specific and context-dependent manner is critical for understanding the mechanisms through which genetic variation influences complex traits and disease pathobiology. To this end, we performed single-cell RNA sequencing of lung tissue from 66 individuals with pulmonary fibrosis and 48 unaffected donors. Using a pseudobulk approach, we mapped expression quantitative trait loci (eQTLs) across 38 cell types, observing both shared and cell-type-specific regulatory effects. Furthermore, we identified disease interaction eQTLs and demonstrated that this class of associations is more likely to be cell-type-specific and linked to cellular dysregulation in pulmonary fibrosis. Finally, we connected lung disease risk variants to their regulatory targets in disease-relevant cell types. These results indicate that cellular context determines the impact of genetic variation on gene expression and implicates context-specific eQTLs as key regulators of lung homeostasis and disease.
Topics: Humans; Quantitative Trait Loci; Pulmonary Fibrosis; Gene Expression Regulation; Lung; Multifactorial Inheritance; Genome-Wide Association Study; Polymorphism, Single Nucleotide
PubMed: 38548990
DOI: 10.1038/s41588-024-01702-0 -
Genes Feb 2024Inherited cardiomyopathies represent a highly heterogeneous group of cardiac diseases. DNA variants in genes expressed in cardiomyocytes cause a diverse spectrum of...
Inherited cardiomyopathies represent a highly heterogeneous group of cardiac diseases. DNA variants in genes expressed in cardiomyocytes cause a diverse spectrum of cardiomyopathies, ultimately leading to heart failure, arrythmias, and sudden cardiac death. We applied massive parallel DNA sequencing using a 72-gene panel for studying inherited cardiomyopathies. We report on variants in 25 families, where pathogenicity was predicted by different computational approaches, databases, and an in-house filtering analysis. All variants were validated using Sanger sequencing. Familial segregation was tested when possible. We identified 41 different variants in 26 genes. Analytically, we identified fifteen variants previously reported in the Human Gene Mutation Database: twelve mentioned as disease-causing mutations (DM) and three as probable disease-causing mutations (DM?). Additionally, we identified 26 novel variants. We classified the forty-one variants as follows: twenty-eight (68.3%) as variants of uncertain significance, eight (19.5%) as likely pathogenic, and five (12.2%) as pathogenic. We genetically characterized families with a cardiac phenotype. The genetic heterogeneity and the multiplicity of candidate variants are making a definite molecular diagnosis challenging, especially when there is a suspicion of incomplete penetrance or digenic-oligogenic inheritance. This is the first systematic study of inherited cardiac conditions in Cyprus, enabling us to develop a genetic baseline and precision cardiology.
Topics: Humans; Multifactorial Inheritance; Cyprus; Cardiomyopathies; Mutation; Sequence Analysis, DNA
PubMed: 38540378
DOI: 10.3390/genes15030319 -
BMC Musculoskeletal Disorders Mar 2024Individuals with osteoarthritis present with comorbidities, and the potential causal associations remain incompletely elucidated. The present study undertook a...
BACKGROUND
Individuals with osteoarthritis present with comorbidities, and the potential causal associations remain incompletely elucidated. The present study undertook a large-scale investigation about the causality between osteoarthritis and variable traits, using the summary-level data of genome-wide association studies (GWAS).
METHODS
The present study included the summary-level GWS data of knee osteoarthritis, hip osteoarthritis, hip or knee osteoarthritis, hand osteoarthritis, and other 1355 traits. Genetic correlation analysis was conducted between osteoarthritis and other traits through cross-trait bivariate linkage disequilibrium score regression. Subsequently, latent causal variable analysis was performed to explore the causal association when there was a significant genetic correlation. Genetic correlation and latent causal variable analysis were conducted on the Complex Traits Genomics Virtual Lab platform ( https://vl.genoma.io/ ).
RESULTS
We found 133 unique phenotypes showing causal relationships with osteoarthritis. Our results confirmed several well-established risk factors of osteoarthritis, such as obesity, weight, BMI, and meniscus derangement. Additionally, our findings suggested putative causal links between osteoarthritis and multiple factors. Socioeconomic determinants such as occupational exposure to dust and diesel exhaust, extended work hours exceeding 40 per week, and unemployment status were implicated. Furthermore, our analysis revealed causal associations with cardiovascular and metabolic disorders, including heart failure, deep venous thrombosis, type 2 diabetes mellitus, and elevated cholesterol levels. Soft tissue and musculoskeletal disorders, such as hallux valgus, internal derangement of the knee, and spondylitis, were also identified to be causally related to osteoarthritis. The study also identified the putative causal associations of osteoarthritis with digestive and respiratory diseases, such as Barrett's esophagus, esophagitis, and asthma, as well as psychiatric conditions including panic attacks and manic or hyperactive episodes. Additionally, we observed osteoarthritis causally related to pharmacological treatments, such as the use of antihypertensive medications, anti-asthmatic drugs, and antidepressants.
CONCLUSION
Our study uncovered a wide range of traits causally associated with osteoarthritis. Further studies are needed to validate and illustrate the detailed mechanism of those causal associations.
Topics: Humans; Diabetes Mellitus, Type 2; Osteoarthritis, Knee; Osteoarthritis, Hip; Genome-Wide Association Study; Multifactorial Inheritance; Polymorphism, Single Nucleotide
PubMed: 38532343
DOI: 10.1186/s12891-024-07360-x -
Genetics in Medicine : Official Journal... Mar 2024DISP1 encodes a transmembrane protein that regulates the secretion of the morphogen, Sonic hedgehog, a deficiency of which is a major cause of holoprosencephaly (HPE)....
PURPOSE
DISP1 encodes a transmembrane protein that regulates the secretion of the morphogen, Sonic hedgehog, a deficiency of which is a major cause of holoprosencephaly (HPE). This disorder covers a spectrum of brain and midline craniofacial malformations. The objective of the present study was to better delineate the clinical phenotypes associated with division transporter dispatched-1 (DISP1) variants.
METHODS
This study was based on the identification of at least 1 pathogenic variant of the DISP1 gene in individuals for whom detailed clinical data were available.
RESULTS
A total of 23 DISP1 variants were identified in heterozygous, compound heterozygous or homozygous states in 25 individuals with midline craniofacial defects. Most cases were minor forms of HPE, with craniofacial features such as orofacial cleft, solitary median maxillary central incisor, and congenital nasal pyriform aperture stenosis. These individuals had either monoallelic loss-of-function variants or biallelic missense variants in DISP1. In individuals with severe HPE, the DISP1 variants were commonly found associated with a variant in another HPE-linked gene (ie, oligogenic inheritance).
CONCLUSION
The genetic findings we have acquired demonstrate a significant involvement of DISP1 variants in the phenotypic spectrum of midline defects. This underlines its importance as a crucial element in the efficient secretion of Sonic hedgehog. We also demonstrated that the very rare solitary median maxillary central incisor and congenital nasal pyriform aperture stenosis combination is part of the DISP1-related phenotype. The present study highlights the clinical risks to be flagged up during genetic counseling after the discovery of a pathogenic DISP1 variant.
PubMed: 38529886
DOI: 10.1016/j.gim.2024.101126