-
Annual Review of Vision Science Sep 2020Keratoconus, a progressive corneal ectasia, is a complex disease with both genetic and environmental risk factors. The exact etiology is not known and is likely variable... (Review)
Review
Keratoconus, a progressive corneal ectasia, is a complex disease with both genetic and environmental risk factors. The exact etiology is not known and is likely variable between individuals. Conditions such as hay fever and allergy are associated with increased risk, while diabetes may be protective. Behaviors such as eye rubbing are also implicated, but direct causality has not been proven. Genetics plays a major role in risk for some individuals, with many large pedigrees showing autosomal inheritance patterns. Several genes have been implicated using linkage and follow-up sequencing in these families. Genome-wide association studies for keratoconus and for quantitative traits such as central corneal thickness have identified several genetic loci that contribute to a cumulative risk for keratoconus, even in people without a family history of the disease. Identification of risk genes for keratoconus is improving our understanding of the biology of this complex disease.
Topics: Cornea; Disease Progression; Genome-Wide Association Study; Humans; Keratoconus; Pedigree; Risk Factors
PubMed: 32320633
DOI: 10.1146/annurev-vision-121219-081723 -
Molecular Ecology Resources Feb 2022In genomic-scale data sets, loci are closely packed within chromosomes and hence provide correlated information. Averaging across loci as if they were independent...
In genomic-scale data sets, loci are closely packed within chromosomes and hence provide correlated information. Averaging across loci as if they were independent creates pseudoreplication, which reduces the effective degrees of freedom (df') compared to the nominal degrees of freedom, df. This issue has been known for some time, but consequences have not been systematically quantified across the entire genome. Here, we measured pseudoreplication (quantified by the ratio df'/df) for a common metric of genetic differentiation (F ) and a common measure of linkage disequilibrium between pairs of loci (r ). Based on data simulated using models (SLiM and msprime) that allow efficient forward-in-time and coalescent simulations while precisely controlling population pedigrees, we estimated df' and df'/df by measuring the rate of decline in the variance of mean F and mean r as more loci were used. For both indices, df' increases with N and genome size, as expected. However, even for large N and large genomes, df' for mean r plateaus after a few thousand loci, and a variance components analysis indicates that the limiting factor is uncertainty associated with sampling individuals rather than genes. Pseudoreplication is less extreme for F , but df'/df ≤0.01 can occur in data sets using tens of thousands of loci. Commonly-used block-jackknife methods consistently overestimated var (F ), producing very conservative confidence intervals. Predicting df' based on our modelling results as a function of N , L, S, and genome size provides a robust way to quantify precision associated with genomic-scale data sets.
Topics: Genome Size; Genomics; Linkage Disequilibrium; Models, Genetic; Pedigree; Population Density
PubMed: 34351073
DOI: 10.1111/1755-0998.13482 -
Genetics in Medicine : Official Journal... Jun 2022This study aimed to provide comprehensive diagnostic and candidate analyses in a pediatric rare disease cohort through the Genomic Answers for Kids program.
PURPOSE
This study aimed to provide comprehensive diagnostic and candidate analyses in a pediatric rare disease cohort through the Genomic Answers for Kids program.
METHODS
Extensive analyses of 960 families with suspected genetic disorders included short-read exome sequencing and short-read genome sequencing (srGS); PacBio HiFi long-read genome sequencing (HiFi-GS); variant calling for single nucleotide variants (SNV), structural variant (SV), and repeat variants; and machine-learning variant prioritization. Structured phenotypes, prioritized variants, and pedigrees were stored in PhenoTips database, with data sharing through controlled access the database of Genotypes and Phenotypes.
RESULTS
Diagnostic rates ranged from 11% in patients with prior negative genetic testing to 34.5% in naive patients. Incorporating SVs from genome sequencing added up to 13% of new diagnoses in previously unsolved cases. HiFi-GS yielded increased discovery rate with >4-fold more rare coding SVs compared with srGS. Variants and genes of unknown significance remain the most common finding (58% of nondiagnostic cases).
CONCLUSION
Computational prioritization is efficient for diagnostic SNVs. Thorough identification of non-SNVs remains challenging and is partly mitigated using HiFi-GS sequencing. Importantly, community research is supported by sharing real-time data to accelerate gene validation and by providing HiFi variant (SNV/SV) resources from >1000 human alleles to facilitate implementation of new sequencing platforms for rare disease diagnoses.
Topics: Child; Genome; Genomics; High-Throughput Nucleotide Sequencing; Humans; Pedigree; Rare Diseases; Sequence Analysis, DNA
PubMed: 35305867
DOI: 10.1016/j.gim.2022.02.007 -
Trends in Genetics : TIG Oct 2022Some rare genetic disorders, such as retinitis pigmentosa or Alport syndrome, are caused by the co-inheritance of DNA variants at two different genetic loci (digenic... (Review)
Review
Some rare genetic disorders, such as retinitis pigmentosa or Alport syndrome, are caused by the co-inheritance of DNA variants at two different genetic loci (digenic inheritance). To capture the effects of these disease-causing variants and their possible interactive effects, various statistical methods have been developed in human genetics. Analogous developments have taken place in the field of machine learning, particularly for the field that is now called Big Data. In the past, these two areas have grown independently and have started to converge only in recent years. We discuss an overview of each of the two fields, paying special attention to machine learning methods for uncovering the combined effects of pairs of variants on human disease.
Topics: Humans; Inheritance Patterns; Machine Learning; Multifactorial Inheritance; Mutation; Pedigree
PubMed: 35581032
DOI: 10.1016/j.tig.2022.04.009 -
Fa Yi Xue Za Zhi Jun 2023Kinship testing is widely needed in forensic science practice. This paper reviews the definitions of common concepts, and summarizes the basic principles, advantages and... (Review)
Review
Kinship testing is widely needed in forensic science practice. This paper reviews the definitions of common concepts, and summarizes the basic principles, advantages and disadvantages, and application scope of kinship analysis methods, including identity by state (IBS) method, likelihood ratio (LR) method, method of moment (MoM), and identity by descent (IBD) segment method. This paper also discusses the research hotspots of challenging kinship testing, complex kinship testing, forensic genetic genealogy analysis, and non-human biological samples.
Topics: DNA Fingerprinting; Forensic Genetics; Forensic Sciences; Pedigree; Humans
PubMed: 37517010
DOI: 10.12116/j.issn.1004-5619.2023.530208 -
American Journal of Human Genetics Jan 2023Pedigree analysis showed that a large proportion of Leber hereditary optic neuropathy (LHON) family members who carry a mitochondrial risk variant never lose vision....
Pedigree analysis showed that a large proportion of Leber hereditary optic neuropathy (LHON) family members who carry a mitochondrial risk variant never lose vision. Mitochondrial haplotype appears to be a major factor influencing the risk of vision loss from LHON. Mitochondrial variants, including m.14484T>C and m.11778G>A, have been added to gene arrays, and thus many patients and research participants are tested for LHON mutations. Analysis of the UK Biobank and Australian cohort studies found more than 1 in 1,000 people in the general population carry either the m.14484T>C or the m.11778G>A LHON variant. None of the subset of carriers examined had visual acuity at 20/200 or worse, suggesting a very low penetrance of LHON. Haplogroup analysis of m.14484T>C carriers showed a high rate of haplogroup U subclades, previously shown to have low penetrance in pedigrees. Penetrance calculations of the general population are lower than pedigree calculations, most likely because of modifier genetic factors. This Matters Arising Response paper addresses the Watson et al. (2022) Matters Arising paper, published concurrently in The American Journal of Human Genetics.
Topics: Humans; Penetrance; DNA, Mitochondrial; Optic Atrophy, Hereditary, Leber; Australia; Mutation; Pedigree
PubMed: 36565701
DOI: 10.1016/j.ajhg.2022.11.014 -
Blood Mar 2023Familial aggregation of Hodgkin lymphoma (HL) has been demonstrated in large population studies, pointing to genetic predisposition to this hematological malignancy. To...
Familial aggregation of Hodgkin lymphoma (HL) has been demonstrated in large population studies, pointing to genetic predisposition to this hematological malignancy. To understand the genetic variants associated with the development of HL, we performed whole genome sequencing on 234 individuals with and without HL from 36 pedigrees that had 2 or more first-degree relatives with HL. Our pedigree selection criteria also required at least 1 affected individual aged <21 years, with the median age at diagnosis of 21.98 years (3-55 years). Family-based segregation analysis was performed for the identification of coding and noncoding variants using linkage and filtering approaches. Using our tiered variant prioritization algorithm, we identified 44 HL-risk variants in 28 pedigrees, of which 33 are coding and 11 are noncoding. The top 4 recurrent risk variants are a coding variant in KDR (rs56302315), a 5' untranslated region variant in KLHDC8B (rs387906223), a noncoding variant in an intron of PAX5 (rs147081110), and another noncoding variant in an intron of GATA3 (rs3824666). A newly identified splice variant in KDR (c.3849-2A>C) was observed for 1 pedigree, and high-confidence stop-gain variants affecting IRF7 (p.W238∗) and EEF2KMT (p.K116∗) were also observed. Multiple truncating variants in POLR1E were found in 3 independent pedigrees as well. Whereas KDR and KLHDC8B have previously been reported, PAX5, GATA3, IRF7, EEF2KMT, and POLR1E represent novel observations. Although there may be environmental factors influencing lymphomagenesis, we observed segregation of candidate germline variants likely to predispose HL in most of the pedigrees studied.
Topics: Humans; Young Adult; Adult; Hodgkin Disease; Genetic Predisposition to Disease; Germ-Line Mutation; Codon, Nonsense; Whole Genome Sequencing; Pedigree; Cell Cycle Proteins
PubMed: 35977101
DOI: 10.1182/blood.2022016056 -
Journal of Mathematical Biology May 2022Our goal is to study the genetic composition of a population in which each individual has 2 parents, who contribute equally to the genome of their offspring. We use a...
Our goal is to study the genetic composition of a population in which each individual has 2 parents, who contribute equally to the genome of their offspring. We use a biparental Moran model, which is characterized by its fixed number N of individuals. We fix an individual and consider the proportions of the genomes of all individuals living n time steps later, that come from this individual. When n goes to infinity, these proportions all converge almost surely towards the same random variable. When N then goes to infinity, this random variable multiplied by N (i.e. the stationary weight of any ancestor in the whole population) converges in law towards the mixture of a Dirac measure in 0 and an exponential law with parameter 1/2, and the weights of several given ancestors are independent. This gives an explicit formula for the limiting (deterministic) distribution of all ancestors' weights.
Topics: Genetics, Population; Genome; Humans; Models, Genetic; Pedigree
PubMed: 35532838
DOI: 10.1007/s00285-022-01752-0 -
Journal of Thrombosis and Haemostasis :... Mar 2024Most family studies on venous thromboembolism (VTE) have focused on first-degree relatives.
BACKGROUND
Most family studies on venous thromboembolism (VTE) have focused on first-degree relatives.
OBJECTIVES
We took a pedigree-based approach and examined the risk of VTE and cardiometabolic disorders in offspring from extended pedigrees according to the densities of VTE in pedigrees.
METHODS
From the Swedish population, we identified a total of 482 185 pedigrees containing a mean of 14.2 parents, aunts/uncles, grandparents, and cousins of a core full sibship that we termed the pedigree offspring (n = 751 060). We then derived 8 empirical classes of these pedigrees based on the density of cases of VTE. The risk was determined in offspring for VTE and cardiometabolic disorders as a function of VTE density in their pedigrees. Bonferroni correction for multiple comparisons was performed.
RESULTS
VTE was unevenly distributed in the population; the Gini coefficient was 0.59. Higher VTE density in pedigrees was associated in the offspring with a higher risk of different VTE manifestations (deep venous thrombosis, pulmonary embolism, pregnancy-related VTE, unusual thrombosis, and superficial thrombophlebitis), thrombophilia, and lower age of first VTE event. Moreover, VTE density in pedigrees was significantly associated in the offspring with obesity, diabetes, gout, varicose veins, and arterial embolism and thrombosis (excluding brain and heart). No significant associations were observed for retinal vein occlusion, hypercholesterolemia, hypertension, coronary heart disease, myocardial infarction, ischemic stroke, atrial fibrillation, heart failure, primary pulmonary hypertension, cerebral hemorrhage, aortic aneurysm, peripheral artery disease, and overall mortality.
CONCLUSION
Offspring of pedigrees with a high density of VTE are disadvantaged regarding VTE manifestations and certain cardiometabolic disorders.
Topics: Humans; Venous Thromboembolism; Pedigree; Risk Factors; Thrombophlebitis; Pulmonary Embolism
PubMed: 38072377
DOI: 10.1016/j.jtha.2023.11.024 -
BMC Bioinformatics May 2021Statistical geneticists employ simulation to estimate the power of proposed studies, test new analysis tools, and evaluate properties of causal models. Although there...
BACKGROUND
Statistical geneticists employ simulation to estimate the power of proposed studies, test new analysis tools, and evaluate properties of causal models. Although there are existing trait simulators, there is ample room for modernization. For example, most phenotype simulators are limited to Gaussian traits or traits transformable to normality, while ignoring qualitative traits and realistic, non-normal trait distributions. Also, modern computer languages, such as Julia, that accommodate parallelization and cloud-based computing are now mainstream but rarely used in older applications. To meet the challenges of contemporary big studies, it is important for geneticists to adopt new computational tools.
RESULTS
We present TraitSimulation, an open-source Julia package that makes it trivial to quickly simulate phenotypes under a variety of genetic architectures. This package is integrated into our OpenMendel suite for easy downstream analyses. Julia was purpose-built for scientific programming and provides tremendous speed and memory efficiency, easy access to multi-CPU and GPU hardware, and to distributed and cloud-based parallelization. TraitSimulation is designed to encourage flexible trait simulation, including via the standard devices of applied statistics, generalized linear models (GLMs) and generalized linear mixed models (GLMMs). TraitSimulation also accommodates many study designs: unrelateds, sibships, pedigrees, or a mixture of all three. (Of course, for data with pedigrees or cryptic relationships, the simulation process must include the genetic dependencies among the individuals.) We consider an assortment of trait models and study designs to illustrate integrated simulation and analysis pipelines. Step-by-step instructions for these analyses are available in our electronic Jupyter notebooks on Github. These interactive notebooks are ideal for reproducible research.
CONCLUSION
The TraitSimulation package has three main advantages. (1) It leverages the computational efficiency and ease of use of Julia to provide extremely fast, straightforward simulation of even the most complex genetic models, including GLMs and GLMMs. (2) It can be operated entirely within, but is not limited to, the integrated analysis pipeline of OpenMendel. And finally (3), by allowing a wider range of more realistic phenotype models, TraitSimulation brings power calculations and diagnostic tools closer to what investigators might see in real-world analyses.
Topics: Aged; Cloud Computing; Computer Simulation; Genetic Testing; Humans; Pedigree; Phenotype
PubMed: 33941078
DOI: 10.1186/s12859-021-04086-8