-
Bioinformatics (Oxford, England) Jun 2016Read-based phasing deduces the haplotypes of an individual from sequencing reads that cover multiple variants, while genetic phasing takes only genotypes as input and...
MOTIVATION
Read-based phasing deduces the haplotypes of an individual from sequencing reads that cover multiple variants, while genetic phasing takes only genotypes as input and applies the rules of Mendelian inheritance to infer haplotypes within a pedigree of individuals. Combining both into an approach that uses these two independent sources of information-reads and pedigree-has the potential to deliver results better than each individually.
RESULTS
We provide a theoretical framework combining read-based phasing with genetic haplotyping, and describe a fixed-parameter algorithm and its implementation for finding an optimal solution. We show that leveraging reads of related individuals jointly in this way yields more phased variants and at a higher accuracy than when phased separately, both in simulated and real data. Coverages as low as 2× for each member of a trio yield haplotypes that are as accurate as when analyzed separately at 15× coverage per individual.
AVAILABILITY AND IMPLEMENTATION
https://bitbucket.org/whatshap/whatshap
CONTACT
Topics: Algorithms; Genotype; Haplotypes; Pedigree; Polymorphism, Single Nucleotide; Sequence Analysis, DNA
PubMed: 27307622
DOI: 10.1093/bioinformatics/btw276 -
Yi Chuan = Hereditas Apr 2021The accuracy of genetic evaluations in different herds is affected by the degree of genetic connectedness among herds. In this study, we explored the application of high...
The accuracy of genetic evaluations in different herds is affected by the degree of genetic connectedness among herds. In this study, we explored the application of high density SNP markers in the assessment of genetic connectedness by comparing the genetic connectedness based on pedigree data and genomic data. Six methods, including PEVD (prediction error variance of differences between estimated breeding values), PEVD (x), VED (variance of estimated difference between the herd effects), CD (generalized coefficient of determination), r (prediction error correlation) and CR (connectedness rating), were implemented to measure the genetic connectedness based on different relationship matrices (A, G, G, G and H). Our results from both simulated data and SNP chip data indicated that, except for the PEVD (x) and VED methods, the genetic connectedness obtained by PEVD, CD, r and CR based on G. G and G matrices (using genome information only) were superior to those based on A matrix (using pedigree information only). Generally, for most approaches, the genetic connectedness based on H matrix (using both pedigree and genome information) was somewhere between A matrix and G matrices. CD could overestimate the degree of genetic connectedness as it was still very high when CR and r were close to 0. The method r could not accurately reflect the true genetic connectedness of the populations. It generated 0.01 of genetic connectedness for all three pig breeding farms, which were actually genetically different with each other. With increasing of heritability, the degree of genetic connectedness obtained by all methods were increased as well. However, in the case of heritability 0.1, PEVD based on A matrix performed better than based on G matrix, suggesting that traits with medium and high heritability are more suitable for the assessment of genetic connectedness compared to traits with low heritability. Our findings indicated that high-density SNP markers have advantages over pedigree analysis for the measurement of genetic connectedness, and CR is a robust and reliable method to assess genetic connectedness. Further, CR is easily calculated and less affected by heritability of trait. PEVD is good supplement to quantify the prediction errors of estimated breeding values under the specific genetic connectedness. In comparison, G matrix can reflect genetic connectedness better than its extensions G and G matrix.
Topics: Animals; Genome; Genomics; Genotype; Models, Genetic; Pedigree; Phenotype; Polymorphism, Single Nucleotide; Swine
PubMed: 33972208
DOI: 10.16288/j.yczz.20-351 -
American Journal of Human Genetics Jan 2021The proportion of samples with one or more close relatives in a genetic dataset increases rapidly with sample size, necessitating relatedness modeling and enabling...
The proportion of samples with one or more close relatives in a genetic dataset increases rapidly with sample size, necessitating relatedness modeling and enabling pedigree-based analyses. Despite this, relatives are generally unreported and current inference methods typically detect only the degree of relatedness of sample pairs and not pedigree relationships. We developed CREST, an accurate and fast method that identifies the pedigree relationships of close relatives. CREST utilizes identity by descent (IBD) segments shared between a pair of samples and their mutual relatives, leveraging the fact that sharing rates among these individuals differ across pedigree configurations. Furthermore, CREST exploits the profound differences in sex-specific genetic maps to classify pairs as maternally or paternally related-e.g., paternal half-siblings-using the locations of autosomal IBD segments shared between the pair. In simulated data, CREST correctly classifies 91.5%-100% of grandparent-grandchild (GP) pairs, 80.0%-97.5% of avuncular (AV) pairs, and 75.5%-98.5% of half-siblings (HS) pairs compared to PADRE's rates of 38.5%-76.0% of GP, 60.5%-92.0% of AV, 73.0%-95.0% of HS pairs. Turning to the real 20,032 sample Generation Scotland (GS) dataset, CREST identified seven pedigrees with incorrect relationship types or maternal/paternal parent sexes, five of which we confirmed as mistakes, and two with uncertain relationships. After correcting these, CREST correctly determines relationship types for 93.5% of GP, 97.7% of AV, and 92.2% of HS pairs that have sufficient mutual relative data; the parent sex in 100% of HS and 99.6% of GP pairs; and it completes this analysis in 2.8 h including IBD detection in eight threads.
Topics: Female; Genetic Linkage; Genome, Human; Genotype; Humans; Male; Models, Genetic; Pedigree; Scotland
PubMed: 33385324
DOI: 10.1016/j.ajhg.2020.12.004 -
Human Mutation May 2022Clinical genetic sequencing tests often identify variants of uncertain significance. One source of data that can help classify the pathogenicity of variants is familial...
Clinical genetic sequencing tests often identify variants of uncertain significance. One source of data that can help classify the pathogenicity of variants is familial cosegregation analysis. Identifying and genotyping relatives for cosegregation analysis can be time consuming and costly. We propose an algorithm that describes a single measure of expected variant information gain from genotyping a single additional relative in a family. Then we explore the performance of this algorithm by comparing actual recruitment strategies used in 35 families who had pursued cosegregation analysis with synthetic pedigrees of possible testing outcomes if the families had pursued an optimized testing strategy instead. For each actual and synthetic pedigree, we calculated the likelihood ratio of pathogenicity as each successive test was added to the pedigree. We analyzed the differences in cosegregation likelihood ratio over time resulting from actual versus optimized testing approaches. Employing the testing strategy indicated by the algorithm would have led to maximal information more rapidly in 30 of the 35 pedigrees (86%). Many clinical and research laboratories are involved in targeted cosegregation analysis. The algorithm we present can facilitate a data driven approach to optimal relative recruitment and genotyping for cosegregation analysis and more efficient variant classification.
Topics: Algorithms; Genetic Testing; Genetic Variation; Humans; Pedigree
PubMed: 35225377
DOI: 10.1002/humu.24363 -
Molecular Ecology Jan 2022Over the past 50 years conservation genetics has developed a substantive toolbox to inform species management. One of the most long-standing tools available to manage...
Over the past 50 years conservation genetics has developed a substantive toolbox to inform species management. One of the most long-standing tools available to manage genetics-the pedigree-has been widely used to characterize diversity and maximize evolutionary potential in threatened populations. Now, with the ability to use high throughput sequencing to estimate relatedness, inbreeding, and genome-wide functional diversity, some have asked whether it is warranted for conservation biologists to continue collecting and collating pedigrees for species management. In this perspective, we argue that pedigrees remain a relevant tool, and when combined with genomic data, create an invaluable resource for conservation genomic management. Genomic data can address pedigree pitfalls (e.g., founder relatedness, missing data, uncertainty), and in return robust pedigrees allow for more nuanced research design, including well-informed sampling strategies and quantitative analyses (e.g., heritability, linkage) to better inform genomic inquiry. We further contend that building and maintaining pedigrees provides an opportunity to strengthen trusted relationships among conservation researchers, practitioners, Indigenous Peoples, and Local Communities.
Topics: Conservation of Natural Resources; Genetics, Population; Genome; Genomics; Inbreeding; Pedigree
PubMed: 34553796
DOI: 10.1111/mec.16192 -
American Journal of Medical Genetics.... Nov 2022
Topics: Humans; Pedigree; Exome Sequencing
PubMed: 36209347
DOI: 10.1002/ajmg.a.62935 -
Indian Journal of Dermatology,... 2022
Topics: Humans; Keratoderma, Palmoplantar; Pedigree
PubMed: 33871192
DOI: 10.25259/IJDVL_759_20 -
The Journal of Headache and Pain Dec 2017Migraine has long been known as a common complex disease caused by genetic and environmental factors. The pathophysiology and the specific genetic susceptibility are... (Review)
Review
INTRODUCTION
Migraine has long been known as a common complex disease caused by genetic and environmental factors. The pathophysiology and the specific genetic susceptibility are poorly understood. Common variants only explain a small part of the heritability of migraine. It is thought that rare genetic variants with bigger effect size may be involved in the disease. Since migraine has a tendency to cluster in families, a family approach might be the way to find these variants. This is also indicated by identification of migraine-associated loci in classical linkage-analyses in migraine families. A single migraine study using a candidate-gene approach was performed in 2010 identifying a rare mutation in the TRESK potassium channel segregating in a large family with migraine with aura, but this finding has later become questioned. The technologies of next-generation sequencing (NGS) now provides an affordable tool to investigate the genetic variation in the entire exome or genome. The family-based study design using NGS is described in this paper. We also review family studies using NGS that have been successful in finding rare variants in other common complex diseases in order to argue the promising application of a family approach to migraine.
METHOD
PubMed was searched to find studies that looked for rare genetic variants in common complex diseases through a family-based design using NGS, excluding studies looking for de-novo mutations, or using a candidate-gene approach and studies on cancer. All issues from Nature Genetics and PLOS genetics 2014, 2015 and 2016 (UTAI June) were screened for relevant papers. Reference lists from included and other relevant papers were also searched. For the description of the family-based study design using NGS an in-house protocol was used.
RESULTS
Thirty-two successful studies, which covered 16 different common complex diseases, were included in this paper. We also found a single migraine study. Twenty-three studies found one or a few family specific variants (less than five), while other studies found several possible variants. Not all of them were genome wide significant. Four studies performed follow-up analyses in unrelated cases and controls and calculated odds ratios that supported an association between detected variants and risk of disease. Studies of 11 diseases identified rare variants that segregated fully or to a large degree with the disease in the pedigrees.
CONCLUSION
It is possible to find rare high risk variants for common complex diseases through a family-based approach. One study using a family approach and NGS to find rare variants in migraine has already been published but with strong limitations. More studies are under way.
Topics: Genetic Predisposition to Disease; Humans; Migraine Disorders; Pedigree
PubMed: 28255817
DOI: 10.1186/s10194-017-0729-y -
Journal of Equine Veterinary Science May 2021The sheer diversity of heritable physiological traits, and the ingenuity of genome derived research technologies, extends the study of genetics to impact diverse... (Review)
Review
The sheer diversity of heritable physiological traits, and the ingenuity of genome derived research technologies, extends the study of genetics to impact diverse scientific fields. Equine science is no exception, experiencing a number of genome-enabled discoveries that spur further research in areas like nutrition, reproduction, and exercise physiology. Yet unexpected findings, especially those that over-turn commonly held beliefs in the horse industry, can create challenges in outreach, education and communication with stakeholders. For example, studies of ancient DNA revealed that the oldest domesticated equids in the archeological record were in fact another species, the Przewalski's horse, leaving the origins of our modern horses a mystery yet to be solved. Genomic analysis of ancestry can illuminate relationships older than our prized pedigree records, and in some cases, identify unexpected inconsistencies in those pedigrees. Even our interpretation of what constitutes a genetic disease is changing, as we re-examine common disease alleles; how these alleles impact equine physiology, and how they are perceived by breeders and professionals in the industry. Effectively translating genetic tools for utilization in horse management and preparing our community for the debate surrounding ethical questions that may arise from genomic studies, may be the next great challenges we face as scientists and educators.
Topics: Animals; DNA, Ancient; Genome; Genomics; Horses; Pedigree
PubMed: 34030792
DOI: 10.1016/j.jevs.2021.103456 -
Journal of Neurosurgery Jul 2023Inherited variants predisposing patients to type 1 or 1.5 Chiari malformation (CM) have been hypothesized but have proven difficult to confirm. The authors used a unique...
OBJECTIVE
Inherited variants predisposing patients to type 1 or 1.5 Chiari malformation (CM) have been hypothesized but have proven difficult to confirm. The authors used a unique high-risk pedigree population resource and approach to identify rare candidate variants that likely predispose individuals to CM and protein structure prediction tools to identify pathogenicity mechanisms.
METHODS
By using the Utah Population Database, the authors identified pedigrees with significantly increased numbers of members with CM diagnosis. From a separate DNA biorepository of 451 samples from CM patients and families, 32 CM patients belonging to 1 or more of 24 high-risk Chiari pedigrees were identified. Two high-risk pedigrees had 3 CM-affected relatives, and 22 pedigrees had 2 CM-affected relatives. To identify rare candidate predisposition gene variants, whole-exome sequence data from these 32 CM patients belonging to 24 CM-affected related pairs from high-risk pedigrees were analyzed. The I-TASSER package for protein structure prediction was used to predict the structures of both the wild-type and mutant proteins found here.
RESULTS
Sequence analysis of the 24 affected relative pairs identified 38 rare candidate Chiari predisposition gene variants that were shared by at least 1 CM-affected pair from a high-risk pedigree. The authors found a candidate variant in HOXC4 that was shared by 2 CM-affected patients in 2 independent pedigrees. All 4 of these CM cases, 2 in each pedigree, exhibited a specific craniocervical bony phenotype defined by a clivoaxial angle less than 125°. The protein structure prediction results suggested that the mutation considered here may reduce the binding affinity of HOXC4 to DNA.
CONCLUSIONS
Analysis of unique and powerful Utah genetic resources allowed identification of 38 strong candidate CM predisposition gene variants. These variants should be pursued in independent populations. One of the candidates, a rare HOXC4 variant, was identified in 2 high-risk CM pedigrees, with this variant possibly predisposing patients to a Chiari phenotype with craniocervical kyphosis.
Topics: Humans; Genetic Predisposition to Disease; Genotype; Homeodomain Proteins; Mutation; Pedigree; Phenotype; Risk Factors; Brain
PubMed: 36433874
DOI: 10.3171/2022.10.JNS22956