-
PloS One 2021Most traits in livestock, crops and humans are polygenic, that is, a large number of loci contribute to genetic variation. Effects at these loci lie along a continuum...
Most traits in livestock, crops and humans are polygenic, that is, a large number of loci contribute to genetic variation. Effects at these loci lie along a continuum ranging from common low-effect to rare high-effect variants that cumulatively contribute to the overall phenotype. Statistical methods to calculate the effect of these loci have been developed and can be used to predict phenotypes in new individuals. In agriculture, these methods are used to select superior individuals using genomic breeding values; in humans these methods are used to quantitatively measure an individual's disease risk, termed polygenic risk scores. Both fields typically use SNP array genotypes for the analysis. Recently, genotyping-by-sequencing has become popular, due to lower cost and greater genome coverage (including structural variants). Oxford Nanopore Technologies' (ONT) portable sequencers have the potential to combine the benefits genotyping-by-sequencing with portability and decreased turn-around time. This introduces the potential for in-house clinical genetic disease risk screening in humans or calculating genomic breeding values on-farm in agriculture. Here we demonstrate the potential of the later by calculating genomic breeding values for four traits in cattle using low-coverage ONT sequence data and comparing these breeding values to breeding values calculated from SNP arrays. At sequencing coverages between 2X and 4X the correlation between ONT breeding values and SNP array-based breeding values was > 0.92 when imputation was used and > 0.88 when no imputation was used. With an average sequencing coverage of 0.5x the correlation between the two methods was between 0.85 and 0.92 using imputation, depending on the trait. This suggests that ONT sequencing has potential for in clinic or on-farm genomic prediction, however, further work to validate these findings in a larger population still remains.
Topics: Animals; Cattle; Genome; Genomics; Genotype; Genotyping Techniques; High-Throughput Nucleotide Sequencing; Livestock; Nanopore Sequencing; Phenotype; Polymorphism, Single Nucleotide; Sequence Analysis, DNA
PubMed: 34910782
DOI: 10.1371/journal.pone.0261274 -
Molecular Cell May 2022Next-generation sequencing techniques have led to a new quantitative dimension in the biological sciences. In particular, integrating sequencing techniques with... (Review)
Review
Next-generation sequencing techniques have led to a new quantitative dimension in the biological sciences. In particular, integrating sequencing techniques with biophysical tools allows sequence-dependent mechanistic studies. Using the millions of DNA clusters that are generated during sequencing to perform high-throughput binding affinity and kinetics measurements enabled the construction of energy landscapes in sequence space, uncovering relationships between sequence, structure, and function. Here, we review the approaches to perform ensemble fluorescence experiments on next-generation sequencing chips for variations of DNA, RNA, and protein sequences. As the next step, we anticipate that these fluorescence experiments will be pushed to the single-molecule level, which can directly uncover kinetics and molecular heterogeneity in an unprecedented high-throughput fashion. Molecular biophysics in sequence space, both at the ensemble and single-molecule level, leads to new mechanistic insights. The wide spectrum of applications in biology and medicine ranges from the fundamental understanding of evolutionary pathways to the development of new therapeutics.
Topics: Biophysics; DNA; High-Throughput Nucleotide Sequencing; Molecular Biology; Sequence Analysis, DNA
PubMed: 35561688
DOI: 10.1016/j.molcel.2022.04.024 -
Methods in Molecular Biology (Clifton,... 2022Geneticists approach biology with a simple question: which genes are required for the pathway or process of interest? Classical genetic screens (aka forward genetics) in...
Geneticists approach biology with a simple question: which genes are required for the pathway or process of interest? Classical genetic screens (aka forward genetics) in model organisms such as Caenorhabditis elegans have been the method of choice for answering that question. Next-generation sequencing provides the means to generate a comprehensive list of sequence variants, including the mutation of interest. Herein is described a workflow for sample preparation and data analysis to allow the simultaneous mapping and identification of candidate mutations by whole-genome sequencing in Caenorhabditis elegans.
Topics: Animals; Caenorhabditis elegans; DNA Mutational Analysis; High-Throughput Nucleotide Sequencing; Mutation; Whole Genome Sequencing
PubMed: 35320569
DOI: 10.1007/978-1-0716-2181-3_13 -
Current Oncology (Toronto, Ont.) Apr 2024Myelodysplastic neoplasm (MDS) is a heterogeneous group of clonal hematological disorders that originate from the hematopoietic and progenitor cells and present with... (Review)
Review
Myelodysplastic neoplasm (MDS) is a heterogeneous group of clonal hematological disorders that originate from the hematopoietic and progenitor cells and present with cytopenias and morphologic dysplasia with a propensity to progress to bone marrow failure or acute myeloid leukemia (AML). Genetic evolution plays a critical role in the pathogenesis, progression, and clinical outcomes of MDS. This process involves the acquisition of genetic mutations in stem cells that confer a selective growth advantage, leading to clonal expansion and the eventual development of MDS. With the advent of next-generation sequencing (NGS) assays, an increasing number of molecular aberrations have been discovered in recent years. The knowledge of molecular events in MDS has led to an improved understanding of the disease process, including the evolution of the disease and prognosis, and has paved the way for targeted therapy. The 2022 World Health Organization (WHO) Classification and the International Consensus Classification (ICC) have incorporated the molecular signature into the classification system for MDS. In addition, specific germline mutations are associated with MDS development, especially in pediatrics and young adults. This article reviews the genetic abnormalities of MDS in adults with a brief review of germline predisposition syndromes.
Topics: Humans; Myelodysplastic Syndromes; Mutation; High-Throughput Nucleotide Sequencing
PubMed: 38785456
DOI: 10.3390/curroncol31050175 -
Sheng Wu Gong Cheng Xue Bao = Chinese... Sep 2023As central players in cellular structure and function, proteins have long been central themes in life science research. Analyzing the impact of protein sequence... (Review)
Review
As central players in cellular structure and function, proteins have long been central themes in life science research. Analyzing the impact of protein sequence variation on its structure and function is one of the important means to study proteins. In recent years, a technology called deep mutational scanning (DMS) has been widely used in the field of protein research. It introduces thousands of mutations in parallel in specific regions of proteins through high-abundance DNA libraries. After screening, high-throughput sequencing is employed to score each mutation, revealing sequence-function correlations. Due to its high-throughput, fast and easy, and labor-saving features, DMS has become an important method for protein function research and protein engineering. This review briefly summarizes the principle of DMS technology, highlighting its applications in mammalian cells. Moreover, this review analyzes the current technical bottlenecks, aiming to facilitate relevant research.
Topics: Animals; Mutation; Proteins; Protein Engineering; High-Throughput Nucleotide Sequencing; Mammals
PubMed: 37805848
DOI: 10.13345/j.cjb.221050 -
Bioinformatics (Oxford, England) Mar 2024Intra-host variants refer to genetic variations or mutations that occur within an individual host organism. These variants are typically studied in the context of...
MOTIVATION
Intra-host variants refer to genetic variations or mutations that occur within an individual host organism. These variants are typically studied in the context of viruses, bacteria, or other pathogens to understand the evolution of pathogens. Moreover, intra-host variants are also explored in the field of tumor biology and mitochondrial biology to characterize somatic mutations and inherited heteroplasmic mutations. Intra-host variants can involve long insertions, deletions, and combinations of different mutation types, which poses challenges in their identification. The performance of current methods in detecting of complex intra-host variants is unknown.
RESULTS
First, we simulated a dataset comprising 10 samples with 1869 intra-host variants involving various mutation patterns and benchmarked current variant detection software. The results indicated that though current software can detect most variants with F1-scores between 0.76 and 0.97, their performance in detecting long indels and low frequency variants was limited. Thus, we developed a new software, PySNV, for the detection of complex intra-host variations. On the simulated dataset, PySNV successfully detected 1863 variant cases (F1-score: 0.99) and exhibited the highest Pearson correlation coefficient (PCC: 0.99) to the ground truth in predicting variant frequencies. The results demonstrated that PySNV delivered promising performance even for long indels and low frequency variants, while maintaining computational speed comparable to other methods. Finally, we tested its performance on SARS-CoV-2 replicate sequencing data and found that it reported 21% more variants compared to LoFreq, the best-performing benchmarked software, while showing higher consistency (62% over 54%) within replicates. The discrepancies mostly exist in low-depth regions and low frequency variants.
AVAILABILITY AND IMPLEMENTATION
https://github.com/bnuLyndon/PySNV/.
Topics: High-Throughput Nucleotide Sequencing; Software; Mutation; INDEL Mutation; Genetic Variation
PubMed: 38426352
DOI: 10.1093/bioinformatics/btae116 -
International Journal of Molecular... Nov 2020Aptamers are nucleic acid ligands that bind specifically to a target of interest. Aptamers have gained in popularity due to their high potential for different... (Review)
Review
Aptamers are nucleic acid ligands that bind specifically to a target of interest. Aptamers have gained in popularity due to their high potential for different applications in analysis, diagnostics, and therapeutics. The procedure called systematic evolution of ligands by exponential enrichment (SELEX) is used for aptamer isolation from large nucleic acid combinatorial libraries. The huge number of unique sequences implemented in the in vitro evolution in the SELEX process imposes the necessity of performing extensive sequencing of the selected nucleic acid pools. High-throughput sequencing (HTS) meets this demand of SELEX. Analysis of the data obtained from sequencing of the libraries produced during and after aptamer isolation provides an informative basis for precise aptamer identification and for examining the structure and function of nucleic acid ligands. This review discusses the technical aspects and the potential of the integration of HTS with SELEX.
Topics: Aptamers, Nucleotide; Base Sequence; Benchmarking; Gene Library; High-Throughput Nucleotide Sequencing; Humans; Ligands; Nucleic Acid Conformation; Nucleic Acids; Precision Medicine; SELEX Aptamer Technique
PubMed: 33233573
DOI: 10.3390/ijms21228774 -
Bioinformatics (Oxford, England) Jan 2023High-throughput sequencing technologies have greatly facilitated microbiome research and have generated a large volume of microbiome data with the potential to answer...
MOTIVATION
High-throughput sequencing technologies have greatly facilitated microbiome research and have generated a large volume of microbiome data with the potential to answer key questions regarding microbiome assembly, structure and function. Cluster analysis aims to group features that behave similarly across treatments, and such grouping helps to highlight the functional relationships among features and may provide biological insights into microbiome networks. However, clustering microbiome data are challenging due to the sparsity and high dimensionality.
RESULTS
We propose a model-based clustering method based on Poisson hurdle models for sparse microbiome count data. We describe an expectation-maximization algorithm and a modified version using simulated annealing to conduct the cluster analysis. Moreover, we provide algorithms for initialization and choosing the number of clusters. Simulation results demonstrate that our proposed methods provide better clustering results than alternative methods under a variety of settings. We also apply the proposed method to a sorghum rhizosphere microbiome dataset that results in interesting biological findings.
AVAILABILITY AND IMPLEMENTATION
R package is freely available for download at https://cran.r-project.org/package=PHclust.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Algorithms; Computer Simulation; Microbiota; Cluster Analysis; High-Throughput Nucleotide Sequencing; Software
PubMed: 36469352
DOI: 10.1093/bioinformatics/btac782 -
Cell Reports Methods Oct 2022In a recent issue in , Chen et al. present Live-seq, a single-cell transcriptomic profiling method using picoliter scale single-cell cytoplasmic biopsies instead of...
In a recent issue in , Chen et al. present Live-seq, a single-cell transcriptomic profiling method using picoliter scale single-cell cytoplasmic biopsies instead of complete cell lysis. Since the cells quickly recover and basically remain unaffected after the cytoplasmic extraction, the authors transform single-cell RNA sequencing (scRNA-seq) from an end point to a temporal analysis platform.
Topics: Transcriptome; Sequence Analysis, RNA; High-Throughput Nucleotide Sequencing; Gene Expression Profiling; Biopsy
PubMed: 36313799
DOI: 10.1016/j.crmeth.2022.100319 -
Genome Biology Jun 2023It has been over a decade since the first publication of a method dedicated entirely to mapping long-reads. The distinctive characteristics of long reads resulted in... (Review)
Review
It has been over a decade since the first publication of a method dedicated entirely to mapping long-reads. The distinctive characteristics of long reads resulted in methods moving from the seed-and-extend framework used for short reads to a seed-and-chain framework due to the seed abundance in each read. The main novelties are based on alternative seed constructs or chaining formulations. Dozens of tools now exist, whose heuristics have evolved considerably. We provide an overview of the methods used in long-read mappers. Since they are driven by implementation-specific parameters, we develop an original visualization tool to understand the parameter settings ( http://bcazaux.polytech-lille.net/Minimap2/ ).
Topics: Software; Sequence Analysis, DNA; High-Throughput Nucleotide Sequencing; Algorithms
PubMed: 37264447
DOI: 10.1186/s13059-023-02972-3