-
Journal of Food Protection Mar 2022Advancements in next-generation sequencing technology have dramatically reduced the cost and increased the ease of microbial whole genome sequencing. This approach is... (Review)
Review
ABSTRACT
Advancements in next-generation sequencing technology have dramatically reduced the cost and increased the ease of microbial whole genome sequencing. This approach is revolutionizing the identification and analysis of foodborne microbial pathogens, facilitating expedited detection and mitigation of foodborne outbreaks, improving public health outcomes, and limiting costly recalls. However, next-generation sequencing is still anchored in the traditional laboratory practice of the selection and culture of a single isolate. Metagenomic-based approaches, including metabarcoding and shotgun and long-read metagenomics, are part of the next disruptive revolution in food safety diagnostics and offer the potential to directly identify entire microbial communities in a single food, ingredient, or environmental sample. In this review, metagenomic-based approaches are introduced and placed within the context of conventional detection and diagnostic techniques, and essential considerations for undertaking metagenomic assays and data analysis are described. Recent applications of the use of metagenomics for food safety are discussed alongside current limitations and knowledge gaps and new opportunities arising from the use of this technology.
Topics: Food Safety; High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Whole Genome Sequencing
PubMed: 34706052
DOI: 10.4315/JFP-21-301 -
Trends in Microbiology Nov 2022Viruses are key members of Earth's microbiomes, shaping microbial community composition and metabolism. Here, we describe recent advances in 'soil viromics', that is,... (Review)
Review
Viruses are key members of Earth's microbiomes, shaping microbial community composition and metabolism. Here, we describe recent advances in 'soil viromics', that is, virus-focused metagenome and metatranscriptome analyses that offer unprecedented windows into the soil virosphere. Given the emerging picture of high soil viral activity, diversity, and dynamics over short spatiotemporal scales, we then outline key eco-evolutionary processes that we hypothesize are the major diversity drivers for soil viruses. We argue that a community effort is needed to establish a 'global soil virosphere atlas' that can be used to address the roles of viruses in soil microbiomes and terrestrial biogeochemical cycles across spatiotemporal scales.
Topics: Metagenome; Metagenomics; Soil; Soil Microbiology; Viruses
PubMed: 35644779
DOI: 10.1016/j.tim.2022.05.003 -
Bioinformatics (Oxford, England) Sep 2022Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial...
MOTIVATION
Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial task. Current metagenomic binners do not take full advantage of assembly graphs and are not optimized for long-read assemblies. Deep graph learning algorithms have been proposed in other fields to deal with complex graph data structures. The graph structure generated during the assembly process could be integrated with contig features to obtain better bins with deep learning.
RESULTS
We propose GraphMB, which uses graph neural networks to incorporate the assembly graph into the binning process. We test GraphMB on long-read datasets of different complexities, and compare the performance with other binners in terms of the number of High Quality (HQ) genome bins obtained. With our approach, we were able to obtain unique bins on all real datasets, and obtain more bins on most datasets. In particular, we obtained on average 17.5% more HQ bins when compared with state-of-the-art binners and 13.7% when aggregating the results of our binner with the others. These results indicate that a deep learning model can integrate contig-specific and graph-structure information to improve metagenomic binning.
AVAILABILITY AND IMPLEMENTATION
GraphMB is available from https://github.com/MicrobialDarkMatter/GraphMB.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Sequence Analysis, DNA; Metagenomics; Metagenome; Genome, Microbial; Algorithms
PubMed: 35972375
DOI: 10.1093/bioinformatics/btac557 -
Communications Biology Oct 2023Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct...
Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct genomes by re-grouping the sequences by their organism of origin, thus representing a crucial processing step when exploring the biological diversity of metagenomic samples. Here we present Adversarial Autoencoders for Metagenomics Binning (AAMB), an ensemble deep learning approach that integrates sequence co-abundances and tetranucleotide frequencies into a common denoised space that enables precise clustering of sequences into microbial genomes. When benchmarked, AAMB presented similar or better results compared with the state-of-the-art reference-free binner VAMB, reconstructing ~7% more near-complete (NC) genomes across simulated and real data. In addition, genomes reconstructed using AAMB had higher completeness and greater taxonomic diversity compared with VAMB. Finally, we implemented a pipeline Integrating VAMB and AAMB that enabled improved binning, recovering 20% and 29% more simulated and real NC genomes, respectively, compared to VAMB, with moderate additional runtime.
Topics: Metagenome; Genome, Microbial; Metagenomics; Cluster Analysis; Benchmarking
PubMed: 37865678
DOI: 10.1038/s42003-023-05452-3 -
MSphere May 2020Metagenome-assembled genomes (MAGs) expand our understanding of microbial diversity, evolution, and ecology. Concerns have been raised on how sequencing, assembly,...
Metagenome-assembled genomes (MAGs) expand our understanding of microbial diversity, evolution, and ecology. Concerns have been raised on how sequencing, assembly, binning, and quality assessment tools may result in MAGs that do not reflect single populations in nature. Here, we reflect on another issue, i.e., how to handle highly similar MAGs assembled from independent data sets. Obtaining multiple genomic representatives for a species is highly valuable, as it allows for population genomic analyses; however, when retaining genomes of closely related populations, it complicates MAG quality assessment and abundance inferences. We show that (i) published data sets contain a large fraction of MAGs sharing >99% average nucleotide identity, (ii) different software packages and parameters used to resolve this redundancy remove very different numbers of MAGs, and (iii) the removal of closely related genomes leads to losses of population-specific auxiliary genes. Finally, we highlight some approaches that can infer strain-specific dynamics across a sample series without dereplication.
Topics: Metagenome; Metagenomics; Microbiota; Phylogeny; Software
PubMed: 32434845
DOI: 10.1128/mSphere.00971-19 -
Scientific Data Feb 2023Common culturing techniques and priorities bias our discovery towards specific traits that may not be representative of microbial diversity in nature. So far, these...
Common culturing techniques and priorities bias our discovery towards specific traits that may not be representative of microbial diversity in nature. So far, these biases have not been systematically examined. To address this gap, here we use 116,884 publicly available metagenome-assembled genomes (MAGs, completeness ≥80%) from 203 surveys worldwide as a culture-independent sample of bacterial and archaeal diversity, and compare these MAGs to the popular RefSeq genome database, which heavily relies on cultures. We compare the distribution of 12,454 KEGG gene orthologs (used as trait proxies) in the MAGs and RefSeq genomes, while controlling for environment type (ocean, soil, lake, bioreactor, human, and other animals). Using statistical modeling, we then determine the conditional probabilities that a species is represented in RefSeq depending on its genetic repertoire. We find that the majority of examined genes are significantly biased for or against in RefSeq. Our systematic estimates of gene prevalences across bacteria and archaea in nature and gene-specific biases in reference genomes constitutes a resource for addressing these issues in the future.
Topics: Animals; Archaea; Bacteria; Genome, Microbial; Metagenome; Metagenomics
PubMed: 36759614
DOI: 10.1038/s41597-023-01994-7 -
Microbial Genomics Apr 2024The ever-decreasing cost of sequencing and the growing potential applications of metagenomics have led to an unprecedented surge in data generation. One of the most... (Review)
Review
The ever-decreasing cost of sequencing and the growing potential applications of metagenomics have led to an unprecedented surge in data generation. One of the most prevalent applications of metagenomics is the study of microbial environments, such as the human gut. The gut microbiome plays a crucial role in human health, providing vital information for patient diagnosis and prognosis. However, analysing metagenomic data remains challenging due to several factors, including reference catalogues, sparsity and compositionality. Deep learning (DL) enables novel and promising approaches that complement state-of-the-art microbiome pipelines. DL-based methods can address almost all aspects of microbiome analysis, including novel pathogen detection, sequence classification, patient stratification and disease prediction. Beyond generating predictive models, a key aspect of these methods is also their interpretability. This article reviews DL approaches in metagenomics, including convolutional networks, autoencoders and attention-based models. These methods aggregate contextualized data and pave the way for improved patient care and a better understanding of the microbiome's key role in our health.
Topics: Humans; Deep Learning; Microbiota; Metagenome; Gastrointestinal Microbiome; Metagenomics
PubMed: 38630611
DOI: 10.1099/mgen.0.001231 -
STAR Protocols Sep 2022Homology-based search is commonly used to uncover mobile genetic elements (MGEs) from metagenomes, but it heavily relies on reference genomes in the database. Here we...
Homology-based search is commonly used to uncover mobile genetic elements (MGEs) from metagenomes, but it heavily relies on reference genomes in the database. Here we introduce a protocol to extract CRISPR-targeted sequences from the assembled human gut metagenomic sequences without using a reference database. We describe the assembling of metagenome contigs, the extraction of CRISPR direct repeats and spacers, the discovery of protospacers, and the extraction of protospacer-enriched regions using the graph-based approach. This protocol could extract numerous characterized/uncharacterized MGEs. For complete details on the use and execution of this protocol, please refer to Sugimoto et al. (2021).
Topics: Base Sequence; Clustered Regularly Interspaced Short Palindromic Repeats; Humans; Metagenome; Metagenomics
PubMed: 35780428
DOI: 10.1016/j.xpro.2022.101525 -
Applied Microbiology and Biotechnology Oct 2020Single-cell genomics and transcriptomics can provide reliable context for assembled genome fragments and gene expression activity on the level of individual prokaryotic... (Review)
Review
Single-cell genomics and transcriptomics can provide reliable context for assembled genome fragments and gene expression activity on the level of individual prokaryotic genomes. These methods are rapidly emerging as an essential complement to cultivation-based, metagenomics, metatranscriptomics, and microbial community-focused research approaches by allowing direct access to information from individual microorganisms, even from deep-branching phylogenetic groups that currently lack cultured representatives. Their integration and binning with environmental 'omics data already provides unprecedented insights into microbial diversity and metabolic potential, enabling us to provide information on individual organisms and the structure and dynamics of natural microbial populations in complex environments. This review highlights the pitfalls and recent advances in the field of single-cell omics and its importance in microbiological and biotechnological studies. KEY POINTS: • Single-cell omics expands the tree of life through the discovery of novel organisms, genes, and metabolic pathways. • Disadvantages of metagenome-assembled genomes are overcome by single-cell omics. • Functional analysis of single cells explores the heterogeneity of gene expression. • Technical challenges still limit this field, thus prompting new method developments.
Topics: Genomics; Metagenome; Metagenomics; Microbiota; Phylogeny
PubMed: 32845367
DOI: 10.1007/s00253-020-10844-0 -
Microbiology Spectrum Feb 2023Lower respiratory infection (LRI) is the most fatal communicable disease, with only a few pathogens identified. Metagenomic next-generation sequencing (mNGS), as an...
Lower respiratory infection (LRI) is the most fatal communicable disease, with only a few pathogens identified. Metagenomic next-generation sequencing (mNGS), as an unbiased, hypothesis-free, and culture-independent method, theoretically enables the detection of all pathogens in a single test. In this study, we developed and validated a DNA-based mNGS method for the diagnosis of LRIs from bronchoalveolar lavage fluid (BALF). We prepared simulated data sets and published raw data sets from patients to evaluate the performance of our in-house bioinformatics pipeline and compared it with the popular metagenomics pipeline Kraken2-Bracken. In addition, a series of biological microbial communities were used to comprehensively validate the performance of our mNGS assay. Sixty-nine clinical BALF samples were used for clinical validation to determine the accuracy. The in-house bioinformatics pipeline validation showed a recall of 88.03%, precision of 99.14%, and F1 score of 92.26% via single-genome simulated data. Mock microbial community and clinical metagenomic data showed that the in-house pipeline has a stricter cutoff value than Kraken2-Bracken, which could prevent false-positive detection by the bioinformatics pipeline. The validation for the whole mNGS pipeline revealed that overwhelming human DNA, long-term storage at 4°C, and repeated freezing-thawing reduced the analytical sensitivity of the assay. The mNGS assay showed a sensitivity of 95.18% and specificity of 91.30% for pathogen detection from BALF samples. This study comprehensively demonstrated the analytical performance of this laboratory-developed mNGS assay for pathogen detection from BALF, which contributed to the standardization of this technology. To our knowledge, this study is the first to comprehensively validate the mNGS assay for the diagnosis of LRIs from BALF. This study exhibited a ready-made example for clinical laboratories to prepare reference materials and develop comprehensive validation schemes for their in-house mNGS assays, which would accelerate the standardization of mNGS testing.
Topics: Humans; Metagenome; Respiratory Tract Infections; Microbiota; High-Throughput Nucleotide Sequencing; Metagenomics
PubMed: 36507666
DOI: 10.1128/spectrum.03812-22