-
Bioinformatics (Oxford, England) Dec 2022Recovery of metagenome-assembled genomes (MAGs) from shotgun metagenomic data is an important task for the comprehensive analysis of microbial communities from variable...
MOTIVATION
Recovery of metagenome-assembled genomes (MAGs) from shotgun metagenomic data is an important task for the comprehensive analysis of microbial communities from variable sources. Single binning tools differ in their ability to leverage specific aspects in MAG reconstruction, the use of ensemble binning refinement tools is often time consuming and computational demand increases with community complexity. We introduce MAGScoT, a fast, lightweight and accurate implementation for the reconstruction of highest-quality MAGs from the output of multiple genome-binning tools.
RESULTS
MAGScoT outperforms popular bin-refinement solutions in terms of quality and quantity of MAGs as well as computation time and resource consumption.
AVAILABILITY AND IMPLEMENTATION
MAGScoT is available via GitHub (https://github.com/ikmb/MAGScoT) and as an easy-to-use Docker container (https://hub.docker.com/repository/docker/ikmb/magscot).
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Algorithms; Metagenomics; Metagenome; Microbiota
PubMed: 36264141
DOI: 10.1093/bioinformatics/btac694 -
BMC Bioinformatics May 2022FragGeneScan is currently the most accurate and popular tool for gene prediction in short and error-prone reads, but its execution speed is insufficient for use on...
BACKGROUND
FragGeneScan is currently the most accurate and popular tool for gene prediction in short and error-prone reads, but its execution speed is insufficient for use on larger data sets. The parallelization which should have addressed this is inefficient. Its alternative implementation FragGeneScan+ is faster, but introduced a number of bugs related to memory management, race conditions and even output accuracy.
RESULTS
This paper introduces FragGeneScanRs, a faster Rust implementation of the FragGeneScan gene prediction model. Its command line interface is backward compatible and adds extra features for more flexible usage. Its output is equivalent to the original FragGeneScan implementation.
CONCLUSIONS
Compared to the current C implementation, shotgun metagenomic reads are processed up to 22 times faster using a single thread, with better scaling for multithreaded execution. The Rust code of FragGeneScanRs is freely available from GitHub under the GPL-3.0 license with instructions for installation, usage and other documentation ( https://github.com/unipept/FragGeneScanRs ).
Topics: Algorithms; Metagenome; Metagenomics; Software
PubMed: 35643462
DOI: 10.1186/s12859-022-04736-5 -
Bioinformatics (Oxford, England) May 2021The microbes that live in an environment can be identified from the combined genomic material, also referred to as the metagenome. Sequencing a metagenome can result in...
MOTIVATION
The microbes that live in an environment can be identified from the combined genomic material, also referred to as the metagenome. Sequencing a metagenome can result in large volumes of sequencing reads. A promising approach to reduce the size of metagenomic datasets is by clustering reads into groups based on their overlaps. Clustering reads are valuable to facilitate downstream analyses, including computationally intensive strain-aware assembly. As current read clustering approaches cannot handle the large datasets arising from high-throughput metagenome sequencing, a novel read clustering approach is needed. In this article, we propose OGRE, an Overlap Graph-based Read clustEring procedure for high-throughput sequencing data, with a focus on shotgun metagenomes.
RESULTS
We show that for small datasets OGRE outperforms other read binners in terms of the number of species included in a cluster, also referred to as cluster purity, and the fraction of all reads that is placed in one of the clusters. Furthermore, OGRE is able to process metagenomic datasets that are too large for other read binners into clusters with high cluster purity.
CONCLUSION
OGRE is the only method that can successfully cluster reads in species-specific clusters for large metagenomic datasets without running into computation time- or memory issues.
AVAILABILITYAND IMPLEMENTATION
Code is made available on Github (https://github.com/Marleen1/OGRE).
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Algorithms; Cluster Analysis; High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Sequence Analysis, DNA; Software
PubMed: 32871010
DOI: 10.1093/bioinformatics/btaa760 -
Microbiological Research Jul 2022Reference genomes are essential for analyzing the metabolic and functional potentials of microbiomes. However, microbial genome resources are limited because most of... (Review)
Review
Reference genomes are essential for analyzing the metabolic and functional potentials of microbiomes. However, microbial genome resources are limited because most of microorganisms are difficult to culture. Genome binning is a culture-independent approach that can recover a vast number of microbial genomes from short-read high throughput shotgun metagenomic sequencing data. In this review, we summarize methods commonly used for reconstructing metagenome-assembled genomes (MAGs) to provide a reference for researchers to choose propriate software programs among the numerous and complicated tools and pipelines that are available for these analyses. In addition, we discuss application prospects, challenges, and opportunities for recovering MAGs from metagenomic sequencing data.
Topics: High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Microbiota
PubMed: 35430490
DOI: 10.1016/j.micres.2022.127023 -
Cell Reports May 2023Mouse models are key tools for investigating host-microbiome interactions. However, shotgun metagenomics can only profile a limited fraction of the mouse gut microbiome.... (Meta-Analysis)
Meta-Analysis
Mouse models are key tools for investigating host-microbiome interactions. However, shotgun metagenomics can only profile a limited fraction of the mouse gut microbiome. Here, we employ a metagenomic profiling method, MetaPhlAn 4, which exploits a large catalog of metagenome-assembled genomes (including 22,718 metagenome-assembled genomes from mice) to improve the profiling of the mouse gut microbiome. We combine 622 samples from eight public datasets and an additional cohort of 97 mouse microbiomes, and we assess the potential of MetaPhlAn 4 to better identify diet-related changes in the host microbiome using a meta-analysis approach. We find multiple, strong, and reproducible diet-related microbial biomarkers, largely increasing those identifiable by other available methods relying only on reference information. The strongest drivers of the diet-induced changes are uncharacterized and previously undetected taxa, confirming the importance of adopting metagenomic methods integrating metagenomic assemblies for comprehensive profiling.
Topics: Animals; Mice; Microbiota; Metagenome; Gastrointestinal Microbiome; Diet; Metagenomics
PubMed: 37141097
DOI: 10.1016/j.celrep.2023.112464 -
Current Opinion in Virology Jun 2022Despite the growing interest in the microbiome in recent years, the study of the virome, the major part of which is made up of bacteriophages, is relatively... (Review)
Review
Despite the growing interest in the microbiome in recent years, the study of the virome, the major part of which is made up of bacteriophages, is relatively underdeveloped compared with their bacterial counterparts. This is due in part to the lack of a universally conserved marker such as the 16S rRNA gene. For this reason, the development of metagenomic approaches was a major milestone in the study of the viruses in the microbiome or virome. However, it has become increasingly clear that these wet-lab methods have not yet been able to detect the full range of viruses present, and our understanding of the composition of the virome remains incomplete. In recent years, a range of new technologies has been developed to further our understanding. Direct RNA-Seq technologies bypass the need for cDNA synthesis, thus avoiding biases subjected to this step, which further expands our understanding of RNA viruses. The new generation of amplification methods could solve the low biomass issue relevant to most virome samples while reducing the error rate and biases caused by whole genome amplification. The application of long-read sequencing to virome samples can resolve the shortcomings of short-read sequencing in generating complete viral genomes and avoid the biases introduced by the assembly. Novel experimental methods developed to measure viruses' host range can help overcome the challenges of assigning hosts to many phages, specifically unculturable ones.
Topics: Bacteriophages; Metagenome; Metagenomics; RNA, Ribosomal, 16S; Virome; Viruses
PubMed: 35643020
DOI: 10.1016/j.coviro.2022.101231 -
Bioinformatics (Oxford, England) Oct 2022Shotgun metagenomic sequencing provides the capacity to understand microbial community structure and function at unprecedented resolution; however, the current...
SUMMARY
Shotgun metagenomic sequencing provides the capacity to understand microbial community structure and function at unprecedented resolution; however, the current analytical methods are constrained by a focus on taxonomic classifications that may obfuscate functional relationships. Here, we present expam, a tree-based, taxonomy agnostic tool for the identification of biologically relevant clades from shotgun metagenomic sequencing.
AVAILABILITY AND IMPLEMENTATION
expam is an open-source Python application released under the GNU General Public Licence v3.0. expam installation instructions, source code and tutorials can be found at https://github.com/seansolari/expam.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Metagenome; Metagenomics; Microbiota; Software
PubMed: 36029242
DOI: 10.1093/bioinformatics/btac591 -
F1000Research 2021Metagenomic sequencing allows large-scale identification and genomic characterization. Binning is the process of recovering genomes from complex mixtures of sequence...
Metagenomic sequencing allows large-scale identification and genomic characterization. Binning is the process of recovering genomes from complex mixtures of sequence fragments (metagenome contigs) of unknown bacteria and archaeal species. Assessing the quality of genomes recovered from metagenomes requires the use of complex pipelines involving many independent steps, often difficult to reproduce and maintain. A comprehensive, automated and easy-to-use computational workflow for the quality assessment of draft prokaryotic genomes, based on container technology, would greatly improve reproducibility and reusability of published results. We present metashot/prok-quality, a container-enabled Nextflow pipeline for quality assessment and genome dereplication. The metashot/prok-quality tool produces genome quality reports that are compliant with the Minimum Information about a Metagenome-Assembled Genome (MIMAG) standard, and can run out-of-the-box on any platform that supports Nextflow, Docker or Singularity, including computing clusters or batch infrastructures in the cloud. metashot/prok-quality is part of the metashot collection of analysis pipelines. Workflow and documentation are available under GPL3 licence on GitHub.
Topics: Archaea; Metagenome; Metagenomics; Prokaryotic Cells; Reproducibility of Results
PubMed: 35136576
DOI: 10.12688/f1000research.54418.1 -
Methods in Molecular Biology (Clifton,... 2024Viral metagenomics is one of the most widely used approaches to study viral population genomics. With the recent development of bioinformatic tools, the number of...
Viral metagenomics is one of the most widely used approaches to study viral population genomics. With the recent development of bioinformatic tools, the number of molecular biological methods, programs, and software to analyze viral metagenome data have greatly increased. Here, we describe the basic analysis workflow along with bioinformatic tools that can be used to analyze viral metagenome data. Although this chapter assumes that the viral metagenome data are prepared from the freshwater samples and are subjected to dsDNA sequencing, the protocol can be applied and modified for other types of metagenome data collected from a variety of sources.
Topics: Metagenome; Genome, Viral; Metagenomics; Fresh Water; Viruses
PubMed: 38060116
DOI: 10.1007/978-1-0716-3515-5_3 -
Microbiology Spectrum Dec 2022Antibiotic resistance genes (ARGs) pose a serious threat to public health and ecological security in the 21st century. However, the resistome only accounts for a tiny...
Antibiotic resistance genes (ARGs) pose a serious threat to public health and ecological security in the 21st century. However, the resistome only accounts for a tiny fraction of metagenomic content, which makes it difficult to investigate low-abundance ARGs in various environmental settings. Thus, a highly sensitive, accurate, and comprehensive method is needed to describe ARG profiles in complex metagenomic samples. In this study, we established a high-throughput sequencing method based on targeted amplification, which could simultaneously detect ARGs ( = 251), mobile genetic element genes ( = 8), and metal resistance genes ( = 19) in metagenomes. The performance of amplicon sequencing was compared with traditional metagenomic shotgun sequencing (MetaSeq). A total of 1421 primer pairs were designed, achieving extremely high coverage of target genes. The amplicon sequencing significantly improved the recovery of target ARGs (~9 × 10-fold), with higher sensitivity and diversity, less cost, and computation burden. Furthermore, targeted enrichment allows deep scanning of single nucleotide polymorphisms (SNPs), and elevated SNPs detection was shown in this study. We further performed this approach for 48 environmental samples (37 feces, 20 soils, and 7 sewage) and 16 clinical samples. All samples tested in this study showed high diversity and recovery of targeted genes. Our results demonstrated that the approach could be applied to various metagenomic samples and served as an efficient tool in the surveillance and evolution assessment of ARGs. Access to the resistome using the enrichment method validated in this study enabled the capture of low-abundance resistomes while being less costly and time-consuming, which can greatly advance our understanding of local and global resistome dynamics. ARGs, an increasing global threat to human health, can be transferred into health-related microorganisms in the environment by horizontal gene transfer, posing a serious threat to public health. Advancing profiling methods are needed for monitoring and predicting the potential risks of ARGs in metagenomes. Our study described a customized amplicon sequencing assay that could enable a high-throughput, targeted, in-depth analysis of ARGs and detect a low-abundance portion of resistomes. This method could serve as an efficient tool to assess the variation and evolution of specific ARGs in the clinical and natural environment.
Topics: Humans; Metagenome; Genes, Bacterial; Anti-Bacterial Agents; Drug Resistance, Microbial; Sewage; Metagenomics
PubMed: 36287061
DOI: 10.1128/spectrum.02297-22