-
Current Issues in Molecular Biology 2017Surveys of environmental microbial communities using metagenomic approach produce vast volumes of multidimensional data regarding the phylogenetic and functional... (Review)
Review
Surveys of environmental microbial communities using metagenomic approach produce vast volumes of multidimensional data regarding the phylogenetic and functional composition of the microbiota. Faced with such complex data, a metagenomic researcher needs to select the means for data analysis properly. Data visualization became an indispensable part of the exploratory data analysis and serves a key to the discoveries. While the molecular-genetic analysis of even a single bacterium presents multiple layers of data to be properly displayed and perceived, the studies of microbiota are significantly more challenging. Here we present a review of the state-of-art methods for the visualization of metagenomic data in a multi-level manner: from the methods applicable to an in-depth analysis of a single metagenome to the techniques appropriate for large-scale studies containing hundreds of environmental samples.
Topics: Bacteria; Computer Graphics; Databases, Genetic; Metagenome; Metagenomics; Microbiota
PubMed: 28686567
DOI: 10.21775/cimb.024.037 -
MSystems Aug 2022Metagenome-assembled genomes (MAGs) represent individual genomes recovered from metagenomic data. MAGs are extremely useful to analyze uncultured microbial genomic...
Metagenome-assembled genomes (MAGs) represent individual genomes recovered from metagenomic data. MAGs are extremely useful to analyze uncultured microbial genomic diversity, as well as to characterize associated functional and metabolic potential in natural environments. Recent computational developments have considerably improved MAG reconstruction but also emphasized several limitations, such as the nonbinning of sequence regions with repetitions or distinct nucleotidic composition. Different assembly and binning strategies are often used; however, it still remains unclear which assembly strategy, in combination with which binning approach, offers the best performance for MAG recovery. Several workflows have been proposed in order to reconstruct MAGs, but users are usually limited to single-metagenome assembly or need to manually define sets of metagenomes to coassemble prior to genome binning. Here, we present MAGNETO, an automated workflow dedicated to MAG reconstruction, which includes a fully-automated coassembly step informed by optimal clustering of metagenomic distances, and implements complementary genome binning strategies, for improving MAG recovery. MAGNETO is implemented as a Snakemake workflow and is available at: https://gitlab.univ-nantes.fr/bird_pipeline_registry/magneto. Genome-resolved metagenomics has led to the discovery of previously untapped biodiversity within the microbial world. As the development of computational methods for the recovery of genomes from metagenomes continues, existing strategies need to be evaluated and compared to eventually lead to standardized computational workflows. In this study, we compared commonly used assembly and binning strategies and assessed their performance using both simulated and real metagenomic data sets. We propose a novel approach to automate coassembly, avoiding the requirement for knowledge to combine metagenomic information. The comparison against a previous coassembly approach demonstrates a strong impact of this step on genome binning results, but also the benefits of informing coassembly for improving the quality of recovered genomes. MAGNETO integrates complementary assembly-binning strategies to optimize genome reconstruction and provides a complete reads-to-genomes workflow for the growing microbiome research community.
Topics: Workflow; Metagenomics; Metagenome; Microbiota; Genome, Microbial
PubMed: 35703559
DOI: 10.1128/msystems.00432-22 -
Annual Review of Virology Sep 2021Viral metagenomics has expanded our knowledge of the ecology of uncultured viruses, within both environmental (e.g., terrestrial and aquatic) and host-associated (e.g.,...
Viral metagenomics has expanded our knowledge of the ecology of uncultured viruses, within both environmental (e.g., terrestrial and aquatic) and host-associated (e.g., plants and animals, including humans) contexts. Here, we emphasize the implementation of an ecological framework in viral metagenomic studies to address questions in virology rarely considered ecological, which can change our perception of viruses and how they interact with their surroundings. An ecological framework explicitly considers diverse variants of viruses in populations that make up communities of interacting viruses, with ecosystem-level effects. It provides a structure for the study of the diversity, distributions, dynamics, and interactions of viruses with one another, hosts, and the ecosystem, including interactions with abiotic factors. An ecological framework in viral metagenomics stands poised to broadly expand our knowledge in basic and applied virology. We highlight specific fundamental research needs to capitalize on its potential and advance the field.
Topics: Animals; Ecosystem; Genome, Viral; Humans; Metagenome; Metagenomics; Plants; Viruses
PubMed: 34033501
DOI: 10.1146/annurev-virology-010421-053015 -
MicrobiologyOpen Jun 2022The rise of metagenomics offers a leap forward for understanding the genetic diversity of microorganisms in many different complex environments by providing a platform... (Review)
Review
The rise of metagenomics offers a leap forward for understanding the genetic diversity of microorganisms in many different complex environments by providing a platform that can identify potentially unlimited numbers of known and novel microorganisms. As such, it is impossible to imagine new major initiatives without metagenomics. Nevertheless, it represents a relatively new discipline with various levels of complexity and demands on bioinformatics. The underlying principles and methods used in metagenomics are often seen as common knowledge and often not detailed or fragmented. Therefore, we reviewed these to guide microbiologists in taking the first steps into metagenomics. We specifically focus on a workflow aimed at reconstructing individual genomes, that is, metagenome-assembled genomes, integrating DNA sequencing, assembly, binning, identification and annotation.
Topics: Computational Biology; Metagenome; Metagenomics; Sequence Analysis, DNA
PubMed: 35765182
DOI: 10.1002/mbo3.1298 -
Communications Biology Oct 2023Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct...
Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct genomes by re-grouping the sequences by their organism of origin, thus representing a crucial processing step when exploring the biological diversity of metagenomic samples. Here we present Adversarial Autoencoders for Metagenomics Binning (AAMB), an ensemble deep learning approach that integrates sequence co-abundances and tetranucleotide frequencies into a common denoised space that enables precise clustering of sequences into microbial genomes. When benchmarked, AAMB presented similar or better results compared with the state-of-the-art reference-free binner VAMB, reconstructing ~7% more near-complete (NC) genomes across simulated and real data. In addition, genomes reconstructed using AAMB had higher completeness and greater taxonomic diversity compared with VAMB. Finally, we implemented a pipeline Integrating VAMB and AAMB that enabled improved binning, recovering 20% and 29% more simulated and real NC genomes, respectively, compared to VAMB, with moderate additional runtime.
Topics: Metagenome; Genome, Microbial; Metagenomics; Cluster Analysis; Benchmarking
PubMed: 37865678
DOI: 10.1038/s42003-023-05452-3 -
Scientific Data Oct 2023Biofloc technology is increasingly recognised as a sustainable aquaculture method. In this technique, bioflocs are generated as microbial aggregates that play pivotal...
Biofloc technology is increasingly recognised as a sustainable aquaculture method. In this technique, bioflocs are generated as microbial aggregates that play pivotal roles in assimilating toxic nitrogenous substances, thereby ensuring high water quality. Despite the crucial roles of the floc-associated bacterial (FAB) community in pathogen control and animal health, earlier microbiota studies have primarily relied on the metataxonomic approaches. Here, we employed shotgun sequencing on eight biofloc metagenomes from a commercial aquaculture system. This resulted in the generation of 106.6 Gbp, and the reconstruction of 444 metagenome-assembled genomes (MAGs). Among the recovered MAGs, 230 were high-quality (≥90% completeness, ≤5% contamination), and 214 were medium-quality (≥50% completeness, ≤10% contamination). Phylogenetic analysis unveiled Rhodobacteraceae as dominant members of the FAB community. The reported metagenomes and MAGs are crucial for elucidating the roles of diverse microorganisms and their functional genes in key processes such as nitrification, denitrification, and remineralization. This study will contribute to scientific understanding of phylogenetic diversity and metabolic capabilities of microbial taxa in aquaculture environments.
Topics: Animals; Aquaculture; Bacteria; Metagenome; Metagenomics; Microbiota; Phylogeny
PubMed: 37848477
DOI: 10.1038/s41597-023-02622-0 -
metaMIC: reference-free misassembly identification and correction of de novo metagenomic assemblies.Genome Biology Nov 2022Evaluating the quality of metagenomic assemblies is important for constructing reliable metagenome-assembled genomes and downstream analyses. Here, we present metaMIC (...
Evaluating the quality of metagenomic assemblies is important for constructing reliable metagenome-assembled genomes and downstream analyses. Here, we present metaMIC ( https://github.com/ZhaoXM-Lab/metaMIC ), a machine learning-based tool for identifying and correcting misassemblies in metagenomic assemblies. Benchmarking results on both simulated and real datasets demonstrate that metaMIC outperforms existing tools when identifying misassembled contigs. Furthermore, metaMIC is able to localize the misassembly breakpoints, and the correction of misassemblies by splitting at misassembly breakpoints can improve downstream scaffolding and binning results.
Topics: Metagenome; Sequence Analysis, DNA; Metagenomics; Machine Learning; Benchmarking; Software; Algorithms
PubMed: 36376928
DOI: 10.1186/s13059-022-02810-y -
Microbiome Apr 2018The characterization of microbial communities based on sequencing and analysis of their genetic information has become a popular approach also referred to as...
BACKGROUND
The characterization of microbial communities based on sequencing and analysis of their genetic information has become a popular approach also referred to as metagenomics; in particular, the recent advances in sequencing technologies have enabled researchers to study even the most complex communities. Metagenome analysis, the assignment of sequences to taxonomic and functional entities, however, remains a tedious task: large amounts of data need to be processed. There are a number of approaches addressing particular aspects, but scientific questions are often too specific to be answered by a general-purpose method.
RESULTS
We present MGX, a flexible and extensible client/server-framework for the management and analysis of metagenomic datasets; MGX features a comprehensive set of adaptable workflows required for taxonomic and functional metagenome analysis, combined with an intuitive and easy-to-use graphical user interface offering customizable result visualizations. At the same time, MGX allows to include own data sources and devise custom analysis pipelines, thus enabling researchers to perform basic as well as highly specific analyses within a single application.
CONCLUSIONS
With MGX, we provide a novel metagenome analysis platform giving researchers access to the most recent analysis tools. MGX covers taxonomic and functional metagenome analysis, statistical evaluation, and a wide range of visualizations easing data interpretation. Its default taxonomic classification pipeline provides equivalent or superior results in comparison to existing tools.
Topics: Database Management Systems; Metagenome; Metagenomics; Microbiota; Reproducibility of Results; User-Computer Interface; Workflow
PubMed: 29690922
DOI: 10.1186/s40168-018-0460-1 -
Nucleic Acids Research Aug 2022Genome binning has been essential for characterization of bacteria, archaea, and even eukaryotes from metagenomes. Yet, few approaches exist for viruses. We developed...
Genome binning has been essential for characterization of bacteria, archaea, and even eukaryotes from metagenomes. Yet, few approaches exist for viruses. We developed vRhyme, a fast and precise software for construction of viral metagenome-assembled genomes (vMAGs). vRhyme utilizes single- or multi-sample coverage effect size comparisons between scaffolds and employs supervised machine learning to identify nucleotide feature similarities, which are compiled into iterations of weighted networks and refined bins. To refine bins, vRhyme utilizes unique features of viral genomes, namely a protein redundancy scoring mechanism based on the observation that viruses seldom encode redundant genes. Using simulated viromes, we displayed superior performance of vRhyme compared to available binning tools in constructing more complete and uncontaminated vMAGs. When applied to 10,601 viral scaffolds from human skin, vRhyme advanced our understanding of resident viruses, highlighted by identification of a Herelleviridae vMAG comprised of 22 scaffolds, and another vMAG encoding a nitrate reductase metabolic gene, representing near-complete genomes post-binning. vRhyme will enable a convention of binning uncultivated viral genomes and has the potential to transform metagenome-based viral ecology.
Topics: Genome, Viral; High-Throughput Nucleotide Sequencing; Humans; Metagenome; Metagenomics; Sequence Analysis, DNA; Software
PubMed: 35544285
DOI: 10.1093/nar/gkac341 -
Microbiology Spectrum Aug 2023Microbial secondary metabolites play crucial roles in microbial competition, communication, resource acquisition, antibiotic production, and a variety of other...
Microbial secondary metabolites play crucial roles in microbial competition, communication, resource acquisition, antibiotic production, and a variety of other biotechnological processes. The retrieval of full-length BGC (biosynthetic gene cluster) sequences from uncultivated bacteria is difficult due to the technical constraints of short-read sequencing, making it impossible to determine BGC diversity. Using long-read sequencing and genome mining, 339 mainly full-length BGCs were recovered in this study, illuminating the wide range of BGCs from uncultivated lineages discovered in seawater from Aoshan Bay, Yellow Sea, China. Many extremely diverse BGCs were discovered in bacterial phyla such as , , , and as well as the previously uncultured archaeal phylum " Thermoplasmatota." The data from metatranscriptomics showed that 30.1% of secondary metabolic genes were being expressed, and they also revealed the expression pattern of BGC core biosynthetic genes and tailoring enzymes. Taken together, our results demonstrate that long-read metagenomic sequencing combined with metatranscriptomic analysis provides a direct view into the functional expression of BGCs in environmental processes. Genome mining of metagenomic data has become the preferred method for the bioprospecting of novel compounds by cataloguing secondary metabolite potential. However, the accurate detection of BGCs requires unfragmented genomic assemblies, which have been technically difficult to obtain from metagenomes until recently with new long-read technologies. We used high-quality metagenome-assembled genomes generated from long-read data to determine the biosynthetic potential of microbes found in the surface water of the Yellow Sea. We recovered 339 highly diverse and mostly full-length BGCs from largely uncultured and underexplored bacterial and archaeal phyla. Additionally, we present long-read metagenomic sequencing combined with metatranscriptomic analysis as a potential method for gaining access to the largely underutilized genetic reservoir of specialized metabolite gene clusters in the majority of microbes that are not cultured. The combination of long-read metagenomic and metatranscriptomic analyses is significant because it can more accurately assess the mechanisms of microbial adaptation to the environment through BGC expression based on metatranscriptomic data.
Topics: Metagenomics; Bacteria; Metagenome; Archaea; Bacteroidetes
PubMed: 37409950
DOI: 10.1128/spectrum.01501-23