-
Nature Biotechnology Nov 2023Metagenomic assembly enables new organism discovery from microbial communities, but it can only capture few abundant organisms from most metagenomes. Here we present...
Metagenomic assembly enables new organism discovery from microbial communities, but it can only capture few abundant organisms from most metagenomes. Here we present MetaPhlAn 4, which integrates information from metagenome assemblies and microbial isolate genomes for more comprehensive metagenomic taxonomic profiling. From a curated collection of 1.01 M prokaryotic reference and metagenome-assembled genomes, we define unique marker genes for 26,970 species-level genome bins, 4,992 of them taxonomically unidentified at the species level. MetaPhlAn 4 explains ~20% more reads in most international human gut microbiomes and >40% in less-characterized environments such as the rumen microbiome and proves more accurate than available alternatives on synthetic evaluations while also reliably quantifying organisms with no cultured isolates. Application of the method to >24,500 metagenomes highlights previously undetected species to be strong biomarkers for host conditions and lifestyles in human and mouse microbiomes and shows that even previously uncharacterized species can be genetically profiled at the resolution of single microbial strains.
Topics: Humans; Animals; Mice; Metagenome; Microbiota; Gastrointestinal Microbiome; Metagenomics; Phylogeny
PubMed: 36823356
DOI: 10.1038/s41587-023-01688-w -
Pharmacological Research Mar 2013The microbes residing in and on the human body influence human physiology in many ways, particularly through their impact on the metabolism of xenobiotic compounds,... (Review)
Review
The microbes residing in and on the human body influence human physiology in many ways, particularly through their impact on the metabolism of xenobiotic compounds, including therapeutic drugs, antibiotics, and diet-derived bioactive compounds. Despite the importance of these interactions and the many possibilities for intervention, microbial xenobiotic metabolism remains a largely underexplored component of pharmacology. Here, we discuss the emerging evidence for both direct and indirect effects of the human gut microbiota on xenobiotic metabolism, and the initial links that have been made between specific compounds, diverse members of this complex community, and the microbial genes responsible. Furthermore, we highlight the many parallels to the now well-established field of environmental bioremediation, and the vast potential to leverage emerging metagenomic tools to shed new light on these important microbial biotransformations.
Topics: Animals; Biotransformation; Gastrointestinal Tract; Humans; Metagenome; Metagenomics; Xenobiotics
PubMed: 22902524
DOI: 10.1016/j.phrs.2012.07.009 -
Molecules (Basel, Switzerland) May 2021Microorganisms are highly regarded as a prominent source of natural products that have significant importance in many fields such as medicine, farming, environmental... (Review)
Review
Microorganisms are highly regarded as a prominent source of natural products that have significant importance in many fields such as medicine, farming, environmental safety, and material production. Due to this, only tiny amounts of microorganisms can be cultivated under standard laboratory conditions, and the bulk of microorganisms in the ecosystems are still unidentified, which restricts our knowledge of uncultured microbial metabolism. However, they could hypothetically provide a large collection of innovative natural products. Culture-independent metagenomics study has the ability to address core questions in the potential of NP production by cloning and analysis of microbial DNA derived directly from environmental samples. Latest advancements in next generation sequencing and genetic engineering tools for genome assembly have broadened the scope of metagenomics to offer perspectives into the life of uncultured microorganisms. In this review, we cover the methods of metagenomic library construction, and heterologous expression for the exploration and development of the environmental metabolome and focus on the function-based metagenomics, sequencing-based metagenomics, and single-cell metagenomics of uncultured microorganisms.
Topics: Bacteria; Biological Products; Ecosystem; High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics
PubMed: 34067778
DOI: 10.3390/molecules26102977 -
FEMS Microbiology Ecology Jul 2016The advent of next-generation sequencing has allowed huge amounts of DNA sequence data to be produced, advancing the capabilities of microbial ecosystem studies. The...
The advent of next-generation sequencing has allowed huge amounts of DNA sequence data to be produced, advancing the capabilities of microbial ecosystem studies. The current challenge is to identify from which microorganisms and genes the DNA originated. Several tools and databases are available for annotating DNA sequences. The tools, databases and parameters used can have a significant impact on the results: naïve choice of these factors can result in a false representation of community composition and function. We use a simulated metagenome to show how different parameters affect annotation accuracy by evaluating the sequence annotation performances of MEGAN, MG-RAST, One Codex and Megablast. This simulated metagenome allowed the recovery of known organism and function abundances to be quantitatively evaluated, which is not possible for environmental metagenomes. The performance of each program and database varied, e.g. One Codex correctly annotated many sequences at the genus level, whereas MG-RAST RefSeq produced many false positive annotations. This effect decreased as the taxonomic level investigated increased. Selecting more stringent parameters decreases the annotation sensitivity, but increases precision. Ultimately, there is a trade-off between taxonomic resolution and annotation accuracy. These results should be considered when annotating metagenomes and interpreting results from previous studies.
Topics: Bacteria; Environmental Microbiology; High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Molecular Sequence Annotation; Software
PubMed: 27162180
DOI: 10.1093/femsec/fiw095 -
MSystems Aug 2022Metagenome-assembled genomes (MAGs) represent individual genomes recovered from metagenomic data. MAGs are extremely useful to analyze uncultured microbial genomic...
Metagenome-assembled genomes (MAGs) represent individual genomes recovered from metagenomic data. MAGs are extremely useful to analyze uncultured microbial genomic diversity, as well as to characterize associated functional and metabolic potential in natural environments. Recent computational developments have considerably improved MAG reconstruction but also emphasized several limitations, such as the nonbinning of sequence regions with repetitions or distinct nucleotidic composition. Different assembly and binning strategies are often used; however, it still remains unclear which assembly strategy, in combination with which binning approach, offers the best performance for MAG recovery. Several workflows have been proposed in order to reconstruct MAGs, but users are usually limited to single-metagenome assembly or need to manually define sets of metagenomes to coassemble prior to genome binning. Here, we present MAGNETO, an automated workflow dedicated to MAG reconstruction, which includes a fully-automated coassembly step informed by optimal clustering of metagenomic distances, and implements complementary genome binning strategies, for improving MAG recovery. MAGNETO is implemented as a Snakemake workflow and is available at: https://gitlab.univ-nantes.fr/bird_pipeline_registry/magneto. Genome-resolved metagenomics has led to the discovery of previously untapped biodiversity within the microbial world. As the development of computational methods for the recovery of genomes from metagenomes continues, existing strategies need to be evaluated and compared to eventually lead to standardized computational workflows. In this study, we compared commonly used assembly and binning strategies and assessed their performance using both simulated and real metagenomic data sets. We propose a novel approach to automate coassembly, avoiding the requirement for knowledge to combine metagenomic information. The comparison against a previous coassembly approach demonstrates a strong impact of this step on genome binning results, but also the benefits of informing coassembly for improving the quality of recovered genomes. MAGNETO integrates complementary assembly-binning strategies to optimize genome reconstruction and provides a complete reads-to-genomes workflow for the growing microbiome research community.
Topics: Workflow; Metagenomics; Metagenome; Microbiota; Genome, Microbial
PubMed: 35703559
DOI: 10.1128/msystems.00432-22 -
Bioinformatics (Oxford, England) May 2020Methodological advances in metagenome assembly are rapidly increasing in the number of published metagenome assemblies. However, identifying misassemblies is challenging...
MOTIVATION
Methodological advances in metagenome assembly are rapidly increasing in the number of published metagenome assemblies. However, identifying misassemblies is challenging due to a lack of closely related reference genomes that can act as pseudo ground truth. Existing reference-free methods are no longer maintained, can make strong assumptions that may not hold across a diversity of research projects, and have not been validated on large-scale metagenome assemblies.
RESULTS
We present DeepMAsED, a deep learning approach for identifying misassembled contigs without the need for reference genomes. Moreover, we provide an in silico pipeline for generating large-scale, realistic metagenome assemblies for comprehensive model training and testing. DeepMAsED accuracy substantially exceeds the state-of-the-art when applied to large and complex metagenome assemblies. Our model estimates a 1% contig misassembly rate in two recent large-scale metagenome assembly publications.
CONCLUSIONS
DeepMAsED accurately identifies misassemblies in metagenome-assembled contigs from a broad diversity of bacteria and archaea without the need for reference genomes or strong modeling assumptions. Running DeepMAsED is straight-forward, as well as is model re-training with our dataset generation pipeline. Therefore, DeepMAsED is a flexible misassembly classifier that can be applied to a wide range of metagenome assembly projects.
AVAILABILITY AND IMPLEMENTATION
DeepMAsED is available from GitHub at https://github.com/leylabmpi/DeepMAsED.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Bacteria; Computer Simulation; Metagenome; Metagenomics; Sequence Analysis, DNA; Software
PubMed: 32096824
DOI: 10.1093/bioinformatics/btaa124 -
Gut Jan 2013Advances in sequencing technology and the development of metagenomic and bioinformatics methods have opened up new ways to investigate the 10(14) microorganisms... (Review)
Review
Advances in sequencing technology and the development of metagenomic and bioinformatics methods have opened up new ways to investigate the 10(14) microorganisms inhabiting the human gut. The gene composition of human gut microbiome in a large and deeply sequenced cohort highlighted an overall non-redundant genome size 150 times larger than the human genome. The in silico predictions based on metagenomic sequencing are now actively followed, compared and challenged using additional 'omics' technologies. Interactions between the microbiota and its host are of key interest in several pathologies and applying meta-omics to describe the human gut microbiome will give a better understanding of this crucial crosstalk at mucosal interfaces. Adding to the growing appreciation of the importance of the microbiome is the discovery that numerous phages, that is, viruses of prokaryotes infecting bacteria (bacteriophages) or archaea with a high host specificity, inhabit the human gut and impact microbial activity. In addition, gene exchanges within the gut microbiota have proved to be more frequent than anticipated. Taken together, these innovative exploratory technologies are expected to unravel new information networks critical for gut homeostasis and human health. Among the challenges faced, the in vivo validation of these networks, together with their integration into the prediction and prognosis of disease, may require further working hypothesis and collaborative efforts.
Topics: Bacteriophages; Humans; Inflammatory Bowel Diseases; Intestines; Metagenome; Metagenomics
PubMed: 22525886
DOI: 10.1136/gutjnl-2011-301805 -
NeoReviews May 2019The human microbiota includes the trillions of microorganisms living in the human body whereas the human microbiome includes the genes and gene products of this... (Review)
Review
The human microbiota includes the trillions of microorganisms living in the human body whereas the human microbiome includes the genes and gene products of this microbiota. Bacteria were historically largely considered to be pathogens that inevitably led to human disease. However, because of advances in both cultivation-based methods and the advent of metagenomics, bacteria are now recognized to be largely beneficial commensal organisms and thus, key to normal and healthy human development. This relatively new area of medical research has elucidated insights into diseases such as inflammatory bowel disease and obesity, as well as metabolic and atopic disorders. However, much remains unknown about the complexity of microbe-microbe and microbe-host interactions. Future efforts aimed at answering key questions pertaining to the early establishment of the microbiome, alongside what defines its dysbiosis, will likely lead to long-term health and mitigation of disease. Here, we review the relevant literature pertaining to modulations in the perinatal and neonatal microbiome, the impact of environmental and maternal factors in shaping the neonatal microbiome, and future questions and directions in the exciting emerging arena of metagenomic medicine.
Topics: Female; Forecasting; Humans; Infant Health; Infant, Newborn; Metagenome; Metagenomics; Microbiota; Pregnancy
PubMed: 31261078
DOI: 10.1542/neo.20-5-e258 -
PloS One 2016Ever-increasing affordability of next-generation sequencing makes whole-metagenome sequencing an attractive alternative to traditional 16S rDNA, RFLP, or culturing...
Ever-increasing affordability of next-generation sequencing makes whole-metagenome sequencing an attractive alternative to traditional 16S rDNA, RFLP, or culturing approaches for the analysis of microbiome samples. The advantage of whole-metagenome sequencing is that it allows direct inference of the metabolic capacity and physiological features of the studied metagenome without reliance on the knowledge of genotypes and phenotypes of the members of the bacterial community. It also makes it possible to overcome problems of 16S rDNA sequencing, such as unknown copy number of the 16S gene and lack of sufficient sequence similarity of the "universal" 16S primers to some of the target 16S genes. On the other hand, next-generation sequencing suffers from biases resulting in non-uniform coverage of the sequenced genomes. To overcome this difficulty, we present a model of GC-bias in sequencing metagenomic samples as well as filtration and normalization techniques necessary for accurate quantification of microbial organisms. While there has been substantial research in normalization and filtration of read-count data in such techniques as RNA-seq or Chip-seq, to our knowledge, this has not been the case for the field of whole-metagenome shotgun sequencing. The presented methods assume that complete genome references are available for most microorganisms of interest present in metagenomic samples. This is often a valid assumption in such fields as medical diagnostics of patient microbiota. Testing the model on two validation datasets showed four-fold reduction in root-mean-square error compared to non-normalized data in both cases. The presented methods can be applied to any pipeline for whole metagenome sequencing analysis relying on complete microbial genome references. We demonstrate that such pre-processing reduces the number of false positive hits and increases accuracy of abundance estimates.
Topics: Base Composition; Chromosome Mapping; High-Throughput Nucleotide Sequencing; Humans; Metagenome; Metagenomics; Sequence Analysis, RNA; Software
PubMed: 27760173
DOI: 10.1371/journal.pone.0165015 -
Microbiome Feb 2019Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to...
BACKGROUND
Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required.
RESULTS
We describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series, and differential abundance studies, includes real and simulated strain-level diversity, and generates second- and third-generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes, we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT, and metaSPAdes, on several thousand small data sets generated with CAMISIM.
CONCLUSIONS
CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with standards of truth for method evaluation. All data sets and the software are freely available at https://github.com/CAMI-challenge/CAMISIM.
Topics: Algorithms; Animals; Computer Simulation; Gastrointestinal Microbiome; Humans; Metagenome; Metagenomics; Mice; Models, Biological; Sequence Analysis, DNA; Software
PubMed: 30736849
DOI: 10.1186/s40168-019-0633-6