metagenomics - OpenMD.com Journal Search

Metagenome Proteins and Database Contamination.

MSphere Nov 2020

Continued influx of metagenome-derived proteins with misannotated taxonomy into conventional databases, including RefSeq, threatens to eliminate the value of taxonomy...

Summary PubMed Full Text PDF

Authors: Irina R Arkhipova

Continued influx of metagenome-derived proteins with misannotated taxonomy into conventional databases, including RefSeq, threatens to eliminate the value of taxonomy identifiers. To prevent this, urgent efforts should be undertaken by submitters of metagenomic data sets as well as by database managers.

Topics: Algorithms; Databases, Genetic; Metagenome; Metagenomics; Proteins

PubMed: 33148820
DOI: 10.1128/mSphere.00854-20

MGnify Genomes: A Resource for Biome-specific Microbial Genome Catalogues.

Journal of Molecular Biology Jul 2023

An increasingly common output arising from the analysis of shotgun metagenomic datasets is the generation of metagenome-assembled genomes (MAGs), with tens of thousands...

Summary PubMed Full Text PDF

Authors: Tatiana A Gurbich, Alexandre Almeida, Martin Beracochea...

An increasingly common output arising from the analysis of shotgun metagenomic datasets is the generation of metagenome-assembled genomes (MAGs), with tens of thousands of MAGs now described in the literature. However, the discovery and comparison of these MAG collections is hampered by the lack of uniformity in their generation, annotation and storage. To address this, we have developed MGnify Genomes, a growing collection of biome-specific non-redundant microbial genome catalogues generated using MAGs and publicly available isolate genomes. Genomes within a biome-specific catalogue are organised into species clusters. For species that contain multiple conspecific genomes, the highest quality genome is selected as the representative, always prioritising an isolate genome over a MAG. The species representative sequences and annotations can be visualised on the MGnify website and the full catalogue and associated analysis outputs can be downloaded from MGnify servers. A suite of online search tools is provided allowing users to compare their own sequences, ranging from a gene to sets of genomes, against the catalogues. Seven biomes are available currently, comprising over 300,000 genomes that represent 11,048 non-redundant species, and include 36 taxonomic classes not currently represented by cultured genomes. MGnify Genomes is available at https://www.ebi.ac.uk/metagenomics/browse/genomes/.

Topics: Genome, Microbial; Metagenome; Metagenomics

PubMed: 36806692
DOI: 10.1016/j.jmb.2023.168016

Target-enriched long-read sequencing (TELSeq) contextualizes antimicrobial resistance genes in metagenomes.

Microbiome Nov 2022

Metagenomic data can be used to profile high-importance genes within microbiomes. However, current metagenomic workflows produce data that suffer from low sensitivity...

Summary PubMed Full Text PDF

Authors: Ilya B Slizovskiy, Marco Oliva, Jonathen K Settle...

BACKGROUND

Metagenomic data can be used to profile high-importance genes within microbiomes. However, current metagenomic workflows produce data that suffer from low sensitivity and an inability to accurately reconstruct partial or full genomes, particularly those in low abundance. These limitations preclude colocalization analysis, i.e., characterizing the genomic context of genes and functions within a metagenomic sample. Genomic context is especially crucial for functions associated with horizontal gene transfer (HGT) via mobile genetic elements (MGEs), for example antimicrobial resistance (AMR). To overcome this current limitation of metagenomics, we present a method for comprehensive and accurate reconstruction of antimicrobial resistance genes (ARGs) and MGEs from metagenomic DNA, termed target-enriched long-read sequencing (TELSeq).

RESULTS

Using technical replicates of diverse sample types, we compared TELSeq performance to that of non-enriched PacBio and short-read Illumina sequencing. TELSeq achieved much higher ARG recovery (>1,000-fold) and sensitivity than the other methods across diverse metagenomes, revealing an extensive resistome profile comprising many low-abundance ARGs, including some with public health importance. Using the long reads generated by TELSeq, we identified numerous MGEs and cargo genes flanking the low-abundance ARGs, indicating that these ARGs could be transferred across bacterial taxa via HGT.

CONCLUSIONS

TELSeq can provide a nuanced view of the genomic context of microbial resistomes and thus has wide-ranging applications in public, animal, and human health, as well as environmental surveillance and monitoring of AMR. Thus, this technique represents a fundamental advancement for microbiome research and application. Video abstract.

Topics: Animals; Humans; Metagenome; Anti-Bacterial Agents; Genes, Bacterial; Drug Resistance, Bacterial; Metagenomics

PubMed: 36324140
DOI: 10.1186/s40168-022-01368-y

TAMPA: interpretable analysis and visualization of metagenomics-based taxon abundance profiles.

GigaScience Dec 2022

Metagenomic taxonomic profiling aims to predict the identity and relative abundance of taxa in a given whole-genome sequencing metagenomic sample. A recent surge in...

Summary PubMed Full Text PDF

Authors: Varuni Sarwal, Jaqueline Brito, Serghei Mangul...

BACKGROUND

Metagenomic taxonomic profiling aims to predict the identity and relative abundance of taxa in a given whole-genome sequencing metagenomic sample. A recent surge in computational methods that aim to accurately estimate taxonomic profiles, called taxonomic profilers, has motivated community-driven efforts to create standardized benchmarking datasets and platforms, standardized taxonomic profile formats, and a benchmarking platform to assess tool performance. While this standardization is essential, there is currently a lack of tools to visualize the standardized output of the many existing taxonomic profilers. Thus, benchmarking studies rely on a single-value metrics to compare performance of tools and compare to benchmarking datasets. This is one of the major problems in analyzing metagenomic profiling data, since single metrics, such as the F1 score, fail to capture the biological differences between the datasets.

FINDINGS

Here we report the development of TAMPA (Taxonomic metagenome profiling evaluation), a robust and easy-to-use method that allows scientists to easily interpret and interact with taxonomic profiles produced by the many different taxonomic profiler methods beyond the standard metrics used by the scientific community. We demonstrate the unique ability of TAMPA to generate a novel biological hypothesis by highlighting the taxonomic differences between samples otherwise missed by commonly utilized metrics.

CONCLUSION

In this study, we show that TAMPA can help visualize the output of taxonomic profilers, enabling biologists to effectively choose the most appropriate profiling method to use on their metagenomics data. TAMPA is available on GitHub, Bioconda, and Galaxy Toolshed at https://github.com/dkoslicki/TAMPA and is released under the MIT license.

Topics: Metagenomics; Benchmarking; Metagenome; Whole Genome Sequencing

PubMed: 36852763
DOI: 10.1093/gigascience/giad008

metaMIC: reference-free misassembly identification and correction of de novo metagenomic assemblies.

Genome Biology Nov 2022

Evaluating the quality of metagenomic assemblies is important for constructing reliable metagenome-assembled genomes and downstream analyses. Here, we present metaMIC (...

Summary PubMed Full Text PDF

Authors: Senying Lai, Shaojun Pan, Chuqing Sun...

Evaluating the quality of metagenomic assemblies is important for constructing reliable metagenome-assembled genomes and downstream analyses. Here, we present metaMIC ( https://github.com/ZhaoXM-Lab/metaMIC ), a machine learning-based tool for identifying and correcting misassemblies in metagenomic assemblies. Benchmarking results on both simulated and real datasets demonstrate that metaMIC outperforms existing tools when identifying misassembled contigs. Furthermore, metaMIC is able to localize the misassembly breakpoints, and the correction of misassemblies by splitting at misassembly breakpoints can improve downstream scaffolding and binning results.

Topics: Metagenome; Sequence Analysis, DNA; Metagenomics; Machine Learning; Benchmarking; Software; Algorithms

PubMed: 36376928
DOI: 10.1186/s13059-022-02810-y

Assembling Reads Improves Taxonomic Classification of Species.

Genes Aug 2020

Most current approach to metagenomic classification employ short next generation sequencing (NGS) reads that are present in metagenomic samples to identify unique...

Summary PubMed Full Text PDF

Authors: Quang Tran, Vinhthuy Phan

Most current approach to metagenomic classification employ short next generation sequencing (NGS) reads that are present in metagenomic samples to identify unique genomic regions. NGS reads, however, might not be long enough to differentiate similar genomes. This suggests a potential for using longer reads to improve classification performance. Presently, longer reads tend to have a higher rate of sequencing errors. Thus, given the pros and cons, it remains unclear which types of reads is better for metagenomic classification. We compared two taxonomic classification protocols: a traditional assembly-free protocol and a novel assembly-based protocol. The novel assembly-based protocol consists of assembling short-reads into longer reads, which will be subsequently classified by a traditional taxonomic classifier. We discovered that most classifiers made fewer predictions with longer reads and that they achieved higher classification performance on synthetic metagenomic data. Generally, we observed a significant increase in precision, while having similar recall rates. On real data, we observed similar characteristics that suggest that the classifiers might have similar performance of higher precision with similar recall with longer reads. We have shown a noticeable difference in performance between assembly-based and assembly-free taxonomic classification. This finding strongly suggests that classifying species in metagenomic environments can be achieved with higher overall performance simply by assembling short reads. Further, it also suggests that long-read technologies might be better for species classification.

Topics: Computational Biology; DNA Barcoding, Taxonomic; Metagenome; Metagenomics; Reproducibility of Results; Workflow

PubMed: 32824429
DOI: 10.3390/genes11080946

Major New Microbial Groups Expand Diversity and Alter our Understanding of the Tree of Life.

Cell Mar 2018

The recent recovery of genomes for organisms from phyla with no isolated representative (candidate phyla) via cultivation-independent genomics enabled delineation of... (Review)

Summary PubMed Full Text

Review

Authors: Cindy J Castelle, Jillian F Banfield

The recent recovery of genomes for organisms from phyla with no isolated representative (candidate phyla) via cultivation-independent genomics enabled delineation of major new microbial lineages, namely the bacterial candidate phyla radiation (CPR), DPANN archaea, and Asgard archaea. CPR and DPANN organisms are inferred to be mostly symbionts, and some are episymbionts of other microbial community members. Asgard genomes encode typically eukaryotic systems, and their inclusion in phylogenetic analyses results in placement of eukaryotes as a branch within Archaea. Here, we illustrate how new genomes have changed the structure of the tree of life and altered our understanding of biology, evolution, and metabolic roles in biogeochemical processes.

Topics: Archaea; Bacteria; Genetic Variation; Metagenome; Metagenomics; Phylogeny; RNA, Ribosomal, 16S; Species Specificity

PubMed: 29522741
DOI: 10.1016/j.cell.2018.02.016

Benchmarking microbial growth rate predictions from metagenomes.

The ISME Journal Jan 2021

Growth rates are central to understanding microbial interactions and community dynamics. Metagenomic growth estimators have been developed, specifically codon usage bias...

Summary PubMed Full Text PDF

Authors: Andrew M Long, Shengwei Hou, J Cesar Ignacio-Espinoza...

Growth rates are central to understanding microbial interactions and community dynamics. Metagenomic growth estimators have been developed, specifically codon usage bias (CUB) for maximum growth rates and "peak-to-trough ratio" (PTR) for in situ rates. Both were originally tested with pure cultures, but natural populations are more heterogeneous, especially in individual cell histories pertinent to PTR. To test these methods, we compared predictors with observed growth rates of freshly collected marine prokaryotes in unamended seawater. We prefiltered and diluted samples to remove grazers and greatly reduce virus infection, so net growth approximated gross growth. We sampled over 44 h for abundances and metagenomes, generating 101 metagenome-assembled genomes (MAGs), including Actinobacteria, Verrucomicrobia, SAR406, MGII archaea, etc. We tracked each MAG population by cell-abundance-normalized read recruitment, finding growth rates of 0 to 5.99 per day, the first reported rates for several groups, and used these rates as benchmarks. PTR, calculated by three methods, rarely correlated to growth (r ~-0.26-0.08), except for rapidly growing γ-Proteobacteria (r ~0.63-0.92), while CUB correlated moderately well to observed maximum growth rates (r = 0.57). This suggests that current PTR approaches poorly predict actual growth of most marine bacterial populations, but maximum growth rates can be approximated from genomic characteristics.

Topics: Archaea; Bacteria; Benchmarking; Metagenome; Metagenomics

PubMed: 32939027
DOI: 10.1038/s41396-020-00773-1

Benchmarking second and third-generation sequencing platforms for microbial metagenomics.

Scientific Data Nov 2022

Shotgun metagenomic sequencing is a common approach for studying the taxonomic diversity and metabolic potential of complex microbial communities. Current methods...

Summary PubMed Full Text PDF

Authors: Victoria Meslier, Benoit Quinquis, Kévin Da Silva...

Shotgun metagenomic sequencing is a common approach for studying the taxonomic diversity and metabolic potential of complex microbial communities. Current methods primarily use second generation short read sequencing, yet advances in third generation long read technologies provide opportunities to overcome some of the limitations of short read sequencing. Here, we compared seven platforms, encompassing second generation sequencers (Illumina HiSeq 300, MGI DNBSEQ-G400 and DNBSEQ-T7, ThermoFisher Ion GeneStudio S5 and Ion Proton P1) and third generation sequencers (Oxford Nanopore Technologies MinION R9 and Pacific Biosciences Sequel II). We constructed three uneven synthetic microbial communities composed of up to 87 genomic microbial strains DNAs per mock, spanning 29 bacterial and archaeal phyla, and representing the most complex and diverse synthetic communities used for sequencing technology comparisons. Our results demonstrate that third generation sequencing have advantages over second generation platforms in analyzing complex microbial communities, but require careful sequencing library preparation for optimal quantitative metagenomic analysis. Our sequencing data also provides a valuable resource for testing and benchmarking bioinformatics software for metagenomics.

Topics: Benchmarking; High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Microbiota; Sequence Analysis, DNA

PubMed: 36369227
DOI: 10.1038/s41597-022-01762-z

Fast and sensitive taxonomic assignment to metagenomic contigs.

Bioinformatics (Oxford, England) Sep 2021

MMseqs2 taxonomy is a new tool to assign taxonomic labels to metagenomic contigs. It extracts all possible protein fragments from each contig, quickly retains those that...

Summary PubMed Full Text PDF

Authors: M Mirdita, M Steinegger, F Breitwieser...

SUMMARY

MMseqs2 taxonomy is a new tool to assign taxonomic labels to metagenomic contigs. It extracts all possible protein fragments from each contig, quickly retains those that can contribute to taxonomic annotation, assigns them with robust labels and determines the contig's taxonomic identity by weighted voting. Its fragment extraction step is suitable for the analysis of all domains of life. MMseqs2 taxonomy is 2-18× faster than state-of-the-art tools and also contains new modules for creating and manipulating taxonomic reference databases as well as reporting and visualizing taxonomic assignments.

AVAILABILITY AND IMPLEMENTATION

MMseqs2 taxonomy is part of the MMseqs2 free open-source software package available for Linux, macOS and Windows at https://mmseqs.com.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Topics: Software; Metagenome; Metagenomics; Databases, Factual

PubMed: 33734313
DOI: 10.1093/bioinformatics/btab184