metagenome - OpenMD.com Journal Search

Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4.

Nature Biotechnology Nov 2023

Metagenomic assembly enables new organism discovery from microbial communities, but it can only capture few abundant organisms from most metagenomes. Here we present...

Summary PubMed Full Text PDF

Authors: Aitor Blanco-Míguez, Francesco Beghini, Fabio Cumbo...

Metagenomic assembly enables new organism discovery from microbial communities, but it can only capture few abundant organisms from most metagenomes. Here we present MetaPhlAn 4, which integrates information from metagenome assemblies and microbial isolate genomes for more comprehensive metagenomic taxonomic profiling. From a curated collection of 1.01 M prokaryotic reference and metagenome-assembled genomes, we define unique marker genes for 26,970 species-level genome bins, 4,992 of them taxonomically unidentified at the species level. MetaPhlAn 4 explains ~20% more reads in most international human gut microbiomes and >40% in less-characterized environments such as the rumen microbiome and proves more accurate than available alternatives on synthetic evaluations while also reliably quantifying organisms with no cultured isolates. Application of the method to >24,500 metagenomes highlights previously undetected species to be strong biomarkers for host conditions and lifestyles in human and mouse microbiomes and shows that even previously uncharacterized species can be genetically profiled at the resolution of single microbial strains.

Topics: Humans; Animals; Mice; Metagenome; Microbiota; Gastrointestinal Microbiome; Metagenomics; Phylogeny

PubMed: 36823356
DOI: 10.1038/s41587-023-01688-w

Standardizing translational microbiome studies and metagenomic analyses.

Cardiovascular Research Feb 2021

Summary PubMed Full Text PDF

Authors: Jessica Gambardella, Vanessa Castellanos, Gaetano Santulli...

Topics: Metagenome; Metagenomics; Microbiota

PubMed: 32569375
DOI: 10.1093/cvr/cvaa175

Metagenome Proteins and Database Contamination.

MSphere Nov 2020

Continued influx of metagenome-derived proteins with misannotated taxonomy into conventional databases, including RefSeq, threatens to eliminate the value of taxonomy...

Summary PubMed Full Text PDF

Authors: Irina R Arkhipova

Continued influx of metagenome-derived proteins with misannotated taxonomy into conventional databases, including RefSeq, threatens to eliminate the value of taxonomy identifiers. To prevent this, urgent efforts should be undertaken by submitters of metagenomic data sets as well as by database managers.

Topics: Algorithms; Databases, Genetic; Metagenome; Metagenomics; Proteins

PubMed: 33148820
DOI: 10.1128/mSphere.00854-20

Terabase-Scale Coassembly of a Tropical Soil Microbiome.

Microbiology Spectrum Aug 2023

Petabases of environmental metagenomic data are publicly available, presenting an opportunity to characterize complex environments and discover novel lineages of life....

Summary PubMed Full Text PDF

Authors: Robert Riley, Robert M Bowers, Antonio Pedro Camargo...

Petabases of environmental metagenomic data are publicly available, presenting an opportunity to characterize complex environments and discover novel lineages of life. Metagenome coassembly, in which many metagenomic samples from an environment are simultaneously analyzed to infer the underlying genomes' sequences, is an essential tool for achieving this goal. We applied MetaHipMer2, a distributed metagenome assembler that runs on supercomputing clusters, to coassemble 3.4 terabases (Tbp) of metagenome data from a tropical soil in the Luquillo Experimental Forest (LEF), Puerto Rico. The resulting coassembly yielded 39 high-quality (>90% complete, <5% contaminated, with predicted 23S, 16S, and 5S rRNA genes and ≥18 tRNAs) metagenome-assembled genomes (MAGs), including two from the candidate phylum . Another 268 medium-quality (≥50% complete, <10% contaminated) MAGs were extracted, including the candidate phyla , , and . In total, 307 medium- or higher-quality MAGs were assigned to 23 phyla, compared to 294 MAGs assigned to nine phyla in the same samples individually assembled. The low-quality (<50% complete, <10% contaminated) MAGs from the coassembly revealed a 49% complete rare biosphere microbe from the candidate phylum FCPU426 among other low-abundance microbes, an 81% complete fungal genome from the phylum Ascomycota, and 30 partial eukaryotic MAGs with ≥10% completeness, possibly representing protist lineages. A total of 22,254 viruses, many of them low abundance, were identified. Estimation of metagenome coverage and diversity indicates that we may have characterized ≥87.5% of the sequence diversity in this humid tropical soil and indicates the value of future terabase-scale sequencing and coassembly of complex environments. Petabases of reads are being produced by environmental metagenome sequencing. An essential step in analyzing these data is metagenome assembly, the computational reconstruction of genome sequences from microbial communities. "Coassembly" of metagenomic sequence data, in which multiple samples are assembled together, enables more complete detection of microbial genomes in an environment than "multiassembly," in which samples are assembled individually. To demonstrate the potential for coassembling terabases of metagenome data to drive biological discovery, we applied MetaHipMer2, a distributed metagenome assembler that runs on supercomputing clusters, to coassemble 3.4 Tbp of reads from a humid tropical soil environment. The resulting coassembly, its functional annotation, and analysis are presented here. The coassembly yielded more, and phylogenetically more diverse, microbial, eukaryotic, and viral genomes than the multiassembly of the same data. Our resource may facilitate the discovery of novel microbial biology in tropical soils and demonstrates the value of terabase-scale metagenome sequencing.

Topics: Soil; Microbiota; Bacteria; Metagenome; Genome, Viral; Metagenomics

PubMed: 37310219
DOI: 10.1128/spectrum.00200-23

Towards the biogeography of prokaryotic genes.

Nature Jan 2022

Microbial genes encode the majority of the functional repertoire of life on earth. However, despite increasing efforts in metagenomic sequencing of various habitats,...

Summary PubMed Full Text PDF

Authors: Luis Pedro Coelho, Renato Alves, Álvaro Rodríguez Del Río...

Microbial genes encode the majority of the functional repertoire of life on earth. However, despite increasing efforts in metagenomic sequencing of various habitats, little is known about the distribution of genes across the global biosphere, with implications for human and planetary health. Here we constructed a non-redundant gene catalogue of 303 million species-level genes (clustered at 95% nucleotide identity) from 13,174 publicly available metagenomes across 14 major habitats and use it to show that most genes are specific to a single habitat. The small fraction of genes found in multiple habitats is enriched in antibiotic-resistance genes and markers for mobile genetic elements. By further clustering these species-level genes into 32 million protein families, we observed that a small fraction of these families contain the majority of the genes (0.6% of families account for 50% of the genes). The majority of species-level genes and protein families are rare. Furthermore, species-level genes, and in particular the rare ones, show low rates of positive (adaptive) selection, supporting a model in which most genetic variability observed within each protein family is neutral or nearly neutral.

Topics: Anti-Bacterial Agents; Drug Resistance, Microbial; Ecosystem; Humans; Metagenome; Metagenomics

PubMed: 34912116
DOI: 10.1038/s41586-021-04233-4

Metagenomics Approaches for Improving Food Safety: A Review.

Journal of Food Protection Mar 2022

Advancements in next-generation sequencing technology have dramatically reduced the cost and increased the ease of microbial whole genome sequencing. This approach is... (Review)

Summary PubMed

Review

Authors: Craig Billington, Joanne M Kingsbury, Lucia Rivas...

ABSTRACT

Advancements in next-generation sequencing technology have dramatically reduced the cost and increased the ease of microbial whole genome sequencing. This approach is revolutionizing the identification and analysis of foodborne microbial pathogens, facilitating expedited detection and mitigation of foodborne outbreaks, improving public health outcomes, and limiting costly recalls. However, next-generation sequencing is still anchored in the traditional laboratory practice of the selection and culture of a single isolate. Metagenomic-based approaches, including metabarcoding and shotgun and long-read metagenomics, are part of the next disruptive revolution in food safety diagnostics and offer the potential to directly identify entire microbial communities in a single food, ingredient, or environmental sample. In this review, metagenomic-based approaches are introduced and placed within the context of conventional detection and diagnostic techniques, and essential considerations for undertaking metagenomic assays and data analysis are described. Recent applications of the use of metagenomics for food safety are discussed alongside current limitations and knowledge gaps and new opportunities arising from the use of this technology.

Topics: Food Safety; High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Whole Genome Sequencing

PubMed: 34706052
DOI: 10.4315/JFP-21-301

Phigaro: high-throughput prophage sequence annotation.

Bioinformatics (Oxford, England) Jun 2020

Phigaro is a standalone command-line application that is able to detect prophage regions taking raw genome and metagenome assemblies as an input. It also produces...

Summary PubMed

Authors: Elizaveta V Starikova, Polina O Tikhonova, Nikita A Prianichnikov...

SUMMARY

Phigaro is a standalone command-line application that is able to detect prophage regions taking raw genome and metagenome assemblies as an input. It also produces dynamic annotated 'prophage genome maps' and marks possible transposon insertion spots inside prophages. It is applicable for mining prophage regions from large metagenomic datasets.

AVAILABILITY AND IMPLEMENTATION

Source code for Phigaro is freely available for download at https://github.com/bobeobibo/phigaro along with test data. The code is written in Python.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Topics: High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Prophages; Software

PubMed: 32311023
DOI: 10.1093/bioinformatics/btaa250

Fishing for phages in metagenomes: what do we catch, what do we miss?

Current Opinion in Virology Aug 2021

Metagenomics and metatranscriptomics have become the principal approaches for discovery of novel bacteriophages and preliminary characterization of their ecology and... (Review)

Summary PubMed

Review

Authors: Sean Benler, Eugene V Koonin

Metagenomics and metatranscriptomics have become the principal approaches for discovery of novel bacteriophages and preliminary characterization of their ecology and biology. Metagenomic sequencing dramatically expanded the known diversity of tailed and non-tailed phages with double-stranded DNA genomes and those with single-stranded DNA genomes, whereas metatranscriptomics led to the discovery of thousands of new single-stranded RNA phages. Apart from expanding phage diversity, metagenomics studies discover major novel groups of phages with unique features of genome organization, expression strategy and virus-host interaction, such as the putative order 'crAssvirales', which includes the most abundant human-associated viruses. The continued success of metagenomics hinges on the combination of the most powerful computational methods for phage genome assembly and analysis including harnessing CRISPR spacers for the discovery of novel phages and host assignment. Together, these approaches could make a comprehensive characterization of the earth phageome a realistic goal.

Topics: Animals; Bacteria; Bacteriophages; CRISPR-Cas Systems; Genome, Viral; Host Specificity; Humans; Metagenome; Metagenomics; Transcriptome

PubMed: 34139668
DOI: 10.1016/j.coviro.2021.05.008

Adversarial and variational autoencoders improve metagenomic binning.

Communications Biology Oct 2023

Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct...

Summary PubMed Full Text PDF

Authors: Pau Piera Líndez, Joachim Johansen, Svetlana Kutuzova...

Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct genomes by re-grouping the sequences by their organism of origin, thus representing a crucial processing step when exploring the biological diversity of metagenomic samples. Here we present Adversarial Autoencoders for Metagenomics Binning (AAMB), an ensemble deep learning approach that integrates sequence co-abundances and tetranucleotide frequencies into a common denoised space that enables precise clustering of sequences into microbial genomes. When benchmarked, AAMB presented similar or better results compared with the state-of-the-art reference-free binner VAMB, reconstructing ~7% more near-complete (NC) genomes across simulated and real data. In addition, genomes reconstructed using AAMB had higher completeness and greater taxonomic diversity compared with VAMB. Finally, we implemented a pipeline Integrating VAMB and AAMB that enabled improved binning, recovering 20% and 29% more simulated and real NC genomes, respectively, compared to VAMB, with moderate additional runtime.

Topics: Metagenome; Genome, Microbial; Metagenomics; Cluster Analysis; Benchmarking

PubMed: 37865678
DOI: 10.1038/s42003-023-05452-3

Illuminating the Virosphere Through Global Metagenomics.

Annual Review of Biomedical Data Science Jul 2021

Viruses are the most abundant biological entity on Earth, infect cellular organisms from all domains of life, and are central players in the global biosphere. Over the...

Summary PubMed

Authors: Lee Call, Stephen Nayfach, Nikos C Kyrpides...

Viruses are the most abundant biological entity on Earth, infect cellular organisms from all domains of life, and are central players in the global biosphere. Over the last century, the discovery and characterization of viruses have progressed steadily alongside much of modern biology. In terms of outright numbers of novel viruses discovered, however, the last few years have been by far the most transformative for the field. Advances in methods for identifying viral sequences in genomic and metagenomic datasets, coupled to the exponential growth of environmental sequencing, have greatly expanded the catalog of known viruses and fueled the tremendous growth of viral sequence databases. Development and implementation of new standards, along with careful study of the newly discovered viruses, have transformed and will continue to transform our understanding of microbial evolution, ecology, and biogeochemical cycles, leading to new biotechnological innovations across many diverse fields, including environmental, agricultural, and biomedical sciences.

Topics: Ecology; Genome, Viral; Metagenome; Metagenomics; Viruses

PubMed: 34465172
DOI: 10.1146/annurev-biodatasci-012221-095114