metagenome - OpenMD.com Journal Search

Introduction to the principles and methods underlying the recovery of metagenome-assembled genomes from metagenomic data.

MicrobiologyOpen Jun 2022

The rise of metagenomics offers a leap forward for understanding the genetic diversity of microorganisms in many different complex environments by providing a platform... (Review)

Summary PubMed Full Text PDF

Review

Authors: Gleb Goussarov, Mohamed Mysara, Peter Vandamme...

The rise of metagenomics offers a leap forward for understanding the genetic diversity of microorganisms in many different complex environments by providing a platform that can identify potentially unlimited numbers of known and novel microorganisms. As such, it is impossible to imagine new major initiatives without metagenomics. Nevertheless, it represents a relatively new discipline with various levels of complexity and demands on bioinformatics. The underlying principles and methods used in metagenomics are often seen as common knowledge and often not detailed or fragmented. Therefore, we reviewed these to guide microbiologists in taking the first steps into metagenomics. We specifically focus on a workflow aimed at reconstructing individual genomes, that is, metagenome-assembled genomes, integrating DNA sequencing, assembly, binning, identification and annotation.

Topics: Computational Biology; Metagenome; Metagenomics; Sequence Analysis, DNA

PubMed: 35765182
DOI: 10.1002/mbo3.1298

Extraction of CRISPR-targeted sequences from the metagenome.

STAR Protocols Sep 2022

Homology-based search is commonly used to uncover mobile genetic elements (MGEs) from metagenomes, but it heavily relies on reference genomes in the database. Here we...

Summary PubMed Full Text PDF

Authors: Ryota Sugimoto, Luca Nishimura, Phuong Thanh Nguyen...

Homology-based search is commonly used to uncover mobile genetic elements (MGEs) from metagenomes, but it heavily relies on reference genomes in the database. Here we introduce a protocol to extract CRISPR-targeted sequences from the assembled human gut metagenomic sequences without using a reference database. We describe the assembling of metagenome contigs, the extraction of CRISPR direct repeats and spacers, the discovery of protospacers, and the extraction of protospacer-enriched regions using the graph-based approach. This protocol could extract numerous characterized/uncharacterized MGEs. For complete details on the use and execution of this protocol, please refer to Sugimoto et al. (2021).

Topics: Base Sequence; Clustered Regularly Interspaced Short Palindromic Repeats; Humans; Metagenome; Metagenomics

PubMed: 35780428
DOI: 10.1016/j.xpro.2022.101525

Extensive Unexplored Human Microbiome Diversity Revealed by Over 150,000 Genomes from Metagenomes Spanning Age, Geography, and Lifestyle.

Cell Jan 2019

The body-wide human microbiome plays a role in health, but its full diversity remains uncharacterized, particularly outside of the gut and in international populations....

Summary PubMed Full Text PDF

Authors: Edoardo Pasolli, Francesco Asnicar, Serena Manara...

The body-wide human microbiome plays a role in health, but its full diversity remains uncharacterized, particularly outside of the gut and in international populations. We leveraged 9,428 metagenomes to reconstruct 154,723 microbial genomes (45% of high quality) spanning body sites, ages, countries, and lifestyles. We recapitulated 4,930 species-level genome bins (SGBs), 77% without genomes in public repositories (unknown SGBs [uSGBs]). uSGBs are prevalent (in 93% of well-assembled samples), expand underrepresented phyla, and are enriched in non-Westernized populations (40% of the total SGBs). We annotated 2.85 M genes in SGBs, many associated with conditions including infant development (94,000) or Westernization (106,000). SGBs and uSGBs permit deeper microbiome analyses and increase the average mappability of metagenomic reads from 67.76% to 87.51% in the gut (median 94.26%) and 65.14% to 82.34% in the mouth. We thus identify thousands of microbial genomes from yet-to-be-named species, expand the pangenomes of human-associated microbes, and allow better exploitation of metagenomic technologies.

Topics: Big Data; Genetic Variation; Geography; Humans; Life Style; Metagenome; Metagenomics; Microbiota; Phylogeny; Sequence Analysis, DNA

PubMed: 30661755
DOI: 10.1016/j.cell.2019.01.001

StrainXpress: strain aware metagenome assembly from short reads.

Nucleic Acids Research Sep 2022

Next-generation sequencing-based metagenomics has enabled to identify microorganisms in characteristic habitats without the need for lengthy cultivation. Importantly,...

Summary PubMed Full Text PDF

Authors: Xiongbin Kang, Xiao Luo, Alexander Schönhuth...

Next-generation sequencing-based metagenomics has enabled to identify microorganisms in characteristic habitats without the need for lengthy cultivation. Importantly, clinically relevant phenomena such as resistance to medication, virulence or interactions with the environment can vary already within species. Therefore, a major current challenge is to reconstruct individual genomes from the sequencing reads at the level of strains, and not just the level of species. However, strains of one species can differ only by minor amounts of variants, which makes it difficult to distinguish them. Despite considerable recent progress, related approaches have remained fragmentary so far. Here, we present StrainXpress, as a comprehensive solution to the problem of strain aware metagenome assembly from next-generation sequencing reads. In experiments, StrainXpress reconstructs strain-specific genomes from metagenomes that involve up to >1000 strains and proves to successfully deal with poorly covered strains. The amount of reconstructed strain-specific sequence exceeds that of the current state-of-the-art approaches by on average 26.75% across all data sets (first quartile: 18.51%, median: 26.60%, third quartile: 35.05%).

Topics: High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Sequence Analysis, DNA

PubMed: 35776122
DOI: 10.1093/nar/gkac543

How to Obtain and Compare Metagenome-Assembled Genomes.

Methods in Molecular Biology (Clifton,... 2024

Metagenome-assembled genomes, or MAGs, are genomes retrieved from metagenome datasets. In the vast majority of cases, MAGs are genomes from prokaryotic species that have...

Summary PubMed

Authors: Fabio Beltrame Sanchez, Suzana Eiko Sato Guima, João Carlos Setubal...

Metagenome-assembled genomes, or MAGs, are genomes retrieved from metagenome datasets. In the vast majority of cases, MAGs are genomes from prokaryotic species that have not been isolated or cultivated in the lab. They, therefore, provide us with information on these species that are impossible to obtain otherwise, at least until new cultivation methods are devised. Thanks to improvements and cost reductions of DNA sequencing technologies and growing interest in microbial ecology, the rise in number of MAGs in genome repositories has been exponential. This chapter covers the basics of MAG retrieval and processing and provides a practical step-by-step guide using a real dataset and state-of-the-art tools for MAG analysis and comparison.

Topics: Metagenome; Metagenomics; Software; Computational Biology; Databases, Genetic; Sequence Analysis, DNA; Genome, Bacterial

PubMed: 38819559
DOI: 10.1007/978-1-0716-3838-5_6

Recovering complete and draft population genomes from metagenome datasets.

Microbiome Mar 2016

Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating... (Review)

Summary PubMed Full Text PDF

Review

Authors: Naseer Sangwan, Fangfang Xia, Jack A Gilbert...

Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.

Topics: Contig Mapping; Datasets as Topic; Genome, Microbial; Metagenome; Metagenomics; Sequence Analysis, DNA

PubMed: 26951112
DOI: 10.1186/s40168-016-0154-5

Freshwater Viral Metagenome Analyses Targeting dsDNA Viruses.

Methods in Molecular Biology (Clifton,... 2024

Viral metagenomics is one of the most widely used approaches to study viral population genomics. With the recent development of bioinformatic tools, the number of...

Summary PubMed

Authors: Kira Moon, Jang-Cheon Cho

Viral metagenomics is one of the most widely used approaches to study viral population genomics. With the recent development of bioinformatic tools, the number of molecular biological methods, programs, and software to analyze viral metagenome data have greatly increased. Here, we describe the basic analysis workflow along with bioinformatic tools that can be used to analyze viral metagenome data. Although this chapter assumes that the viral metagenome data are prepared from the freshwater samples and are subjected to dsDNA sequencing, the protocol can be applied and modified for other types of metagenome data collected from a variety of sources.

Topics: Metagenome; Genome, Viral; Metagenomics; Fresh Water; Viruses

PubMed: 38060116
DOI: 10.1007/978-1-0716-3515-5_3

Unraveling the functional dark matter through global metagenomics.

Nature Oct 2023

Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities. Exploration of this vast sequence space has been limited to...

Summary PubMed Full Text PDF

Authors: Georgios A Pavlopoulos, Fotis A Baltoumas, Sirui Liu...

Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.

Topics: Cluster Analysis; Metagenome; Metagenomics; Proteins; Databases, Protein; Protein Conformation; Microbiology

PubMed: 37821698
DOI: 10.1038/s41586-023-06583-7

CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes.

Genome Research Jul 2015

Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions...

Summary PubMed Full Text PDF

Authors: Donovan H Parks, Michael Imelfort, Connor T Skennerton...

Large-scale recovery of genomes from isolates, single cells, and metagenomic data has been made possible by advances in computational methods and substantial reductions in sequencing costs. Although this increasing breadth of draft genomes is providing key information regarding the evolutionary and functional diversity of microbial life, it has become impractical to finish all available reference genomes. Making robust biological inferences from draft genomes requires accurate estimates of their completeness and contamination. Current methods for assessing genome quality are ad hoc and generally make use of a limited number of "marker" genes conserved across all bacterial or archaeal genomes. Here we introduce CheckM, an automated method for assessing the quality of a genome using a broader set of marker genes specific to the position of a genome within a reference genome tree and information about the collocation of these genes. We demonstrate the effectiveness of CheckM using synthetic data and a wide range of isolate-, single-cell-, and metagenome-derived genomes. CheckM is shown to provide accurate estimates of genome completeness and contamination and to outperform existing approaches. Using CheckM, we identify a diverse range of errors currently impacting publicly available isolate genomes and demonstrate that genomes obtained from single cells and metagenomic data vary substantially in quality. In order to facilitate the use of draft genomes, we propose an objective measure of genome quality that can be used to select genomes suitable for specific gene- and genome-centric analyses of microbial communities.

Topics: Genome, Microbial; Metagenome; Metagenomics

PubMed: 25977477
DOI: 10.1101/gr.186072.114

Community-scale models of microbiomes: Articulating metabolic modelling and metagenome sequencing.

Microbial Biotechnology Jan 2024

Building models is essential for understanding the functions and dynamics of microbial communities. Metabolic models built on genome-scale metabolic network... (Review)

Summary PubMed Full Text PDF

Review

Authors: Klara Cerk, Pablo Ugalde-Salas, Chabname Ghassemi Nedjad...

Building models is essential for understanding the functions and dynamics of microbial communities. Metabolic models built on genome-scale metabolic network reconstructions (GENREs) are especially relevant as a means to decipher the complex interactions occurring among species. Model reconstruction increasingly relies on metagenomics, which permits direct characterisation of naturally occurring communities that may contain organisms that cannot be isolated or cultured. In this review, we provide an overview of the field of metabolic modelling and its increasing reliance on and synergy with metagenomics and bioinformatics. We survey the means of assigning functions and reconstructing metabolic networks from (meta-)genomes, and present the variety and mathematical fundamentals of metabolic models that foster the understanding of microbial dynamics. We emphasise the characterisation of interactions and the scaling of model construction to large communities, two important bottlenecks in the applicability of these models. We give an overview of the current state of the art in metagenome sequencing and bioinformatics analysis, focusing on the reconstruction of genomes in microbial communities. Metagenomics benefits tremendously from third-generation sequencing, and we discuss the opportunities of long-read sequencing, strain-level characterisation and eukaryotic metagenomics. We aim at providing algorithmic and mathematical support, together with tool and application resources, that permit bridging the gap between metagenomics and metabolic modelling.

Topics: Metagenome; Microbiota; Metagenomics; Sequence Analysis, DNA; Computational Biology

PubMed: 38243750
DOI: 10.1111/1751-7915.14396