metagenome - OpenMD.com Journal Search

Large-scale quality assessment of prokaryotic genomes with metashot/prok-quality.

F1000Research 2021

Metagenomic sequencing allows large-scale identification and genomic characterization. Binning is the process of recovering genomes from complex mixtures of sequence...

Summary PubMed Full Text PDF

Authors: Davide Albanese, Claudio Donati

Metagenomic sequencing allows large-scale identification and genomic characterization. Binning is the process of recovering genomes from complex mixtures of sequence fragments (metagenome contigs) of unknown bacteria and archaeal species. Assessing the quality of genomes recovered from metagenomes requires the use of complex pipelines involving many independent steps, often difficult to reproduce and maintain. A comprehensive, automated and easy-to-use computational workflow for the quality assessment of draft prokaryotic genomes, based on container technology, would greatly improve reproducibility and reusability of published results. We present metashot/prok-quality, a container-enabled Nextflow pipeline for quality assessment and genome dereplication. The metashot/prok-quality tool produces genome quality reports that are compliant with the Minimum Information about a Metagenome-Assembled Genome (MIMAG) standard, and can run out-of-the-box on any platform that supports Nextflow, Docker or Singularity, including computing clusters or batch infrastructures in the cloud. metashot/prok-quality is part of the metashot collection of analysis pipelines. Workflow and documentation are available under GPL3 licence on GitHub.

Topics: Archaea; Metagenome; Metagenomics; Prokaryotic Cells; Reproducibility of Results

PubMed: 35136576
DOI: 10.12688/f1000research.54418.1

Freshwater Viral Metagenome Analyses Targeting dsDNA Viruses.

Methods in Molecular Biology (Clifton,... 2024

Viral metagenomics is one of the most widely used approaches to study viral population genomics. With the recent development of bioinformatic tools, the number of...

Summary PubMed

Authors: Kira Moon, Jang-Cheon Cho

Viral metagenomics is one of the most widely used approaches to study viral population genomics. With the recent development of bioinformatic tools, the number of molecular biological methods, programs, and software to analyze viral metagenome data have greatly increased. Here, we describe the basic analysis workflow along with bioinformatic tools that can be used to analyze viral metagenome data. Although this chapter assumes that the viral metagenome data are prepared from the freshwater samples and are subjected to dsDNA sequencing, the protocol can be applied and modified for other types of metagenome data collected from a variety of sources.

Topics: Metagenome; Genome, Viral; Metagenomics; Fresh Water; Viruses

PubMed: 38060116
DOI: 10.1007/978-1-0716-3515-5_3

Multiplexed Target Enrichment Enables Efficient and In-Depth Analysis of Antimicrobial Resistome in Metagenomes.

Microbiology Spectrum Dec 2022

Antibiotic resistance genes (ARGs) pose a serious threat to public health and ecological security in the 21st century. However, the resistome only accounts for a tiny...

Summary PubMed Full Text PDF

Authors: Yiming Li, Xiaomin Shi, Yang Zuo...

Antibiotic resistance genes (ARGs) pose a serious threat to public health and ecological security in the 21st century. However, the resistome only accounts for a tiny fraction of metagenomic content, which makes it difficult to investigate low-abundance ARGs in various environmental settings. Thus, a highly sensitive, accurate, and comprehensive method is needed to describe ARG profiles in complex metagenomic samples. In this study, we established a high-throughput sequencing method based on targeted amplification, which could simultaneously detect ARGs ( = 251), mobile genetic element genes ( = 8), and metal resistance genes ( = 19) in metagenomes. The performance of amplicon sequencing was compared with traditional metagenomic shotgun sequencing (MetaSeq). A total of 1421 primer pairs were designed, achieving extremely high coverage of target genes. The amplicon sequencing significantly improved the recovery of target ARGs (~9 × 10-fold), with higher sensitivity and diversity, less cost, and computation burden. Furthermore, targeted enrichment allows deep scanning of single nucleotide polymorphisms (SNPs), and elevated SNPs detection was shown in this study. We further performed this approach for 48 environmental samples (37 feces, 20 soils, and 7 sewage) and 16 clinical samples. All samples tested in this study showed high diversity and recovery of targeted genes. Our results demonstrated that the approach could be applied to various metagenomic samples and served as an efficient tool in the surveillance and evolution assessment of ARGs. Access to the resistome using the enrichment method validated in this study enabled the capture of low-abundance resistomes while being less costly and time-consuming, which can greatly advance our understanding of local and global resistome dynamics. ARGs, an increasing global threat to human health, can be transferred into health-related microorganisms in the environment by horizontal gene transfer, posing a serious threat to public health. Advancing profiling methods are needed for monitoring and predicting the potential risks of ARGs in metagenomes. Our study described a customized amplicon sequencing assay that could enable a high-throughput, targeted, in-depth analysis of ARGs and detect a low-abundance portion of resistomes. This method could serve as an efficient tool to assess the variation and evolution of specific ARGs in the clinical and natural environment.

Topics: Humans; Metagenome; Genes, Bacterial; Anti-Bacterial Agents; Drug Resistance, Microbial; Sewage; Metagenomics

PubMed: 36287061
DOI: 10.1128/spectrum.02297-22

A survey on computational strategies for genome-resolved gut metagenomics.

Briefings in Bioinformatics May 2023

Recovering high-quality metagenome-assembled genomes (HQ-MAGs) is critical for exploring microbial compositions and microbe-phenotype associations. However, multiple...

Summary PubMed

Authors: Longhao Jia, Yingjian Wu, Yanqi Dong...

Recovering high-quality metagenome-assembled genomes (HQ-MAGs) is critical for exploring microbial compositions and microbe-phenotype associations. However, multiple sequencing platforms and computational tools for this purpose may confuse researchers and thus call for extensive evaluation. Here, we systematically evaluated a total of 40 combinations of popular computational tools and sequencing platforms (i.e. strategies), involving eight assemblers, eight metagenomic binners and four sequencing technologies, including short-, long-read and metaHiC sequencing. We identified the best tools for the individual tasks (e.g. the assembly and binning) and combinations (e.g. generating more HQ-MAGs) depending on the availability of the sequencing data. We found that the combination of the hybrid assemblies and metaHiC-based binning performed best, followed by the hybrid and long-read assemblies. More importantly, both long-read and metaHiC sequencings link more mobile elements and antibiotic resistance genes to bacterial hosts and improve the quality of public human gut reference genomes with 32% (34/105) HQ-MAGs that were either of better quality than those in the Unified Human Gastrointestinal Genome catalog version 2 or novel.

Topics: Humans; Metagenomics; Sequence Analysis, DNA; Metagenome; Bacteria; Gastrointestinal Tract

PubMed: 37114640
DOI: 10.1093/bib/bbad162

Microbial characterization based on multifractal analysis of metagenomes.

Frontiers in Cellular and Infection... 2023

The species diversity of microbiomes is a cutting-edge concept in metagenomic research. In this study, we propose a multifractal analysis for metagenomic research.

Summary PubMed Full Text PDF

Authors: Xian-Hua Xie, Yu-Jie Huang, Guo-Sheng Han...

INTRODUCTION

The species diversity of microbiomes is a cutting-edge concept in metagenomic research. In this study, we propose a multifractal analysis for metagenomic research.

METHOD AND RESULTS

Firstly, we visualized the chaotic game representation (CGR) of simulated metagenomes and real metagenomes. We find that metagenomes are visualized with self-similarity. Then we defined and calculated the multifractal dimension for the visualized plot of simulated and real metagenomes, respectively. By analyzing the Pearson correlation coefficients between the multifractal dimension and the traditional species diversity index, we obtain that the correlation coefficients between the multifractal dimension and the species richness index and Shannon diversity index reached the maximum value when q = 0, 1, and the correlation coefficient between the multifractal dimension and the Simpson diversity index reached the maximum value when q = 5. Finally, we apply our method to real metagenomes of the gut microbiota of 100 infants who are newborn and 4 and 12 months old. The results show that the multifractal dimensions of an infant's gut microbiomes can distinguish age differences.

CONCLUSION AND DISCUSSION

There is self-similarity among the CGRs of WGS of metagenomes, and the multifractal spectrum is an important characteristic for metagenomes. The traditional diversity indicators can be unified under the framework of multifractal analysis. These results coincided with similar results in macrobial ecology. The multifractal spectrum of infants' gut microbiomes are related to the development of the infants.

Topics: Humans; Infant; Infant, Newborn; Metagenome; Microbiota; Gastrointestinal Microbiome; Metagenomics; Ecology

PubMed: 36779183
DOI: 10.3389/fcimb.2023.1117421

Functional relevance of microbiome signatures: The correlation era requires tools for consolidation.

The Journal of Allergy and Clinical... Apr 2017

Compelling research over the past decade identified a fundamental role of the intestinal microbiome on human health. Compositional and functional changes of this... (Review)

Summary PubMed

Review

Authors: Ludovica F Buttó, Dirk Haller

Compelling research over the past decade identified a fundamental role of the intestinal microbiome on human health. Compositional and functional changes of this microbial ecosystem are correlated with a variety of human pathologies. Metagenomic resolution and bioinformatic tools considerably improved, allowing even strain-level analysis. However, the search for microbial risk patterns in human cohorts is often confounded by environmental factors (eg, medication) and host status (eg, disease relapse), questioning the prognostic and therapeutic value of the currently available information. In addition to a better stratification of human phenotypes, the implementation of standardized protocols for sampling and analysis is needed to improve the reproducibility and comparability of microbiome signatures at a meaningful taxonomic resolution. At the level of mechanistic understanding, the molecular integration of pleiotropic signals coming from this complex and dynamically changing ecosystem is one of the biggest challenges in this field. The first successful attempts to apply reverse genetics based on the available metagenomic information yielded identification of small molecules and metabolites with functional relevance for microbe-host interactions. Further expansion on the isolation of bacteria from the "unculturable biomass" will help characterize microbiome signatures in model systems, finally aiming at the development of clinically relevant synthetic consortia with safe and functionally well-defined strains. In conclusion and beyond reasonable enthusiasm, the mechanistic implementation and clinical relevance of microbiome alterations on disease susceptibility is still in its infancy, but the integration of all the above-mentioned strategies will help overcome the correlation era in microbiome research and lead to a rational evaluation of clinical strategies relevant for targeted microbial intervention.

Topics: Animals; Gastrointestinal Microbiome; Humans; Metagenome; Metagenomics; Models, Biological

PubMed: 28390576
DOI: 10.1016/j.jaci.2017.02.010

MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices.

Methods (San Diego, Calif.) Jun 2016

The study of metagenomics has been much benefited from low-cost and high-throughput sequencing technologies, yet the tremendous amount of data generated make analysis... (Review)

Summary PubMed

Review

Authors: Dinghua Li, Ruibang Luo, Chi-Man Liu...

The study of metagenomics has been much benefited from low-cost and high-throughput sequencing technologies, yet the tremendous amount of data generated make analysis like de novo assembly to consume too much computational resources. In late 2014 we released MEGAHIT v0.1 (together with a brief note of Li et al. (2015) [1]), which is the first NGS metagenome assembler that can assemble genome sequences from metagenomic datasets of hundreds of Giga base-pairs (bp) in a time- and memory-efficient manner on a single server. The core of MEGAHIT is an efficient parallel algorithm for constructing succinct de Bruijn Graphs (SdBG), implemented on a graphical processing unit (GPU). The software has been well received by the assembly community, and there is interest in how to adapt the algorithms to integrate popular assembly practices so as to improve the assembly quality, as well as how to speed up the software using better CPU-based algorithms (instead of GPU). In this paper we first describe the details of the core algorithms in MEGAHIT v0.1, and then we show the new modules to upgrade MEGAHIT to version v1.0, which gives better assembly quality, runs faster and uses less memory. For the Iowa Prairie Soil dataset (252Gbp after quality trimming), the assembly quality of MEGAHIT v1.0, when compared with v0.1, has a significant improvement, namely, 36% increase in assembly size and 23% in N50. More interestingly, MEGAHIT v1.0 is no slower than before (even running with the extra modules). This is primarily due to a new CPU-based algorithm for SdBG construction that is faster and requires less memory. Using CPU only, MEGAHIT v1.0 can assemble the Iowa Prairie Soil sample in about 43h, reducing the running time of v0.1 by at least 25% and memory usage by up to 50%. MEGAHIT v1.0, exhibiting a smaller memory footprint, can process even larger datasets. The Kansas Prairie Soil sample (484Gbp), the largest publicly available dataset, can now be assembled using no more than 500GB of memory in 7.5days. The assemblies of these datasets (and other large metgenomic datasets), as well as the software, are available at the website https://hku-bal.github.io/megabox.

Topics: Algorithms; Datasets as Topic; Metagenome; Metagenomics; Sequence Analysis; Software; Soil

PubMed: 27012178
DOI: 10.1016/j.ymeth.2016.02.020

CheckV assesses the quality and completeness of metagenome-assembled viral genomes.

Nature Biotechnology May 2021

Millions of new viral sequences have been identified from metagenomes, but the quality and completeness of these sequences vary considerably. Here we present CheckV, an...

Summary PubMed Full Text PDF

Authors: Stephen Nayfach, Antonio Pedro Camargo, Frederik Schulz...

Millions of new viral sequences have been identified from metagenomes, but the quality and completeness of these sequences vary considerably. Here we present CheckV, an automated pipeline for identifying closed viral genomes, estimating the completeness of genome fragments and removing flanking host regions from integrated proviruses. CheckV estimates completeness by comparing sequences with a large database of complete viral genomes, including 76,262 identified from a systematic search of publicly available metagenomes, metatranscriptomes and metaviromes. After validation on mock datasets and comparison to existing methods, we applied CheckV to large and diverse collections of metagenome-assembled viral sequences, including IMG/VR and the Global Ocean Virome. This revealed 44,652 high-quality viral genomes (that is, >90% complete), although the vast majority of sequences were small fragments, which highlights the challenge of assembling viral genomes from short-read metagenomes. Additionally, we found that removal of host contamination substantially improved the accurate identification of auxiliary metabolic genes and interpretation of viral-encoded functions.

Topics: Genome, Viral; Metagenome; Metagenomics; Molecular Sequence Annotation; Software

PubMed: 33349699
DOI: 10.1038/s41587-020-00774-7

Web Resources for Metagenomics Studies.

Genomics, Proteomics & Bioinformatics Oct 2015

The development of next-generation sequencing (NGS) platforms spawned an enormous volume of data. This explosion in data has unearthed new scalability challenges for... (Review)

Summary PubMed Full Text PDF

Review

Authors: Pravin Dudhagara, Sunil Bhavsar, Chintan Bhagat...

The development of next-generation sequencing (NGS) platforms spawned an enormous volume of data. This explosion in data has unearthed new scalability challenges for existing bioinformatics tools. The analysis of metagenomic sequences using bioinformatics pipelines is complicated by the substantial complexity of these data. In this article, we review several commonly-used online tools for metagenomics data analysis with respect to their quality and detail of analysis using simulated metagenomics data. There are at least a dozen such software tools presently available in the public domain. Among them, MGRAST, IMG/M, and METAVIR are the most well-known tools according to the number of citations by peer-reviewed scientific media up to mid-2015. Here, we describe 12 online tools with respect to their web link, annotation pipelines, clustering methods, online user support, and availability of data storage. We have also done the rating for each tool to screen more potential and preferential tools and evaluated five best tools using synthetic metagenome. The article comprehensively deals with the contemporary problems and the prospects of metagenomics from a bioinformatics viewpoint.

Topics: Cluster Analysis; Computational Biology; High-Throughput Nucleotide Sequencing; Humans; Information Storage and Retrieval; Internet; Metagenome; Metagenomics; Software

PubMed: 26602607
DOI: 10.1016/j.gpb.2015.10.003

MetaQUAST: evaluation of metagenome assemblies.

Bioinformatics (Oxford, England) Apr 2016

During the past years we have witnessed the rapid development of new metagenome assembly methods. Although there are many benchmark utilities designed for single-genome...

Summary PubMed

Authors: Alla Mikheenko, Vladislav Saveliev, Alexey Gurevich...

UNLABELLED

During the past years we have witnessed the rapid development of new metagenome assembly methods. Although there are many benchmark utilities designed for single-genome assemblies, there is no well-recognized evaluation and comparison tool for metagenomic-specific analogues. In this article, we present MetaQUAST, a modification of QUAST, the state-of-the-art tool for genome assembly evaluation based on alignment of contigs to a reference. MetaQUAST addresses such metagenome datasets features as (i) unknown species content by detecting and downloading reference sequences, (ii) huge diversity by giving comprehensive reports for multiple genomes and (iii) presence of highly relative species by detecting chimeric contigs. We demonstrate MetaQUAST performance by comparing several leading assemblers on one simulated and two real datasets.

AVAILABILITY AND IMPLEMENTATION

http://bioinf.spbau.ru/metaquast

CONTACT

[email protected]

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Topics: Algorithms; Genomic Structural Variation; Metagenome; Metagenomics; Software

PubMed: 26614127
DOI: 10.1093/bioinformatics/btv697