metagenome - OpenMD.com Journal Search

Microbial characterization based on multifractal analysis of metagenomes.

Frontiers in Cellular and Infection... 2023

The species diversity of microbiomes is a cutting-edge concept in metagenomic research. In this study, we propose a multifractal analysis for metagenomic research.

Summary PubMed Full Text PDF

Authors: Xian-Hua Xie, Yu-Jie Huang, Guo-Sheng Han...

INTRODUCTION

The species diversity of microbiomes is a cutting-edge concept in metagenomic research. In this study, we propose a multifractal analysis for metagenomic research.

METHOD AND RESULTS

Firstly, we visualized the chaotic game representation (CGR) of simulated metagenomes and real metagenomes. We find that metagenomes are visualized with self-similarity. Then we defined and calculated the multifractal dimension for the visualized plot of simulated and real metagenomes, respectively. By analyzing the Pearson correlation coefficients between the multifractal dimension and the traditional species diversity index, we obtain that the correlation coefficients between the multifractal dimension and the species richness index and Shannon diversity index reached the maximum value when q = 0, 1, and the correlation coefficient between the multifractal dimension and the Simpson diversity index reached the maximum value when q = 5. Finally, we apply our method to real metagenomes of the gut microbiota of 100 infants who are newborn and 4 and 12 months old. The results show that the multifractal dimensions of an infant's gut microbiomes can distinguish age differences.

CONCLUSION AND DISCUSSION

There is self-similarity among the CGRs of WGS of metagenomes, and the multifractal spectrum is an important characteristic for metagenomes. The traditional diversity indicators can be unified under the framework of multifractal analysis. These results coincided with similar results in macrobial ecology. The multifractal spectrum of infants' gut microbiomes are related to the development of the infants.

Topics: Humans; Infant; Infant, Newborn; Metagenome; Microbiota; Gastrointestinal Microbiome; Metagenomics; Ecology

PubMed: 36779183
DOI: 10.3389/fcimb.2023.1117421

Accurate and sensitive detection of microbial eukaryotes from whole metagenome shotgun sequencing.

Microbiome Mar 2021

Microbial eukaryotes are found alongside bacteria and archaea in natural microbial systems, including host-associated microbiomes. While microbial eukaryotes are...

Summary PubMed Full Text PDF

Authors: Abigail L Lind, Katherine S Pollard

BACKGROUND

Microbial eukaryotes are found alongside bacteria and archaea in natural microbial systems, including host-associated microbiomes. While microbial eukaryotes are critical to these communities, they are challenging to study with shotgun sequencing techniques and are therefore often excluded.

RESULTS

Here, we present EukDetect, a bioinformatics method to identify eukaryotes in shotgun metagenomic sequencing data. Our approach uses a database of 521,824 universal marker genes from 241 conserved gene families, which we curated from 3713 fungal, protist, non-vertebrate metazoan, and non-streptophyte archaeplastida genomes and transcriptomes. EukDetect has a broad taxonomic coverage of microbial eukaryotes, performs well on low-abundance and closely related species, and is resilient against bacterial contamination in eukaryotic genomes. Using EukDetect, we describe the spatial distribution of eukaryotes along the human gastrointestinal tract, showing that fungi and protists are present in the lumen and mucosa throughout the large intestine. We discover that there is a succession of eukaryotes that colonize the human gut during the first years of life, mirroring patterns of developmental succession observed in gut bacteria. By comparing DNA and RNA sequencing of paired samples from human stool, we find that many eukaryotes continue active transcription after passage through the gut, though some do not, suggesting they are dormant or nonviable. We analyze metagenomic data from the Baltic Sea and find that eukaryotes differ across locations and salinity gradients. Finally, we observe eukaryotes in Arabidopsis leaf samples, many of which are not identifiable from public protein databases.

CONCLUSIONS

EukDetect provides an automated and reliable way to characterize eukaryotes in shotgun sequencing datasets from diverse microbiomes. We demonstrate that it enables discoveries that would be missed or clouded by false positives with standard shotgun sequence analysis. EukDetect will greatly advance our understanding of how microbial eukaryotes contribute to microbiomes. Video abstract.

Topics: Animals; Eukaryota; Humans; Metagenome; Metagenomics; Sequence Analysis, DNA

PubMed: 33658077
DOI: 10.1186/s40168-021-01015-y

Functional assignment of metagenomic data: challenges and applications.

Briefings in Bioinformatics Nov 2012

Metagenomic sequencing provides a unique opportunity to explore earth's limitless environments harboring scores of yet unknown and mostly unculturable microbes and other... (Review)

Summary PubMed Full Text PDF

Review

Authors: Tulika Prakash, Todd D Taylor

Metagenomic sequencing provides a unique opportunity to explore earth's limitless environments harboring scores of yet unknown and mostly unculturable microbes and other organisms. Functional analysis of the metagenomic data plays a central role in projects aiming to explore the most essential questions in microbiology, namely 'In a given environment, among the microbes present, what are they doing, and how are they doing it?' Toward this goal, several large-scale metagenomic projects have recently been conducted or are currently underway. Functional analysis of metagenomic data mainly suffers from the vast amount of data generated in these projects. The shear amount of data requires much computational time and storage space. These problems are compounded by other factors potentially affecting the functional analysis, including, sample preparation, sequencing method and average genome size of the metagenomic samples. In addition, the read-lengths generated during sequencing influence sequence assembly, gene prediction and subsequently the functional analysis. The level of confidence for functional predictions increases with increasing read-length. Usually, the most reliable functional annotations for metagenomic sequences are achieved using homology-based approaches against publicly available reference sequence databases. Here, we present an overview of the current state of functional analysis of metagenomic sequence data, bottlenecks frequently encountered and possible solutions in light of currently available resources and tools. Finally, we provide some examples of applications from recent metagenomic studies which have been successfully conducted in spite of the known difficulties.

Topics: Algorithms; Metagenome; Metagenomics; Sequence Analysis, DNA

PubMed: 22772835
DOI: 10.1093/bib/bbs033

Web Resources for Metagenomics Studies.

Genomics, Proteomics & Bioinformatics Oct 2015

The development of next-generation sequencing (NGS) platforms spawned an enormous volume of data. This explosion in data has unearthed new scalability challenges for... (Review)

Summary PubMed Full Text PDF

Review

Authors: Pravin Dudhagara, Sunil Bhavsar, Chintan Bhagat...

The development of next-generation sequencing (NGS) platforms spawned an enormous volume of data. This explosion in data has unearthed new scalability challenges for existing bioinformatics tools. The analysis of metagenomic sequences using bioinformatics pipelines is complicated by the substantial complexity of these data. In this article, we review several commonly-used online tools for metagenomics data analysis with respect to their quality and detail of analysis using simulated metagenomics data. There are at least a dozen such software tools presently available in the public domain. Among them, MGRAST, IMG/M, and METAVIR are the most well-known tools according to the number of citations by peer-reviewed scientific media up to mid-2015. Here, we describe 12 online tools with respect to their web link, annotation pipelines, clustering methods, online user support, and availability of data storage. We have also done the rating for each tool to screen more potential and preferential tools and evaluated five best tools using synthetic metagenome. The article comprehensively deals with the contemporary problems and the prospects of metagenomics from a bioinformatics viewpoint.

Topics: Cluster Analysis; Computational Biology; High-Throughput Nucleotide Sequencing; Humans; Information Storage and Retrieval; Internet; Metagenome; Metagenomics; Software

PubMed: 26602607
DOI: 10.1016/j.gpb.2015.10.003

Computational tools for viral metagenomics and their application in clinical research.

Virology Dec 2012

There are 100 times more virions than eukaryotic cells in a healthy human body. The characterization of human-associated viral communities in a non-pathological state... (Review)

Summary PubMed Full Text PDF

Review

Authors: L Fancello, D Raoult, C Desnues...

There are 100 times more virions than eukaryotic cells in a healthy human body. The characterization of human-associated viral communities in a non-pathological state and the detection of viral pathogens in cases of infection are essential for medical care and epidemic surveillance. Viral metagenomics, the sequenced-based analysis of the complete collection of viral genomes directly isolated from an organism or an ecosystem, bypasses the "single-organism-level" point of view of clinical diagnostics and thus the need to isolate and culture the targeted organism. The first part of this review is dedicated to a presentation of past research in viral metagenomics with an emphasis on human-associated viral communities (eukaryotic viruses and bacteriophages). In the second part, we review more precisely the computational challenges posed by the analysis of viral metagenomes, and we illustrate the problem of sequences that do not have homologs in public databases and the possible approaches to characterize them.

Topics: Biomedical Research; Biota; Computational Biology; Environmental Microbiology; Humans; Metagenome; Metagenomics; Viruses

PubMed: 23062738
DOI: 10.1016/j.virol.2012.09.025

MGnify Genomes: A Resource for Biome-specific Microbial Genome Catalogues.

Journal of Molecular Biology Jul 2023

An increasingly common output arising from the analysis of shotgun metagenomic datasets is the generation of metagenome-assembled genomes (MAGs), with tens of thousands...

Summary PubMed Full Text PDF

Authors: Tatiana A Gurbich, Alexandre Almeida, Martin Beracochea...

An increasingly common output arising from the analysis of shotgun metagenomic datasets is the generation of metagenome-assembled genomes (MAGs), with tens of thousands of MAGs now described in the literature. However, the discovery and comparison of these MAG collections is hampered by the lack of uniformity in their generation, annotation and storage. To address this, we have developed MGnify Genomes, a growing collection of biome-specific non-redundant microbial genome catalogues generated using MAGs and publicly available isolate genomes. Genomes within a biome-specific catalogue are organised into species clusters. For species that contain multiple conspecific genomes, the highest quality genome is selected as the representative, always prioritising an isolate genome over a MAG. The species representative sequences and annotations can be visualised on the MGnify website and the full catalogue and associated analysis outputs can be downloaded from MGnify servers. A suite of online search tools is provided allowing users to compare their own sequences, ranging from a gene to sets of genomes, against the catalogues. Seven biomes are available currently, comprising over 300,000 genomes that represent 11,048 non-redundant species, and include 36 taxonomic classes not currently represented by cultured genomes. MGnify Genomes is available at https://www.ebi.ac.uk/metagenomics/browse/genomes/.

Topics: Genome, Microbial; Metagenome; Metagenomics

PubMed: 36806692
DOI: 10.1016/j.jmb.2023.168016

SCAPP: an algorithm for improved plasmid assembly in metagenomes.

Microbiome Jun 2021

Metagenomic sequencing has led to the identification and assembly of many new bacterial genome sequences. These bacteria often contain plasmids: usually small, circular...

Summary PubMed Full Text PDF

Authors: David Pellow, Alvah Zorea, Maraike Probst...

BACKGROUND

Metagenomic sequencing has led to the identification and assembly of many new bacterial genome sequences. These bacteria often contain plasmids: usually small, circular double-stranded DNA molecules that may transfer across bacterial species and confer antibiotic resistance. These plasmids are generally less studied and understood than their bacterial hosts. Part of the reason for this is insufficient computational tools enabling the analysis of plasmids in metagenomic samples.

RESULTS

We developed SCAPP (Sequence Contents-Aware Plasmid Peeler)-an algorithm and tool to assemble plasmid sequences from metagenomic sequencing. SCAPP builds on some key ideas from the Recycler algorithm while improving plasmid assemblies by integrating biological knowledge about plasmids. We compared the performance of SCAPP to Recycler and metaplasmidSPAdes on simulated metagenomes, real human gut microbiome samples, and a human gut plasmidome dataset that we generated. We also created plasmidome and metagenome data from the same cow rumen sample and used the parallel sequencing data to create a novel assessment procedure. Overall, SCAPP outperformed Recycler and metaplasmidSPAdes across this wide range of datasets.

CONCLUSIONS

SCAPP is an easy to use Python package that enables the assembly of full plasmid sequences from metagenomic samples. It outperformed existing metagenomic plasmid assemblers in most cases and assembled novel and clinically relevant plasmids in samples we generated such as a human gut plasmidome. SCAPP is open-source software available from: https://github.com/Shamir-Lab/SCAPP . Video abstract.

Topics: Algorithms; Humans; Metagenome; Metagenomics; Plasmids; Sequence Analysis, DNA; Software

PubMed: 34172093
DOI: 10.1186/s40168-021-01068-z

Genomics: Resident risks.

Nature Oct 2012

An innovative method for probing the genomes of the vast community of microorganisms that inhabit the human gut provides an alternative approach to identifying risk...

Summary PubMed Full Text PDF

Authors: Julia Oh, Julia A Segre

An innovative method for probing the genomes of the vast community of microorganisms that inhabit the human gut provides an alternative approach to identifying risk factors for type 2 diabetes.

Topics: Diabetes Mellitus, Type 2; Genome-Wide Association Study; Humans; Intestines; Metagenome; Metagenomics

PubMed: 23038462
DOI: 10.1038/490044a

Assembling Reads Improves Taxonomic Classification of Species.

Genes Aug 2020

Most current approach to metagenomic classification employ short next generation sequencing (NGS) reads that are present in metagenomic samples to identify unique...

Summary PubMed Full Text PDF

Authors: Quang Tran, Vinhthuy Phan

Most current approach to metagenomic classification employ short next generation sequencing (NGS) reads that are present in metagenomic samples to identify unique genomic regions. NGS reads, however, might not be long enough to differentiate similar genomes. This suggests a potential for using longer reads to improve classification performance. Presently, longer reads tend to have a higher rate of sequencing errors. Thus, given the pros and cons, it remains unclear which types of reads is better for metagenomic classification. We compared two taxonomic classification protocols: a traditional assembly-free protocol and a novel assembly-based protocol. The novel assembly-based protocol consists of assembling short-reads into longer reads, which will be subsequently classified by a traditional taxonomic classifier. We discovered that most classifiers made fewer predictions with longer reads and that they achieved higher classification performance on synthetic metagenomic data. Generally, we observed a significant increase in precision, while having similar recall rates. On real data, we observed similar characteristics that suggest that the classifiers might have similar performance of higher precision with similar recall with longer reads. We have shown a noticeable difference in performance between assembly-based and assembly-free taxonomic classification. This finding strongly suggests that classifying species in metagenomic environments can be achieved with higher overall performance simply by assembling short reads. Further, it also suggests that long-read technologies might be better for species classification.

Topics: Computational Biology; DNA Barcoding, Taxonomic; Metagenome; Metagenomics; Reproducibility of Results; Workflow

PubMed: 32824429
DOI: 10.3390/genes11080946

Shotgun metagenomes from productive lakes in an urban region of Sweden.

Scientific Data Nov 2023

Urban lakes provide multiple benefits to society while influencing life quality. Moreover, lakes and their microbiomes are sentinels of anthropogenic impact and can be...

Summary PubMed Full Text PDF

Authors: Alejandro Rodríguez-Gijón, Justyna J Hampel, Jennah Dharamshi...

Urban lakes provide multiple benefits to society while influencing life quality. Moreover, lakes and their microbiomes are sentinels of anthropogenic impact and can be used for natural resource management and planning. Here, we release original metagenomic data from several well-characterized and anthropogenically impacted eutrophic lakes in the vicinity of Stockholm (Sweden). Our goal was to collect representative microbial community samples and use shotgun sequencing to provide a broad view on microbial diversity of productive urban lakes. Our dataset has an emphasis on Lake Mälaren as a major drinking water reservoir under anthropogenic impact. This dataset includes short-read sequence data and metagenome assemblies from each of 17 samples collected from eutrophic lakes near the greater Stockholm area. We used genome-resolved metagenomics and obtained 2378 metagenome assembled genomes that de-replicated into 514 species representative genomes. This dataset adds new datapoints to previously sequenced lakes and it includes the first sequenced set of metagenomes from Lake Mälaren. Our dataset serves as a baseline for future monitoring of drinking water reservoirs and urban lakes.

Topics: Bacteria; Drinking Water; Lakes; Metagenome; Metagenomics; Sweden

PubMed: 37978200
DOI: 10.1038/s41597-023-02722-x