-
Briefings in Bioinformatics May 2023Recovering high-quality metagenome-assembled genomes (HQ-MAGs) is critical for exploring microbial compositions and microbe-phenotype associations. However, multiple...
Recovering high-quality metagenome-assembled genomes (HQ-MAGs) is critical for exploring microbial compositions and microbe-phenotype associations. However, multiple sequencing platforms and computational tools for this purpose may confuse researchers and thus call for extensive evaluation. Here, we systematically evaluated a total of 40 combinations of popular computational tools and sequencing platforms (i.e. strategies), involving eight assemblers, eight metagenomic binners and four sequencing technologies, including short-, long-read and metaHiC sequencing. We identified the best tools for the individual tasks (e.g. the assembly and binning) and combinations (e.g. generating more HQ-MAGs) depending on the availability of the sequencing data. We found that the combination of the hybrid assemblies and metaHiC-based binning performed best, followed by the hybrid and long-read assemblies. More importantly, both long-read and metaHiC sequencings link more mobile elements and antibiotic resistance genes to bacterial hosts and improve the quality of public human gut reference genomes with 32% (34/105) HQ-MAGs that were either of better quality than those in the Unified Human Gastrointestinal Genome catalog version 2 or novel.
Topics: Humans; Metagenomics; Sequence Analysis, DNA; Metagenome; Bacteria; Gastrointestinal Tract
PubMed: 37114640
DOI: 10.1093/bib/bbad162 -
Frontiers in Cellular and Infection... 2023The species diversity of microbiomes is a cutting-edge concept in metagenomic research. In this study, we propose a multifractal analysis for metagenomic research.
INTRODUCTION
The species diversity of microbiomes is a cutting-edge concept in metagenomic research. In this study, we propose a multifractal analysis for metagenomic research.
METHOD AND RESULTS
Firstly, we visualized the chaotic game representation (CGR) of simulated metagenomes and real metagenomes. We find that metagenomes are visualized with self-similarity. Then we defined and calculated the multifractal dimension for the visualized plot of simulated and real metagenomes, respectively. By analyzing the Pearson correlation coefficients between the multifractal dimension and the traditional species diversity index, we obtain that the correlation coefficients between the multifractal dimension and the species richness index and Shannon diversity index reached the maximum value when q = 0, 1, and the correlation coefficient between the multifractal dimension and the Simpson diversity index reached the maximum value when q = 5. Finally, we apply our method to real metagenomes of the gut microbiota of 100 infants who are newborn and 4 and 12 months old. The results show that the multifractal dimensions of an infant's gut microbiomes can distinguish age differences.
CONCLUSION AND DISCUSSION
There is self-similarity among the CGRs of WGS of metagenomes, and the multifractal spectrum is an important characteristic for metagenomes. The traditional diversity indicators can be unified under the framework of multifractal analysis. These results coincided with similar results in macrobial ecology. The multifractal spectrum of infants' gut microbiomes are related to the development of the infants.
Topics: Humans; Infant; Infant, Newborn; Metagenome; Microbiota; Gastrointestinal Microbiome; Metagenomics; Ecology
PubMed: 36779183
DOI: 10.3389/fcimb.2023.1117421 -
Biotechnology and Applied Biochemistry Oct 2022Esterase enzymes are a family of hydrolases that catalyze the breakdown and formation of ester bonds. Esterases have gained a prominent position in today's world's... (Review)
Review
Esterase enzymes are a family of hydrolases that catalyze the breakdown and formation of ester bonds. Esterases have gained a prominent position in today's world's industrial enzymes market. Due to their unique biocatalytic attributes, esterases contribute to environmentally sustainable design approaches, including biomass degradation, food and feed industry, dairy, clothing, agrochemical (herbicides, insecticides), bioremediation, biosensor development, anticancer, antitumor, gene therapy, and diagnostic purposes. Esterases can be isolated by a diverse range of mammalian tissues, animals, and microorganisms. The isolation of extremophilic esterases increases the interest of researchers in the extraction and utilization of these enzymes at the industrial level. Genomic, metagenomic, and immobilization techniques have opened innovative ways to extract esterases and utilize them for a longer time to take advantage of their beneficial activities. The current study discusses the types of esterases, metagenomic studies for exploring new esterases, and their biomedical applications in different industrial sectors.
Topics: Animals; Esterases; Metagenomics; Metagenome; Biotechnology; Biocatalysis; Mammals
PubMed: 34699092
DOI: 10.1002/bab.2277 -
Nature Biotechnology May 2021Millions of new viral sequences have been identified from metagenomes, but the quality and completeness of these sequences vary considerably. Here we present CheckV, an...
Millions of new viral sequences have been identified from metagenomes, but the quality and completeness of these sequences vary considerably. Here we present CheckV, an automated pipeline for identifying closed viral genomes, estimating the completeness of genome fragments and removing flanking host regions from integrated proviruses. CheckV estimates completeness by comparing sequences with a large database of complete viral genomes, including 76,262 identified from a systematic search of publicly available metagenomes, metatranscriptomes and metaviromes. After validation on mock datasets and comparison to existing methods, we applied CheckV to large and diverse collections of metagenome-assembled viral sequences, including IMG/VR and the Global Ocean Virome. This revealed 44,652 high-quality viral genomes (that is, >90% complete), although the vast majority of sequences were small fragments, which highlights the challenge of assembling viral genomes from short-read metagenomes. Additionally, we found that removal of host contamination substantially improved the accurate identification of auxiliary metabolic genes and interpretation of viral-encoded functions.
Topics: Genome, Viral; Metagenome; Metagenomics; Molecular Sequence Annotation; Software
PubMed: 33349699
DOI: 10.1038/s41587-020-00774-7 -
Briefings in Bioinformatics Nov 2022Pan-genome analyses of metagenome-assembled genomes (MAGs) may suffer from the known issues with MAGs: fragmentation, incompleteness and contamination. Here, we...
Pan-genome analyses of metagenome-assembled genomes (MAGs) may suffer from the known issues with MAGs: fragmentation, incompleteness and contamination. Here, we conducted a critical assessment of pan-genomics of MAGs, by comparing pan-genome analysis results of complete bacterial genomes and simulated MAGs. We found that incompleteness led to significant core gene (CG) loss. The CG loss remained when using different pan-genome analysis tools (Roary, BPGA, Anvi'o) and when using a mixture of MAGs and complete genomes. Contamination had little effect on core genome size (except for Roary due to in its gene clustering issue) but had major influence on accessory genomes. Importantly, the CG loss was partially alleviated by lowering the CG threshold and using gene prediction algorithms that consider fragmented genes, but to a less degree when incompleteness was higher than 5%. The CG loss also led to incorrect pan-genome functional predictions and inaccurate phylogenetic trees. Our main findings were supported by a study of real MAG-isolate genome data. We conclude that lowering CG threshold and predicting genes in metagenome mode (as Anvi'o does with Prodigal) are necessary in pan-genome analysis of MAGs. Development of new pan-genome analysis tools specifically for MAGs are needed in future studies.
Topics: Metagenome; Phylogeny; Genome, Bacterial; Genomics; Sequence Analysis, DNA; Metagenomics
PubMed: 36124775
DOI: 10.1093/bib/bbac413 -
Letters in Applied Microbiology Feb 2023The word endosphere represents the internal tissues of plants harboring diverse microbes capable of producing active biological products for various biotechnological and... (Review)
Review
The word endosphere represents the internal tissues of plants harboring diverse microbes capable of producing active biological products for various biotechnological and agricultural applications. The discreet standalone genes and interdependent association of microbial endophytes with plants can be an underlining factor in predicting their ecological functions. Yet-to-be-cultured endophytic microbes have geared the invention of metagenomics in various environmental studies to determine their structural diversity and functional genes with novel attributes. This review presents an overview of the general concept of metagenomics in microbial endophytic studies. First, the endosphere microbial communities were introduced, followed by metagenomic insights in endosphere biology, a promising technology. Also, the major application of metagenomics and a short brief on DNA stable isotope probing in determining functions and metabolic pathways of microbial metagenome were highlighted. Therefore, the use of metagenomics promises to provide answers to yet-to-be-cultured microbes by unraveling their diversity, functional attributes, and metabolic pathways with prospects in integrated and sustainable agriculture.
Topics: Metagenome; Metagenomics; Microbiota; Endophytes; Plants
PubMed: 36794885
DOI: 10.1093/lambio/ovac030 -
metaMIC: reference-free misassembly identification and correction of de novo metagenomic assemblies.Genome Biology Nov 2022Evaluating the quality of metagenomic assemblies is important for constructing reliable metagenome-assembled genomes and downstream analyses. Here, we present metaMIC (...
Evaluating the quality of metagenomic assemblies is important for constructing reliable metagenome-assembled genomes and downstream analyses. Here, we present metaMIC ( https://github.com/ZhaoXM-Lab/metaMIC ), a machine learning-based tool for identifying and correcting misassemblies in metagenomic assemblies. Benchmarking results on both simulated and real datasets demonstrate that metaMIC outperforms existing tools when identifying misassembled contigs. Furthermore, metaMIC is able to localize the misassembly breakpoints, and the correction of misassemblies by splitting at misassembly breakpoints can improve downstream scaffolding and binning results.
Topics: Metagenome; Sequence Analysis, DNA; Metagenomics; Machine Learning; Benchmarking; Software; Algorithms
PubMed: 36376928
DOI: 10.1186/s13059-022-02810-y -
Methods in Molecular Biology (Clifton,... 2023Viral metagenomics enables the detection, characterization, and quantification of viral sequences present in shotgun-sequenced datasets of purified virus-like particles...
Viral metagenomics enables the detection, characterization, and quantification of viral sequences present in shotgun-sequenced datasets of purified virus-like particles and whole metagenomes. Next generation sequencing (Illumina) derived short single or paired-end read runs are a principal platform for metagenomics, and assembly of short reads allows for the identification of distinguishing viral signatures and complex genomic features for taxonomy and functional annotation. Here we describe the identification and characterization of viral genome sequences, bacteriophages, and eukaryotic viruses, from a cohort of human stool samples, using multiple methods. Following the purification of virus-like particles, sequencing, quality refinement, and genome assembly, we begin the protocol with raw short reads deposited in an open-source nucleotide archive. We highlight the use of VIBRANT, an automated computational tool for the characterization of microbial viruses and their viral community function. Finally, we also describe an alternative assembly-free option of mapping reads to established databases of reference genomes and previously characterized metagenome-assembled viral genomes.
Topics: Humans; Metagenome; Genomics; Metagenomics; Viruses; Bacteriophages; High-Throughput Nucleotide Sequencing
PubMed: 37258871
DOI: 10.1007/978-1-0716-3072-3_17 -
Scientific Data Nov 2023Urban lakes provide multiple benefits to society while influencing life quality. Moreover, lakes and their microbiomes are sentinels of anthropogenic impact and can be...
Urban lakes provide multiple benefits to society while influencing life quality. Moreover, lakes and their microbiomes are sentinels of anthropogenic impact and can be used for natural resource management and planning. Here, we release original metagenomic data from several well-characterized and anthropogenically impacted eutrophic lakes in the vicinity of Stockholm (Sweden). Our goal was to collect representative microbial community samples and use shotgun sequencing to provide a broad view on microbial diversity of productive urban lakes. Our dataset has an emphasis on Lake Mälaren as a major drinking water reservoir under anthropogenic impact. This dataset includes short-read sequence data and metagenome assemblies from each of 17 samples collected from eutrophic lakes near the greater Stockholm area. We used genome-resolved metagenomics and obtained 2378 metagenome assembled genomes that de-replicated into 514 species representative genomes. This dataset adds new datapoints to previously sequenced lakes and it includes the first sequenced set of metagenomes from Lake Mälaren. Our dataset serves as a baseline for future monitoring of drinking water reservoirs and urban lakes.
Topics: Bacteria; Drinking Water; Lakes; Metagenome; Metagenomics; Sweden
PubMed: 37978200
DOI: 10.1038/s41597-023-02722-x -
Journal of Microbiology (Seoul, Korea) Mar 2021The environment is under siege from a variety of pollution sources. Fecal pollution is especially harmful as it disperses pathogenic bacteria into waterways. Unraveling... (Review)
Review
The environment is under siege from a variety of pollution sources. Fecal pollution is especially harmful as it disperses pathogenic bacteria into waterways. Unraveling origins of mixed sources of fecal bacteria is difficult and microbial source tracking (MST) in complex environments is still a daunting task. Despite the challenges, the need for answers far outweighs the difficulties experienced. Advancements in qPCR and next generation sequencing (NGS) technologies have shifted the traditional culture-based MST approaches towards culture independent technologies, where community-based MST is becoming a method of choice. Metagenomic tools may be useful to overcome some of the limitations of community-based MST methods as they can give deep insight into identifying host specific fecal markers and their association with different environments. Adoption of machine learning (ML) algorithms, along with the metagenomic based MST approaches, will also provide a statistically robust and automated platform. To compliment that, ML-based approaches provide accurate optimization of resources. With the successful application of ML based models in disease prediction, outbreak investigation and medicine prescription, it would be possible that these methods would serve as a better surrogate of traditional MST approaches in future.
Topics: Animals; Bacteria; Feces; High-Throughput Nucleotide Sequencing; Humans; Metagenome; Metagenomics
PubMed: 33565053
DOI: 10.1007/s12275-021-0668-9