-
Methods in Molecular Biology (Clifton,... 2022Most microbial groups have not been cultivated yet, and the only way to approach the enormous diversity of rhodopsins that they contain in a sensible timeframe is...
Most microbial groups have not been cultivated yet, and the only way to approach the enormous diversity of rhodopsins that they contain in a sensible timeframe is through the analysis of their genomes. High-throughput sequencing technologies have allowed the release of community genomics (metagenomics) of many habitats in the photic zones of the ocean and lakes. Already the harvest is impressive and included from the first bacterial rhodopsin (proteorhodopsin) to the recent discovery of heliorhodopsin by functional metagenomics. However, the search continues using bioinformatic or biochemical routes.
Topics: Metagenome; Metagenomics; Phylogeny; Rhodopsins, Microbial
PubMed: 35857224
DOI: 10.1007/978-1-0716-2329-9_4 -
Briefings in Bioinformatics Mar 2023Metagenome assembly is an efficient approach to reconstruct microbial genomes from metagenomic sequencing data. Although short-read sequencing has been widely used for...
Metagenome assembly is an efficient approach to reconstruct microbial genomes from metagenomic sequencing data. Although short-read sequencing has been widely used for metagenome assembly, linked- and long-read sequencing have shown their advancements in assembly by providing long-range DNA connectedness. Many metagenome assembly tools were developed to simplify the assembly graphs and resolve the repeats in microbial genomes. However, there remains no comprehensive evaluation of metagenomic sequencing technologies, and there is a lack of practical guidance on selecting the appropriate metagenome assembly tools. This paper presents a comprehensive benchmark of 19 commonly used assembly tools applied to metagenomic sequencing datasets obtained from simulation, mock communities or human gut microbiomes. These datasets were generated using mainstream sequencing platforms, such as Illumina and BGISEQ short-read sequencing, 10x Genomics linked-read sequencing, and PacBio and Oxford Nanopore long-read sequencing. The assembly tools were extensively evaluated against many criteria, which revealed that long-read assemblers generated high contig contiguity but failed to reveal some medium- and high-quality metagenome-assembled genomes (MAGs). Linked-read assemblers obtained the highest number of overall near-complete MAGs from the human gut microbiomes. Hybrid assemblers using both short- and long-read sequencing were promising methods to improve both total assembly length and the number of near-complete MAGs. This paper also discussed the running time and peak memory consumption of these assembly tools and provided practical guidance on selecting them.
Topics: Humans; Metagenome; Benchmarking; Microbiota; Metagenomics; Genomics; High-Throughput Nucleotide Sequencing; Sequence Analysis, DNA
PubMed: 36917471
DOI: 10.1093/bib/bbad087 -
Briefings in Bioinformatics Sep 2023Microbial genome recovery from metagenomes can further explain microbial ecosystem structures, functions and dynamics. Thus, this study developed the Additional...
Microbial genome recovery from metagenomes can further explain microbial ecosystem structures, functions and dynamics. Thus, this study developed the Additional Clustering Refiner (ACR) to enhance high-purity prokaryotic and eukaryotic metagenome-assembled genome (MAGs) recovery. ACR refines low-quality MAGs by subjecting them to iterative k-means clustering predicated on contig abundance and increasing bin purity through validated universal marker genes. Synthetic and real-world metagenomic datasets, including short- and long-read sequences, evaluated ACR's effectiveness. The results demonstrated improved MAG purity and a significant increase in high- and medium-quality MAG recovery rates. In addition, ACR seamlessly integrates with various binning algorithms, augmenting their strengths without modifying core features. Furthermore, its multiple sequencing technology compatibilities expand its applicability. By efficiently recovering high-quality prokaryotic and eukaryotic genomes, ACR is a promising tool for deepening our understanding of microbial communities through genome-centric metagenomics.
Topics: Metagenome; Eukaryota; Microbiota; Algorithms; Metagenomics; Cluster Analysis
PubMed: 37889119
DOI: 10.1093/bib/bbad381 -
Scientific Data Feb 2023Common culturing techniques and priorities bias our discovery towards specific traits that may not be representative of microbial diversity in nature. So far, these...
Common culturing techniques and priorities bias our discovery towards specific traits that may not be representative of microbial diversity in nature. So far, these biases have not been systematically examined. To address this gap, here we use 116,884 publicly available metagenome-assembled genomes (MAGs, completeness ≥80%) from 203 surveys worldwide as a culture-independent sample of bacterial and archaeal diversity, and compare these MAGs to the popular RefSeq genome database, which heavily relies on cultures. We compare the distribution of 12,454 KEGG gene orthologs (used as trait proxies) in the MAGs and RefSeq genomes, while controlling for environment type (ocean, soil, lake, bioreactor, human, and other animals). Using statistical modeling, we then determine the conditional probabilities that a species is represented in RefSeq depending on its genetic repertoire. We find that the majority of examined genes are significantly biased for or against in RefSeq. Our systematic estimates of gene prevalences across bacteria and archaea in nature and gene-specific biases in reference genomes constitutes a resource for addressing these issues in the future.
Topics: Animals; Archaea; Bacteria; Genome, Microbial; Metagenome; Metagenomics
PubMed: 36759614
DOI: 10.1038/s41597-023-01994-7 -
Bioinformatics (Oxford, England) Sep 2022Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial...
MOTIVATION
Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial task. Current metagenomic binners do not take full advantage of assembly graphs and are not optimized for long-read assemblies. Deep graph learning algorithms have been proposed in other fields to deal with complex graph data structures. The graph structure generated during the assembly process could be integrated with contig features to obtain better bins with deep learning.
RESULTS
We propose GraphMB, which uses graph neural networks to incorporate the assembly graph into the binning process. We test GraphMB on long-read datasets of different complexities, and compare the performance with other binners in terms of the number of High Quality (HQ) genome bins obtained. With our approach, we were able to obtain unique bins on all real datasets, and obtain more bins on most datasets. In particular, we obtained on average 17.5% more HQ bins when compared with state-of-the-art binners and 13.7% when aggregating the results of our binner with the others. These results indicate that a deep learning model can integrate contig-specific and graph-structure information to improve metagenomic binning.
AVAILABILITY AND IMPLEMENTATION
GraphMB is available from https://github.com/MicrobialDarkMatter/GraphMB.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Sequence Analysis, DNA; Metagenomics; Metagenome; Genome, Microbial; Algorithms
PubMed: 35972375
DOI: 10.1093/bioinformatics/btac557 -
Microbial Genomics Apr 2024The ever-decreasing cost of sequencing and the growing potential applications of metagenomics have led to an unprecedented surge in data generation. One of the most... (Review)
Review
The ever-decreasing cost of sequencing and the growing potential applications of metagenomics have led to an unprecedented surge in data generation. One of the most prevalent applications of metagenomics is the study of microbial environments, such as the human gut. The gut microbiome plays a crucial role in human health, providing vital information for patient diagnosis and prognosis. However, analysing metagenomic data remains challenging due to several factors, including reference catalogues, sparsity and compositionality. Deep learning (DL) enables novel and promising approaches that complement state-of-the-art microbiome pipelines. DL-based methods can address almost all aspects of microbiome analysis, including novel pathogen detection, sequence classification, patient stratification and disease prediction. Beyond generating predictive models, a key aspect of these methods is also their interpretability. This article reviews DL approaches in metagenomics, including convolutional networks, autoencoders and attention-based models. These methods aggregate contextualized data and pave the way for improved patient care and a better understanding of the microbiome's key role in our health.
Topics: Humans; Deep Learning; Microbiota; Metagenome; Gastrointestinal Microbiome; Metagenomics
PubMed: 38630611
DOI: 10.1099/mgen.0.001231 -
STAR Protocols Sep 2022Homology-based search is commonly used to uncover mobile genetic elements (MGEs) from metagenomes, but it heavily relies on reference genomes in the database. Here we...
Homology-based search is commonly used to uncover mobile genetic elements (MGEs) from metagenomes, but it heavily relies on reference genomes in the database. Here we introduce a protocol to extract CRISPR-targeted sequences from the assembled human gut metagenomic sequences without using a reference database. We describe the assembling of metagenome contigs, the extraction of CRISPR direct repeats and spacers, the discovery of protospacers, and the extraction of protospacer-enriched regions using the graph-based approach. This protocol could extract numerous characterized/uncharacterized MGEs. For complete details on the use and execution of this protocol, please refer to Sugimoto et al. (2021).
Topics: Base Sequence; Clustered Regularly Interspaced Short Palindromic Repeats; Humans; Metagenome; Metagenomics
PubMed: 35780428
DOI: 10.1016/j.xpro.2022.101525 -
Molecules (Basel, Switzerland) May 2021Microorganisms are highly regarded as a prominent source of natural products that have significant importance in many fields such as medicine, farming, environmental... (Review)
Review
Microorganisms are highly regarded as a prominent source of natural products that have significant importance in many fields such as medicine, farming, environmental safety, and material production. Due to this, only tiny amounts of microorganisms can be cultivated under standard laboratory conditions, and the bulk of microorganisms in the ecosystems are still unidentified, which restricts our knowledge of uncultured microbial metabolism. However, they could hypothetically provide a large collection of innovative natural products. Culture-independent metagenomics study has the ability to address core questions in the potential of NP production by cloning and analysis of microbial DNA derived directly from environmental samples. Latest advancements in next generation sequencing and genetic engineering tools for genome assembly have broadened the scope of metagenomics to offer perspectives into the life of uncultured microorganisms. In this review, we cover the methods of metagenomic library construction, and heterologous expression for the exploration and development of the environmental metabolome and focus on the function-based metagenomics, sequencing-based metagenomics, and single-cell metagenomics of uncultured microorganisms.
Topics: Bacteria; Biological Products; Ecosystem; High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics
PubMed: 34067778
DOI: 10.3390/molecules26102977 -
Journal of Microbiological Methods Mar 2020Plant microbiota have different effects on the plant which can be beneficial or pathogenic. In this study, we concentrated on beneficial microbes associated with plants... (Review)
Review
Plant microbiota have different effects on the plant which can be beneficial or pathogenic. In this study, we concentrated on beneficial microbes associated with plants using endophytic microbes as a case study. Detailed knowledge of the microbial diversity, abundance, composition, functional genes patterns, and metabolic pathways at genome level could assist in understanding the contributions of microbial community towards plant growth and health. Recently, the study of microbial community has improved greatly with the discovery of next-generation sequencing and bioinformatics technologies. Analysis of next generation sequencing data and a proper computational method plays a key role in examining microbial metagenome. This review presents the general metagenomics and computational methods used in processing plant associated metagenomes with concentration on endophytes. This includes 1) introduction of plant-associated microbiota and the factors driving their diversity. 2) plant metagenome focusing on DNA extraction, verification and quality control. 3) metagenomics methods used in community analysis of endophytes focusing on maize plant and, 4) computational methods used in the study of endophytic microbiomes. Limitations and future prospects of metagenomics and computational methods for the analysis of plant-associated metagenome (endophytic metagenome) were also discussed with the aim of fostering its development. We conclude that there is need to adopt advanced genomic features such as k-mers of random size, which do not depend on annotation and can represent other sequence alternatives.
Topics: Computational Biology; Endophytes; High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Microbiota; Sequence Analysis, DNA; Zea mays
PubMed: 32027927
DOI: 10.1016/j.mimet.2020.105860 -
Methods in Molecular Biology (Clifton,... 2024Bacteriophage diversity is a relatively unknown frontier that is rapidly being explored, leading to a wealth of new information. New bacteriophages are being discovered... (Review)
Review
Bacteriophage diversity is a relatively unknown frontier that is rapidly being explored, leading to a wealth of new information. New bacteriophages are being discovered at an astounding rate via both phage isolation studies and metagenomic analyses. In addition, a nucleotide sequence-based viral taxonomic system has been developed to better handle this wealth of new information. As a result of these developments, phage scientists are transitioning from knowing that there must be huge numbers of diverse kinds of phage particles in natural environments to identifying the actual abundance and phage diversity that is present in specific environments. This review documents the beginning of this transition, offering a glimpse into the magnitude of change unfolding in the field. It stands as a testament to the expanding frontiers of phage research, illuminating the remarkable progress made in unraveling the intricate world of bacteriophage diversity and advancing our understanding of these enigmatic viral entities.
Topics: Bacteriophages; Genomics; Metagenomics; Metagenome; Genome, Viral
PubMed: 37966589
DOI: 10.1007/978-1-0716-3549-0_1