-
Applied Biochemistry and Biotechnology Oct 2017Microorganisms are found throughout every corner of nature, and vast number of microorganisms is difficult to cultivate by classical microbiological techniques. The... (Review)
Review
Microorganisms are found throughout every corner of nature, and vast number of microorganisms is difficult to cultivate by classical microbiological techniques. The advent of metagenomics has revolutionized the field of microbial biotechnology. Metagenomics allow the recovery of genetic material directly from environmental niches without any cultivation techniques. Currently, metagenomic tools are widely employed as powerful tools to isolate and identify enzymes with novel biocatalytic activities from the uncultivable component of microbial communities. The employment of next-generation sequencing techniques for metagenomics resulted in the generation of large sequence data sets derived from various environments, such as soil, the human body and ocean water. This review article describes the state-of-the-art techniques and tools in metagenomics and discusses the potential of metagenomic approaches for the bioprospecting of industrial enzymes from various environmental samples. We also describe the unusual novel enzymes discovered via metagenomic approaches and discuss the future prospects for metagenome technologies.
Topics: Enzymes; Metagenome; Metagenomics
PubMed: 28815469
DOI: 10.1007/s12010-017-2568-3 -
Communications Biology Oct 2023Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct...
Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct genomes by re-grouping the sequences by their organism of origin, thus representing a crucial processing step when exploring the biological diversity of metagenomic samples. Here we present Adversarial Autoencoders for Metagenomics Binning (AAMB), an ensemble deep learning approach that integrates sequence co-abundances and tetranucleotide frequencies into a common denoised space that enables precise clustering of sequences into microbial genomes. When benchmarked, AAMB presented similar or better results compared with the state-of-the-art reference-free binner VAMB, reconstructing ~7% more near-complete (NC) genomes across simulated and real data. In addition, genomes reconstructed using AAMB had higher completeness and greater taxonomic diversity compared with VAMB. Finally, we implemented a pipeline Integrating VAMB and AAMB that enabled improved binning, recovering 20% and 29% more simulated and real NC genomes, respectively, compared to VAMB, with moderate additional runtime.
Topics: Metagenome; Genome, Microbial; Metagenomics; Cluster Analysis; Benchmarking
PubMed: 37865678
DOI: 10.1038/s42003-023-05452-3 -
Annual Review of Biomedical Data Science Jul 2021Viruses are the most abundant biological entity on Earth, infect cellular organisms from all domains of life, and are central players in the global biosphere. Over the...
Viruses are the most abundant biological entity on Earth, infect cellular organisms from all domains of life, and are central players in the global biosphere. Over the last century, the discovery and characterization of viruses have progressed steadily alongside much of modern biology. In terms of outright numbers of novel viruses discovered, however, the last few years have been by far the most transformative for the field. Advances in methods for identifying viral sequences in genomic and metagenomic datasets, coupled to the exponential growth of environmental sequencing, have greatly expanded the catalog of known viruses and fueled the tremendous growth of viral sequence databases. Development and implementation of new standards, along with careful study of the newly discovered viruses, have transformed and will continue to transform our understanding of microbial evolution, ecology, and biogeochemical cycles, leading to new biotechnological innovations across many diverse fields, including environmental, agricultural, and biomedical sciences.
Topics: Ecology; Genome, Viral; Metagenome; Metagenomics; Viruses
PubMed: 34465172
DOI: 10.1146/annurev-biodatasci-012221-095114 -
Microbiome Mar 2016Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating... (Review)
Review
Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution.
Topics: Contig Mapping; Datasets as Topic; Genome, Microbial; Metagenome; Metagenomics; Sequence Analysis, DNA
PubMed: 26951112
DOI: 10.1186/s40168-016-0154-5 -
Methods in Molecular Biology (Clifton,... 2022Most microbial groups have not been cultivated yet, and the only way to approach the enormous diversity of rhodopsins that they contain in a sensible timeframe is...
Most microbial groups have not been cultivated yet, and the only way to approach the enormous diversity of rhodopsins that they contain in a sensible timeframe is through the analysis of their genomes. High-throughput sequencing technologies have allowed the release of community genomics (metagenomics) of many habitats in the photic zones of the ocean and lakes. Already the harvest is impressive and included from the first bacterial rhodopsin (proteorhodopsin) to the recent discovery of heliorhodopsin by functional metagenomics. However, the search continues using bioinformatic or biochemical routes.
Topics: Metagenome; Metagenomics; Phylogeny; Rhodopsins, Microbial
PubMed: 35857224
DOI: 10.1007/978-1-0716-2329-9_4 -
NeoReviews May 2019The human microbiota includes the trillions of microorganisms living in the human body whereas the human microbiome includes the genes and gene products of this... (Review)
Review
The human microbiota includes the trillions of microorganisms living in the human body whereas the human microbiome includes the genes and gene products of this microbiota. Bacteria were historically largely considered to be pathogens that inevitably led to human disease. However, because of advances in both cultivation-based methods and the advent of metagenomics, bacteria are now recognized to be largely beneficial commensal organisms and thus, key to normal and healthy human development. This relatively new area of medical research has elucidated insights into diseases such as inflammatory bowel disease and obesity, as well as metabolic and atopic disorders. However, much remains unknown about the complexity of microbe-microbe and microbe-host interactions. Future efforts aimed at answering key questions pertaining to the early establishment of the microbiome, alongside what defines its dysbiosis, will likely lead to long-term health and mitigation of disease. Here, we review the relevant literature pertaining to modulations in the perinatal and neonatal microbiome, the impact of environmental and maternal factors in shaping the neonatal microbiome, and future questions and directions in the exciting emerging arena of metagenomic medicine.
Topics: Female; Forecasting; Humans; Infant Health; Infant, Newborn; Metagenome; Metagenomics; Microbiota; Pregnancy
PubMed: 31261078
DOI: 10.1542/neo.20-5-e258 -
Briefings in Bioinformatics Sep 2023Microbial genome recovery from metagenomes can further explain microbial ecosystem structures, functions and dynamics. Thus, this study developed the Additional...
Microbial genome recovery from metagenomes can further explain microbial ecosystem structures, functions and dynamics. Thus, this study developed the Additional Clustering Refiner (ACR) to enhance high-purity prokaryotic and eukaryotic metagenome-assembled genome (MAGs) recovery. ACR refines low-quality MAGs by subjecting them to iterative k-means clustering predicated on contig abundance and increasing bin purity through validated universal marker genes. Synthetic and real-world metagenomic datasets, including short- and long-read sequences, evaluated ACR's effectiveness. The results demonstrated improved MAG purity and a significant increase in high- and medium-quality MAG recovery rates. In addition, ACR seamlessly integrates with various binning algorithms, augmenting their strengths without modifying core features. Furthermore, its multiple sequencing technology compatibilities expand its applicability. By efficiently recovering high-quality prokaryotic and eukaryotic genomes, ACR is a promising tool for deepening our understanding of microbial communities through genome-centric metagenomics.
Topics: Metagenome; Eukaryota; Microbiota; Algorithms; Metagenomics; Cluster Analysis
PubMed: 37889119
DOI: 10.1093/bib/bbad381 -
Scientific Data Feb 2023Common culturing techniques and priorities bias our discovery towards specific traits that may not be representative of microbial diversity in nature. So far, these...
Common culturing techniques and priorities bias our discovery towards specific traits that may not be representative of microbial diversity in nature. So far, these biases have not been systematically examined. To address this gap, here we use 116,884 publicly available metagenome-assembled genomes (MAGs, completeness ≥80%) from 203 surveys worldwide as a culture-independent sample of bacterial and archaeal diversity, and compare these MAGs to the popular RefSeq genome database, which heavily relies on cultures. We compare the distribution of 12,454 KEGG gene orthologs (used as trait proxies) in the MAGs and RefSeq genomes, while controlling for environment type (ocean, soil, lake, bioreactor, human, and other animals). Using statistical modeling, we then determine the conditional probabilities that a species is represented in RefSeq depending on its genetic repertoire. We find that the majority of examined genes are significantly biased for or against in RefSeq. Our systematic estimates of gene prevalences across bacteria and archaea in nature and gene-specific biases in reference genomes constitutes a resource for addressing these issues in the future.
Topics: Animals; Archaea; Bacteria; Genome, Microbial; Metagenome; Metagenomics
PubMed: 36759614
DOI: 10.1038/s41597-023-01994-7 -
Genomics, Proteomics & Bioinformatics Dec 2018Metagenomes from uncultured microorganisms are rich resources for novel enzyme genes. The methods used to screen the metagenomic libraries fall into two categories,... (Review)
Review
Metagenomes from uncultured microorganisms are rich resources for novel enzyme genes. The methods used to screen the metagenomic libraries fall into two categories, which are based on sequence or function of the enzymes. The sequence-based approaches rely on the known sequences of the target gene families. In contrast, the function-based approaches do not involve the incorporation of metagenomic sequencing data and, therefore, may lead to the discovery of novel gene sequences with desired functions. In this review, we discuss the function-based screening strategies that have been used in the identification of enzymes from metagenomes. Because of its simplicity, agar plate screening is most commonly used in the identification of novel enzymes with diverse functions. Other screening methods with higher sensitivity are also employed, such as microtiter plate screening. Furthermore, several ultra-high-throughput methods were developed to deal with large metagenomic libraries. Among these are the FACS-based screening, droplet-based screening, and the in vivo reporter-based screening methods. The application of these novel screening strategies has increased the chance for the discovery of novel enzyme genes.
Topics: Animals; Bacteria; Enzymes; Gene Library; High-Throughput Screening Assays; Metagenome; Metagenomics; Plants
PubMed: 30597257
DOI: 10.1016/j.gpb.2018.01.002 -
Bioinformatics (Oxford, England) Sep 2022Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial...
MOTIVATION
Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial task. Current metagenomic binners do not take full advantage of assembly graphs and are not optimized for long-read assemblies. Deep graph learning algorithms have been proposed in other fields to deal with complex graph data structures. The graph structure generated during the assembly process could be integrated with contig features to obtain better bins with deep learning.
RESULTS
We propose GraphMB, which uses graph neural networks to incorporate the assembly graph into the binning process. We test GraphMB on long-read datasets of different complexities, and compare the performance with other binners in terms of the number of High Quality (HQ) genome bins obtained. With our approach, we were able to obtain unique bins on all real datasets, and obtain more bins on most datasets. In particular, we obtained on average 17.5% more HQ bins when compared with state-of-the-art binners and 13.7% when aggregating the results of our binner with the others. These results indicate that a deep learning model can integrate contig-specific and graph-structure information to improve metagenomic binning.
AVAILABILITY AND IMPLEMENTATION
GraphMB is available from https://github.com/MicrobialDarkMatter/GraphMB.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Sequence Analysis, DNA; Metagenomics; Metagenome; Genome, Microbial; Algorithms
PubMed: 35972375
DOI: 10.1093/bioinformatics/btac557