-
Genome Biology Feb 2022Recovering high-quality metagenome-assembled genomes (MAGs) from complex microbial ecosystems remains challenging. Recently, high-throughput chromosome conformation...
Recovering high-quality metagenome-assembled genomes (MAGs) from complex microbial ecosystems remains challenging. Recently, high-throughput chromosome conformation capture (Hi-C) has been applied to simultaneously study multiple genomes in natural microbial communities. We develop HiCBin, a novel open-source pipeline, to resolve high-quality MAGs utilizing Hi-C contact maps. HiCBin employs the HiCzin normalization method and the Leiden clustering algorithm and includes the spurious contact detection into binning pipelines for the first time. HiCBin is validated on one synthetic and two real metagenomic samples and is shown to outperform the existing Hi-C-based binning methods. HiCBin is available at https://github.com/dyxstat/HiCBin .
Topics: Algorithms; Cluster Analysis; Metagenome; Metagenomics; Microbiota
PubMed: 35227283
DOI: 10.1186/s13059-022-02626-w -
Briefings in Bioinformatics Nov 2012Metagenomics has become an indispensable tool for studying the diversity and metabolic potential of environmental microbes, whose bulk is as yet non-cultivable.... (Review)
Review
Metagenomics has become an indispensable tool for studying the diversity and metabolic potential of environmental microbes, whose bulk is as yet non-cultivable. Continual progress in next-generation sequencing allows for generating increasingly large metagenomes and studying multiple metagenomes over time or space. Recently, a new type of holistic ecosystem study has emerged that seeks to combine metagenomics with biodiversity, meta-expression and contextual data. Such 'ecosystems biology' approaches bear the potential to not only advance our understanding of environmental microbes to a new level but also impose challenges due to increasing data complexities, in particular with respect to bioinformatic post-processing. This mini review aims to address selected opportunities and challenges of modern metagenomics from a bioinformatics perspective and hopefully will serve as a useful resource for microbial ecologists and bioinformaticians alike.
Topics: Biodiversity; Computational Biology; Genome, Archaeal; Genome, Bacterial; Metagenome; Metagenomics
PubMed: 22966151
DOI: 10.1093/bib/bbs039 -
Nature Biotechnology May 2021Millions of new viral sequences have been identified from metagenomes, but the quality and completeness of these sequences vary considerably. Here we present CheckV, an...
Millions of new viral sequences have been identified from metagenomes, but the quality and completeness of these sequences vary considerably. Here we present CheckV, an automated pipeline for identifying closed viral genomes, estimating the completeness of genome fragments and removing flanking host regions from integrated proviruses. CheckV estimates completeness by comparing sequences with a large database of complete viral genomes, including 76,262 identified from a systematic search of publicly available metagenomes, metatranscriptomes and metaviromes. After validation on mock datasets and comparison to existing methods, we applied CheckV to large and diverse collections of metagenome-assembled viral sequences, including IMG/VR and the Global Ocean Virome. This revealed 44,652 high-quality viral genomes (that is, >90% complete), although the vast majority of sequences were small fragments, which highlights the challenge of assembling viral genomes from short-read metagenomes. Additionally, we found that removal of host contamination substantially improved the accurate identification of auxiliary metabolic genes and interpretation of viral-encoded functions.
Topics: Genome, Viral; Metagenome; Metagenomics; Molecular Sequence Annotation; Software
PubMed: 33349699
DOI: 10.1038/s41587-020-00774-7 -
PloS One 2021Characterizing the gut microbiota in terms of their capacity to interfere with drug metabolism is necessary to achieve drug efficacy and safety. Although examples of...
Characterizing the gut microbiota in terms of their capacity to interfere with drug metabolism is necessary to achieve drug efficacy and safety. Although examples of drug-microbiome interactions are well-documented, little has been reported about a computational pipeline for systematically identifying and characterizing bacterial enzymes that process particular classes of drugs. The goal of our study is to develop a computational approach that compiles drugs whose metabolism may be influenced by a particular class of microbial enzymes and that quantifies the variability in the collective level of those enzymes among individuals. The present paper describes this approach, with microbial β-glucuronidases as an example, which break down drug-glucuronide conjugates and reactivate the drugs or their metabolites. We identified 100 medications that may be metabolized by β-glucuronidases from the gut microbiome. These medications included morphine, estrogen, ibuprofen, midazolam, and their structural analogues. The analysis of metagenomic data available through the Sequence Read Archive (SRA) showed that the level of β-glucuronidase in the gut metagenomes was higher in males than in females, which provides a potential explanation for the sex-based differences in efficacy and toxicity for several drugs, reported in previous studies. Our analysis also showed that infant gut metagenomes at birth and 12 months of age have higher levels of β-glucuronidase than the metagenomes of their mothers and the implication of this observed variability was discussed in the context of breastfeeding as well as infant hyperbilirubinemia. Overall, despite important limitations discussed in this paper, our analysis provided useful insights on the role of the human gut metagenome in the variability in drug response among individuals. Importantly, this approach exploits drug and metagenome data available in public databases as well as open-source cheminformatics and bioinformatics tools to predict drug-metagenome interactions.
Topics: Adult; Bacteria; Computational Biology; Data Management; Female; Forecasting; Gastrointestinal Microbiome; Glucuronidase; Humans; Infant, Newborn; Male; Metagenome; Metagenomics; Microbiota; Mothers
PubMed: 33411719
DOI: 10.1371/journal.pone.0244876 -
Letters in Applied Microbiology Mar 2022The most alarming aspect of the Sudanese toombak smokeless tobacco is that it contains high levels of highly toxic tobacco-specific nitrosamines (TSNAs). Understanding...
The most alarming aspect of the Sudanese toombak smokeless tobacco is that it contains high levels of highly toxic tobacco-specific nitrosamines (TSNAs). Understanding the microbiology of toombak is of relevance because TSNAs are an indirect result of microbial-mediated nitrate reductions. We conducted shotgun metagenomic sequencing on a toombak product for which relevant features are presented here. The microbiota was composed of over 99% Bacteria. The most abundant taxa included Actinobacteria, specifically the genera Enteractinococcus and Corynebacterium, while Firmicutes were represented by the family Bacillaceae and the genus Staphylococcus. Selected gene targets were nitrate reduction and transport, antimicrobial resistance, and other genetic transference mechanisms. Canonical nitrate reduction and transport genes (i.e. nar) were found for Enteractinococcus and Corynebacterium while various species of Staphylococcus exhibited a notable number of antimicrobial resistance and genetic transference genes. The nitrate reduction activity of the microbiota in toombak is suspected to be a contributing factor to its high levels of TSNAs. Additionally, the presence of antimicrobial resistance and transference genes could contribute to deleterious effects on oral and gastrointestinal health of the end user. Overall, the high toxicity and increased incidences of cancer and oral disease of toombak users warrants further investigation into the microbiology of toombak.
Topics: Metagenome; Metagenomics; Nitrosamines; Nicotiana; Tobacco, Smokeless
PubMed: 34862647
DOI: 10.1111/lam.13623 -
Annual Review of Virology Sep 2022Over the past 20 years, our knowledge of virus diversity and abundance in subsurface environments has expanded dramatically through application of quantitative... (Review)
Review
Over the past 20 years, our knowledge of virus diversity and abundance in subsurface environments has expanded dramatically through application of quantitative metagenomic approaches. In most subsurface environments, viral diversity and abundance rival viral diversity and abundance observed in surface environments. Most of these viruses are uncharacterized in terms of their hosts and replication cycles. Analysis of accessory metabolic genes encoded by subsurface viruses indicates that they evolved to replicate within the unique features of their environments. The key question remains: What role do these viruses play in the ecology and evolution of the environments in which they replicate? Undoubtedly, as more virologists examine the role of viruses in subsurface environments, new insights will emerge.
Topics: Ecology; Metagenome; Metagenomics; Viruses
PubMed: 36173700
DOI: 10.1146/annurev-virology-093020-015957 -
Current Opinion in Virology Apr 2022Viruses are diverse biological entities that influence all life. Even with limited genome sizes, viruses can manipulate, drive, steal from, and kill their hosts. The... (Review)
Review
Viruses are diverse biological entities that influence all life. Even with limited genome sizes, viruses can manipulate, drive, steal from, and kill their hosts. The field of virus genomics, using sequencing data to understand viral capabilities, has seen significant innovations in recent years. However, with advancements in metagenomic sequencing and related technologies, the bottleneck to discovering and employing the virosphere has become the analysis of genomes rather than generation. With metagenomics rapidly expanding available data, vital components of virus genomes and features are being overlooked, with the issue compounded by lagging databases and bioinformatics methods. Despite the field moving in a positive direction, there are noteworthy points to keep in mind, from how software-based virus genome predictions are interpreted to what information is overlooked by current standards. In this review, we discuss conventions and ideologies that likely need to be revised while continuing forward in the study of virus genomics.
Topics: Genome, Viral; Metagenome; Metagenomics; Software; Viruses
PubMed: 35051682
DOI: 10.1016/j.coviro.2022.101200 -
Food Research International (Ottawa,... Oct 2023Jalebi is one of the oldest Indian traditional fermented wheat-based confectioneries. Since jalebi is prepared by natural fermentation, diverse microbial community is...
Jalebi is one of the oldest Indian traditional fermented wheat-based confectioneries. Since jalebi is prepared by natural fermentation, diverse microbial community is expected to play bio-functional activities. Due to limited studies, information on microbial community structure in jalebi is unknown. Hence, the present study is aimed to profile the microbial community in jalebi by shotgun metagenomics and also to predict putative probiotic and functional genes by metagenome-assembled genome (MAG). Bacteria were the most abundant domain (91.91%) under which Bacillota was the most abundant phylum (82%). The most abundant species was Lapidilactobacillus dextrinicus followed by several species of lactic acid bacteria, acetic acid bacteria including few yeasts. Lap. dextrinicus was also significantly abundant in jalebi when compared to similar fermented wheat-based sourdough. Additionally, Lap. bayanensis, Pediococcus stilesii, and yeast- Candida glabrata, Gluconobacter japonicus, Pichia kudriavzevii, Wickerhamomyces anomalus were only detected in jalebi, which are not detected in sourdough. Few viruses and archaea were detected with < 1 % abundance. In silico screening of genes from the abundant species was mined using both KEGG and EggNOG database for putative health beneficial attributes. Circular genomes of five high-quality MAGs, identified as Lapidilactobacillus dextrinicus, Enterococcus hirae, Pediococcus stilesii, Acetobacter indonesiensis and Acetobacter cibinongensis, were constructed separately and putative genes were mapped and annotated. The CRISPR/Cas gene clusters in the genomes of four MAGs except Acetobacter cibinongensis were detected. MAGs also showed several secondary metabolites. Since, the identified MAGs have different putative genes for bio-functional properties, this may pave the way to selectively culture the uncultivated putative microbes for jalebi production. We believe this is the first report on metagenomic and MAGs of jalebi.
Topics: Metagenome; Edible Grain; Metagenomics; India
PubMed: 37689895
DOI: 10.1016/j.foodres.2023.113130 -
Bioinformatics (Oxford, England) Sep 2021Metagenomic approaches hold the potential to characterize microbial communities and unravel the intricate link between the microbiome and biological processes. Assembly...
MOTIVATION
Metagenomic approaches hold the potential to characterize microbial communities and unravel the intricate link between the microbiome and biological processes. Assembly is one of the most critical steps in metagenomics experiments. It consists of transforming overlapping DNA sequencing reads into sufficiently accurate representations of the community's genomes. This process is computationally difficult and commonly results in genomes fragmented across many contigs. Computational binning methods are used to mitigate fragmentation by partitioning contigs based on their sequence composition, abundance or chromosome organization into bins representing the community's genomes. Existing binning methods have been principally tuned for bacterial genomes and do not perform favorably on viral metagenomes.
RESULTS
We propose Composition and Coverage Network (CoCoNet), a new binning method for viral metagenomes that leverages the flexibility and the effectiveness of deep learning to model the co-occurrence of contigs belonging to the same viral genome and provide a rigorous framework for binning viral contigs. Our results show that CoCoNet substantially outperforms existing binning methods on viral datasets.
AVAILABILITY AND IMPLEMENTATION
CoCoNet was implemented in Python and is available for download on PyPi (https://pypi.org/). The source code is hosted on GitHub at https://github.com/Puumanamana/CoCoNet and the documentation is available at https://coconet.readthedocs.io/en/latest/index.html. CoCoNet does not require extensive resources to run. For example, binning 100k contigs took about 4 h on 10 Intel CPU Cores (2.4 GHz), with a memory peak at 27 GB (see Supplementary Fig. S9). To process a large dataset, CoCoNet may need to be run on a high RAM capacity server. Such servers are typically available in high-performance or cloud computing settings.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Metagenome; Algorithms; Deep Learning; Software; Microbiota; Sequence Analysis, DNA; Metagenomics
PubMed: 33822891
DOI: 10.1093/bioinformatics/btab213 -
The Lancet. Microbe Nov 2022Measurement and manipulation of the microbiome is generally considered to have great potential for understanding the causes of complex diseases in humans, developing new... (Review)
Review
Measurement and manipulation of the microbiome is generally considered to have great potential for understanding the causes of complex diseases in humans, developing new therapies, and finding preventive measures. Many studies have found significant associations between the microbiome and various diseases; however, Koch's classical postulates remind us about the importance of causative reasoning when considering the relationship between microbes and a disease manifestation. Although causal discovery in observational microbiome data faces many challenges, methodological advances in causal structure learning have improved the potential of data-driven prediction of causal effects in large-scale biological systems. In this Personal View, we show the capability of existing methods for inferring causal effects from metagenomic data, and we highlight ways in which the introduction of causal structures that are more flexible than existing structures offers new opportunities for causal reasoning. Our observations suggest that microbiome research can further benefit from tools developed in the past 5 years in causal discovery and learn from their applications elsewhere.
Topics: Humans; Microbiota; Metagenomics; Causality; Metagenome
PubMed: 36152674
DOI: 10.1016/S2666-5247(22)00186-0