-
Annual Review of Virology Sep 2021Viral metagenomics has expanded our knowledge of the ecology of uncultured viruses, within both environmental (e.g., terrestrial and aquatic) and host-associated (e.g.,...
Viral metagenomics has expanded our knowledge of the ecology of uncultured viruses, within both environmental (e.g., terrestrial and aquatic) and host-associated (e.g., plants and animals, including humans) contexts. Here, we emphasize the implementation of an ecological framework in viral metagenomic studies to address questions in virology rarely considered ecological, which can change our perception of viruses and how they interact with their surroundings. An ecological framework explicitly considers diverse variants of viruses in populations that make up communities of interacting viruses, with ecosystem-level effects. It provides a structure for the study of the diversity, distributions, dynamics, and interactions of viruses with one another, hosts, and the ecosystem, including interactions with abiotic factors. An ecological framework in viral metagenomics stands poised to broadly expand our knowledge in basic and applied virology. We highlight specific fundamental research needs to capitalize on its potential and advance the field.
Topics: Animals; Ecosystem; Genome, Viral; Humans; Metagenome; Metagenomics; Plants; Viruses
PubMed: 34033501
DOI: 10.1146/annurev-virology-010421-053015 -
MicrobiologyOpen Jun 2022The rise of metagenomics offers a leap forward for understanding the genetic diversity of microorganisms in many different complex environments by providing a platform... (Review)
Review
The rise of metagenomics offers a leap forward for understanding the genetic diversity of microorganisms in many different complex environments by providing a platform that can identify potentially unlimited numbers of known and novel microorganisms. As such, it is impossible to imagine new major initiatives without metagenomics. Nevertheless, it represents a relatively new discipline with various levels of complexity and demands on bioinformatics. The underlying principles and methods used in metagenomics are often seen as common knowledge and often not detailed or fragmented. Therefore, we reviewed these to guide microbiologists in taking the first steps into metagenomics. We specifically focus on a workflow aimed at reconstructing individual genomes, that is, metagenome-assembled genomes, integrating DNA sequencing, assembly, binning, identification and annotation.
Topics: Computational Biology; Metagenome; Metagenomics; Sequence Analysis, DNA
PubMed: 35765182
DOI: 10.1002/mbo3.1298 -
Communications Biology Oct 2023Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct...
Assembly of reads from metagenomic samples is a hard problem, often resulting in highly fragmented genome assemblies. Metagenomic binning allows us to reconstruct genomes by re-grouping the sequences by their organism of origin, thus representing a crucial processing step when exploring the biological diversity of metagenomic samples. Here we present Adversarial Autoencoders for Metagenomics Binning (AAMB), an ensemble deep learning approach that integrates sequence co-abundances and tetranucleotide frequencies into a common denoised space that enables precise clustering of sequences into microbial genomes. When benchmarked, AAMB presented similar or better results compared with the state-of-the-art reference-free binner VAMB, reconstructing ~7% more near-complete (NC) genomes across simulated and real data. In addition, genomes reconstructed using AAMB had higher completeness and greater taxonomic diversity compared with VAMB. Finally, we implemented a pipeline Integrating VAMB and AAMB that enabled improved binning, recovering 20% and 29% more simulated and real NC genomes, respectively, compared to VAMB, with moderate additional runtime.
Topics: Metagenome; Genome, Microbial; Metagenomics; Cluster Analysis; Benchmarking
PubMed: 37865678
DOI: 10.1038/s42003-023-05452-3 -
Scientific Data Oct 2023Biofloc technology is increasingly recognised as a sustainable aquaculture method. In this technique, bioflocs are generated as microbial aggregates that play pivotal...
Biofloc technology is increasingly recognised as a sustainable aquaculture method. In this technique, bioflocs are generated as microbial aggregates that play pivotal roles in assimilating toxic nitrogenous substances, thereby ensuring high water quality. Despite the crucial roles of the floc-associated bacterial (FAB) community in pathogen control and animal health, earlier microbiota studies have primarily relied on the metataxonomic approaches. Here, we employed shotgun sequencing on eight biofloc metagenomes from a commercial aquaculture system. This resulted in the generation of 106.6 Gbp, and the reconstruction of 444 metagenome-assembled genomes (MAGs). Among the recovered MAGs, 230 were high-quality (≥90% completeness, ≤5% contamination), and 214 were medium-quality (≥50% completeness, ≤10% contamination). Phylogenetic analysis unveiled Rhodobacteraceae as dominant members of the FAB community. The reported metagenomes and MAGs are crucial for elucidating the roles of diverse microorganisms and their functional genes in key processes such as nitrification, denitrification, and remineralization. This study will contribute to scientific understanding of phylogenetic diversity and metabolic capabilities of microbial taxa in aquaculture environments.
Topics: Animals; Aquaculture; Bacteria; Metagenome; Metagenomics; Microbiota; Phylogeny
PubMed: 37848477
DOI: 10.1038/s41597-023-02622-0 -
metaMIC: reference-free misassembly identification and correction of de novo metagenomic assemblies.Genome Biology Nov 2022Evaluating the quality of metagenomic assemblies is important for constructing reliable metagenome-assembled genomes and downstream analyses. Here, we present metaMIC (...
Evaluating the quality of metagenomic assemblies is important for constructing reliable metagenome-assembled genomes and downstream analyses. Here, we present metaMIC ( https://github.com/ZhaoXM-Lab/metaMIC ), a machine learning-based tool for identifying and correcting misassemblies in metagenomic assemblies. Benchmarking results on both simulated and real datasets demonstrate that metaMIC outperforms existing tools when identifying misassembled contigs. Furthermore, metaMIC is able to localize the misassembly breakpoints, and the correction of misassemblies by splitting at misassembly breakpoints can improve downstream scaffolding and binning results.
Topics: Metagenome; Sequence Analysis, DNA; Metagenomics; Machine Learning; Benchmarking; Software; Algorithms
PubMed: 36376928
DOI: 10.1186/s13059-022-02810-y -
Nucleic Acids Research Aug 2022Genome binning has been essential for characterization of bacteria, archaea, and even eukaryotes from metagenomes. Yet, few approaches exist for viruses. We developed...
Genome binning has been essential for characterization of bacteria, archaea, and even eukaryotes from metagenomes. Yet, few approaches exist for viruses. We developed vRhyme, a fast and precise software for construction of viral metagenome-assembled genomes (vMAGs). vRhyme utilizes single- or multi-sample coverage effect size comparisons between scaffolds and employs supervised machine learning to identify nucleotide feature similarities, which are compiled into iterations of weighted networks and refined bins. To refine bins, vRhyme utilizes unique features of viral genomes, namely a protein redundancy scoring mechanism based on the observation that viruses seldom encode redundant genes. Using simulated viromes, we displayed superior performance of vRhyme compared to available binning tools in constructing more complete and uncontaminated vMAGs. When applied to 10,601 viral scaffolds from human skin, vRhyme advanced our understanding of resident viruses, highlighted by identification of a Herelleviridae vMAG comprised of 22 scaffolds, and another vMAG encoding a nitrate reductase metabolic gene, representing near-complete genomes post-binning. vRhyme will enable a convention of binning uncultivated viral genomes and has the potential to transform metagenome-based viral ecology.
Topics: Genome, Viral; High-Throughput Nucleotide Sequencing; Humans; Metagenome; Metagenomics; Sequence Analysis, DNA; Software
PubMed: 35544285
DOI: 10.1093/nar/gkac341 -
Microbiology Spectrum Aug 2023Microbial secondary metabolites play crucial roles in microbial competition, communication, resource acquisition, antibiotic production, and a variety of other...
Microbial secondary metabolites play crucial roles in microbial competition, communication, resource acquisition, antibiotic production, and a variety of other biotechnological processes. The retrieval of full-length BGC (biosynthetic gene cluster) sequences from uncultivated bacteria is difficult due to the technical constraints of short-read sequencing, making it impossible to determine BGC diversity. Using long-read sequencing and genome mining, 339 mainly full-length BGCs were recovered in this study, illuminating the wide range of BGCs from uncultivated lineages discovered in seawater from Aoshan Bay, Yellow Sea, China. Many extremely diverse BGCs were discovered in bacterial phyla such as , , , and as well as the previously uncultured archaeal phylum " Thermoplasmatota." The data from metatranscriptomics showed that 30.1% of secondary metabolic genes were being expressed, and they also revealed the expression pattern of BGC core biosynthetic genes and tailoring enzymes. Taken together, our results demonstrate that long-read metagenomic sequencing combined with metatranscriptomic analysis provides a direct view into the functional expression of BGCs in environmental processes. Genome mining of metagenomic data has become the preferred method for the bioprospecting of novel compounds by cataloguing secondary metabolite potential. However, the accurate detection of BGCs requires unfragmented genomic assemblies, which have been technically difficult to obtain from metagenomes until recently with new long-read technologies. We used high-quality metagenome-assembled genomes generated from long-read data to determine the biosynthetic potential of microbes found in the surface water of the Yellow Sea. We recovered 339 highly diverse and mostly full-length BGCs from largely uncultured and underexplored bacterial and archaeal phyla. Additionally, we present long-read metagenomic sequencing combined with metatranscriptomic analysis as a potential method for gaining access to the largely underutilized genetic reservoir of specialized metabolite gene clusters in the majority of microbes that are not cultured. The combination of long-read metagenomic and metatranscriptomic analyses is significant because it can more accurately assess the mechanisms of microbial adaptation to the environment through BGC expression based on metatranscriptomic data.
Topics: Metagenomics; Bacteria; Metagenome; Archaea; Bacteroidetes
PubMed: 37409950
DOI: 10.1128/spectrum.01501-23 -
Microbiology Spectrum Apr 2022The reproductive tract metagenome plays a significant role in the various reproductive system functions, including reproductive cycles, health, and fertility. One of the...
The reproductive tract metagenome plays a significant role in the various reproductive system functions, including reproductive cycles, health, and fertility. One of the major challenges in bovine vaginal metagenome studies is host DNA contamination, which limits the sequencing capacity for metagenomic content and reduces the accuracy of untargeted shotgun metagenomic profiling. This is the first study comparing the effectiveness of different host depletion and DNA extraction methods for bovine vaginal metagenomic samples. The host depletion methods evaluated were slow centrifugation (Soft-spin), NEBNext Microbiome DNA Enrichment kit (NEBNext), and propidium monoazide (PMA) treatment, while the extraction methods were DNeasy Blood and Tissue extraction (DNeasy) and QIAamp DNA Microbiome extraction (QIAamp). Soft-spin and QIAamp were the most effective host depletion method and extraction methods, respectively, in reducing the number of cattle genomic content in bovine vaginal samples. The reduced host-to-microbe ratio in the extracted DNA increased the sequencing depth for microbial reads in untargeted shotgun sequencing. Bovine vaginal samples extracted with QIAamp presented taxonomical profiles which closely resembled the mock microbial composition, especially for the recovery of Gram-positive bacteria. Additionally, samples extracted with QIAamp presented extensive functional profiles with deep coverage. Overall, a combination of Soft-spin and QIAamp provided the most robust representation of the vaginal microbial community in cattle while minimizing host DNA contamination. In addition to the host tissue collected during the sampling process, bovine vaginal samples are saturated with large amounts of extracellular DNA and secreted proteins that are essential for physiological purposes, including the reproductive cycle and immune defense. Due to the high host-to-microbe genome ratio, which hampers the sequencing efficacy for metagenome samples and the recovery of the actual metagenomic profiles, bovine vaginal samples cannot benefit from the full potential of shotgun sequencing. This is the first investigation on the most effective host depletion and extraction methods for bovine vaginal metagenomic samples. This study demonstrated an effective combination of host depletion and extraction methods, which harvested higher percentages of 16S rRNA genes and microbial reads, which subsequently led to a taxonomical profile that resembled the actual community and a functional profile with deeper coverage. A representative metagenomic profile is essential for investigating the role of the bovine vaginal metagenome for both reproductive function and susceptibility to infections.
Topics: Animals; Cattle; DNA; Female; Metagenome; Metagenomics; RNA, Ribosomal, 16S; Sequence Analysis, DNA
PubMed: 35404108
DOI: 10.1128/spectrum.00412-21 -
Current Opinion in Microbiology Dec 2022While they are the most abundant biological entities on the planet, the role of bacteriophages (phages) in the microbiome remains enigmatic and understudied. With a rise... (Review)
Review
While they are the most abundant biological entities on the planet, the role of bacteriophages (phages) in the microbiome remains enigmatic and understudied. With a rise in the number of metagenomics studies and the publication of highly efficient phage mining programmes, we now have extensive data on the genomic and taxonomic diversity of (mainly) DNA bacteriophages in a wide range of environments. In addition, the higher throughput and quality of sequencing is allowing for strain-level reconstructions of phage genomes from metagenomes. These factors will ultimately help us to understand the role these phages play as part of specific microbial communities, enabling the tracking of individual virus genomes through space and time. Using lessons learned from the latest metagenomic studies, we focus on two explicit aspects of the role bacteriophages play within the microbiome, their ecological role in structuring bacterial populations, and their contribution to microbiome functioning by encoding auxiliary metabolism genes.
Topics: Humans; Bacteriophages; Metagenomics; Metagenome; Genome, Viral; Bacteria
PubMed: 36347213
DOI: 10.1016/j.mib.2022.102229 -
Genes Oct 2022The recent increase in publicly available metagenomic datasets with geospatial metadata has made it possible to determine location-specific, microbial fingerprints from...
The recent increase in publicly available metagenomic datasets with geospatial metadata has made it possible to determine location-specific, microbial fingerprints from around the world. Such fingerprints can be useful for comparing microbial niches for environmental research, as well as for applications within forensic science and public health. To determine the regional specificity for environmental metagenomes, we examined 4305 shotgun-sequenced samples from the MetaSUB Consortium dataset-the most extensive public collection of urban microbiomes, spanning 60 different cities, 30 countries, and 6 continents. We were able to identify city-specific microbial fingerprints using supervised machine learning (SML) on the taxonomic classifications, and we also compared the performance of ten SML classifiers. We then further evaluated the five algorithms with the highest accuracy, with the city and continental accuracy ranging from 85-89% to 90-94%, respectively. Thereafter, we used these results to develop Cassandra, a random-forest-based classifier that identifies bioindicator species to aid in fingerprinting and can infer higher-order microbial interactions at each site. We further tested the Cassandra algorithm on the Tara Oceans dataset, the largest collection of marine-based microbial genomes, where it classified the oceanic sample locations with 83% accuracy. These results and code show the utility of SML methods and Cassandra to identify bioindicator species across both oceanic and urban environments, which can help guide ongoing efforts in biotracing, environmental monitoring, and microbial forensics (MF).
Topics: Metagenomics; Metagenome; Microbiota; Supervised Machine Learning; Cities
PubMed: 36292799
DOI: 10.3390/genes13101914