metagenome - OpenMD.com Journal Search

Scientific Data Jun 2022

With the rapid development of high-throughput sequencing technology, the amount of metagenomic data (including both 16S and whole-genome sequencing data) in public...

Summary PubMed Full Text PDF

Authors: Ruirui Hu, Rui Yao, Lei Li...

With the rapid development of high-throughput sequencing technology, the amount of metagenomic data (including both 16S and whole-genome sequencing data) in public repositories is increasing exponentially. However, owing to the large and decentralized nature of the data, it is still difficult for users to mine, compare, and analyze the data. The animal metagenome database (AnimalMetagenome DB) integrates metagenomic sequencing data with host information, making it easier for users to find data of interest. The AnimalMetagenome DB is designed to contain all public metagenomic data from animals, and the data are divided into domestic and wild animal categories. Users can browse, search, and download animal metagenomic data of interest based on different attributes of the metadata such as animal species, sample site, study purpose, and DNA extraction method. The AnimalMetagenome DB version 1.0 includes metadata for 82,097 metagenomes from 4 domestic animals (pigs, bovines, horses, and sheep) and 540 wild animals. These metagenomes cover 15 years of experiments, 73 countries, 1,044 studies, 63,214 amplicon sequencing data, and 10,672 whole genome sequencing data. All data in the database are hosted and available in figshare https://doi.org/10.6084/m9.figshare.19728619 .

Topics: Animals; Cattle; Databases, Factual; High-Throughput Nucleotide Sequencing; Horses; Metadata; Metagenome; Metagenomics; Sheep; Swine

PubMed: 35710683
DOI: 10.1038/s41597-022-01444-w

Combined assembly of long and short sequencing reads improve the efficiency of exploring the soil metagenome.

BMC Genomics Jan 2022

Advances in DNA sequencing technologies have transformed our capacity to perform life science research, decipher the dynamics of complex soil microbial communities and...

Summary PubMed Full Text PDF

Authors: Guoshun Xu, Liwen Zhang, Xiaoqing Liu...

BACKGROUND

Advances in DNA sequencing technologies have transformed our capacity to perform life science research, decipher the dynamics of complex soil microbial communities and exploit them for plant disease management. However, soil is a complex conglomerate, which makes functional metagenomics studies very challenging.

RESULTS

Metagenomes were assembled by long-read (PacBio, PB), short-read (Illumina, IL), and mixture of PB and IL (PI) sequencing of soil DNA samples were compared. Ortholog analyses and functional annotation revealed that the PI approach significantly increased the contig length of the metagenomic sequences compared to IL and enlarged the gene pool compared to PB. The PI approach also offered comparable or higher species abundance than either PB or IL alone, and showed significant advantages for studying natural product biosynthetic genes in the soil microbiomes.

CONCLUSION

Our results provide an effective strategy for combining long and short-read DNA sequencing data to explore and distill the maximum information out of soil metagenomics.

Topics: High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Sequence Analysis, DNA; Soil

PubMed: 34996356
DOI: 10.1186/s12864-021-08260-3

Aquatic viral metagenomics: Lights and shadows.

Virus Research Jul 2017

Viruses are the most abundant biological entities on Earth, exceeding bacteria in most of the ecosystems. Specially in oceans, viruses are thought to be the major... (Review)

Summary PubMed

Review

Authors: Alberto Rastrojo, Antonio Alcamí

Viruses are the most abundant biological entities on Earth, exceeding bacteria in most of the ecosystems. Specially in oceans, viruses are thought to be the major planktonic predators shaping microorganism communities and controlling ocean biological capacity. Plankton lysis by viruses plays an important role in ocean nutrient and energy cycles. Viral metagenomics has emerged as a powerful tool to uncover viral diversity in aquatic ecosystems through the use of Next Generation Sequencing. However, many of the commonly used viral sample preparation steps have several important biases that must be considered to avoid a misinterpretation of the results. In addition to biases caused by the purification of virus particles, viral DNA/RNA amplification and the preparation of genomic libraries could also introduce biases, and a detailed knowledge about such protocols is required. In this review, the main steps in the viral metagenomic workflow are described paying special attention to the potential biases introduced by each one.

Topics: Genetic Variation; Genome, Viral; Geography; Metagenome; Metagenomics; Viruses; Water Microbiology

PubMed: 27889617
DOI: 10.1016/j.virusres.2016.11.021

Long-Read Metagenomics and CAZyme Discovery.

Methods in Molecular Biology (Clifton,... 2023

Microorganisms play a primary role in regulating biogeochemical cycles and are a valuable source of enzymes that have biotechnological applications, such as...

Summary PubMed

Authors: Alessandra Ferrillo, Carl Mathias Kobel, Arturo Vera-Ponce de León...

Microorganisms play a primary role in regulating biogeochemical cycles and are a valuable source of enzymes that have biotechnological applications, such as carbohydrate-active enzymes (CAZymes). However, the inability to culture the majority of microorganisms that exist in natural ecosystems restricts access to potentially novel bacteria and beneficial CAZymes. While commonplace molecular-based culture-independent methods such as metagenomics enable researchers to study microbial communities directly from environmental samples, recent progress in long-read sequencing technologies are advancing the field. We outline key methodological stages that are required as well as describe specific protocols that are currently used for long-read metagenomic projects dedicated to CAZyme discovery.

Topics: Metagenomics; Microbiota; Metagenome; Carbohydrates; High-Throughput Nucleotide Sequencing

PubMed: 37149537
DOI: 10.1007/978-1-0716-3151-5_19

MIDAS2: Metagenomic Intra-species Diversity Analysis System.

Bioinformatics (Oxford, England) Jan 2023

The Metagenomic Intra-Species Diversity Analysis System (MIDAS) is a scalable metagenomic pipeline that identifies single nucleotide variants (SNVs) and gene copy number...

Summary PubMed Full Text PDF

Authors: Chunyu Zhao, Boris Dimitrov, Miriam Goldman...

SUMMARY

The Metagenomic Intra-Species Diversity Analysis System (MIDAS) is a scalable metagenomic pipeline that identifies single nucleotide variants (SNVs) and gene copy number variants in microbial populations. Here, we present MIDAS2, which addresses the computational challenges presented by increasingly large reference genome databases, while adding functionality for building custom databases and leveraging paired-end reads to improve SNV accuracy. This fast and scalable reengineering of the MIDAS pipeline enables thousands of metagenomic samples to be efficiently genotyped.

AVAILABILITY AND IMPLEMENTATION

The source code is available at https://github.com/czbiohub/MIDAS2.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Topics: Metagenome; Software; Metagenomics; Genotype; Databases, Factual

PubMed: 36321886
DOI: 10.1093/bioinformatics/btac713

Metagenomic binning with assembly graph embeddings.

Bioinformatics (Oxford, England) Sep 2022

Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial...

Summary PubMed Full Text PDF

Authors: Andre Lamurias, Mantas Sereika, Mads Albertsen...

MOTIVATION

Despite recent advancements in sequencing technologies and assembly methods, obtaining high-quality microbial genomes from metagenomic samples is still not a trivial task. Current metagenomic binners do not take full advantage of assembly graphs and are not optimized for long-read assemblies. Deep graph learning algorithms have been proposed in other fields to deal with complex graph data structures. The graph structure generated during the assembly process could be integrated with contig features to obtain better bins with deep learning.

RESULTS

We propose GraphMB, which uses graph neural networks to incorporate the assembly graph into the binning process. We test GraphMB on long-read datasets of different complexities, and compare the performance with other binners in terms of the number of High Quality (HQ) genome bins obtained. With our approach, we were able to obtain unique bins on all real datasets, and obtain more bins on most datasets. In particular, we obtained on average 17.5% more HQ bins when compared with state-of-the-art binners and 13.7% when aggregating the results of our binner with the others. These results indicate that a deep learning model can integrate contig-specific and graph-structure information to improve metagenomic binning.

AVAILABILITY AND IMPLEMENTATION

GraphMB is available from https://github.com/MicrobialDarkMatter/GraphMB.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Topics: Sequence Analysis, DNA; Metagenomics; Metagenome; Genome, Microbial; Algorithms

PubMed: 35972375
DOI: 10.1093/bioinformatics/btac557

An Improved Approach to Identify Bacterial Pathogens to Human in Environmental Metagenome.

Journal of Microbiology and... Sep 2020

The identification of bacterial pathogens to humans is critical for environmental microbial risk assessment. However, current methods for identifying pathogens in...

Summary PubMed Full Text PDF

Authors: Jihoon Yang, Adina Howe, Jaejin Lee...

The identification of bacterial pathogens to humans is critical for environmental microbial risk assessment. However, current methods for identifying pathogens in environmental samples are limited in their ability to detect highly diverse bacterial communities and accurately differentiate pathogens from commensal bacteria. In the present study, we suggest an improved approach using a combination of identification results obtained from multiple databases, including the multilocus sequence typing (MLST) database, virulence factor database (VFDB), and pathosystems resource integration center (PATRIC) databases to resolve current challenges. By integrating the identification results from multiple databases, potential bacterial pathogens in metagenomes were identified and classified into eight different groups. Based on the distribution of genes in each group, we proposed an equation to calculate the metagenomic pathogen identification index (MPII) of each metagenome based on the weighted abundance of identified sequences in each database. We found that the accuracy of pathogen identification was improved by using combinations of multiple databases compared to that of individual databases. When the approach was applied to environmental metagenomes, metagenomes associated with activated sludge were estimated with higher MPII than other environments (, drinking water, ocean water, ocean sediment, and freshwater sediment). The calculated MPII values were statistically distinguishable among different environments (<0.05). These results demonstrate that the suggested approach allows more for more accurate identification of the pathogens associated with metagenomes.

Topics: Bacteria; Databases, Genetic; Environmental Microbiology; Humans; Metagenome; Metagenomics; Microbiota; Systems Integration

PubMed: 32627750
DOI: 10.4014/jmb.2005.05033

OGRE: Overlap Graph-based metagenomic Read clustEring.

Bioinformatics (Oxford, England) May 2021

The microbes that live in an environment can be identified from the combined genomic material, also referred to as the metagenome. Sequencing a metagenome can result in...

Summary PubMed Full Text PDF

Authors: Marleen Balvert, Xiao Luo, Ernestina Hauptfeld...

MOTIVATION

The microbes that live in an environment can be identified from the combined genomic material, also referred to as the metagenome. Sequencing a metagenome can result in large volumes of sequencing reads. A promising approach to reduce the size of metagenomic datasets is by clustering reads into groups based on their overlaps. Clustering reads are valuable to facilitate downstream analyses, including computationally intensive strain-aware assembly. As current read clustering approaches cannot handle the large datasets arising from high-throughput metagenome sequencing, a novel read clustering approach is needed. In this article, we propose OGRE, an Overlap Graph-based Read clustEring procedure for high-throughput sequencing data, with a focus on shotgun metagenomes.

RESULTS

We show that for small datasets OGRE outperforms other read binners in terms of the number of species included in a cluster, also referred to as cluster purity, and the fraction of all reads that is placed in one of the clusters. Furthermore, OGRE is able to process metagenomic datasets that are too large for other read binners into clusters with high cluster purity.

CONCLUSION

OGRE is the only method that can successfully cluster reads in species-specific clusters for large metagenomic datasets without running into computation time- or memory issues.

AVAILABILITYAND IMPLEMENTATION

Code is made available on Github (https://github.com/Marleen1/OGRE).

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Topics: Algorithms; Cluster Analysis; High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Sequence Analysis, DNA; Software

PubMed: 32871010
DOI: 10.1093/bioinformatics/btaa760

Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life.

Nature Microbiology Nov 2017

Challenges in cultivating microorganisms have limited the phylogenetic diversity of currently available microbial genomes. This is being addressed by advances in...

Summary PubMed

Authors: Donovan H Parks, Christian Rinke, Maria Chuvochina...

Challenges in cultivating microorganisms have limited the phylogenetic diversity of currently available microbial genomes. This is being addressed by advances in sequencing throughput and computational techniques that allow for the cultivation-independent recovery of genomes from metagenomes. Here, we report the reconstruction of 7,903 bacterial and archaeal genomes from >1,500 public metagenomes. All genomes are estimated to be ≥50% complete and nearly half are ≥90% complete with ≤5% contamination. These genomes increase the phylogenetic diversity of bacterial and archaeal genome trees by >30% and provide the first representatives of 17 bacterial and three archaeal candidate phyla. We also recovered 245 genomes from the Patescibacteria superphylum (also known as the Candidate Phyla Radiation) and find that the relative diversity of this group varies substantially with different protein marker sets. The scale and quality of this data set demonstrate that recovering genomes from metagenomes provides an expedient path forward to exploring microbial dark matter.

Topics: Archaea; Bacteria; Genome, Archaeal; Genome, Bacterial; Metagenome; Metagenomics; Phylogeny; Sequence Analysis, DNA

PubMed: 28894102
DOI: 10.1038/s41564-017-0012-7

Metagenomics Approaches for Improving Food Safety: A Review.

Journal of Food Protection Mar 2022

Advancements in next-generation sequencing technology have dramatically reduced the cost and increased the ease of microbial whole genome sequencing. This approach is... (Review)

Summary PubMed

Review

Authors: Craig Billington, Joanne M Kingsbury, Lucia Rivas...

ABSTRACT

Advancements in next-generation sequencing technology have dramatically reduced the cost and increased the ease of microbial whole genome sequencing. This approach is revolutionizing the identification and analysis of foodborne microbial pathogens, facilitating expedited detection and mitigation of foodborne outbreaks, improving public health outcomes, and limiting costly recalls. However, next-generation sequencing is still anchored in the traditional laboratory practice of the selection and culture of a single isolate. Metagenomic-based approaches, including metabarcoding and shotgun and long-read metagenomics, are part of the next disruptive revolution in food safety diagnostics and offer the potential to directly identify entire microbial communities in a single food, ingredient, or environmental sample. In this review, metagenomic-based approaches are introduced and placed within the context of conventional detection and diagnostic techniques, and essential considerations for undertaking metagenomic assays and data analysis are described. Recent applications of the use of metagenomics for food safety are discussed alongside current limitations and knowledge gaps and new opportunities arising from the use of this technology.

Topics: Food Safety; High-Throughput Nucleotide Sequencing; Metagenome; Metagenomics; Whole Genome Sequencing

PubMed: 34706052
DOI: 10.4315/JFP-21-301