sequence analysis - OpenMD.com Journal Search

Applied and Environmental Microbiology Jan 2020

More than 10 years ago, we published the paper describing the mothur software package in Our goal was to create a comprehensive package that allowed users to analyze... (Review)

Summary PubMed Full Text PDF

Review

Authors: Patrick D Schloss

More than 10 years ago, we published the paper describing the mothur software package in Our goal was to create a comprehensive package that allowed users to analyze amplicon sequence data using the most robust methods available. mothur has helped lead the community through the ongoing sequencing revolution and continues to provide this service to the microbial ecology community. Beyond its success and impact on the field, mothur's development exposed a series of observations that are generally translatable across science. Perhaps the observation that stands out the most is that all science is done in the context of prevailing ideas and available technologies. Although it is easy to criticize choices that were made 10 years ago through a modern lens, if we were to wait for all of the possible limitations to be solved before proceeding, science would stall. Even preceding the development of mothur, it was necessary to address the most important problems and work backwards to other problems that limited access to robust sequence analysis tools. At the same time, we strive to expand mothur's capabilities in a data-driven manner to incorporate new ideas and accommodate changes in data and desires of the research community. It has been edifying to see the benefit that a simple set of tools can bring to so many other researchers.

Topics: Environmental Microbiology; Sequence Analysis; Software

PubMed: 31704678
DOI: 10.1128/AEM.02343-19

The Versatility of SMRT Sequencing.

Genes Jan 2019

The adoption of single molecule real-time (SMRT) sequencing [...].

Summary PubMed Full Text PDF

Authors: Matthew S Hestand, Adam Ameur

The adoption of single molecule real-time (SMRT) sequencing [...].

Topics: Animals; High-Throughput Nucleotide Sequencing; Humans; Plants; Sensitivity and Specificity; Sequence Analysis, DNA

PubMed: 30621217
DOI: 10.3390/genes10010024

New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing.

Briefings in Bioinformatics May 2014

With the development of next-generation sequencing (NGS) technologies, a large amount of short read data has been generated. Assembly of these short reads can be... (Review)

Summary PubMed Full Text PDF

Review

Authors: Kai Song, Jie Ren, Gesine Reinert...

With the development of next-generation sequencing (NGS) technologies, a large amount of short read data has been generated. Assembly of these short reads can be challenging for genomes and metagenomes without template sequences, making alignment-based genome sequence comparison difficult. In addition, sequence reads from NGS can come from different regions of various genomes and they may not be alignable. Sequence signature-based methods for genome comparison based on the frequencies of word patterns in genomes and metagenomes can potentially be useful for the analysis of short reads data from NGS. Here we review the recent development of alignment-free genome and metagenome comparison based on the frequencies of word patterns with emphasis on the dissimilarity measures between sequences, the statistical power of these measures when two sequences are related and the applications of these measures to NGS data.

Topics: Algorithms; Computational Biology; Genomics; High-Throughput Nucleotide Sequencing; Markov Chains; Models, Statistical; Sequence Alignment; Sequence Analysis

PubMed: 24064230
DOI: 10.1093/bib/bbt067

High-Throughput Selection and Characterisation of Aptamers on Optical Next-Generation Sequencers.

International Journal of Molecular... Aug 2021

Aptamers feature a number of advantages, compared to antibodies. However, their application has been limited so far, mainly because of the complex selection process.... (Review)

Summary PubMed Full Text PDF

Review

Authors: Alissa Drees, Markus Fischer

Aptamers feature a number of advantages, compared to antibodies. However, their application has been limited so far, mainly because of the complex selection process. 'High-throughput sequencing fluorescent ligand interaction profiling' (HiTS-FLIP) significantly increases the selection efficiency and is consequently a very powerful and versatile technology for the selection of high-performance aptamers. It is the first experiment to allow the direct and quantitative measurement of the affinity and specificity of millions of aptamers simultaneously by harnessing the potential of optical next-generation sequencing platforms to perform fluorescence-based binding assays on the clusters displayed on the flow cells and determining their sequence and position in regular high-throughput sequencing. Many variants of the experiment have been developed that allow automation and in situ conversion of DNA clusters into base-modified DNA, RNA, peptides, and even proteins. In addition, the information from mutational assays, performed with HiTS-FLIP, provides deep insights into the relationship between the sequence, structure, and function of aptamers. This enables a detailed understanding of the sequence-specific rules that determine affinity, and thus, supports the evolution of aptamers. Current variants of the HiTS-FLIP experiment and its application in the field of aptamer selection, characterisation, and optimisation are presented in this review.

Topics: Aptamers, Nucleotide; Automation, Laboratory; High-Throughput Nucleotide Sequencing; Mutagenesis; Optical Devices; Sequence Analysis, DNA

PubMed: 34502110
DOI: 10.3390/ijms22179202

Sequence analysis by iterated maps, a review.

Briefings in Bioinformatics May 2014

Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also... (Review)

Summary PubMed Full Text PDF

Review

Authors: Jonas S Almeida

Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, 'Chaos Game Representation'. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results.

Topics: Computational Biology; Fractals; Models, Statistical; Nonlinear Dynamics; Sequence Alignment; Sequence Analysis

PubMed: 24162172
DOI: 10.1093/bib/bbt072

Platforms and Analytical Tools Used in Nucleic Acid Sequence-Based Microbial Genotyping Procedures.

Microbiology Spectrum Jan 2019

In the decade and a half since the introduction of next-generation sequencing (NGS), the technical feasibility, cost, and overall utility of sequencing have changed... (Review)

Summary PubMed Full Text

Review

Authors: Duncan MacCannell

In the decade and a half since the introduction of next-generation sequencing (NGS), the technical feasibility, cost, and overall utility of sequencing have changed dramatically, including applications for infectious disease epidemiology. Massively parallel sequencing technologies have decreased the cost of sequencing by more than 6 orders or magnitude over this time, with a corresponding increase in data generation and complexity. This review provides an overview of the basic principles, chemistry, and operational mechanics of current sequencing technologies, including both conventional Sanger and NGS approaches. As the generation of large amounts of sequence data becomes increasingly routine, the role of bioinformatics in data analysis and reporting becomes all the more critical, and the successful deployment of NGS in public health settings requires careful consideration of changing information technology, bioinformatics, workforce, and regulatory requirements. While there remain important challenges to the sustainable implementation of NGS in public health, in terms of both laboratory and bioinformatics capacity, the impact of these technologies on infectious disease surveillance and outbreak investigations has been nothing short of revolutionary. Understanding the important role that NGS plays in modern public health laboratory practice is critical, as is the need to ensure appropriate workforce, infrastructure, facilities, and funding consideration for routine NGS applications, future innovation, and rapidly scaling NGS-based infectious disease surveillance and outbreak response activities. *This article is part of a curated collection.

Topics: Computational Biology; DNA; Data Analysis; Gene Library; High-Throughput Nucleotide Sequencing; Humans; Sequence Analysis, DNA

PubMed: 30737915
DOI: 10.1128/microbiolspec.AME-0005-2018

how_are_we_stranded_here: quick determination of RNA-Seq strandedness.

BMC Bioinformatics Jan 2022

Quality control checks are the first step in RNA-Sequencing analysis, which enable the identification of common issues that occur in the sequenced reads. Checks for...

Summary PubMed Full Text PDF

Authors: Brandon Signal, Tim Kahlke

BACKGROUND

Quality control checks are the first step in RNA-Sequencing analysis, which enable the identification of common issues that occur in the sequenced reads. Checks for sequence quality, contamination, and complexity are commonplace, and allow users to implement steps downstream which can account for these issues. Strand-specificity of reads is frequently overlooked and is often unavailable even in published data, yet when unknown or incorrectly specified can have detrimental effects on the reproducibility and accuracy of downstream analyses.

RESULTS

To address these issues, we developed how_are_we_stranded_here, a Python library that helps to quickly infer strandedness of paired-end RNA-Sequencing data. Testing on both simulated and real RNA-Sequencing reads showed that it correctly measures strandedness, and measures outside the normal range may indicate sample contamination.

CONCLUSIONS

how_are_we_stranded_here is fast and user friendly, making it easy to implement in quality control pipelines prior to analysing RNA-Sequencing data. how_are_we_stranded_here is freely available at https://github.com/betsig/how_are_we_stranded_here .

Topics: High-Throughput Nucleotide Sequencing; RNA-Seq; Reproducibility of Results; Sequence Analysis, DNA; Sequence Analysis, RNA; Software

PubMed: 35065593
DOI: 10.1186/s12859-022-04572-7

Sequencing revolution.

Developmental Medicine and Child... Aug 2011

Summary PubMed Full Text

Authors: Yanick J Crow

Topics: Genomics; Humans; Sequence Analysis, DNA

PubMed: 21679360
DOI: 10.1111/j.1469-8749.2011.04016.x

NetSeekR: a network analysis pipeline for RNA-Seq time series data.

BMC Bioinformatics Jan 2022

Recent development of bioinformatics tools for Next Generation Sequencing data has facilitated complex analyses and prompted large scale experimental designs for...

Summary PubMed Full Text PDF

Authors: Himangi Srivastava, Drew Ferrell, George V Popescu...

BACKGROUND

Recent development of bioinformatics tools for Next Generation Sequencing data has facilitated complex analyses and prompted large scale experimental designs for comparative genomics. When combined with the advances in network inference tools, this can lead to powerful methodologies for mining genomics data, allowing development of pipelines that stretch from sequence reads mapping to network inference. However, integrating various methods and tools available over different platforms requires a programmatic framework to fully exploit their analytic capabilities. Integrating multiple genomic analysis tools faces challenges from standardization of input and output formats, normalization of results for performing comparative analyses, to developing intuitive and easy to control scripts and interfaces for the genomic analysis pipeline.

RESULTS

We describe here NetSeekR, a network analysis R package that includes the capacity to analyze time series of RNA-Seq data, to perform correlation and regulatory network inferences and to use network analysis methods to summarize the results of a comparative genomics study. The software pipeline includes alignment of reads, differential gene expression analysis, correlation network analysis, regulatory network analysis, gene ontology enrichment analysis and network visualization of differentially expressed genes. The implementation provides support for multiple RNA-Seq read mapping methods and allows comparative analysis of the results obtained by different bioinformatics methods.

CONCLUSION

Our methodology increases the level of integration of genomics data analysis tools to network inference, facilitating hypothesis building, functional analysis and genomics discovery from large scale NGS data. When combined with network analysis and simulation tools, the pipeline allows for developing systems biology methods using large scale genomics data.

Topics: Computational Biology; High-Throughput Nucleotide Sequencing; RNA-Seq; Sequence Analysis, RNA; Time Factors

PubMed: 35090393
DOI: 10.1186/s12859-021-04554-1

iGenomics: Comprehensive DNA sequence analysis on your Smartphone.

GigaScience Dec 2020

Following the miniaturization of integrated circuitry and other computer hardware over the past several decades, DNA sequencing is on a similar path. Leading this trend...

Summary PubMed Full Text PDF

Authors: Aspyn Palatnick, Bin Zhou, Elodie Ghedin...

BACKGROUND

Following the miniaturization of integrated circuitry and other computer hardware over the past several decades, DNA sequencing is on a similar path. Leading this trend is the Oxford Nanopore sequencing platform, which currently offers the hand-held MinION instrument and even smaller instruments on the horizon. This technology has been used in several important applications, including the analysis of genomes of major pathogens in remote stations around the world. However, despite the simplicity of the sequencer, an equally simple and portable analysis platform is not yet available.

RESULTS

iGenomics is the first comprehensive mobile genome analysis application, with capabilities to align reads, call variants, and visualize the results entirely on an iOS device. Implemented in Objective-C using the FM-index, banded dynamic programming, and other high-performance bioinformatics techniques, iGenomics is optimized to run in a mobile environment. We benchmark iGenomics using a variety of real and simulated Nanopore sequencing datasets of viral and bacterial genomes and show that iGenomics has performance comparable to the popular BWA-MEM/SAMtools/IGV suite, without necessitating a laptop or server cluster.

CONCLUSIONS

iGenomics is available open source (https://github.com/stuckinaboot/iGenomics) and for free on Apple's App Store (https://apple.co/2HCplzr).

Topics: Computational Biology; Genome, Bacterial; High-Throughput Nucleotide Sequencing; Nanopore Sequencing; Nanopores; Sequence Analysis, DNA; Smartphone

PubMed: 33284326
DOI: 10.1093/gigascience/giaa138