-
Genome Biology 2010The fine detail provided by sequencing-based transcriptome surveys suggests that RNA-seq is likely to become the platform of choice for interrogating steady state RNA....
The fine detail provided by sequencing-based transcriptome surveys suggests that RNA-seq is likely to become the platform of choice for interrogating steady state RNA. In order to discover biologically important changes in expression, we show that normalization continues to be an essential step in the analysis. We outline a simple and effective method for performing normalization and show dramatically improved results for inferring differential expression in simulated and publicly available data sets.
Topics: Base Sequence; Computer Simulation; Gene Expression Profiling; Gene Library; Models, Statistical; RNA
PubMed: 20196867
DOI: 10.1186/gb-2010-11-3-r25 -
Molecular Microbiology Apr 2000
Comparative Study
Topics: Archaea; Bacteria; Base Sequence; Binding Sites; Conserved Sequence; DNA-Binding Proteins; Mitochondria; Repetitive Sequences, Nucleic Acid
PubMed: 10760181
DOI: 10.1046/j.1365-2958.2000.01838.x -
Cell May 2012Mobile DNAs have had a central role in shaping our genome. More than half of our DNA is comprised of interspersed repeats resulting from replicative copy and paste... (Review)
Review
Mobile DNAs have had a central role in shaping our genome. More than half of our DNA is comprised of interspersed repeats resulting from replicative copy and paste events of retrotransposons. Although most are fixed, incapable of templating new copies, there are important exceptions to retrotransposon quiescence. De novo insertions cause genetic diseases and cancers, though reliably detecting these occurrences has been difficult. New technologies aimed at uncovering polymorphic insertions reveal that mobile DNAs provide a substantial and dynamic source of structural variation. Key questions going forward include how and how much new transposition events affect human health and disease.
Topics: Alu Elements; Animals; Base Sequence; Biological Evolution; DNA Transposable Elements; Genome, Human; Humans; Molecular Sequence Data
PubMed: 22579280
DOI: 10.1016/j.cell.2012.04.019 -
The EMBO Journal May 2021Eukaryotic transcription factors recognize specific DNA sequence motifs, but are also endowed with generic, non-specific DNA-binding activity. How these binding modes...
Eukaryotic transcription factors recognize specific DNA sequence motifs, but are also endowed with generic, non-specific DNA-binding activity. How these binding modes are integrated to determine select transcriptional outputs remains unresolved. We addressed this question by site-directed mutagenesis of the Myc transcription factor. Impairment of non-specific DNA backbone contacts caused pervasive loss of genome interactions and gene regulation, associated with increased intra-nuclear mobility of the Myc protein in murine cells. In contrast, a mutant lacking base-specific contacts retained DNA-binding and mobility profiles comparable to those of the wild-type protein, but failed to recognize its consensus binding motif (E-box) and could not activate Myc-target genes. Incidentally, this mutant gained weak affinity for an alternative motif, driving aberrant activation of different genes. Altogether, our data show that non-specific DNA binding is required to engage onto genomic regulatory regions; sequence recognition in turn contributes to transcriptional activation, acting at distinct levels: stabilization and positioning of Myc onto DNA, and-unexpectedly-promotion of its transcriptional activity. Hence, seemingly pervasive genome interaction profiles, as detected by ChIP-seq, actually encompass diverse DNA-binding modalities, driving defined, sequence-dependent transcriptional responses.
Topics: Base Sequence; Binding Sites; DNA; Gene Expression Regulation; Protein Stability; Proto-Oncogene Proteins c-myc; Transcription Factors
PubMed: 33792944
DOI: 10.15252/embj.2020105464 -
Nucleic Acids Research May 2016Sequence Logos and its variants are the most commonly used method for visualization of multiple sequence alignments (MSAs) and sequence motifs. They provide...
Sequence Logos and its variants are the most commonly used method for visualization of multiple sequence alignments (MSAs) and sequence motifs. They provide consensus-based summaries of the sequences in the alignment. Consequently, individual sequences cannot be identified in the visualization and covariant sites are not easily discernible. We recently proposed Sequence Bundles, a motif visualization technique that maintains a one-to-one relationship between sequences and their graphical representation and visualizes covariant sites. We here present Alvis, an open-source platform for the joint explorative analysis of MSAs and phylogenetic trees, employing Sequence Bundles as its main visualization method. Alvis combines the power of the visualization method with an interactive toolkit allowing detection of covariant sites, annotation of trees with synapomorphies and homoplasies, and motif detection. It also offers numerical analysis functionality, such as dimension reduction and classification. Alvis is user-friendly, highly customizable and can export results in publication-quality figures. It is available as a full-featured standalone version (http://www.bitbucket.org/rfs/alvis) and its Sequence Bundles visualization module is further available as a web application (http://science-practice.com/projects/sequence-bundles).
Topics: Base Sequence; Computational Biology; Sequence Alignment; Sequence Analysis, DNA
PubMed: 26819408
DOI: 10.1093/nar/gkw022 -
Genome Biology 2004The availability of the rat genome sequence, and detailed three-way comparison of the rat, mouse and human genomes, is revealing a great deal about mammalian genome... (Comparative Study)
Comparative Study Review
The availability of the rat genome sequence, and detailed three-way comparison of the rat, mouse and human genomes, is revealing a great deal about mammalian genome evolution. Together with recent developments in cloning technologies, this heralds an important phase in rat research.
Topics: Animals; Base Sequence; Genome; Humans; Rats
PubMed: 15128437
DOI: 10.1186/gb-2004-5-5-221 -
Genome Research Nov 2021Eukaryotic genomes typically show a uniform G + C content among chromosomes, but on smaller scales, many species have a G + C density that fluctuates with a...
Eukaryotic genomes typically show a uniform G + C content among chromosomes, but on smaller scales, many species have a G + C density that fluctuates with a characteristic wavelength. This oscillation is evident in many insect species, with wavelengths ranging between 700 bp and 4 kb. Measures of evolutionary conservation oscillate in phase with G + C content, with conserved regions having higher G + C. Loci with large regulatory regions show more regular oscillations; coding sequences and heterochromatic regions show little or no oscillation. There is little oscillation in vertebrate genomes in regions with densely distributed mobile repetitive elements. However, species with few repeats show oscillation in both G + C density and sequence conservation. These oscillations may reflect optimal spacing of -regulatory elements.
Topics: Base Sequence; Biological Evolution; Conserved Sequence; Evolution, Molecular; Genome; Regulatory Sequences, Nucleic Acid; Repetitive Sequences, Nucleic Acid
PubMed: 34649930
DOI: 10.1101/gr.274332.120 -
BioEssays : News and Reviews in... Apr 2022In human languages, a palindrome reads the same forward as backward (e.g., 'madam'). In regulatory DNA, a palindrome is an inverted sequence repeat that allows a... (Review)
Review
In human languages, a palindrome reads the same forward as backward (e.g., 'madam'). In regulatory DNA, a palindrome is an inverted sequence repeat that allows a transcription factor to bind as a homodimer or as a heterodimer with another type of transcription factor. Regulatory palindromes are typically imperfect, that is, the repeated sequences differ in at least one base pair, but the functional significance of this asymmetry remains poorly understood. Here, we review the use of imperfect palindromes in Drosophila photoreceptor differentiation and mammalian steroid receptor signaling. Moreover, we discuss mechanistic explanations for the predominance of imperfect palindromes over perfect palindromes in these two gene regulatory contexts. Lastly, we propose to elucidate whether specific imperfectly palindromic variants have specific regulatory functions in steroid receptor signaling and whether such variants can help predict transcriptional outcomes as well as the response of individual patients to drug treatments.
Topics: Animals; Base Sequence; Gene Expression Regulation; Humans; Mammals; Repetitive Sequences, Nucleic Acid; Transcription Factors
PubMed: 35195290
DOI: 10.1002/bies.202100191 -
Nucleic Acids Research Jan 2022Repeats are prevalent in the genomes of all bacteria, plants and animals, and they cover nearly half of the Human genome, which play indispensable roles in the...
Repeats are prevalent in the genomes of all bacteria, plants and animals, and they cover nearly half of the Human genome, which play indispensable roles in the evolution, inheritance, variation and genomic instability, and serve as substrates for chromosomal rearrangements that include disease-causing deletions, inversions, and translocations. Comprehensive identification, classification and annotation of repeats in genomes can provide accurate and targeted solutions towards understanding and diagnosis of complex diseases, optimization of plant properties and development of new drugs. RepBase and Dfam are two most frequently used repeat databases, but they are not sufficiently complete. Due to the lack of a comprehensive repeat database of multiple species, the current research in this field is far from being satisfactory. LongRepMarker is a new framework developed recently by our group for comprehensive identification of genomic repeats. We here propose msRepDB based on LongRepMarker, which is currently the most comprehensive multi-species repeat database, covering >80 000 species. Comprehensive evaluations show that msRepDB contains more species, and more complete repeats and families than RepBase and Dfam databases. (https://msrepdb.cbrc.kaust.edu.sa/pages/msRepDB/index.html).
Topics: Animals; Base Sequence; DNA Transposable Elements; Databases, Nucleic Acid; Genome; Humans; Internet; Plants; Repetitive Sequences, Nucleic Acid; Retroelements; Sequence Analysis, DNA; User-Computer Interface
PubMed: 34850956
DOI: 10.1093/nar/gkab1089 -
Human Mutation Oct 2011Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their... (Review)
Review
On the sequence-directed nature of human gene mutation: the role of genomic architecture and the local DNA sequence environment in mediating gene mutations underlying human inherited disease.
Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher order features of the genomic architecture. The human genome is now recognized to contain "pervasive architectural flaws" in that certain DNA sequences are inherently mutation prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here, we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of noncanonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair and may serve to increase mutation frequencies in generalized fashion (i.e., both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease.
Topics: Alu Elements; Base Sequence; DNA Copy Number Variations; DNA Repeat Expansion; Genes; Genetic Diseases, Inborn; Genetic Predisposition to Disease; Genome, Human; Genome, Mitochondrial; Humans; Microsatellite Repeats; Mutation; Polymorphism, Genetic; Recombination, Genetic; Retroelements
PubMed: 21853507
DOI: 10.1002/humu.21557