-
Proceedings of the National Academy of... Mar 2019R-loops are abundant three-stranded nucleic-acid structures that form during transcription. Experimental evidence suggests that R-loop formation is affected by DNA...
R-loops are abundant three-stranded nucleic-acid structures that form during transcription. Experimental evidence suggests that R-loop formation is affected by DNA sequence and topology. However, the exact manner by which these factors interact to determine R-loop susceptibility is unclear. To investigate this, we developed a statistical mechanical equilibrium model of R-loop formation in superhelical DNA. In this model, the energy involved in forming an R-loop includes four terms-junctional and base-pairing energies and energies associated with superhelicity and with the torsional winding of the displaced DNA single strand around the RNA:DNA hybrid. This model shows that the significant energy barrier imposed by the formation of junctions can be overcome in two ways. First, base-pairing energy can favor RNA:DNA over DNA:DNA duplexes in favorable sequences. Second, R-loops, by absorbing negative superhelicity, partially or fully relax the rest of the DNA domain, thereby returning it to a lower energy state. In vitro transcription assays confirmed that R-loops cause plasmid relaxation and that negative superhelicity is required for R-loops to form, even in a favorable region. Single-molecule R-loop footprinting following in vitro transcription showed a strong agreement between theoretical predictions and experimental mapping of stable R-loop positions and further revealed the impact of DNA topology on the R-loop distribution landscape. Our results clarify the interplay between base sequence and DNA superhelicity in controlling R-loop stability. They also reveal R-loops as powerful and reversible topology sinks that cells may use to nonenzymatically relieve superhelical stress during transcription.
Topics: Base Sequence; DNA; DNA, Single-Stranded; DNA, Superhelical; Models, Genetic; Nucleic Acid Conformation; Nucleic Acid Hybridization; Plasmids; RNA; Transcription, Genetic
PubMed: 30850542
DOI: 10.1073/pnas.1819476116 -
Bio-protocol Oct 2020Biochemical investigations into DNA-binding and DNA-cutting proteins often benefit from the specific attachment of a radioactive label to one of the two DNA termini. In...
Biochemical investigations into DNA-binding and DNA-cutting proteins often benefit from the specific attachment of a radioactive label to one of the two DNA termini. In many cases, it is essential to perform two versions of the same experiment: one with the 5' DNA end labeled and one with the 3' DNA end labeled. While homogeneous 5'-radiolabeling can be accomplished using a single kinase-catalyzed phosphorylation step, existing procedures for 3'-radiolabeling often result in probe heterogeneity, prohibiting precise DNA fragment identification in downstream experiments. We present here a new protocol to efficiently attach a P-phosphate to the 3' end of a DNA oligonucleotide of arbitrary sequence, relying on inexpensive DNA oligonucleotide modifications (2'-O-methylribonucleotide and ribonucleotide sugar substitutions), two enzymes (T4 polynucleotide kinase and T4 RNA ligase 2), and the differential susceptibility of DNA and RNA to hydroxide treatment. Radioactive probe molecules produced by this protocol are homogeneous and oxidant-compatible, and they can be used for precise cleavage-site mapping in the context of both DNase enzyme characterization and DNA footprinting assays. Graphic abstract.
PubMed: 33659442
DOI: 10.21769/BioProtoc.3787 -
Nature Methods Mar 2016High-throughput sequencing technologies have allowed many gene locus-level molecular biology assays to become genome-wide profiling methods. DNA-cleaving enzymes such as...
High-throughput sequencing technologies have allowed many gene locus-level molecular biology assays to become genome-wide profiling methods. DNA-cleaving enzymes such as DNase I have been used to probe accessible chromatin. The accessible regions contain functional regulatory sites, including promoters, insulators and enhancers. Deep sequencing of DNase-seq libraries and computational analysis of the cut profiles have been used to infer protein occupancy in the genome at the nucleotide level, a method introduced as 'digital genomic footprinting'. The approach has been proposed as an attractive alternative to the analysis of transcription factors (TFs) by chromatin immunoprecipitation followed by sequencing (ChIP-seq), and in theory it should overcome antibody issues, poor resolution and batch effects. Recent reports point to limitations of the DNase-based genomic footprinting approach and call into question the scope of detectable protein occupancy, especially for TFs with short-lived chromatin binding. The genomics community is grappling with issues concerning the utility of genomic footprinting and is reassessing the proposed approaches in terms of robust deliverables. Here we summarize the consensus as well as different views emerging from recent reports, and we describe the remaining issues and hurdles for genomic footprinting.
Topics: Algorithms; Chromosome Mapping; DNA; DNA Footprinting; Genome, Human; High-Throughput Nucleotide Sequencing; Humans
PubMed: 26914206
DOI: 10.1038/nmeth.3766 -
Cell Genomics Nov 2022Gene expression is controlled by transcription factors (TFs) that bind cognate DNA motif sequences in -regulatory elements (CREs). The combinations of DNA motifs acting...
Gene expression is controlled by transcription factors (TFs) that bind cognate DNA motif sequences in -regulatory elements (CREs). The combinations of DNA motifs acting within homeostasis and disease, however, are unclear. Gene expression, chromatin accessibility, TF footprinting, and H3K27ac-dependent DNA looping data were generated and a random-forest-based model was applied to identify 7,531 cell-type-specific -regulatory modules (CRMs) across 15 diploid human cell types. A co-enrichment framework within CRMs nominated 838 cell-type-specific, recurrent heterotypic DNA motif combinations (DMCs), which were functionally validated using massively parallel reporter assays. Cancer cells engaged DMCs linked to neoplasia-enabling processes operative in normal cells while also activating new DMCs only seen in the neoplastic state. This integrative approach identifies cell-type-specific -regulatory combinatorial DNA motifs in diverse normal and diseased human cells and represents a general framework for deciphering -regulatory sequence logic in gene regulation.
PubMed: 36742369
DOI: 10.1016/j.xgen.2022.100191 -
Molecular Microbiology Apr 2020CodY is a global transcriptional regulator that controls, directly or indirectly, the expression of dozens of genes and operons in Listeria monocytogenes. We used in...
CodY is a global transcriptional regulator that controls, directly or indirectly, the expression of dozens of genes and operons in Listeria monocytogenes. We used in vitro DNA affinity purification combined with massively parallel sequencing (IDAP-Seq) to identify genome-wide L. monocytogenes chromosomal DNA regions that CodY binds in vitro. The total number of CodY-binding regions exceeded 2,000, but they varied significantly in their strengths of binding at different CodY concentrations. The 388 strongest CodY-binding regions were chosen for further analysis. A strand-specific analysis of the data allowed pinpointing CodY-binding sites at close to single-nucleotide resolution. Gel shift and DNase I footprinting assays confirmed the presence and locations of several CodY-binding sites. Surprisingly, most of the sites were located within genes' coding regions. The binding site within the beginning of the coding sequence of the prfA gene, which encodes the master regulator of virulence genes, has been previously implicated in regulation of prfA, but this site was weaker in vitro than hundreds of other sites. The L. monocytogenes CodY protein was functionally similar to Bacillus subtilis CodY when expressed in B. subtilis cells. Based on the sequences of the CodY-binding sites, a model of CodY interaction with DNA is proposed.
Topics: Bacterial Proteins; Binding Sites; DNA, Bacterial; DNA-Binding Proteins; Gene Expression Regulation, Bacterial; Listeria monocytogenes; Protein Binding; Transcription Factors; Virulence Factors
PubMed: 31944451
DOI: 10.1111/mmi.14449 -
Bioinformatics (Oxford, England) Jul 2021High-throughput chromatin immunoprecipitation (ChIP) sequencing-based assays capture genomic regions associated with the profiled transcription factor (TF). ChIP-exo is...
MOTIVATION
High-throughput chromatin immunoprecipitation (ChIP) sequencing-based assays capture genomic regions associated with the profiled transcription factor (TF). ChIP-exo is a modified protocol, which uses lambda exonuclease to digest DNA close to the TF-DNA complex, in order to improve on the positional resolution of the TF-DNA contact. Because the digestion occurs in the 5'-3' orientation, the protocol produces directional footprints close to the complex, on both sides of the double stranded DNA. Like all ChIP-based methods, ChIP-exo reports a mixture of different regions associated with the TF: those bound directly to the TF as well as via intermediaries. However, the distribution of footprints are likely to be indicative of the complex forming at the DNA.
RESULTS
We present ExoDiversity, which uses a model-based framework to learn a joint distribution over footprints and motifs, thus resolving the mixture of ChIP-exo footprints into diverse binding modes. It uses no prior motif or TF information and automatically learns the number of different modes from the data. We show its application on a wide range of TFs and organisms/cell-types. Because its goal is to explain the complete set of reported regions, it is able to identify co-factor TF motifs that appear in a small fraction of the dataset. Further, ExoDiversity discovers small nucleotide variations within and outside canonical motifs, which co-occur with variations in footprints, suggesting that the TF-DNA structural configuration at those regions is likely to be different. Finally, we show that detected modes have specific DNA shape features and conservation signals, giving insights into the structure and function of the putative TF-DNA complexes.
AVAILABILITY AND IMPLEMENTATION
The code for ExoDiversity is available on https://github.com/NarlikarLab/exoDIVERSITY.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Binding Sites; Chromatin Immunoprecipitation; DNA; DNA Footprinting; Exonucleases; Protein Binding; Sequence Analysis, DNA
PubMed: 34252930
DOI: 10.1093/bioinformatics/btab274 -
International Journal of Molecular... Jan 2023Transcription through nucleosomes by RNA polymerases (RNAP) is accompanied by formation of small intranucleosomal DNA loops (i-loops). The i-loops form more efficiently...
Transcription through nucleosomes by RNA polymerases (RNAP) is accompanied by formation of small intranucleosomal DNA loops (i-loops). The i-loops form more efficiently in the presence of single-strand breaks or gaps in a non-template DNA strand (NT-SSBs) and induce arrest of transcribing RNAP, thus allowing detection of NT-SSBs by the enzyme. Here we examined the role of histone tails and extranucleosomal NT-SSBs in i-loop formation and arrest of RNAP during transcription of promoter-proximal region of nucleosomal DNA. NT-SSBs present in linker DNA induce arrest of RNAP +1 to +15 bp in the nucleosome, suggesting formation of the i-loops; the arrest is more efficient in the presence of the histone tails. Consistently, DNA footprinting reveals formation of an i-loop after stalling RNAP at the position +2 and backtracking to position +1. The data suggest that histone tails and NT-SSBs present in linker DNA strongly facilitate formation of the i-loops during transcription through the promoter-proximal region of nucleosomal DNA.
Topics: Nucleosomes; Histones; Transcription, Genetic; RNA Polymerase II; DNA Breaks, Single-Stranded; DNA-Directed RNA Polymerases; DNA; DNA, Single-Stranded
PubMed: 36768621
DOI: 10.3390/ijms24032295 -
The Journal of Biological Chemistry Nov 2018G-quadruplexes (G4s) are four-stranded DNA structures comprising stacks of four guanines, are prevalent in genomes, and have diverse biological functions in various...
G-quadruplexes (G4s) are four-stranded DNA structures comprising stacks of four guanines, are prevalent in genomes, and have diverse biological functions in various chromosomal structures. A conserved protein, Rap1-interacting factor 1 (Rif1) from fission yeast (), binds to Rif1-binding sequence (Rif1BS) and regulates DNA replication timing. Rif1BS is characterized by the presence of multiple G-tracts, often on both strands, and their unusual spacing. Although previous studies have suggested generation of G4-like structures on duplex Rif1BS, its precise molecular architecture remains unknown. Using gel-shift DNA binding assays and DNA footprinting with various nuclease probes, we show here that both of the Rif1BS strands adopt specific higher-order structures upon heat denaturation. We observed that the structure generated on the G-strand is consistent with a G4 having unusually long loop segments and that the structure on the complementary C-strand does not have an intercalated motif (i-motif). Instead, we found that the formation of the C-strand structure depends on the G4 formation on the G-strand. Thus, the higher-order structure generated at Rif1BS involved both DNA strands, and in some cases, G4s may form on both of these strands. The presence of multiple G-tracts permitted the formation of alternative structures when some G-tracts were mutated or disrupted by deazaguanine replacement, indicating the robust nature of DNA higher-order structures generated at Rif1BS. Our results provide general insights into DNA structures generated at G4-forming sequences on duplex DNA.
Topics: Base Sequence; Binding Sites; DNA Footprinting; DNA Replication; DNA, Fungal; DNA, Single-Stranded; G-Quadruplexes; Molecular Sequence Data; Nucleic Acid Conformation; Schizosaccharomyces; Schizosaccharomyces pombe Proteins; Telomere-Binding Proteins
PubMed: 30217821
DOI: 10.1074/jbc.RA118.005240 -
Molecules (Basel, Switzerland) Aug 2020Radiotherapy, the most common therapy for the treatment of solid tumors, exerts its effects by inducing DNA damage. To fully understand the extent and nature of this...
Radiotherapy, the most common therapy for the treatment of solid tumors, exerts its effects by inducing DNA damage. To fully understand the extent and nature of this damage, DNA models that mimic the in vivo situation should be utilized. In a cellular context, genomic DNA constantly interacts with proteins and these interactions could influence both the primary radical processes (triggered by ionizing radiation) and secondary reactions, ultimately leading to DNA damage. However, this is seldom addressed in the literature. In this work, we propose a general approach to tackle these shortcomings. We synthesized a protein-DNA complex that more closely represents DNA in the physiological environment than oligonucleotides solution itself, while being sufficiently simple to permit further chemical analyses. Using click chemistry, we obtained an oligonucleotide-peptide conjugate, which, if annealed with the complementary oligonucleotide strand, forms a complex that mimics the specific interactions between the GCN4 protein and DNA. The covalent bond connecting the oligonucleotide and peptide constitutes a part of substituted triazole, which forms due to the click reaction between the short peptide corresponding to the specific amino acid sequence of GCN4 protein (yeast transcription factor) and a DNA fragment that is recognized by the protein. DNAse footprinting demonstrated that the part of the DNA fragment that specifically interacts with the peptide in the complex is protected from DNAse activity. Moreover, the thermodynamic characteristics obtained using differential scanning calorimetry (DSC) are consistent with the interaction energies calculated at the level of metadynamics. Thus, we present an efficient approach to generate a well-defined DNA-peptide conjugate that mimics a real DNA-peptide complex. These complexes can be used to investigate DNA damage under conditions very similar to those present in the cell.
Topics: Amino Acid Sequence; Basic-Leucine Zipper Transcription Factors; Binding Sites; Calorimetry, Differential Scanning; Catalysis; Chromatography, High Pressure Liquid; Click Chemistry; Copper; DNA; DNA Damage; DNA, Single-Stranded; Molecular Dynamics Simulation; Nucleic Acid Conformation; Peptides; Protein Domains; Saccharomyces cerevisiae Proteins; Spectrometry, Mass, Electrospray Ionization; Transition Temperature
PubMed: 32784992
DOI: 10.3390/molecules25163630 -
Bioinformatics (Oxford, England) Jun 2017The computational investigation of DNA binding motifs from binding sites is one of the classic tasks in bioinformatics and a prerequisite for understanding gene...
MOTIVATION
The computational investigation of DNA binding motifs from binding sites is one of the classic tasks in bioinformatics and a prerequisite for understanding gene regulation as a whole. Due to the development of sequencing technologies and the increasing number of available genomes, approaches based on phylogenetic footprinting become increasingly attractive. Phylogenetic footprinting requires phylogenetic trees with attached substitution probabilities for quantifying the evolution of binding sites, but these trees and substitution probabilities are typically not known and cannot be estimated easily.
RESULTS
Here, we investigate the influence of phylogenetic trees with different substitution probabilities on the classification performance of phylogenetic footprinting using synthetic and real data. For synthetic data we find that the classification performance is highest when the substitution probability used for phylogenetic footprinting is similar to that used for data generation. For real data, however, we typically find that the classification performance of phylogenetic footprinting surprisingly increases with increasing substitution probabilities and is often highest for unrealistically high substitution probabilities close to one. This finding suggests that choosing realistic model assumptions might not always yield optimal predictions in general and that choosing unrealistically high substitution probabilities close to one might actually improve the classification performance of phylogenetic footprinting.
AVAILABILITY AND IMPLEMENTATION
The proposed PF is implemented in JAVA and can be downloaded from https://github.com/mgledi/PhyFoo.
CONTACT
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Animals; Binding Sites; Computational Biology; DNA-Binding Proteins; Gene Regulatory Networks; Humans; Phylogeny; Sequence Analysis, DNA; Sequence Analysis, Protein; Software
PubMed: 28130227
DOI: 10.1093/bioinformatics/btx033