-
Nature Jul 2020The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better...
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE and Roadmap Epigenomics data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
Topics: Animals; Chromatin; DNA; DNA Footprinting; DNA Methylation; DNA Replication Timing; Databases, Genetic; Deoxyribonuclease I; Genome; Genome, Human; Genomics; Histones; Humans; Mice; Mice, Transgenic; Molecular Sequence Annotation; RNA-Binding Proteins; Registries; Regulatory Sequences, Nucleic Acid; Transcription, Genetic; Transposases
PubMed: 32728249
DOI: 10.1038/s41586-020-2493-4 -
Nature Aug 2021The majority of gene transcripts generated by RNA polymerase II in mammalian genomes initiate at CpG island (CGI) promoters, yet our understanding of their regulation...
The majority of gene transcripts generated by RNA polymerase II in mammalian genomes initiate at CpG island (CGI) promoters, yet our understanding of their regulation remains limited. This is in part due to the incomplete information that we have on transcription factors, their DNA-binding motifs and which genomic binding sites are functional in any given cell type. In addition, there are orphan motifs without known binders, such as the CGCG element, which is associated with highly expressed genes across human tissues and enriched near the transcription start site of a subset of CGI promoters. Here we combine single-molecule footprinting with interaction proteomics to identify BTG3-associated nuclear protein (BANP) as the transcription factor that binds this element in the mouse and human genome. We show that BANP is a strong CGI activator that controls essential metabolic genes in pluripotent stem and terminally differentiated neuronal cells. BANP binding is repelled by DNA methylation of its motif in vitro and in vivo, which epigenetically restricts most binding to CGIs and accounts for differential binding at aberrantly methylated CGI promoters in cancer cells. Upon binding to an unmethylated motif, BANP opens chromatin and phases nucleosomes. These findings establish BANP as a critical activator of a set of essential genes and suggest a model in which the activity of CGI promoters relies on methylation-sensitive transcription factors that are capable of chromatin opening.
Topics: Animals; Base Sequence; Cell Cycle Proteins; Cell Line, Tumor; Chromatin; Chromatin Assembly and Disassembly; CpG Islands; DNA Methylation; DNA-Binding Proteins; Gene Expression Regulation; Genes, Essential; Humans; Mice; Nuclear Proteins; Single Molecule Imaging
PubMed: 34234345
DOI: 10.1038/s41586-021-03689-8 -
Methods in Molecular Biology (Clifton,... 2022In-gel footprinting enables the precise identification of protein binding sites on the DNA after separation of free and protein-bound DNA molecules by gel...
In-gel footprinting enables the precise identification of protein binding sites on the DNA after separation of free and protein-bound DNA molecules by gel electrophoresis in native conditions and subsequent digestion by the nuclease activity of the 1,10-phenanthroline-copper ion [(OP)-Cu] within the gel matrix. Hence, the technique combines the resolving power of protein-DNA complexes in the electrophoretic mobility shift assay (EMSA) with the precision of target site identification by chemical footprinting. This approach is particularly well suited to characterize distinct molecular assemblies in a mixture of protein-DNA complexes and to identify individual binding sites within composite operators, when the concentration-dependent occupation of binding sites, with a different affinity, results in the generation of complexes with a distinct stoichiometry and migration velocity in gel electrophoresis.
Topics: Binding Sites; DNA; Electrophoretic Mobility Shift Assay; Protein Binding; Proteins
PubMed: 35922628
DOI: 10.1007/978-1-0716-2413-5_11 -
Plant Methods Jul 2021DNA-protein interactions are essential for several molecular and cellular mechanisms, such as transcription, transcriptional regulation, DNA modifications, among others.... (Review)
Review
DNA-protein interactions are essential for several molecular and cellular mechanisms, such as transcription, transcriptional regulation, DNA modifications, among others. For many decades scientists tried to unravel how DNA links to proteins, forming complex and vital interactions. However, the high number of techniques developed for the study of these interactions made the choice of the appropriate technique a difficult task. This review intends to provide a historical context and compile the methods that describe DNA-protein interactions according to the purpose of each approach, summarise the respective advantages and disadvantages and give some examples of recent uses for each technique. The final aim of this work is to help in deciding which technique to perform according to the objectives and capacities of each research team. Considering the DNA-binding proteins characterisation, filter binding assay and EMSA are easy in vitro methods that rapidly identify nucleic acid-protein binding interactions. To find DNA-binding sites, DNA-footprinting is indeed an easier, faster and reliable approach, however, techniques involving base analogues and base-site selection are more precise. Concerning binding kinetics and affinities, filter binding assay and EMSA are useful and easy methods, although SPR and spectroscopy techniques are more sensitive. Finally, relatively to genome-wide studies, ChIP-seq is the desired method, given the coverage and resolution of the technique. In conclusion, although some experiments are easier and faster than others, when designing a DNA-protein interaction study several concerns should be taken and different techniques may need to be considered, since different methods confer different precisions and accuracies.
PubMed: 34301293
DOI: 10.1186/s13007-021-00780-z -
ELife Dec 2020Our understanding of the beads-on-a-string arrangement of nucleosomes has been built largely on high-resolution sequence-agnostic imaging methods and sequence-resolved...
Our understanding of the beads-on-a-string arrangement of nucleosomes has been built largely on high-resolution sequence-agnostic imaging methods and sequence-resolved bulk biochemical techniques. To bridge the divide between these approaches, we present the single-molecule adenine methylated oligonucleosome sequencing assay (SAMOSA). SAMOSA is a high-throughput single-molecule sequencing method that combines adenine methyltransferase footprinting and single-molecule real-time DNA sequencing to natively and nondestructively measure nucleosome positions on individual chromatin fibres. SAMOSA data allows unbiased classification of single-molecular 'states' of nucleosome occupancy on individual chromatin fibres. We leverage this to estimate nucleosome regularity and spacing on single chromatin fibres genome-wide, at predicted transcription factor binding motifs, and across human epigenomic domains. Our analyses suggest that chromatin is comprised of both regular and irregular single-molecular oligonucleosome patterns that differ subtly in their relative abundance across epigenomic domains. This irregularity is particularly striking in constitutive heterochromatin, which has typically been viewed as a conformationally static entity. Our proof-of-concept study provides a powerful new methodology for studying nucleosome organization at a previously intractable resolution and offers up new avenues for modeling and visualizing higher order chromatin structure.
Topics: Acetylation; Binding Sites; Chromatin; DNA; Epigenesis, Genetic; High-Throughput Nucleotide Sequencing; Histones; Humans; K562 Cells; Nucleic Acid Conformation; Nucleosomes; Proof of Concept Study; Protein Conformation; Protein Processing, Post-Translational; Single Molecule Imaging; Site-Specific DNA-Methyltransferase (Adenine-Specific); Transcription Factors
PubMed: 33263279
DOI: 10.7554/eLife.59404 -
BioRxiv : the Preprint Server For... Mar 2023-regulatory elements control gene expression and are dynamic in their structure, reflecting changes to the composition of diverse effector proteins over time. Here we...
-regulatory elements control gene expression and are dynamic in their structure, reflecting changes to the composition of diverse effector proteins over time. Here we sought to connect the structural changes at regulatory elements to alterations in cellular fate and function. To do this we developed PRINT, a computational method that uses deep learning to correct sequence bias in chromatin accessibility data and identifies multi-scale footprints of DNA-protein interactions. We find that multi-scale footprints enable more accurate inference of TF and nucleosome binding. Using PRINT with single-cell multi-omics, we discover wide-spread changes to the structure and function of candidate -regulatory elements (cCREs) across hematopoiesis, wherein nucleosomes slide, expose DNA for TF binding, and promote gene expression. Activity segmentation using the co-variance across cell states identifies "sub-cCREs" as modular cCRE subunits of regulatory DNA. We apply this single-cell and PRINT approach to characterize the age-associated alterations to cCREs within hematopoietic stem cells (HSCs). Remarkably, we find a spectrum of aging alterations among HSCs corresponding to a global gain of sub-cCRE activity while preserving cCRE accessibility. Collectively, we reveal the functional importance of cCRE structure across cell states, highlighting changes to gene regulation at single-cell and single-base-pair resolution.
PubMed: 37034577
DOI: 10.1101/2023.03.28.533945 -
Bio-protocol Dec 2020DNA footprinting is a classic technique to investigate protein-DNA interactions. However, traditional footprinting protocols can be unsuccessful or difficult to...
DNA footprinting is a classic technique to investigate protein-DNA interactions. However, traditional footprinting protocols can be unsuccessful or difficult to interpret if the binding of the protein to the DNA is weak, the protein has a fast off-rate, or if several different protein-DNA complexes are formed. Our protocol differs from traditional footprinting protocols, because it provides a method to isolate the protein-DNA complex from a native gel after treatment with the footprinting agent, thus removing the bound DNA from the free DNA or other protein-DNA complexes. The DNA is then extracted from the isolated complex before electrophoresis on a sequencing gel to determine the footprinting pattern. This analysis provides a possible solution for those who have been unable to use traditional footprinting methods to determine protein-DNA contacts.
PubMed: 33659492
DOI: 10.21769/BioProtoc.3843 -
Methods in Molecular Biology (Clifton,... 2021This chapter provides an overview on different methods for the characterization of RNAs in Trichoderma reesei. In the first section, protocols for the extraction of...
This chapter provides an overview on different methods for the characterization of RNAs in Trichoderma reesei. In the first section, protocols for the extraction of total RNA from fungal mycelia and the identification of 5' and 3' ends of certain RNAs of interest via rapid amplification of cDNA ends (RACE) are presented. In the next section, this knowledge on the transcriptional start and end points is used for in vitro synthesis and fluorescence labeling of the RNA of interest. The in vitro synthesized RNA can then be applied for in vitro analyses such as RNA electrophoretic mobility shift assays (RNA-EMSA) and RNA in vitro footprinting. RNA-EMSA is a method suitable for the identification and characterization of RNA-protein interactions or interactions of an RNA with other nucleic acids. RNA in vitro footprinting allows exact mapping of protein-binding sites on RNA molecules and also the determination of RNA secondary and tertiary structures at singe-nucleotide resolution. All protocols presented in this chapter are optimized for the analysis of noncoding RNAs (ncRNAs), especially long ncRNAs (lncRNAs) or other specific RNA species of more than 200 nt in length.
Topics: Chemical Precipitation; DNA, Complementary; Data Analysis; Electrophoresis, Capillary; Electrophoresis, Polyacrylamide Gel; Electrophoretic Mobility Shift Assay; Hypocreales; RNA, Fungal; RNA-Binding Proteins; Reverse Transcription; Ribonucleases
PubMed: 33165790
DOI: 10.1007/978-1-0716-1048-0_16 -
Chemistry (Weinheim An Der Bergstrasse,... Oct 2021Fluoro-substitution on the ribose moiety (e. g., 2'-F-deoxyribonucleotide) represents a popular way to modulate the ribose conformation and, hence, the structure and...
Fluoro-substitution on the ribose moiety (e. g., 2'-F-deoxyribonucleotide) represents a popular way to modulate the ribose conformation and, hence, the structure and function of nucleic acids. In the present study, we synthesized 4'-F-deoxythymidine ( T) and introduced it to oligodeoxyribonucleotides (ODNs). Though scission of the glycosylic bond of T followed by strand cleavage occurred to some extent under alkaline conditions, the T-modified ODNs were rather stable in neutral buffers. NMR studies showed that like 2'-F-deoxyribonucleoside, T exists predominantly in the North conformation not only in the nucleoside form but also in the context of ODN strands. Circular dichroism spectroscopy, thermal denaturing and RNase H1 footprinting studies of T-modified ODN/cDNA and ODN/cRNA duplexes indicated that the North conformation tendency of T is maintained in the duplexes, leading to a local structural perturbation. Collectively, 4'-F-deoxyribonucleotide structurally resembles the 2'-F-deoxyribonucleotide but imparts less structural perturbation to the duplex than the latter.
Topics: Circular Dichroism; Molecular Conformation; Nucleic Acid Conformation; Nucleosides; Oligodeoxyribonucleotides
PubMed: 34432342
DOI: 10.1002/chem.202102561 -
Methods in Molecular Biology (Clifton,... 2021The in vivo footprinting method identifies protein-targeted DNA regions under different conditions such as carbon sources. Dimethyl sulfate (DMS) generates methylated...
The in vivo footprinting method identifies protein-targeted DNA regions under different conditions such as carbon sources. Dimethyl sulfate (DMS) generates methylated purine bases at DNA sites which are not bound by proteins or transcription factors. The DNA is cleaved by HCl, and the resulting DNA fragments are 5'-end [6-FAM]-labeled by a linker-mediated PCR (LM-PCR). Fluorescent fragments are separated and analyzed on a capillary sequencer, followed by automated data analysis using the software tool ivFAST.
Topics: Base Sequence; DNA Footprinting; DNA, Fungal; Electrophoresis, Capillary; Hypocreales; Methylation; Polymerase Chain Reaction; Promoter Regions, Genetic
PubMed: 33165789
DOI: 10.1007/978-1-0716-1048-0_15