-
The Journal of Biological Chemistry Sep 2015ETS1 is the archetype of the ETS transcription factor (TF) family. ETS TFs share a DNA-binding domain, the ETS domain. All ETS TFs recognize a core GGA(A/T) binding...
ETS1 is the archetype of the ETS transcription factor (TF) family. ETS TFs share a DNA-binding domain, the ETS domain. All ETS TFs recognize a core GGA(A/T) binding site, and thus ETS TFs are found to redundantly regulate the same genes. However, each ETS TF has unique targets as well. One prevailing hypotheses explaining this duality is that protein-protein interactions, including homodimerization, allow each ETS TF to display distinct behavior. The behavior of ETS1 is further regulated by autoinhibition. Autoinhibition apparently modulates ETS1 DNA binding affinity, but the mechanism of this inhibition is not completely understood. We sought to characterize the relationship between DNA binding and ETS1 homodimer formation. We find that ETS1 interrogates DNA and forms dimers even when the DNA does not contain an ETS recognition sequence. Mutational studies also link nonspecific DNA backbone contacts with dimer formation, in addition to providing a new role for the recognition helix of ETS1 in maintaining the autoinhibited state. Finally, in showing that residues in the DNA recognition helix affect autoinhibition, we define a new function of ETS1 autoinhibition: maintenance of a monomeric state in the absence of DNA. The conservation of relevant amino acid residues across all ETS TFs indicates that the mechanisms of nonspecific DNA interrogation and protein oligomer formation elucidated here may be common to all ETS proteins that autoinhibit.
Topics: Binding Sites; Binding, Competitive; Circular Dichroism; DNA; DNA Footprinting; Deoxyribonuclease I; Humans; Mutant Proteins; Mutation; Oligonucleotides; Protein Binding; Protein Multimerization; Proto-Oncogene Protein c-ets-1; Transcription Factors
PubMed: 26195629
DOI: 10.1074/jbc.M115.671339 -
Nature Jul 2020Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected...
Combinatorial binding of transcription factors to regulatory DNA underpins gene regulation in all organisms. Genetic variation in regulatory regions has been connected with diseases and diverse phenotypic traits, but it remains challenging to distinguish variants that affect regulatory function. Genomic DNase I footprinting enables the quantitative, nucleotide-resolution delineation of sites of transcription factor occupancy within native chromatin. However, only a small fraction of such sites have been precisely resolved on the human genome sequence. Here, to enable comprehensive mapping of transcription factor footprints, we produced high-density DNase I cleavage maps from 243 human cell and tissue types and states and integrated these data to delineate about 4.5 million compact genomic elements that encode transcription factor occupancy at nucleotide resolution. We map the fine-scale structure within about 1.6 million DNase I-hypersensitive sites and show that the overwhelming majority are populated by well-spaced sites of single transcription factor-DNA interaction. Cell-context-dependent cis-regulation is chiefly executed by wholesale modulation of accessibility at regulatory DNA rather than by differential transcription factor occupancy within accessible elements. We also show that the enrichment of genetic variants associated with diseases or phenotypic traits in regulatory regions is almost entirely attributable to variants within footprints, and that functional variants that affect transcription factor occupancy are nearly evenly partitioned between loss- and gain-of-function alleles. Unexpectedly, we find increased density of human genetic variation within transcription factor footprints, revealing an unappreciated driver of cis-regulatory evolution. Our results provide a framework for both global and nucleotide-precision analyses of gene regulatory mechanisms and functional genetic variation.
Topics: Consensus Sequence; DNA; DNA Footprinting; Deoxyribonuclease I; Genetics, Population; Genome, Human; Genome-Wide Association Study; Humans; Models, Molecular; Polymorphism, Single Nucleotide; Regulatory Sequences, Nucleic Acid; Transcription Factors
PubMed: 32728250
DOI: 10.1038/s41586-020-2528-x -
Nucleic Acids Research Apr 1997In order to clarify the role of the purine 2-amino group in the recognition of DNA by small molecules we have examined the binding of actinomycin D and echinomycin to...
In order to clarify the role of the purine 2-amino group in the recognition of DNA by small molecules we have examined the binding of actinomycin D and echinomycin to artificial DNA molecules asymmetrically substituted with inosine and/or 2,6-diaminopurine (DAP) in one of the complementary strands. These DNAs, prepared by a method based upon PCR, present various potential sites for antibiotic binding, including several containing only a single purine 2-amino group in different configurations. The results show unambiguously that the presence of two 2-amino groups is mandatory for binding of actinomycin D to double-stranded DNA. In the case of echinomycin only one purine 2-amino group is required for remarkably strong binding to the asymmetric TpDAP.TpA dinucleotide step, but the CpDAP.TpI step (which also contains only a single purine-2 amino group) does not afford a binding site. Evidently, removing a 2-amino group (G-->I substitution) is dominant over adding one (A-->DAP substitution). No sequences containing just a single guanine residue are acceptable. The possibility is raised that replacing guanosine with inosine may do more than remove a group endowed with hydrogen bonding capability and interfere with ligand binding in other ways. The new methodology developed to construct asymmetrically substituted DNA substrates for this work provides a novel strategy that should be generally applicable for studying ligand-DNA interactions, beyond the specific interest in drug binding to DNA, and may help to elucidate how proteins and oligonucleotides recognize their target sites.
Topics: 2-Aminopurine; Base Composition; Binding Sites; Cytosine; DNA; DNA Footprinting; DNA Primers; Dactinomycin; Echinomycin; Hydrogen Bonding; Hypoxanthine; Inosine; Molecular Sequence Data; Nucleic Acid Heteroduplexes; Oligodeoxyribonucleotides; Polymerase Chain Reaction; Thymine
PubMed: 9092655
DOI: 10.1093/nar/25.8.1502 -
Genome Biology Feb 2019DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to... (Comparative Study)
Comparative Study
BACKGROUND
DNase-seq and ATAC-seq are broadly used methods to assay open chromatin regions genome-wide. The single nucleotide resolution of DNase-seq has been further exploited to infer transcription factor binding sites (TFBSs) in regulatory regions through footprinting. Recent studies have demonstrated the sequence bias of DNase I and its adverse effects on footprinting efficiency. However, footprinting and the impact of sequence bias have not been extensively studied for ATAC-seq.
RESULTS
Here, we undertake a systematic comparison of the two methods and show that a modification to the ATAC-seq protocol increases its yield and its agreement with DNase-seq data from the same cell line. We demonstrate that the two methods have distinct sequence biases and correct for these protocol-specific biases when performing footprinting. Despite the differences in footprint shapes, the locations of the inferred footprints in ATAC-seq and DNase-seq are largely concordant. However, the protocol-specific sequence biases in conjunction with the sequence content of TFBSs impact the discrimination of footprint from the background, which leads to one method outperforming the other for some TFs. Finally, we address the depth required for reproducible identification of open chromatin regions and TF footprints.
CONCLUSIONS
We demonstrate that the impact of bias correction on footprinting performance is greater for DNase-seq than for ATAC-seq and that DNase-seq footprinting leads to better performance. It is possible to infer concordant footprints by using replicates, highlighting the importance of reproducibility assessment. The results presented here provide an overview of the advantages and limitations of footprinting analyses using ATAC-seq and DNase-seq.
Topics: DNA Footprinting; Gene Library; Genomics; HEK293 Cells; Humans; K562 Cells; Sequence Analysis, DNA; Transcription Factors
PubMed: 30791920
DOI: 10.1186/s13059-019-1654-y -
Epigenetics & Chromatin Jun 2019As the cost of high-throughput sequencing technologies decreases, genome-wide chromatin accessibility profiling methods such as the assay of transposase-accessible...
BACKGROUND
As the cost of high-throughput sequencing technologies decreases, genome-wide chromatin accessibility profiling methods such as the assay of transposase-accessible chromatin using sequencing (ATAC-seq) are employed widely, with data accumulating at an unprecedented rate. However, accurate inference of protein occupancy requires higher-resolution footprinting analysis where major hurdles exist, including the sequence bias of nucleases and the short-lived chromatin binding of many transcription factors (TFs) with consequent lack of footprints.
RESULTS
Here we introduce an assay termed cross-link (XL)-DNase-seq, designed to capture chromatin interactions of dynamic TFs. Mild cross-linking improved the detection of DNase-based footprints of dynamic TFs but interfered with ATAC-based footprinting of the same TFs.
CONCLUSIONS
XL-DNase-seq may help extract novel gene regulatory circuits involving previously undetectable TFs. The DNase-seq and ATAC-seq data generated in our systematic comparison of various cross-linking conditions also represent an unprecedented-scale resource derived from activated mouse macrophage-like cells which share many features of inflammatory macrophages.
Topics: Animals; Chromatin; Chromatin Immunoprecipitation; Chromatin Immunoprecipitation Sequencing; DNA Footprinting; Deoxyribonuclease I; Deoxyribonucleases; Genomics; High-Throughput Nucleotide Sequencing; Humans; Mice; Sequence Analysis, DNA; Transcription Factors
PubMed: 31164146
DOI: 10.1186/s13072-019-0277-6 -
FEBS Letters May 1998Up to 1% of the human genome is represented by human endogenous retroviruses (HERVs) and their fragments that are likely footprints of ancient primate germ-cell... (Review)
Review
Up to 1% of the human genome is represented by human endogenous retroviruses (HERVs) and their fragments that are likely footprints of ancient primate germ-cell infections by retroviruses that occurred 10-60 million years ago. HERV solitary long terminal repeats (LTRs) can be often met in close vicinity to functional genes. The LTRs comprise a set of regulatory sequences like promoters, enhancers, hormone responsive elements and polyadenylation signals that might come out as new regulatory signals to resident genes and thus change their regulation in evolution. Moreover, the LTRs have a potential for chromatin remodeling that can also modulate gene expression. This review describes the integration specificity and distribution of the HERVs and LTRs in the human genome and discusses possible functional consequences of their integration in the vicinity of genes.
Topics: Animals; DNA Footprinting; Evolution, Molecular; Gene Expression Regulation; Genome, Human; Humans; Repetitive Sequences, Nucleic Acid; Retroviridae; Retroviridae Infections; Virus Integration
PubMed: 9645463
DOI: 10.1016/s0014-5793(98)00478-5 -
The Journal of Biological Chemistry May 1999The 298-amino acid ATP-dependent DNA ligase of Chlorella virus PBCV-1 is the smallest eukaryotic DNA ligase known. The enzyme has intrinsic specificity for binding to...
The 298-amino acid ATP-dependent DNA ligase of Chlorella virus PBCV-1 is the smallest eukaryotic DNA ligase known. The enzyme has intrinsic specificity for binding to nicked duplex DNA. To delineate the ligase-DNA interface, we have footprinted the enzyme binding site on DNA and the DNA binding site on ligase. The size of the exonuclease III footprint of ligase bound a single nick in duplex DNA is 19-21 nucleotides. The footprint is asymmetric, extending 8-9 nucleotides on the 3'-OH side of the nick and 11-12 nucleotides on the 5'-phosphate side. The 5'-phosphate moiety is essential for the binding of Chlorella virus ligase to nicked DNA. Here we show that the 3'-OH moiety is not required for nick recognition. The Chlorella virus ligase binds to a nicked ligand containing 2',3'-dideoxy and 5'-phosphate termini, but cannot catalyze adenylation of the 5'-end. Hence, the 3'-OH is important for step 2 chemistry even though it is not itself chemically transformed during DNA-adenylate formation. A 2'-OH cannot substitute for the essential 3'-OH in adenylation at a nick or even in strand closure at a preadenylated nick. The protein side of the ligase-DNA interface was probed by limited proteolysis of ligase with trypsin and chymotrypsin in the presence and absence of nicked DNA. Protease accessible sites are clustered within a short segment from amino acids 210-225 located distal to conserved motif V. The ligase is protected from proteolysis by nicked DNA. Protease cleavage of the native enzyme prior to DNA addition results in loss of DNA binding. These results suggest a bipartite domain structure in which the interdomain segment either comprises part of the DNA binding site or undergoes a conformational change upon DNA binding. The domain structure of Chlorella virus ligase inferred from the solution experiments is consistent with the structure of T7 DNA ligase determined by x-ray crystallography.
Topics: Amino Acid Sequence; Base Sequence; Binding Sites; Catalysis; Chymotrypsin; DNA; DNA Footprinting; DNA Ligases; Exodeoxyribonucleases; Molecular Sequence Data; Nucleic Acid Conformation; Protein Conformation; Surface Properties; Trypsin; Viral Proteins
PubMed: 10318816
DOI: 10.1074/jbc.274.20.14032 -
Cell Reports Oct 2017Protein-DNA interactions provide the basis for chromatin structure and gene regulation. Comprehensive identification of protein-occupied sites is thus vital to an...
Protein-DNA interactions provide the basis for chromatin structure and gene regulation. Comprehensive identification of protein-occupied sites is thus vital to an in-depth understanding of genome function. Dimethyl sulfate (DMS) is a chemical probe that has long been used to detect footprints of DNA-bound proteins in vitro and in vivo. Here, we describe a genomic footprinting method, dimethyl sulfate sequencing (DMS-seq), which exploits the cell-permeable nature of DMS to obviate the need for nuclear isolation. This feature makes DMS-seq simple in practice and removes the potential risk of protein re-localization during nuclear isolation. DMS-seq successfully detects transcription factors bound to cis-regulatory elements and non-canonical chromatin particles in nucleosome-free regions. Furthermore, an unexpected preference of DMS confers on DMS-seq a unique potential to directly detect nucleosome centers without using genetic manipulation. We expect that DMS-seq will serve as a characteristic method for genome-wide interrogation of in vivo protein-DNA interactions.
Topics: Cell Line; Chromosome Mapping; DNA; DNA Footprinting; DNA-Binding Proteins; Gene Expression Regulation; Gene Library; Genetic Loci; Genome, Human; Hepatocytes; High-Throughput Nucleotide Sequencing; Histones; Humans; Nucleosomes; Regulatory Sequences, Nucleic Acid; Saccharomyces cerevisiae; Sequence Analysis, DNA; Sulfuric Acid Esters
PubMed: 28978481
DOI: 10.1016/j.celrep.2017.09.035 -
Developmental Biology May 1999A long-standing problem in developmental biology has been to understand how the embryonic germ layers gain the competence to differentiate into distinct cell types.... (Review)
Review
A long-standing problem in developmental biology has been to understand how the embryonic germ layers gain the competence to differentiate into distinct cell types. Genetic studies have shown that members of the GATA and HNF3/fork head transcription factor families are essential for the formation and differentiation of gut endoderm tissues in worms, flies, and mammals. Recent in vivo footprinting studies have shown that GATA and HNF3 binding sites in chromatin are occupied on a silent gene in endoderm that has the potential to be activated solely in that germ layer. These and other data indicate that these evolutionarily conserved factors help impart the competence of a gene to be activated in development, a phenomenon called genetic potentiation. The mechanistic implications of genetic potentiation and its general significance are discussed.
Topics: Animals; DNA Footprinting; DNA-Binding Proteins; Endoderm; Evolution, Molecular; Forkhead Transcription Factors; Gene Expression Regulation, Developmental; Intestines; Mice; Models, Biological; Morphogenesis; Trans-Activators; Transcription Factors
PubMed: 10208738
DOI: 10.1006/dbio.1999.9228 -
Biochemistry Aug 2012Crystal structures of the GCN4 bZIP (basic region/leucine zipper) with the AP-1 or CRE site show how each GCN4 basic region binds to a 4 bp cognate half-site as a single...
Crystal structures of the GCN4 bZIP (basic region/leucine zipper) with the AP-1 or CRE site show how each GCN4 basic region binds to a 4 bp cognate half-site as a single DNA target; however, this may not always fully describe how bZIP proteins interact with their target sites. Previously, we showed that the GCN4 basic region interacts with all 5 bp in half-site TTGCG (termed 5H-LR) and that 5H-LR comprises two 4 bp subsites, TTGC and TGCG, which individually are also target sites of the basic region. In this work, we explore how the basic region interacts with 5H-LR when the bZIP dimer localizes to full-sites. Using AMBER molecular modeling, we simulated GCN4 bZIP complexes with full-sites containing 5H-LR to investigate in silico the interface between the basic region and 5H-LR. We also performed in vitro investigation of bZIP-DNA interactions at a number of full-sites that contain 5H-LR versus either subsite: we analyzed results from DNase I footprinting and electrophoretic mobility shift assay (EMSA) and from EMSA titrations to quantify binding affinities. Our computational and experimental results together support a highly dynamic DNA-binding model: when a bZIP dimer localizes to its target full-site, the basic region can alternately recognize either subsite as a distinct target at 5H-LR and translocate between the subsites, potentially by sliding and hopping. This model provides added insights into how α-helical DNA-binding domains of transcription factors can localize to their gene regulatory sequences in vivo.
Topics: Base Sequence; Basic-Leucine Zipper Transcription Factors; Binding Sites; CCAAT-Enhancer-Binding Proteins; DNA Footprinting; DNA-Binding Proteins; Hydrogen Bonding; Saccharomyces cerevisiae Proteins
PubMed: 22856882
DOI: 10.1021/bi300718f