-
ELife Jun 2019The existence of discrete phenotypic traits suggests that the complex regulatory processes which produce them are functionally modular. These processes are usually...
The existence of discrete phenotypic traits suggests that the complex regulatory processes which produce them are functionally modular. These processes are usually represented by networks. Only modular networks can be partitioned into intelligible subcircuits able to evolve relatively independently. Traditionally, functional modularity is approximated by detection of modularity in network structure. However, the correlation between structure and function is loose. Many regulatory networks exhibit modular behaviour without structural modularity. Here we partition an experimentally tractable regulatory network-the gap gene system of dipteran insects-using an alternative approach. We show that this system, although not structurally modular, is composed of dynamical modules driving different aspects of whole-network behaviour. All these subcircuits share the same regulatory structure, but differ in components and sensitivity to regulatory interactions. Some subcircuits are in a state of criticality, while others are not, which explains the observed differential evolvability of the various expression features in the system.
Topics: Animals; Gene Expression Regulation, Developmental; Gene Regulatory Networks; Genes, Developmental; Insecta
PubMed: 31169494
DOI: 10.7554/eLife.42832 -
BMC Evolutionary Biology Sep 2005Translation initiation in eukaryotes involves the recruitment of mRNA to the ribosome which is controlled by the translation factor eIF4E. eIF4E binds to the 5'-m7Gppp...
BACKGROUND
Translation initiation in eukaryotes involves the recruitment of mRNA to the ribosome which is controlled by the translation factor eIF4E. eIF4E binds to the 5'-m7Gppp cap-structure of mRNA. Three dimensional structures of eIF4Es bound to cap-analogues resemble 'cupped-hands' in which the cap-structure is sandwiched between two conserved Trp residues (Trp-56 and Trp-102 of H. sapiens eIF4E). A third conserved Trp residue (Trp-166 of H. sapiens eIF4E) recognizes the 7-methyl moiety of the cap-structure. Assessment of GenBank NR and dbEST databases reveals that many organisms encode a number of proteins with homology to eIF4E. Little is understood about the relationships of these structurally related proteins to each other.
RESULTS
By combining sequence data deposited in the Genbank databases, we have identified sequences encoding 411 eIF4E-family members from 230 species. These sequences have been deposited into an internet-accessible database designed for sequence comparisons of eIF4E-family members. Most members can be grouped into one of three classes. Class I members carry Trp residues equivalent to Trp-43 and Trp-56 of H. sapiens eIF4E and appear to be present in all eukaryotes. Class II members, possess Trp-->Tyr/Phe/Leu and Trp-->Tyr/Phe substitutions relative to Trp-43 and Trp-56 of H. sapiens eIF4E, and can be identified in Metazoa, Viridiplantae, and Fungi. Class III members possess a Trp residue equivalent to Trp-43 of H. sapiens eIF4E but carry a Trp-->Cys/Tyr substitution relative to Trp-56 of H. sapiens eIF4E, and can be identified in Coelomata and Cnidaria. Some eIF4E-family members from Protista show extension or compaction relative to prototypical eIF4E-family members.
CONCLUSION
The expansion of sequenced cDNAs and genomic DNAs from all eukaryotic kingdoms has revealed a variety of proteins related in structure to eIF4E. Evolutionarily it seems that a single early eIF4E gene has undergone multiple gene duplications generating multiple structural classes, such that it is no longer possible to predict function from the primary amino acid sequence of an eIF4E-family member. The variety of eIF4E-family members provides a source of alternatives on the eIF4E structural theme that will benefit structure/function analyses and therapeutic drug design.
Topics: Amino Acid Sequence; Animals; Conserved Sequence; Cysteine; DNA; DNA, Complementary; Drug Design; Eukaryotic Initiation Factor-4E; Evolution, Molecular; Genes, MHC Class II; Humans; Leucine; Molecular Sequence Data; Multigene Family; Phylogeny; Protein Biosynthesis; Protein Structure, Tertiary; RNA Caps; Sequence Analysis, DNA; Sequence Homology, Amino Acid; Structure-Activity Relationship; Tryptophan
PubMed: 16191198
DOI: 10.1186/1471-2148-5-48 -
Journal of Virology Nov 2014Ebola virus (EBOV) belongs to the group of nonsegmented negative-sense RNA viruses. The seven EBOV genes are separated by variable gene borders, including short (4- or...
UNLABELLED
Ebola virus (EBOV) belongs to the group of nonsegmented negative-sense RNA viruses. The seven EBOV genes are separated by variable gene borders, including short (4- or 5-nucleotide) intergenic regions (IRs), a single long (144-nucleotide) IR, and gene overlaps, where the neighboring gene end and start signals share five conserved nucleotides. The unique structure of the gene overlaps and the presence of a single long IR are conserved among all filoviruses. Here, we sought to determine the impact of the EBOV gene borders during viral transcription. We show that readthrough mRNA synthesis occurs in EBOV-infected cells irrespective of the structure of the gene border, indicating that the gene overlaps do not promote recognition of the gene end signal. However, two consecutive gene end signals at the VP24 gene might improve termination at the VP24-L gene border, ensuring efficient L gene expression. We further demonstrate that the long IR is not essential for but regulates transcription reinitiation in a length-dependent but sequence-independent manner. Mutational analysis of bicistronic minigenomes and recombinant EBOVs showed no direct correlation between IR length and reinitiation rates but demonstrated that specific IR lengths not found naturally in filoviruses profoundly inhibit downstream gene expression. Intriguingly, although truncation of the 144-nucleotide-long IR to 5 nucleotides did not substantially affect EBOV transcription, it led to a significant reduction of viral growth.
IMPORTANCE
Our current understanding of EBOV transcription regulation is limited due to the requirement for high-containment conditions to study this highly pathogenic virus. EBOV is thought to share many mechanistic features with well-analyzed prototype nonsegmented negative-sense RNA viruses. A single polymerase entry site at the 3' end of the genome determines that transcription of the genes is mainly controlled by gene order and cis-acting signals found at the gene borders. Here, we examined the regulatory role of the structurally unique EBOV gene borders during viral transcription. Our data suggest that transcriptional regulation in EBOV is highly complex and differs from that in prototype viruses and further the understanding of this most fundamental process in the filovirus replication cycle. Moreover, our results with recombinant EBOVs suggest a novel role of the long IR found in all filovirus genomes during the viral replication cycle.
Topics: Animals; Cell Line; DNA, Intergenic; Ebolavirus; Gene Expression Regulation, Viral; Genes, Overlapping; Genes, Viral; Humans; Transcription Termination, Genetic; Transcription, Genetic
PubMed: 25142600
DOI: 10.1128/JVI.01863-14 -
European Journal of Biochemistry Mar 1989A yeast cDNA genetic library in a bacteriophage expression vector was screened using an antiserum reacting with fructose 1,6-bisphosphate aldolase from Saccharomyces...
A yeast cDNA genetic library in a bacteriophage expression vector was screened using an antiserum reacting with fructose 1,6-bisphosphate aldolase from Saccharomyces cerevisiae. Radio-labelled probes of selected immunopositive clones were used for screening of a yeast genomic library. From the genomic clones a yeast/Escherichia coli shuttle plasmid was constructed containing on a 1990-base-pair fragment the entire structural gene FBA1 coding for yeast aldolase. The primary structure of the FBA1 gene was determined. An open reading frame comprises 1077 base pairs coding for a protein of 359 amino acids with a predicted molecular mass of 39,608 Da. As observed for other strongly expressed yeast genes, codon usage is extremely biased. The 810 base pairs at the 5' end and the 90 base pairs at the 3' end of the coding region of the cloned FBA1 gene are sufficient for normal expression and show characteristic elements present in the noncoding sequences of other yeast genes. Aldolase is the major protein in yeast cells transformed with a high-copy-number plasmid containing the FBA1 gene. The aldolase gene was disrupted by insertion of the yeast URA3 gene into the coding region of one FBA1 allele in a homozygous diploid ura3 strain. The haploid offsprings with the defective aldolase allele fba1::URA3 lack aldolase enzymatic activity and fail to grow in media containing as a carbon source metabolites of only one side of the aldolase reaction.
Topics: Amino Acid Sequence; Base Sequence; Chromosome Deletion; Cloning, Molecular; Codon; DNA, Fungal; Fructose-Bisphosphate Aldolase; Genes; Genes, Fungal; Molecular Sequence Data; Restriction Mapping; Saccharomyces cerevisiae
PubMed: 2647491
DOI: 10.1111/j.1432-1033.1989.tb14648.x -
PloS One 2013Cone snails, which are predatory marine gastropods, produce a cocktail of venoms used for predation, defense and competition. The major venom component, conotoxin, has...
Cone snails, which are predatory marine gastropods, produce a cocktail of venoms used for predation, defense and competition. The major venom component, conotoxin, has received significant attention because it is useful in neuroscience research, drug development and molecular diversity studies. In this study, we report the genomic characterization of nine conotoxin gene superfamilies from 18 Conus species and investigate the relationships among conotoxin gene structure, molecular evolution and diversity. The I1, I2, M, O2, O3, P, S, and T superfamily precursors all contain three exons and two introns, while A superfamily members contain two exons and one intron. The introns are conserved within a certain gene superfamily, and also conserved across different Conus species, but divergent among different superfamilies. The intronic sequences contain many simple repeat sequences and regulatory elements that may influence conotoxin gene expression. Furthermore, due to the unique gene structure of conotoxins, the base substitution rates and the number of positively selected sites vary greatly among exons. Many more point mutations and trinucleotide indels were observed in the mature peptide exon than in the other exons. In addition, the first example of alternative splicing in conotoxin genes was found. These results suggest that the diversity of conotoxin genes has been shaped by point mutations and indels, as well as rare gene recombination or alternative splicing events, and that the unique gene structures could have made a contribution to the evolution of conotoxin genes.
Topics: Alternative Splicing; Animals; Base Composition; Conus Snail; Evolution, Molecular; Exons; Gene Expression Profiling; Gene Expression Regulation; Gene Order; Introns; Multigene Family; Open Reading Frames; Peptides; Sequence Analysis, DNA; Snake Venoms
PubMed: 24349297
DOI: 10.1371/journal.pone.0082495 -
Proceedings of the National Academy of... Jan 1981M and N are the two common ("normal") alleles at the MN locus of the MNSs blood group system. The antigens M and N that they determine are located within the...
M and N are the two common ("normal") alleles at the MN locus of the MNSs blood group system. The antigens M and N that they determine are located within the amino-terminal region of glycophorin A. In the serologically active and glycosylated (*) fragment of glycophorin AN the sequence is Leu-Ser*-Thr*-Thr*-Glu-, and in that of glycophorin AM it is Ser-Ser*-Thr*-Thr*-Gly-. Mg and Mc are very rare ("variant") alleles of M and N; as to the corresponding antigens, Mg is serologically quite distinct from M and N, while Mc is a compound of both. Erythrocytes of genotypes MgN, MgM, MgMg, and McM, which were the object of the present study, contain normal amounts of glycophorin A in their membrane. In glycophorin AMg the amino-terminal sequence is related to that of glycophorin AN by substitution of asparagine for threonine in position 4, and it is nonglycosylated: Leu-Ser-Thr-Asn-Glu-. The corresponding structure of glycophorin AMc is Ser-Ser*-Thr*-Thr*-Glu-; it is thus closely related to that of glycophorin AN and AM, by substitution of the amino acids in positions 1 or 5, respectively. All of these substitutions can be explained by single base changes. The distinctions in chemical structure not only confirm the location of M and N in this region of glycophorin A, because they are the only differences observed, but also indicate, because they are correlated with the distinctions in antigenic specificity, that M and N are structural genes coding for amino acid sequences. The finding that Mc contains structural features of both M and N suggests that these two forms of glycophorin A have evolved from a common ancestral gene by single base substitutions at sites in the genome coding for amino acids in positions 1 and 5 of the sequence. Carbohydrate structures, however, are also necessary for full expression of antigens M and N. Glycosylation during biosynthesis of residues within the polypeptide appears to depend on a particular protein structure.
Topics: Alleles; Amino Acid Sequence; Epitopes; Genes; Genetic Variation; Genotype; Glycophorins; Humans; MNSs Blood-Group System; Mutation; Sialoglycoproteins
PubMed: 6166001
DOI: 10.1073/pnas.78.1.631 -
PloS One 2017Soybean (Glycine max) is one of the major crops worldwide and flooding stress affects the production and expansion of cultivated areas. Oxygen is essential for...
Soybean (Glycine max) is one of the major crops worldwide and flooding stress affects the production and expansion of cultivated areas. Oxygen is essential for mitochondrial aerobic respiration to supply the energy demand of plant cells. Because oxygen diffusion in water is 10,000 times lower than in air, partial (hypoxic) or total (anoxic) oxygen deficiency is important component of flooding. Even when oxygen is externally available, oxygen deficiency frequently occurs in bulky, dense or metabolically active tissues such as phloem, meristems, seeds, and fruits. In this study, we analyzed conserved and divergent root transcriptional responses between flood-tolerant Embrapa 45 and flood-sensitive BR 4 soybean cultivars under hypoxic stress conditions with RNA-seq. To understand how soybean genes evolve and respond to hypoxia, stable and differentially expressed genes were characterized structurally and compositionally comparing its mechanistic relationship. Between cultivars, Embrapa 45 showed less up- and more down-regulated genes, and stronger induction of phosphoglucomutase (Glyma05g34790), unknown protein related to N-terminal protein myristoylation (Glyma06g03430), protein suppressor of phyA-105 (Glyma06g37080), and fibrillin (Glyma10g32620). RNA-seq and qRT-PCR analysis of non-symbiotic hemoglobin (Glyma11g12980) indicated divergence in gene structure between cultivars. Transcriptional changes for genes in amino acids and derivative metabolic process suggest involvement of amino acids metabolism in tRNA modifications, translation accuracy/efficiency, and endoplasmic reticulum stress in both cultivars under hypoxia. Gene groups differed in promoter TATA box, ABREs (ABA-responsive elements), and CRT/DREs (C-repeat/dehydration-responsive elements) frequency. Gene groups also differed in structure, composition, and codon usage, indicating biological significances. Additional data suggests that cis-acting ABRE elements can mediate gene expression independent of ABA in soybean roots under hypoxia.
Topics: Gene Expression Regulation, Plant; Genes, Plant; Oxygen; Glycine max; Stress, Physiological; Transcriptome
PubMed: 29145496
DOI: 10.1371/journal.pone.0187920 -
Nucleic Acids Research Jan 2010Sets of genes expressed in the same tissue are believed to be under the regulation of a similar set of transcription factors, and can thus be assumed to contain similar...
Sets of genes expressed in the same tissue are believed to be under the regulation of a similar set of transcription factors, and can thus be assumed to contain similar structural patterns in their regulatory regions. Here we present a study of the structural patterns in promoters of genes expressed specifically in 26 human and 34 mouse tissues. For each tissue we constructed promoter structure models, taking into account presences of motifs, their positioning to the transcription start site, and pairwise positioning of motifs. We found that 35 out of 60 models (58%) were able to distinguish positive test promoter sequences from control promoter sequences with statistical significance. Models with high performance include those for liver, skeletal muscle, kidney and tongue. Many of the important structural patterns in these models involve transcription factors of known importance in the tissues in question and structural patterns tend to be conserved between human and mouse. In addition to that, promoter models for related tissues tend to have high inter-tissue performance, indicating that their promoters share common structural patterns. Together, these results illustrate the validity of our models, but also indicate that the promoter structures for some tissues are easier to model than those of others.
Topics: Animals; Binding Sites; Gene Expression Regulation; Humans; Liver; Mice; Models, Genetic; Promoter Regions, Genetic; Sequence Analysis, DNA; Transcription Factors; Transcription Initiation Site
PubMed: 19850720
DOI: 10.1093/nar/gkp866 -
BMC Evolutionary Biology Aug 2007Histidine biosynthesis is one of the best characterized anabolic pathways. There is a large body of genetic and biochemical information available, including operon...
BACKGROUND
Histidine biosynthesis is one of the best characterized anabolic pathways. There is a large body of genetic and biochemical information available, including operon structure, gene expression, and increasingly larger sequence databases. For over forty years this pathway has been the subject of extensive studies, mainly in Escherichia coli and Salmonella enterica, in both of which details of histidine biosynthesis appear to be identical. In these two enterobacteria the pathway is unbranched, includes a number of unusual reactions, and consists of nine intermediates; his genes are arranged in a compact operon (hisGDC [NB]HAF [IE]), with three of them (hisNB, hisD and hisIE) coding for bifunctional enzymes. We performed a detailed analysis of his gene fusions in available genomes to understand the role of gene fusions in shaping this pathway.
RESULTS
The analysis of HisA structures revealed that several gene elongation events are at the root of this protein family: internal duplication have been identified by structural superposition of the modules composing the TIM-barrel protein. Several his gene fusions happened in distinct taxonomic lineages; hisNB originated within gamma-proteobacteria and after its appearance it was transferred to Campylobacter species (epsilon-proteobacteria) and to some Bacteria belonging to the CFB group. The transfer involved the entire his operon. The hisIE gene fusion was found in several taxonomic lineages and our results suggest that it probably happened several times in distinct lineages. Gene fusions involving hisIE and hisD genes (HIS4) and hisH and hisF genes (HIS7) took place in the Eukarya domain; the latter has been transferred to some delta-proteobacteria.
CONCLUSION
Gene duplication is the most widely known mechanism responsible for the origin and evolution of metabolic pathways; however, several other mechanisms might concur in the process of pathway assembly and gene fusion appeared to be one of the most important and common.
Topics: Amino Acid Sequence; Archaea; Bacteria; Base Sequence; Eukaryotic Cells; Evolution, Molecular; Gene Duplication; Gene Fusion; Genes, Archaeal; Genes, Bacterial; Histidine; Metabolic Networks and Pathways; Phylogeny; Sequence Alignment
PubMed: 17767732
DOI: 10.1186/1471-2148-7-S2-S4 -
ELife Jun 2022Splicing is highly regulated and is modulated by numerous factors. Quantitative predictions for how a mutation will affect precursor mRNA (pre-mRNA) structure and...
Splicing is highly regulated and is modulated by numerous factors. Quantitative predictions for how a mutation will affect precursor mRNA (pre-mRNA) structure and downstream function are particularly challenging. Here, we use a novel chemical probing strategy to visualize endogenous precursor and mature mRNA structures in cells. We used these data to estimate Boltzmann suboptimal structural ensembles, which were then analyzed to predict consequences of mutations on pre-mRNA structure. Further analysis of recent cryo-EM structures of the spliceosome at different stages of the splicing cycle revealed that the footprint of the B complex with pre-mRNA best predicted alternative splicing outcomes for exon 10 inclusion of the alternatively spliced gene, achieving 74% accuracy. We further developed a β-regression weighting framework that incorporates splice site strength, RNA structure, and exonic/intronic splicing regulatory elements capable of predicting, with 90% accuracy, the effects of 47 known and 6 newly discovered mutations on inclusion of exon 10 of . This combined experimental and computational framework represents a path forward for accurate prediction of splicing-related disease-causing variants.
Topics: Alternative Splicing; Exons; Introns; Mutation; RNA Precursors; RNA Splice Sites; RNA Splicing; RNA, Messenger
PubMed: 35695373
DOI: 10.7554/eLife.73888