-
Genome Biology and Evolution 2011Organisms show striking differences in genome structure; however, the functional implications and fundamental forces that govern these differences remain obscure. The...
Organisms show striking differences in genome structure; however, the functional implications and fundamental forces that govern these differences remain obscure. The intron-exon organization of nuclear genes is involved in a particularly large variety of structures and functional roles. We performed a 22-species study of Meis/hth genes, intron-rich homeodomain-containing transcription factors involved in a wide range of developmental processes. Our study revealed three surprising results that suggest important and very different functions for Meis intron-exon structures. First, we find unexpected conservation across species of intron positions and lengths along most of the Meis locus. This contrasts with the high degree of structural divergence found in genome-wide studies and may attest to conserved regulatory elements residing within these conserved introns. Second, we find very different evolutionary histories for the 5' and 3' regions of the gene. The 5'-most 10 exons, which encode the highly conserved Meis domain and homeodomain, show striking conservation. By contrast, the 3' of the gene, which encodes several domains implicated in transcriptional activation and response to cell signaling, shows a remarkably active evolutionary history, with diverse isoforms and frequent creation and loss of new exons and splice sites. This region-specific diversity suggests evolutionary "tinkering," with alternative splicing allowing for more subtle regulation of protein function. Third, we find a large number of cases of convergent evolution in the 3' region, including 1) parallel losses of ancestral coding sequence, 2) parallel gains of external and internal splice sites, and 3) recurrent truncation of C-terminal coding regions. These results attest to the importance of locus-specific splicing functions in differences in structural evolution across genes, as well as to commonalities of forces shaping the evolution of individual genes along different lineages.
Topics: 3' Untranslated Regions; 5' Untranslated Regions; Alternative Splicing; Animals; Base Sequence; Conserved Sequence; Evolution, Molecular; Exons; Homeodomain Proteins; Humans; Introns; Invertebrates; Vertebrates
PubMed: 21680890
DOI: 10.1093/gbe/evr056 -
Nature Immunology Feb 2009The elusive etiology of germline bias of the T cell receptor (TCR) for major histocompatibility complex (MHC) has been clarified by recent 'proof-of-concept' structural... (Review)
Review
The elusive etiology of germline bias of the T cell receptor (TCR) for major histocompatibility complex (MHC) has been clarified by recent 'proof-of-concept' structural results demonstrating the conservation of specific TCR-MHC interfacial contacts in complexes bearing common variable segments and MHC allotypes. We suggest that each TCR variable-region gene product engages each type of MHC through a 'menu' of structurally coded recognition motifs that have arisen through coevolution. The requirement for MHC-restricted T cell recognition during thymic selection and peripheral surveillance has necessitated the existence of such a coded recognition system. Given these findings, a reconsideration of the TCR-peptide-MHC structural database shows that not only have the answers been there all along but also they were predictable by the first principles of physical chemistry.
Topics: Animals; Genes, T-Cell Receptor; Humans; Major Histocompatibility Complex; Protein Structure, Quaternary; Receptors, Antigen, T-Cell
PubMed: 19148199
DOI: 10.1038/ni.f.219 -
PloS One 2016In the modern era of post genomics and transcriptomics, non-coding RNAs and non-coding regions of many RNAs are a big puzzle when we try deciphering their role in...
In the modern era of post genomics and transcriptomics, non-coding RNAs and non-coding regions of many RNAs are a big puzzle when we try deciphering their role in specific gene function. Gene function assessment is a main task wherein high throughput technologies provide an impressive body of data that enables the design of hypotheses linking genes to phenotypes. Gene knockdown technologies and RNA-dependent gene silencing are the most frequent approaches to assess the role of key effectors in a particular scenario. Ribozymes are effective modulators of gene expression because of their simple structure, site-specific cleavage activity, and catalytic potential. In our study, after an extensive transcriptomic search of Leishmania major transcriptome we found a Putative ATP dependent DNA helicase (Lmjf_09_0590) 3' UTR which has a structural signature similar to well-known HDV hammerhead ribozyme, even though they have variable sequence motifs. Henceforth, to determine their structural stability and sustainability we analyzed our predicted structural model of this 3'UTR with a 30ns MD simulation, further confirmed with 100ns MD simulation in presence of 5mM MgCl2 ionic environment. In this environment, structural stability was significantly improved by bonded interactions between a RNA backbone and Mg2+ ions. These predictions were further validated in silico using RNA normal mode analysis and anisotropic network modelling (ANM) studies. The study may be significantly imparted to know the functional importance of many such 3'UTRs to predict their role in a mechanistic manner.
Topics: 3' Untranslated Regions; Base Sequence; Computational Biology; Gene Expression Profiling; Leishmania major; Molecular Dynamics Simulation; Molecular Sequence Data; Nucleic Acid Conformation; RNA, Catalytic; Regulatory Elements, Transcriptional; Sequence Alignment; Transcriptome
PubMed: 26901858
DOI: 10.1371/journal.pone.0148909 -
The Anatomical Record Feb 1999The cloning of genes involved in pathways fundamental to morphogenesis has opened the door to visualizing expression of developmental regulatory genes in many organisms.... (Comparative Study)
Comparative Study Review
The cloning of genes involved in pathways fundamental to morphogenesis has opened the door to visualizing expression of developmental regulatory genes in many organisms. Expression data have become technical commonplace in analysis of mutants of Drosophila melanogaster and a handful of other genetic model systems. Many researchers have used probes and extended the logic from studies of D. melanogaster for comparisons of expression patterns to infer developmental bases for homologous structures among animals with diverse body plans. This research program has led to exciting but sweeping generalizations about how development evolves. Here we examine several underlying assumptions of this approach in terms of comparative and historical biology. First, we evaluate the logic that underlies the equation of gene expression similarity with homologous morphology. Second, we examine epistemological issues surrounding the descriptive visualization of gene expression patterns. We conclude by examining the role of phylogenetic coding and mapping of these patterns to examine the evolution of complex gene regulatory networks.
Topics: Animals; Evolution, Molecular; Gene Expression Regulation, Developmental; Genes; Genotype; Phenotype; Research
PubMed: 10333399
DOI: 10.1002/(SICI)1097-0185(19990215)257:1<6::AID-AR4>3.0.CO;2-I -
The Journal of Biological Chemistry Jul 1994The cell-surface glycoprotein CD36 interacts with a large variety of ligands, including collagen types I and IV, thrombospondin, erythrocytes parasitized with Plasmodium...
The cell-surface glycoprotein CD36 interacts with a large variety of ligands, including collagen types I and IV, thrombospondin, erythrocytes parasitized with Plasmodium falciparum, platelet-agglutinating protein p37, oxidized low density lipoprotein, and long-chain fatty acids. Its expression is restricted to platelets, monocytes, adipocytes, and some endothelial and epithelial cells and is regulated during cell activation, differentiation, and development. CD36 belongs to a novel gene family of structurally related glycoproteins that includes CLA-1 and the lysosomal membrane glycoprotein LIMPII. To advance our knowledge on the genomic organization and the regulation of the cellular expression of the genes of this family, we have investigated the structural organization of the human CD36 gene and of its 5'-proximal flanking region. The CD36 gene is encoded by 15 exons that extend more than 32 kilobases on the human genome. Interestingly, the CD36 mRNA 5'-untranslated region is encoded by three exons. The 3'-untranslated region is contained in two exons, whose expression pattern can originate two mRNA forms. The cytoplasmic and transmembrane regions predicted at both terminal ends of the polypeptide chain are encoded by single exons, while the extracellular domain is encoded by 11 exons. The transcription initiation site of the CD36 gene is located 289 nucleotides upstream from the translational start codon. Sequence analysis of the proximal 5'-flanking region of the gene reveals the existence of a TATA box appropriately located with respect to the transcription initiation site and several potential cis-regulatory elements that might contribute to the transcriptional regulation of the CD36 gene. Delineation of the structural organization of the CD36 gene may help in defining the boundaries of relevant structural and/or functional domains in CD36 and, by extension, in the other members of the family.
Topics: Antigens, CD; Base Sequence; CD36 Antigens; Exons; Genes; Humans; Molecular Sequence Data; Oligodeoxyribonucleotides; Promoter Regions, Genetic; RNA, Messenger; Restriction Mapping; Transcription, Genetic
PubMed: 7518447
DOI: No ID Found -
Developmental Biology Sep 2021Previous studies on mouse embryo limbs have established that interzone mesenchymal progenitor cells emerging at each prescribed joint site give rise to joint tissues...
Previous studies on mouse embryo limbs have established that interzone mesenchymal progenitor cells emerging at each prescribed joint site give rise to joint tissues over fetal time. These incipient tissues undergo structural maturation and morphogenesis postnatally, but underlying mechanisms of regulation remain unknown. Hox11 genes dictate overall zeugopod musculoskeletal patterning and skeletal element identities during development. Here we asked where these master regulators are expressed in developing limb joints and whether they are maintained during postnatal zeugopod joint morphogenesis. We found that Hoxa11 was predominantly expressed and restricted to incipient wrist and ankle joints in E13.5 mouse embryos, and became apparent in medial and central regions of knees by E14.5, though remaining continuously dormant in elbow joints. Closer examination revealed that Hoxa11 initially characterized interzone and neighboring cells and was then restricted to nascent articular cartilage, intra joint ligaments and structures such as meniscal horns over prenatal time. Postnatally, articular cartilage progresses from a nondescript cell-rich, matrix-poor tissue to a highly structured, thick, zonal and mechanically competent tissue with chondrocyte columns over time, most evident at sites such as the tibial plateau. Indeed, Hox11 expression (primarily Hoxa11) was intimately coupled to such morphogenetic processes and, in particular, to the topographical rearrangement of chondrocytes into columns within the intermediate and deep zones of tibial plateau that normally endures maximal mechanical loads. Revealingly, these expression patterns were maintained even at 6 months of age. In sum, our data indicate that Hox11 genes remain engaged well beyond embryonic synovial joint patterning and are specifically tied to postnatal articular cartilage morphogenesis into a zonal and resilient tissue. The data demonstrate that Hox11 genes characterize adult, terminally differentiated, articular chondrocytes and maintain region-specificity established in the embryo.
Topics: Animals; Cartilage, Articular; Chondrogenesis; Extremities; Gene Expression Regulation, Developmental; Genes, Homeobox; Genes, Reporter; Green Fluorescent Proteins; Mice; Synovial Membrane
PubMed: 34010606
DOI: 10.1016/j.ydbio.2021.05.007 -
BMC Genomics Sep 2013Divergence in gene structure following gene duplication is not well understood. Gene duplication can occur via whole-genome duplication (WGD) and single-gene...
BACKGROUND
Divergence in gene structure following gene duplication is not well understood. Gene duplication can occur via whole-genome duplication (WGD) and single-gene duplications including tandem, proximal and transposed duplications. Different modes of gene duplication may be associated with different types, levels, and patterns of structural divergence.
RESULTS
In Arabidopsis thaliana, we denote levels of structural divergence between duplicated genes by differences in coding-region lengths and average exon lengths, and the number of insertions/deletions (indels) and maximum indel length in their protein sequence alignment. Among recent duplicates of different modes, transposed duplicates diverge most dramatically in gene structure. In transposed duplications, parental loci tend to have longer coding-regions and exons, and smaller numbers of indels and maximum indel lengths than transposed loci, reflecting biased structural changes in transposed duplications. Structural divergence increases with evolutionary time for WGDs, but not transposed duplications, possibly because of biased gene losses following transposed duplications. Structural divergence has heterogeneous relationships with nucleotide substitution rates, but is consistently positively correlated with gene expression divergence. The NBS-LRR gene family shows higher-than-average levels of structural divergence.
CONCLUSIONS
Our study suggests that structural divergence between duplicated genes is greatly affected by the mechanisms of gene duplication and may be not proportional to evolutionary time, and that certain gene families are under selection on rapid evolution of gene structure.
Topics: Arabidopsis; Arabidopsis Proteins; DNA Transposable Elements; Gene Dosage; Gene Duplication; Gene Expression Regulation, Plant; Genes, Duplicate; Genes, Plant; Genetic Variation; Multigene Family; Nucleotides
PubMed: 24063813
DOI: 10.1186/1471-2164-14-652 -
BMC Plant Biology Jul 2008The MYB superfamily constitutes the most abundant group of transcription factors described in plants. Members control processes such as epidermal cell differentiation,...
BACKGROUND
The MYB superfamily constitutes the most abundant group of transcription factors described in plants. Members control processes such as epidermal cell differentiation, stomatal aperture, flavonoid synthesis, cold and drought tolerance and pathogen resistance. No genome-wide characterization of this family has been conducted in a woody species such as grapevine. In addition, previous analysis of the recently released grape genome sequence suggested expansion events of several gene families involved in wine quality.
RESULTS
We describe and classify 108 members of the grape R2R3 MYB gene subfamily in terms of their genomic gene structures and similarity to their putative Arabidopsis thaliana orthologues. Seven gene models were derived and analyzed in terms of gene expression and their DNA binding domain structures. Despite low overall sequence homology in the C-terminus of all proteins, even in those with similar functions across Arabidopsis and Vitis, highly conserved motif sequences and exon lengths were found. The grape epidermal cell fate clade is expanded when compared with the Arabidopsis and rice MYB subfamilies. Two anthocyanin MYBA related clusters were identified in chromosomes 2 and 14, one of which includes the previously described grape colour locus. Tannin related loci were also detected with eight candidate homologues in chromosomes 4, 9 and 11.
CONCLUSION
This genome wide transcription factor analysis in Vitis suggests that clade-specific grape R2R3 MYB genes are expanded while other MYB genes could be well conserved compared to Arabidopsis. MYB gene abundance, homology and orientation within particular loci also suggests that expanded MYB clades conferring quality attributes of grapes and wines, such as colour and astringency, could possess redundant, overlapping and cooperative functions.
Topics: Arabidopsis; Conserved Sequence; Gene Expression Profiling; Gene Expression Regulation, Plant; Genes, Plant; Genome, Plant; Multigene Family; Phylogeny; Vitis; Wine
PubMed: 18647406
DOI: 10.1186/1471-2229-8-83 -
Sheng Wu Gong Cheng Xue Bao = Chinese... May 2008In this work, we have analyzed the genetic variation that can alter the expression and the function in BRCA2 gene using computational methods. Out of the total 534 SNPs,...
In this work, we have analyzed the genetic variation that can alter the expression and the function in BRCA2 gene using computational methods. Out of the total 534 SNPs, 101 were found to be non synonymous (nsSNPs). Among the 7 SNPs in the untranslated region, 3 SNPs were found in 5' and 4 SNPs were found in 3' un-translated regions (UTR). Of the nsSNPs 20.7% were found to be damaging by both SIFT and PolyPhen server among the 101 nsSNPs investigated. UTR resource tool suggested that 2 SNPs in the 5' UTR region and 4 SNPs in the 3' UTR regions might change the protein expression levels. The mutation from asparagine to isoleucine at the position 3124 of the native protein of BRCA2 gene was most deleterious by both SIFT and PolyPhen servers. A structural analysis of this mutated protein and the native protein was made which had an RMSD value of 0.301 nm. Based on this work, we proposed that this most deleterious nsSNP with an SNPid rs28897759 is an important candidate for the cause of breast cancer by BRCA2 gene.
Topics: 3' Untranslated Regions; 5' Untranslated Regions; Amino Acid Substitution; Apoptosis Regulatory Proteins; BRCA2 Protein; Breast Neoplasms; Computational Biology; Genes, BRCA2; Humans; Models, Molecular; Mutation; Polymorphism, Single Nucleotide; Structure-Activity Relationship
PubMed: 18724707
DOI: 10.1016/s1872-2075(08)60042-4 -
BMC Bioinformatics Jun 2014Accurate computational identification of eukaryotic gene organization is a long-standing problem. Despite the fundamental importance of precise annotation of genes...
BACKGROUND
Accurate computational identification of eukaryotic gene organization is a long-standing problem. Despite the fundamental importance of precise annotation of genes encoded in newly sequenced genomes, the accuracy of predicted gene structures has not been critically evaluated, mostly due to the scarcity of proper assessment methods.
RESULTS
We present a gene-structure-aware multiple sequence alignment method for gene prediction using amino acid sequences translated from homologous genes from many genomes. The approach provides rich information concerning the reliability of each predicted gene structure. We have also devised an iterative method that attempts to improve the structures of suspiciously predicted genes based on a spliced alignment algorithm using consensus sequences or reliable homologs as templates. Application of our methods to cytochrome P450 and ribosomal proteins from 47 plant genomes indicated that 50 ~ 60 % of the annotated gene structures are likely to contain some defects. Whereas more than half of the defect-containing genes may be intrinsically broken, i.e. they are pseudogenes or gene fragments, located in unfinished sequencing areas, or corresponding to non-productive isoforms, the defects found in a majority of the remaining gene candidates can be remedied by our iterative refinement method.
CONCLUSIONS
Refinement of eukaryotic gene structures mediated by gene-structure-aware multiple protein sequence alignment is a useful strategy to dramatically improve the overall prediction quality of a set of homologous genes. Our method will be applicable to various families of protein-coding genes if their domain structures are evolutionarily stable. It is also feasible to apply our method to gene families from all kingdoms of life, not just plants.
Topics: Algorithms; Genome, Plant; High-Throughput Nucleotide Sequencing; Introns; Plant Proteins; Pseudogenes; Reproducibility of Results; Sequence Alignment
PubMed: 24927652
DOI: 10.1186/1471-2105-15-189