-
Genes & Development Dec 1987We have determined the nucleotide sequence and transforming activity of the human L-myc gene and a processed L-myc pseudogene (L-myc psi). We demonstrate by...
We have determined the nucleotide sequence and transforming activity of the human L-myc gene and a processed L-myc pseudogene (L-myc psi). We demonstrate by cotransformation assays that a 10.6-kb EcoRI fragment derived from a human placental library contains a complete and functional L-myc gene including transcriptional regulatory sequences sufficient for expression in rat embryo fibroblasts. Organization of the L-myc gene was determined by comparing its sequence to those of the L-myc psi gene and an L-myc cDNA clone derived from a human small cell lung carcinoma. Our results show that L-myc has a three-exon organization similar to that of the c-myc and N-myc genes. The putative L-myc gene product consists of 364 amino acids and contains five of the seven homology regions highly conserved between c-myc and N-myc. These conserved regions are located along the entire length of the putative L-myc protein and are interspersed among nonconserved regions. While the putative L-myc gene product is of a smaller size when compared to the c- and N-myc proteins, the relative positions of certain conserved residues occur in corresponding locations along the peptide backbone of the three proteins. In addition, comparison of the human and murine L-myc gene sequences indicate that the relatively large 5' and 3' untranslated regions are evolutionarily conserved, but that these sequences are totally divergent between the L-, c-, and N-myc genes. Finally, we demonstrate that, like the N- and c-myc genes, the L-myc gene can cooperate with a mutant Ha-ras gene to cause malignant transformation of rat embryo fibroblasts in culture. Our analyses clearly prove that L-myc represents a functional member of the myc oncogene family and further delineate structural features that may be important for the common and divergent functions of the members of this gene family.
Topics: Amino Acid Sequence; Animals; Base Sequence; Cell Transformation, Neoplastic; Cells, Cultured; Genes, ras; Humans; Molecular Sequence Data; Multigene Family; Proto-Oncogene Proteins; Proto-Oncogenes; Pseudogenes; Rats
PubMed: 3322939
DOI: 10.1101/gad.1.10.1311 -
PloS One 2014NAD(H) kinase (NADK) is the key enzyme that catalyzes de novo synthesis of NADP(H) from NAD(H) for NADP(H)-based metabolic pathways. In plants, NADKs form functional...
BACKGROUND
NAD(H) kinase (NADK) is the key enzyme that catalyzes de novo synthesis of NADP(H) from NAD(H) for NADP(H)-based metabolic pathways. In plants, NADKs form functional subfamilies. Studies of these families in Arabidopsis thaliana indicate that they have undergone considerable evolutionary selection; however, the detailed evolutionary history and functions of the various NADKs in plants are not clearly understood.
PRINCIPAL FINDINGS
We performed a comparative genomic analysis that identified 74 NADK gene homologs from 24 species representing the eight major plant lineages within the supergroup Plantae: glaucophytes, rhodophytes, chlorophytes, bryophytes, lycophytes, gymnosperms, monocots and eudicots. Phylogenetic and structural analysis classified these NADK genes into four well-conserved subfamilies with considerable variety in the domain organization and gene structure among subfamily members. In addition to the typical NAD_kinase domain, additional domains, such as adenylate kinase, dual-specificity phosphatase, and protein tyrosine phosphatase catalytic domains, were found in subfamily II. Interestingly, NADKs in subfamily III exhibited low sequence similarity (∼30%) in the kinase domain within the subfamily and with the other subfamilies. These observations suggest that gene fusion and exon shuffling may have occurred after gene duplication, leading to specific domain organization seen in subfamilies II and III, respectively. Further analysis of the exon/intron structures showed that single intron loss and gain had occurred, yielding the diversified gene structures, during the process of structural evolution of NADK family genes. Finally, both available global microarray data analysis and qRT-RCR experiments revealed that the NADK genes in Arabidopsis and Oryza sativa show different expression patterns in different developmental stages and under several different abiotic/biotic stresses and hormone treatments, underscoring the functional diversity and functional divergence of the NADK family in plants.
CONCLUSIONS
These findings will facilitate further studies of the NADK family and provide valuable information for functional validation of this family in plants.
Topics: Arabidopsis; Evolution, Molecular; Exons; Gene Expression Regulation, Plant; Genes, Plant; Genome-Wide Association Study; Introns; Multigene Family; Organ Specificity; Oryza; Phosphotransferases (Alcohol Group Acceptor); Phylogeny; Plants; Promoter Regions, Genetic; Response Elements; Stress, Physiological
PubMed: 24968225
DOI: 10.1371/journal.pone.0101051 -
Nature Communications Sep 2021Bacteria of the genus Streptomyces are prolific producers of specialized metabolites, including antibiotics. The linear chromosome includes a central region harboring...
Bacteria of the genus Streptomyces are prolific producers of specialized metabolites, including antibiotics. The linear chromosome includes a central region harboring core genes, as well as extremities enriched in specialized metabolite biosynthetic gene clusters. Here, we show that chromosome structure in Streptomyces ambofaciens correlates with genetic compartmentalization during exponential phase. Conserved, large and highly transcribed genes form boundaries that segment the central part of the chromosome into domains, whereas the terminal ends tend to be transcriptionally quiescent compartments with different structural features. The onset of metabolic differentiation is accompanied by a rearrangement of chromosome architecture, from a rather 'open' to a 'closed' conformation, in which highly expressed specialized metabolite biosynthetic genes form new boundaries. Thus, our results indicate that the linear chromosome of S. ambofaciens is partitioned into structurally distinct entities, suggesting a link between chromosome folding, gene expression and genome evolution.
Topics: Anti-Bacterial Agents; Chromosome Structures; Chromosomes, Bacterial; Gene Expression Regulation, Bacterial; Genome, Bacterial; Multigene Family; Streptomyces; Transcriptome
PubMed: 34471117
DOI: 10.1038/s41467-021-25462-1 -
BMC Bioinformatics Mar 2019The inference of splicing orthology relationships between gene transcripts is a basic step for the prediction of transcripts and the annotation of gene structures in...
BACKGROUND
The inference of splicing orthology relationships between gene transcripts is a basic step for the prediction of transcripts and the annotation of gene structures in genomes. The splicing structure of a sequence refers to the exon extremity information in a CDS or the exon-intron extremity information in a gene sequence. Splicing orthologous CDS are pairs of CDS with similar sequences and conserved splicing structures from orthologous genes. Spliced alignment that consists in aligning a spliced cDNA sequence against an unspliced genomic sequence, constitutes a promising, yet unexplored approach for the identification of splicing orthology relationships. Existing spliced alignment algorithms do not exploit the information on the splicing structure of the input sequences, namely the exon structure of the cDNA sequence and the exon-intron structure of the genomic sequences. Yet, this information is often available for coding DNA sequences (CDS) and gene sequences annotated in databases, and it can help improve the accuracy of the computed spliced alignments. To address this issue, we introduce a new spliced alignment problem and a method called SplicedFamAlign (SFA) for computing the alignment of a spliced CDS against a gene sequence while accounting for the splicing structures of the input sequences, and then the inference of transcript splicing orthology groups in a gene family based on spliced alignments.
RESULTS
The experimental results show that SFA outperforms existing spliced alignment methods in terms of accuracy and execution time for CDS-to-gene alignment. We also show that the performance of SFA remains high for various levels of sequence similarity between input sequences, thanks to accounting for the splicing structure of the input sequences. It is important to notice that unlike all current spliced alignment methods that are meant for cDNA-to-genome alignments and can be used for CDS-to-gene alignments, SFA is the first method specifically designed for CDS-to-gene alignments.
CONCLUSION
We show the usefulness of SFA for the comparison of genes and transcripts within a gene family for the purpose of analyzing splicing orthologies. It can also be used for gene structure annotation and alternative splicing analyses. SplicedFamAlign was implemented in Python. Source code is freely available at https://github.com/UdeS-CoBIUS/SpliceFamAlign .
Topics: Algorithms; Alternative Splicing; Base Sequence; Computer Simulation; Exons; Introns; Molecular Sequence Annotation; Open Reading Frames; RNA, Messenger; Sequence Alignment
PubMed: 30925859
DOI: 10.1186/s12859-019-2647-2 -
Nucleic Acids Research Jan 2013Gene families often show degrees of differences in terms of exon-intron structures depending on their distinct evolutionary histories. Comparative analysis of gene...
Gene families often show degrees of differences in terms of exon-intron structures depending on their distinct evolutionary histories. Comparative analysis of gene structures is important for understanding their evolutionary and functional relationships within plant species. Here, we present a comparative genomics database named PIECE (http://wheat.pw.usda.gov/piece) for Plant Intron and Exon Comparison and Evolution studies. The database contains all the annotated genes extracted from 25 sequenced plant genomes. These genes were classified based on Pfam motifs. Phylogenetic trees were pre-constructed for each gene category. PIECE provides a user-friendly interface for different types of searches and a graphical viewer for displaying a gene structure pattern diagram linked to the resulting bootstrapped dendrogram for each gene family. The gene structure evolution of orthologous gene groups was determined using the GLOOME, Exalign and GECA software programs that can be accessed within the database. PIECE also provides a web server version of the software, GSDraw, for drawing schematic diagrams of gene structures. PIECE is a powerful tool for comparing gene sequences and provides valuable insights into the evolution of gene structure in plant genomes.
Topics: Databases, Genetic; Evolution, Molecular; Exons; Genes, Plant; Genome, Plant; Internet; Introns; Multigene Family; Phylogeny; Plant Proteins; Sequence Alignment
PubMed: 23180792
DOI: 10.1093/nar/gks1109 -
Blood Jul 1999The endothelial cell protein C/activated protein C receptor (EPCR) is located primarily on the surface of the large vessels of the vasculature. In vitro studies suggest... (Comparative Study)
Comparative Study
Structural and functional implications of the intron/exon organization of the human endothelial cell protein C/activated protein C receptor (EPCR) gene: comparison with the structure of CD1/major histocompatibility complex alpha1 and alpha2 domains.
The endothelial cell protein C/activated protein C receptor (EPCR) is located primarily on the surface of the large vessels of the vasculature. In vitro studies suggest that it is involved in the protein C anticoagulant pathway. We report the organization and nucleotide sequence of the human EPCR gene. It spans approximately 6 kbp of genomic DNA, with a transcription initiation point 79 bp upstream of the translation initiation (Met) codon in close proximity to a TATA box and other promoter element consensus sequences. The human EPCR gene has been localized to 20q11.2 and consists of four exons interrupted by three introns, all of which obey the GT-AG rule. Exon I encodes the 5' untranslated region and the signal peptide, and exon IV encodes the transmembrane domain, the cytoplasmic tail, and the 3' untranslated region. Exons II and III encode most of the extracellular region of the EPCR. These exons have been found to correspond to those encoding the alpha1 and alpha2 domains of the CD1/major histocompatibility complex (MHC) class I superfamily. Flanking and intervening introns are of the same phase (phase I) and the position of the intervening intron is identically located. Secondary structure prediction for the amino acid sequence of exons II and III corresponds well with the actual secondary structure elements determined for the alpha1 and alpha2 domains of HLA-A2 and murine CD1.1 from crystal structures. These findings suggest that the EPCR folds with a beta-sheet platform supporting two alpha-helical regions collectively forming a potential binding pocket for protein C/activated protein C.
Topics: Amino Acid Sequence; Antigens, CD1; Base Sequence; Blood Coagulation Factors; Chromosome Mapping; Chromosomes, Human, Pair 20; Cloning, Molecular; Codon; DNA, Complementary; Exons; Genes; Genes, MHC Class I; HLA-A2 Antigen; Humans; Introns; Molecular Sequence Data; Receptors, Cell Surface; Regulatory Sequences, Nucleic Acid; Sequence Alignment; Sequence Homology; Structure-Activity Relationship; Transcription, Genetic
PubMed: 10397730
DOI: No ID Found -
Proceedings of the National Academy of... Jan 2012Gene duplication plays key roles in organismal evolution. Duplicate genes, if they survive, tend to diverge in regulatory and coding regions. Divergences in coding...
Gene duplication plays key roles in organismal evolution. Duplicate genes, if they survive, tend to diverge in regulatory and coding regions. Divergences in coding regions, especially those that can change the function of the gene, can be caused by amino acid-altering substitutions and/or alterations in exon-intron structure. Much has been learned about the mode, tempo, and consequences of nucleotide substitutions, yet relatively little is known about structural divergences. In this study, by analyzing 612 pairs of sibling paralogs from seven representative gene families and 300 pairs of one-to-one orthologs from different species, we investigated the occurrence and relative importance of structural divergences during the evolution of duplicate and nonduplicate genes. We found that structural divergences have been very prevalent in duplicate genes and, in many cases, have led to the generation of functionally distinct paralogs. Comparisons of the genomic sequences of these genes further indicated that the differences in exon-intron structure were actually accomplished by three main types of mechanisms (exon/intron gain/loss, exonization/pseudoexonization, and insertion/deletion), each of which contributed differently to structural divergence. Like nucleotide substitutions, insertion/deletion and exonization/pseudoexonization occurred more or less randomly, with the number of observable mutational events per gene pair being largely proportional to evolutionary time. Notably, however, compared with paralogs with similar evolutionary times, orthologs have accumulated significantly fewer structural changes, whereas the amounts of amino acid replacements accumulated did not show clear differences. This finding suggests that structural divergences have played a more important role during the evolution of duplicate than nonduplicate genes.
Topics: Alternative Splicing; Computational Biology; Evolution, Molecular; Exons; Frameshift Mutation; Genes, Duplicate; Genetic Structures; Introns; Multigene Family
PubMed: 22232673
DOI: 10.1073/pnas.1109047109 -
Genetics Nov 1983Mutants of Saccharomyces cerevisiae lacking glucokinase (EC 2.7.1.2) have no discernible phenotypic difference from the wild-type strain; in a hexokinaseless background,... (Comparative Study)
Comparative Study
Mutants of Saccharomyces cerevisiae lacking glucokinase (EC 2.7.1.2) have no discernible phenotypic difference from the wild-type strain; in a hexokinaseless background, however, they are unable to grow on any sugar except galactose. Reversion studies with glucokinase mutants indicate that the yeast S. cerevisiae has no other enzyme for phosphorylating glucose except the two hexokinases, P1 and P2, and glucokinase. Spontaneous revertants of hxk1 hxk2 glk1 strains collected on glucose regain any one of these three enzymes. The majority of glucokinase revertants synthesize species of enzyme activity that are kinetically or otherwise indistinguishable from the wild-type enzyme. In a few cases the reverted enzyme is very perceptibly altered in properties with a Km for glucose two orders of magnitude higher than that of the enzyme from the wild-type parent. These recessive, noncomplementing mutants, thus, define a single structural gene GLK1 of glucokinase. Yeast diploids lacking all of the three enzymes for glucose phosphorylation fail to sporulate. Heterozygosity of either of the hexokinase genes HXK1 or HXK2, but not GLK1, restores sporulation. The location of GLK1 on chromosome III was indicated by loss of this chromosome when hexokinaseless diploids heterozygous for glk1 were selected for resistance to 2-deoxyglucose; the homologue of chromosome III carrying GLK1, the mating-type allele and other nutritional markers on this chromosome was lost. Meiotic mapping of glucokinase executed with heterozygosity of one of the hexokinases indicated that the gene GLK1 defining the structure of glucokinase protein is located on the left arm of chromosome III 24 cM to the left of his4 in the order: leu2--his4--glk1. --Only two of 206 independent glucokinase mutants are nonsense ochre, both of which map at one end of the gene. In hxk1 only one of 130 isolates is a nonsense mutation, whereas in hxk2 none has been found among 220 independent mutants. These results raise the possibility that the protein products of these genes have some other essential function. --An earlier mapping result for hxk2 has been corrected. The new location is on the left arm of chromosome VII, 17 cM distal to ade5 in the order: lys5--ade5--hxk2.
Topics: Chromosome Mapping; Genes; Genes, Fungal; Genes, Mating Type, Fungal; Glucokinase; Hexokinase; Mutation; Saccharomyces cerevisiae
PubMed: 6357942
DOI: 10.1093/genetics/105.3.501 -
PloS One 2012MYB proteins comprise a large family of plant transcription factors, members of which perform a variety of functions in plant biological processes. To date, no...
MYB proteins comprise a large family of plant transcription factors, members of which perform a variety of functions in plant biological processes. To date, no genome-wide characterization of this gene family has been conducted in maize (Zea mays). In the present study, we performed a comprehensive computational analysis, to yield a complete overview of the R2R3-MYB gene family in maize, including the phylogeny, expression patterns, and also its structural and functional characteristics. The MYB gene structure in maize and Arabidopsis were highly conserved, indicating that they were originally compact in size. Subgroup-specific conserved motifs outside the MYB domain may reflect functional conservation. The genome distribution strongly supports the hypothesis that segmental and tandem duplication contribute to the expansion of maize MYB genes. We also performed an updated and comprehensive classification of the R2R3-MYB gene families in maize and other plant species. The result revealed that the functions were conserved between maize MYB genes and their putative orthologs, demonstrating the origin and evolutionary diversification of plant MYB genes. Species-specific groups/subgroups may evolve or be lost during evolution, resulting in functional divergence. Expression profile study indicated that maize R2R3-MYB genes exhibit a variety of expression patterns, suggesting diverse functions. Furthermore, computational prediction potential targets of maize microRNAs (miRNAs) revealed that miR159, miR319, and miR160 may be implicated in regulating maize R2R3-MYB genes, suggesting roles of these miRNAs in post-transcriptional regulation and transcription networks. Our comparative analysis of R2R3-MYB genes in maize confirm and extend the sequence and functional characteristics of this gene family, and will facilitate future functional analysis of the MYB gene family in maize.
Topics: Amino Acid Sequence; Gene Expression Profiling; Genes, Plant; Genes, myb; Molecular Sequence Data; Multigene Family; Phylogeny; Sequence Homology, Amino Acid; Transcription Factors; Zea mays
PubMed: 22719841
DOI: 10.1371/journal.pone.0037463 -
Genes May 2023The tonoplast monosaccharide transporter () family plays essential roles in sugar transport and plant growth. However, there is limited knowledge about the evolutionary...
The tonoplast monosaccharide transporter () family plays essential roles in sugar transport and plant growth. However, there is limited knowledge about the evolutionary dynamics of this important gene family in important Gramineae crops and putative function of rice genes under external stresses. Here, the gene structural characteristics, chromosomal location, evolutionary relationship, and expression patterns of genes were analyzed at a genome-wide scale. We identified six, three, six, six, four, six, and four genes, respectively, in (Bd), (Hv), (Or), ssp. (Os), (Sb), (Si), and (Zm). All TMT proteins were divided into three clades based on the phylogenetic tree, gene structures, and protein motifs. The transcriptome data and qRT-PCR experiments suggested that each clade members had different expression patterns in various tissues and multiple reproductive tissues. In addition, the microarray datasets of rice indicated that different rice subspecies responded differently to the same intensity of salt or heat stress. The Fst value results indicated that the gene family in rice was under different selection pressures in the process of rice subspecies differentiation and later selection breeding. Our findings pave the way for further insights into the evolutionary patterns of the gene family in the important Gramineae crops and provide important references for characterizing the functions of rice genes.
Topics: Oryza; Multigene Family; Phylogeny; Genes, Plant; Membrane Transport Proteins
PubMed: 37372320
DOI: 10.3390/genes14061140