-
Frontiers in Bioscience (Landmark... Jun 2013Breast cancer (BC) is a heterogeneous disease. The majority of breast cancer cases (about 70 percent) are considered sporadic. Familial breast cancer (about 30 percent... (Review)
Review
Breast cancer (BC) is a heterogeneous disease. The majority of breast cancer cases (about 70 percent) are considered sporadic. Familial breast cancer (about 30 percent of patients), often seen in families with a high incidence of BC, has been associated with a number of high-, moderate-, and low-penetrance susceptibility genes. Family linkage studies have identified high-penetrance genes, BRCA1, BRCA2, PTEN and TP53, that are responsible for inherited syndromes. Moreover, a combination of family-based and population-based approaches indicated that genes involved in DNA repair, such as CHEK2, ATM, BRIP1 (FANCJ), PALB2 (FANCN) and RAD51C (FANCO), are associated with moderate BC risk. Genome wide association studies (GWAS) in BC revealed a number of common low penetrance alleles associated with a slightly increased or decreased risk of BC. Currently, only high penetrance genes are used in clinical practice on a wide scale. Due to the development of next generation sequencing technologies, it is envisaged that all familial breast cancer genes will be included in the genetic test. However, additional research in clinical management of moderate and low-risk variants is needed before full implementation of multi-gene panel testing into clinical work-flows. In this review, we focus on the different components of familial breast cancer risk.
Topics: Alleles; Breast Neoplasms; Female; Genes, BRCA1; Genes, BRCA2; Genetic Predisposition to Disease; Humans; Multigene Family
PubMed: 23747889
DOI: 10.2741/4185 -
Philosophical Transactions of the Royal... Feb 2017Gene duplications and gene losses have been frequent events in the evolution of animal genomes, with the balance between these two dynamic processes contributing to... (Review)
Review
Gene duplications and gene losses have been frequent events in the evolution of animal genomes, with the balance between these two dynamic processes contributing to major differences in gene number between species. After gene duplication, it is common for both daughter genes to accumulate sequence change at approximately equal rates. In some cases, however, the accumulation of sequence change is highly uneven with one copy radically diverging from its paralogue. Such 'asymmetric evolution' seems commoner after tandem gene duplication than after whole-genome duplication, and can generate substantially novel genes. We describe examples of asymmetric evolution in duplicated homeobox genes of moths, molluscs and mammals, in each case generating new homeobox genes that were recruited to novel developmental roles. The prevalence of asymmetric divergence of gene duplicates has been underappreciated, in part, because the origin of highly divergent genes can be difficult to resolve using standard phylogenetic methods.This article is part of the themed issue 'Evo-devo in the genomics era, and the origins of morphological diversity'.
Topics: Animals; Biological Evolution; Evolution, Molecular; Genes, Duplicate; Genes, Homeobox; Growth and Development
PubMed: 27994121
DOI: 10.1098/rstb.2015.0480 -
The Plant Journal : For Cell and... Feb 2018Pseudogenes have a reputation of being 'evolutionary relics' or 'junk DNA'. While they are well characterized in mammals, studies in more complex plant genomes have so...
Pseudogenes have a reputation of being 'evolutionary relics' or 'junk DNA'. While they are well characterized in mammals, studies in more complex plant genomes have so far been hampered by the absence of reference genome sequences. Barley is one of the economically most important cereals and has a genome size of 5.1 Gb. With the first high-quality genome reference assembly available for a Triticeae crop, we conducted a whole-genome assessment of pseudogenes on the barley genome. We identified, characterized and classified 89 440 gene fragments and pseudogenes scattered along the chromosomes, with occasional hotspots and higher densities at the chromosome ends. Full-length pseudogenes (11 015) have preferentially retained their exon-intron structure. Retrotransposition of processed mRNAs only plays a marginal role in their creation. However, the distribution of retroposed pseudogenes reflects the Rabl configuration of barley chromosomes and thus hints at founding mechanisms. While parent genes related to the defense-response were found to be under-represented in cultivated barley, we detected several defense-related pseudogenes in wild barley accessions. The percentage of transcriptionally active pseudogenes is 7.2%, and these may potentially adopt new regulatory roles.The barley genome is rich in pseudogenes and small gene fragments mainly located towards chromosome tips or as tandemly repeated units. Our results indicate non-random duplication and pseudogenization preferences and improve our understanding of the dynamics of gene birth and death in large plant genomes and the mechanisms that lead to evolutionary innovations.
Topics: Chromosome Mapping; Chromosomes, Plant; Gene Duplication; Genes, Plant; Hordeum; Multigene Family; Pseudogenes; Selection, Genetic; Synteny
PubMed: 29205595
DOI: 10.1111/tpj.13794 -
Nucleic Acids Research Feb 1987A simple, effective measure of synonymous codon usage bias, the Codon Adaptation Index, is detailed. The index uses a reference set of highly expressed genes from a... (Comparative Study)
Comparative Study
A simple, effective measure of synonymous codon usage bias, the Codon Adaptation Index, is detailed. The index uses a reference set of highly expressed genes from a species to assess the relative merits of each codon, and a score for a gene is calculated from the frequency of use of all codons in that gene. The index assesses the extent to which selection has been effective in moulding the pattern of codon usage. In that respect it is useful for predicting the level of expression of a gene, for assessing the adaptation of viral genes to their hosts, and for making comparisons of codon usage in different organisms. The index may also give an approximate indication of the likely success of heterologous gene expression.
Topics: Animals; Base Sequence; Biological Evolution; Cattle; Codon; Escherichia coli; Genes; Genes, Bacterial; Genes, Fungal; Humans; Mathematics; Models, Genetic; RNA, Messenger; Saccharomyces cerevisiae
PubMed: 3547335
DOI: 10.1093/nar/15.3.1281 -
Nucleic Acids Research Jan 2002PlantCARE is a database of plant cis-acting regulatory elements, enhancers and repressors. Regulatory elements are represented by positional matrices, consensus...
PlantCARE is a database of plant cis-acting regulatory elements, enhancers and repressors. Regulatory elements are represented by positional matrices, consensus sequences and individual sites on particular promoter sequences. Links to the EMBL, TRANSFAC and MEDLINE databases are provided when available. Data about the transcription sites are extracted mainly from the literature, supplemented with an increasing number of in silico predicted data. Apart from a general description for specific transcription factor sites, levels of confidence for the experimental evidence, functional information and the position on the promoter are given as well. New features have been implemented to search for plant cis-acting regulatory elements in a query sequence. Furthermore, links are now provided to a new clustering and motif search method to investigate clusters of co-expressed genes. New regulatory elements can be sent automatically and will be added to the database after curation. The PlantCARE relational database is available via the World Wide Web at http://sphinx.rug.ac.be:8080/PlantCARE/.
Topics: Consensus Sequence; Databases, Nucleic Acid; Enhancer Elements, Genetic; Gene Expression Regulation, Plant; Genes, Plant; Genome, Plant; Information Storage and Retrieval; Internet; Multigene Family; Promoter Regions, Genetic; Regulatory Sequences, Nucleic Acid; Systems Integration; Transcription, Genetic
PubMed: 11752327
DOI: 10.1093/nar/30.1.325 -
Microbiology Spectrum Jul 2018Genetic coding in bacteria largely operates via the "one gene-one protein" paradigm. However, the peculiarities of the mRNA structure, the versatility of the genetic... (Review)
Review
Genetic coding in bacteria largely operates via the "one gene-one protein" paradigm. However, the peculiarities of the mRNA structure, the versatility of the genetic code, and the dynamic nature of translation sometimes allow organisms to deviate from the standard rules of protein encoding. Bacteria can use several unorthodox modes of translation to express more than one protein from a single mRNA cistron. One such alternative path is the use of additional translation initiation sites within the gene. Proteins whose translation is initiated at different start sites within the same reading frame will differ in their N termini but will have identical C-terminal segments. On the other hand, alternative initiation of translation in a register different from the frame dictated by the primary start codon will yield a protein whose sequence is entirely different from the one encoded in the main frame. The use of internal mRNA codons as translation start sites is controlled by the nucleotide sequence and the mRNA folding. The proteins of the alternative proteome generated via the "genes-within-genes" strategy may carry important functions. In this review, we summarize the currently known examples of bacterial genes encoding more than one protein due to the utilization of additional translation start sites and discuss the known or proposed functions of the alternative polypeptides in relation to the main protein product of the gene. We also discuss recent proteome- and genome-wide approaches that will allow the discovery of novel translation initiation sites in a systematic fashion.
Topics: Bacteria; Bacterial Proteins; Codon, Initiator; Gene Expression Regulation, Bacterial; Genes, Overlapping; Genetic Code; Genome, Bacterial; Open Reading Frames; Peptide Chain Initiation, Translational
PubMed: 30003865
DOI: 10.1128/microbiolspec.RWR-0020-2018 -
Genomics Sep 2015Differential gene expression is the basis for cell type diversity in multicellular organisms and the driving force of development and differentiation. It is achieved by... (Review)
Review
Differential gene expression is the basis for cell type diversity in multicellular organisms and the driving force of development and differentiation. It is achieved by cell type-specific transcriptional enhancers, which are genomic DNA sequences that activate the transcription of their target genes. Their identification and characterization is fundamental to our understanding of gene regulation. Features that are associated with enhancer activity, such as regulatory factor binding or histone modifications can predict the location of enhancers. Nonetheless, enhancer activity can only be assessed by transcriptional reporter assays. Over the past years massively parallel reporter assays have been developed for large scale testing of enhancers. In this review we focus on the principles and applications of STARR-seq, a functional assay that quantifies enhancer strengths in complex candidate libraries and thus allows activity-based enhancer identification in entire genomes. We explain how STARR-seq works, discuss current uses and give an outlook to future applications.
Topics: Chromosome Mapping; Enhancer Elements, Genetic; Gene Expression Regulation; Genes, Reporter; Genome, Human; High-Throughput Nucleotide Sequencing; Humans; Promoter Regions, Genetic; Sequence Analysis, DNA
PubMed: 26072434
DOI: 10.1016/j.ygeno.2015.06.001 -
BMC Genomics May 2020The evolutionary radiation of animals was accompanied by extensive expansion of gene and genome sizes, increased isoform diversity, and complexity of regulation.
BACKGROUND
The evolutionary radiation of animals was accompanied by extensive expansion of gene and genome sizes, increased isoform diversity, and complexity of regulation.
RESULTS
Here we show that the longest genes are enriched for expression in neuronal tissues of diverse vertebrates and of invertebrates. Additionally, we show that neuronal gene size expansion occurred predominantly through net gains in intron size, with a positional bias toward the 5' end of each gene.
CONCLUSIONS
We find that intron and gene size expansion is a feature of many genes whose expression is enriched in nervous systems. We speculate that unique attributes of neurons may subject neuronal genes to evolutionary forces favoring net size expansion. This process could be associated with tissue-specific constraints on gene function and/or the evolution of increasingly complex gene regulation in nervous systems.
Topics: Animals; Evolution, Molecular; Gene Expression Regulation; Genes; Genome; Introns; Mutation; Nervous System; Organ Specificity; Phylogeny
PubMed: 32410625
DOI: 10.1186/s12864-020-6760-4 -
BMC Bioinformatics Oct 2011Comparison of complete genomes of Bacteria and Archaea shows that gene content varies considerably and that genomes evolve quite rapidly via gene duplication and...
BACKGROUND
Comparison of complete genomes of Bacteria and Archaea shows that gene content varies considerably and that genomes evolve quite rapidly via gene duplication and deletion and horizontal gene transfer. We analyze a diverse set of 92 Bacteria and 79 Archaea in order to investigate the processes governing the origin and evolution of families of related genes within genomes.
RESULTS
Genes were clustered into related groups using similarity criteria derived from BLAST. Most clusters contained genes from only one or a small number of genomes, and relatively few core clusters were found that spanned all genomes. Gene clusters found in larger numbers of genomes tended to have larger numbers of genes per genome; however, clusters with unusually large numbers of genes per genome were found among both narrowly and widely distributed clusters. Larger genomes were found to have larger mean gene family sizes and a greater proportion of families of very large size. We used a model of birth, death, and innovation to predict the distribution of gene family sizes. The key parameter is r, the ratio of duplications to deletions. It was found that the model can give a good fit to the observed distribution only if there are several classes of genes with different values of r. The preferred model in most cases had three classes of genes.
CONCLUSIONS
There appears to be a rapid rate of origination of new gene families within individual genomes. Most of these gene families are deleted before they spread to large numbers of genomes, which suggests that they may not be generally beneficial to the organisms. The family size distribution is best described by a large fraction of families that tend to have only one or two genes and a small fraction of families of multi-copy genes that are highly prone to duplication. Larger families occur more frequently in larger genomes, indicating higher r in these genomes, possibly due to a greater tolerance for non-beneficial gene duplicates. The smallest genomes contain very few multi-copy families, suggesting a high rate of deletion of all but the most beneficial genes in these genomes.
Topics: Cluster Analysis; Evolution, Molecular; Gene Duplication; Genes, Archaeal; Genes, Bacterial; Genome Size; Genome, Archaeal; Genome, Bacterial; Genomics; Multigene Family
PubMed: 22151831
DOI: 10.1186/1471-2105-12-S9-S14 -
Aging Jan 2018
Topics: Genes, rRNA; Methylation; Promoter Regions, Genetic; RNA, Ribosomal
PubMed: 29365326
DOI: 10.18632/aging.101369