-
BMC Medical Genetics Feb 2011Common single-nucleotide polymorphisms (SNPs) in ten chromosomal loci have been shown to predispose to colorectal cancer (CRC) in genome-wide association studies. A...
BACKGROUND
Common single-nucleotide polymorphisms (SNPs) in ten chromosomal loci have been shown to predispose to colorectal cancer (CRC) in genome-wide association studies. A plausible biological mechanism of CRC susceptibility associated with genetic variation has so far only been proposed for three loci, each pointing to variants that affect gene expression through distant regulatory elements. In this study, we aimed to gain insight into the molecular basis of seven low-penetrance CRC loci tagged by rs4779584 at 15q13, rs10795668 at 10p14, rs3802842 at 11q23, rs4444235 at 14q22, rs9929218 at 16q22, rs10411210 at 19q13, and rs961253 at 20p12.
METHODS
Possible somatic gain of the risk allele or loss of the protective allele was studied by analyzing allelic imbalance in tumour and corresponding normal tissue samples of heterozygous patients. Functional variants were searched from in silico predicted enhancer elements locating inside the CRC-associating linkage-disequilibrium regions.
RESULTS
No allelic imbalance targeting the SNPs was observed at any of the seven loci. Altogether, 12 SNPs that were predicted to disrupt potential transcription factor binding sequences were genotyped in the same population-based case-control series as the seven tagging SNPs originally. None showed association with CRC.
CONCLUSIONS
The results of the allelic imbalance analysis suggest that the seven CRC risk variants are not somatically selected for in the neoplastic progression. The bioinformatic approach was unable to pinpoint cancer-causing variants at any of the seven loci. While it is possible that many of the predisposition loci for CRC are involved in control of gene expression by targeting transcription factor binding sites, also other possibilities, such as regulatory RNAs, should be considered.
Topics: Allelic Imbalance; Carcinoma; Case-Control Studies; Cell Transformation, Neoplastic; Colorectal Neoplasms; Computational Biology; Enhancer Elements, Genetic; Genetic Loci; Genetic Predisposition to Disease; Humans; Linkage Disequilibrium; Penetrance; Polymorphism, Single Nucleotide; Transcription Factors
PubMed: 21314996
DOI: 10.1186/1471-2350-12-23 -
PloS One 2019Of the 108 Schizophrenia (SZ) risk-loci discovered through genome-wide association studies (GWAS), 96 are not altering the sequence of any protein. Evidence linking... (Comparative Study)
Comparative Study
BACKGROUND
Of the 108 Schizophrenia (SZ) risk-loci discovered through genome-wide association studies (GWAS), 96 are not altering the sequence of any protein. Evidence linking non-coding risk-SNPs and genes may be established using expression quantitative trait loci (eQTL). However, other approaches such allelic expression quantitative trait loci (aeQTL) also may be of use.
METHODS
We applied both the eQTL and aeQTL analysis to a biobank of deeply sequenced RNA from 680 dorso-lateral pre-frontal cortex (DLPFC) samples. For each of 340 genes proximal to the SZ risk-SNPs, we asked how much SNP-genotype affected total expression (eQTL), as well as how much the expression ratio between the two alleles differed from 1:1 as a consequence of the risk-SNP genotype (aeQTL).
RESULTS
We analyzed overlap with comparable eQTL-findings: 16 of the 30 risk-SNPs known to have gene-level eQTL also had gene-level aeQTL effects. 6 of 21 risk-SNPs with known splice-eQTL had exon-aeQTL effects. 12 novel potential risk genes were identified with the aeQTL approach, while 55 tested SNP-pairs were found as eQTL but not aeQTL. Of the tested 108 loci we could find at least one gene to be associated with 21 of the risk-SNPs using gene-level aeQTL, and with an additional 18 risk-SNPs using exon-level aeQTL.
CONCLUSION
Our results suggest that the aeQTL strategy complements the eQTL approach to susceptibility gene identification.
Topics: Adolescent; Adult; Aged; Aged, 80 and over; Alleles; Allelic Imbalance; Brain; Child; Child, Preschool; Female; Genetic Predisposition to Disease; Genome-Wide Association Study; Genotype; Humans; Infant; Infant, Newborn; Male; Middle Aged; Polymorphism, Single Nucleotide; Quantitative Trait Loci; RNA-Seq; Schizophrenia; Exome Sequencing; Young Adult
PubMed: 31206532
DOI: 10.1371/journal.pone.0217765 -
Mathematical Biosciences and... Oct 2019Epigenetics is the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence. Genomic...
Epigenetics is the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence. Genomic imprinting is an epigenetically regulated process by which imprinted genes are expressed in a parent-of-origin-specific manner. It can be confounded with a phenomenon, allelic expression imbalance (AEI), which, in this paper, refers to asymmetric expression of the two alleles of a heterozygous subject at a single nucleotide polymorphism not caused by imprinting (non-imprinting AEI). Since existing methods in the literature are not amenable to distinguishing imprinting from non-imprinting AEI for data without replicates, we propose AIJ, a joint test for simultaneous detection of imprinting and non-imprinting AEI that accounts for potential confounding using RNA-seq data based on a reciprocal cross design. Through a simulation study, we show that AIJ is more powerful compared to two frequently used methods that do not account for confounding. To illustrate the practical utility of AIJ, we applied the method to a mouse dataset and identified genes with the imprinting effect and/or non-imprinting AEI phenomenon, with some already confirmed in an existing database. The results are also largely consistent with a study on human data for a set of orthologous genes, affirming earlier conclusion in the literature that non-imprinting AEI events are evolutionarily conserved.
Topics: Allelic Imbalance; Animals; Computer Simulation; Epigenesis, Genetic; Female; Gene Expression Regulation; Genomic Imprinting; Genomics; Heterozygote; Humans; Male; Mice; Mice, Inbred C57BL; Phenotype; Polymorphism, Single Nucleotide; RNA-Seq; Species Specificity
PubMed: 31731356
DOI: 10.3934/mbe.2020020 -
Laboratory Investigation; a Journal of... Oct 2001Reactive oxygen species produced by aerobic cellular metabolism or through exposure to environmental carcinogens can cause oxidative DNA damage by generating DNA base...
Reactive oxygen species produced by aerobic cellular metabolism or through exposure to environmental carcinogens can cause oxidative DNA damage by generating DNA base lesions and strand breakage. Prime among these base lesions is the conversion of guanine to 8-oxoguanine. Among 20 or so oxidative DNA base lesions, 8-oxoguanine is the most abundant and is critical in terms of mutagenesis because it is capable of mispairing with adenine, which, if not sufficiently repaired, may lead to G:C to T:A transversion upon DNA replication. The gene encoding human 8-oxoguanine DNA glycosylase 1 (hOGG1), capable of excision repair of 8-oxoguanine, has been recently cloned, characterized, and mapped to the short arm of chromosome 3 (3p25-26), a region showing frequent loss of heterozygosity (LOH) in head and neck squamous cell carcinoma (HNSCC). In the present study, we developed a tissue microdissection approach designed for use with formalin-fixed, paraffin-embedded specimens which is capable of detecting and characterizing the hOGG1 allelic loss using two highly informative, intragenic single nucleotide polymorphisms. Among 45 cases of HNSCC, 18 cases were informative. We analyzed these 18 cases and found that 11 showed evidence of hOGG1 allelic loss. By immunohistochemical staining on a total of 71 HNSCC cases using a commercially available anti-hOGG1 antibody, we showed that hOGG1 gene expression was markedly suppressed in up to 38% of the cases. The frequent allelic imbalance and suppression of the hOGG1 gene thus imply that repair for oxidative DNA damages may be relevant in future studies on head and neck squamous carcinogenesis.
Topics: Carcinoma, Squamous Cell; DNA Repair; DNA-Formamidopyrimidine Glycosylase; Gene Frequency; Head and Neck Neoplasms; Humans; Loss of Heterozygosity; N-Glycosyl Hydrolases
PubMed: 11598155
DOI: 10.1038/labinvest.3780356 -
Nature Communications Jun 2021A sensitive approach to quantitative analysis of transcriptional regulation in diploid organisms is analysis of allelic imbalance (AI) in RNA sequencing (RNA-seq) data....
A sensitive approach to quantitative analysis of transcriptional regulation in diploid organisms is analysis of allelic imbalance (AI) in RNA sequencing (RNA-seq) data. A near-universal practice in such studies is to prepare and sequence only one library per RNA sample. We present theoretical and experimental evidence that data from a single RNA-seq library is insufficient for reliable quantification of the contribution of technical noise to the observed AI signal; consequently, reliance on one-replicate experimental design can lead to unaccounted-for variation in error rates in allele-specific analysis. We develop a computational approach, Qllelic, that accurately accounts for technical noise by making use of replicate RNA-seq libraries. Testing on new and existing datasets shows that application of Qllelic greatly decreases false positive rate in allele-specific analysis while conserving appropriate signal, and thus greatly improves reproducibility of AI estimates. We explore sources of technical overdispersion in observed AI signal and conclude by discussing design of RNA-seq studies addressing two biologically important questions: quantification of transcriptome-wide AI in one sample, and differential analysis of allele-specific expression between samples.
Topics: Algorithms; Alleles; Allelic Imbalance; Animals; Female; Gene Library; Mice, 129 Strain; Models, Genetic; Polymorphism, Single Nucleotide; RNA; Sequence Analysis, RNA; Transcriptome; Mice
PubMed: 34099647
DOI: 10.1038/s41467-021-23544-8 -
EBioMedicine Apr 2019Genomic investigation of atypical adenomatous hyperplasia (AAH), the only known precursor lesion to lung adenocarcinomas (LUAD), presents challenges due to the low...
BACKGROUND
Genomic investigation of atypical adenomatous hyperplasia (AAH), the only known precursor lesion to lung adenocarcinomas (LUAD), presents challenges due to the low mutant cell fractions. This necessitates sensitive methods for detection of chromosomal aberrations to better study the role of critical alterations in early lung cancer pathogenesis and the progression from AAH to LUAD.
METHODS
We applied a sensitive haplotype-based statistical technique to detect chromosomal alterations leading to allelic imbalance (AI) from genotype array profiling of 48 matched normal lung parenchyma, AAH and tumor tissues from 16 stage-I LUAD patients. To gain insights into shared developmental trajectories among tissues, we performed phylogenetic analyses and integrated our results with point mutation data, highlighting significantly-mutated driver genes in LUAD pathogenesis.
FINDINGS
AI was detected in nine AAHs (56%). Six cases exhibited recurrent loss of 17p. AI and the enrichment of 17p events were predominantly identified in patients with smoking history. Among the nine AAH tissues with detected AI, seven exhibited evidence for shared chromosomal aberrations with matched LUAD specimens, including losses harboring tumor suppressors on 17p, 8p, 9p, 9q, 19p, and gains encompassing oncogenes on 8q, 12p and 1q.
INTERPRETATION
Chromosomal aberrations, particularly 17p loss, appear to play critical roles early in AAH pathogenesis. Genomic instability in AAH, as well as truncal chromosomal aberrations shared with LUAD, provide evidence for mutation accumulation and are suggestive of a cancerized field contributing to the clonal selection and expansion of these premalignant lesions. FUND: Supported in part by Cancer Prevention and Research Institute of Texas (CPRIT) grant RP150079 (PS and HK), NIH grant R01HG005859 (PS) and The University of Texas MD Anderson Cancer Center Core Support Grant.
Topics: Adenocarcinoma; Adult; Aged; Aged, 80 and over; Alleles; Allelic Imbalance; Cell Transformation, Neoplastic; Chromosomal Instability; Disease Progression; Female; Genetic Heterogeneity; Genome-Wide Association Study; Haplotypes; Humans; Hyperplasia; Lung; Lung Neoplasms; Male; Middle Aged; Models, Statistical; Mutation; Neoplasm Staging; Phylogeny; Polymorphism, Single Nucleotide; Precancerous Conditions; Young Adult
PubMed: 30905849
DOI: 10.1016/j.ebiom.2019.03.020 -
G3 (Bethesda, Md.) May 2021Allelic imbalance (AI) occurs when alleles in a diploid individual are differentially expressed and indicates cis acting regulatory variation. What is the distribution...
Allelic imbalance (AI) occurs when alleles in a diploid individual are differentially expressed and indicates cis acting regulatory variation. What is the distribution of allelic effects in a natural population? Are all alleles the same? Are all alleles distinct? The approach described applies to any technology generating allele-specific sequence counts, for example for chromatin accessibility and can be applied generally including to comparisons between tissues or environments for the same genotype. Tests of allelic effect are generally performed by crossing individuals and comparing expression between alleles directly in the F1. However, a crossing scheme that compares alleles pairwise is a prohibitive cost for more than a handful of alleles as the number of crosses is at least (n2-n)/2 where n is the number of alleles. We show here that a testcross design followed by a hypothesis test of AI between testcrosses can be used to infer differences between nontester alleles, allowing n alleles to be compared with n crosses. Using a mouse data set where both testcrosses and direct comparisons have been performed, we show that the predicted differences between nontester alleles are validated at levels of over 90% when a parent-of-origin effect is present and of 60%-80% overall. Power considerations for a testcross, are similar to those in a reciprocal cross. In all applications, the testing for AI involves several complex bioinformatics steps. BayesASE is a complete bioinformatics pipeline that incorporates state-of-the-art error reduction techniques and a flexible Bayesian approach to estimating AI and formally comparing levels of AI between conditions. The modular structure of BayesASE has been packaged in Galaxy, made available in Nextflow and as a collection of scripts for the SLURM workload manager on github (https://github.com/McIntyre-Lab/BayesASE).
Topics: Alleles; Allelic Imbalance; Bayes Theorem; Genotype; Polymorphism, Single Nucleotide
PubMed: 33772539
DOI: 10.1093/g3journal/jkab096 -
PLoS Genetics Oct 2021Chromatin accessibility and gene expression in relevant cell contexts can guide identification of regulatory elements and mechanisms at genome-wide association study...
Chromatin accessibility and gene expression in relevant cell contexts can guide identification of regulatory elements and mechanisms at genome-wide association study (GWAS) loci. To identify regulatory elements that display differential activity across adipocyte differentiation, we performed ATAC-seq and RNA-seq in a human cell model of preadipocytes and adipocytes at days 4 and 14 of differentiation. For comparison, we created a consensus map of ATAC-seq peaks in 11 human subcutaneous adipose tissue samples. We identified 58,387 context-dependent chromatin accessibility peaks and 3,090 context-dependent genes between all timepoint comparisons (log2 fold change>1, FDR<5%) with 15,919 adipocyte- and 18,244 preadipocyte-dependent peaks. Adipocyte-dependent peaks showed increased overlap (60.1%) with Roadmap Epigenomics adipocyte nuclei enhancers compared to preadipocyte-dependent peaks (11.5%). We linked context-dependent peaks to genes based on adipocyte promoter capture Hi-C data, overlap with adipose eQTL variants, and context-dependent gene expression. Of 16,167 context-dependent peaks linked to a gene, 5,145 were linked by two or more strategies to 1,670 genes. Among GWAS loci for cardiometabolic traits, adipocyte-dependent peaks, but not preadipocyte-dependent peaks, showed significant enrichment (LD score regression P<0.005) for waist-to-hip ratio and modest enrichment (P < 0.05) for HDL-cholesterol. We identified 659 peaks linked to 503 genes by two or more approaches and overlapping a GWAS signal, suggesting a regulatory mechanism at these loci. To identify variants that may alter chromatin accessibility between timepoints, we identified 582 variants in 454 context-dependent peaks that demonstrated allelic imbalance in accessibility (FDR<5%), of which 55 peaks also overlapped GWAS variants. At one GWAS locus for palmitoleic acid, rs603424 was located in an adipocyte-dependent peak linked to SCD and exhibited allelic differences in transcriptional activity in adipocytes (P = 0.003) but not preadipocytes (P = 0.09). These results demonstrate that context-dependent peaks and genes can guide discovery of regulatory variants at GWAS loci and aid identification of regulatory mechanisms.
Topics: Adipocytes; Adipose Tissue; Alleles; Allelic Imbalance; Binding Sites; Cardiovascular Diseases; Cell Differentiation; Chromatin; Chromatin Immunoprecipitation Sequencing; Epigenomics; Gene Expression; Genetic Techniques; Genome-Wide Association Study; Humans; Metabolic Diseases; Promoter Regions, Genetic; Quantitative Trait Loci; Regulatory Sequences, Nucleic Acid
PubMed: 34699533
DOI: 10.1371/journal.pgen.1009865 -
Oncotarget Jan 2014
Topics: Animals; Chromosome Deletion; Haploinsufficiency; Humans; Leukemia, Myeloid; Myelodysplastic Syndromes; Tumor Suppressor Proteins
PubMed: 24473927
DOI: 10.18632/oncotarget.1763 -
PloS One Jun 2010Genotyping platforms such as single nucleotide polymorphism (SNP) arrays are powerful tools to study genomic aberrations in cancer samples. Allele specific information...
BACKGROUND
Genotyping platforms such as single nucleotide polymorphism (SNP) arrays are powerful tools to study genomic aberrations in cancer samples. Allele specific information from SNP arrays provides valuable information for interpreting copy number variation (CNV) and allelic imbalance including loss-of-heterozygosity (LOH) beyond that obtained from the total DNA signal available from array comparative genomic hybridization (aCGH) platforms. Several algorithms based on hidden Markov models (HMMs) have been designed to detect copy number changes and copy-neutral LOH making use of the allele information on SNP arrays. However heterogeneity in clinical samples, due to stromal contamination and somatic alterations, complicates analysis and interpretation of these data.
METHODS
We have developed MixHMM, a novel hidden Markov model using hidden states based on chromosomal structural aberrations. MixHMM allows CNV detection for copy numbers up to 7 and allows more complete and accurate description of other forms of allelic imbalance, such as increased copy number LOH or imbalanced amplifications. MixHMM also incorporates a novel sample mixing model that allows detection of tumor CNV events in heterogeneous tumor samples, where cancer cells are mixed with a proportion of stromal cells.
CONCLUSIONS
We validate MixHMM and demonstrate its advantages with simulated samples, clinical tumor samples and a dilution series of mixed samples. We have shown that the CNVs of cancer cells in a tumor sample contaminated with up to 80% of stromal cells can be detected accurately using Illumina BeadChip and MixHMM.
AVAILABILITY
The MixHMM is available as a Python package provided with some other useful tools at http://genecube.med.yale.edu:8080/MixHMM.
Topics: Gene Dosage; Humans; Loss of Heterozygosity; Markov Chains; Models, Genetic; Neoplasms; Polymorphism, Single Nucleotide; Stromal Cells
PubMed: 20532221
DOI: 10.1371/journal.pone.0010909