-
BMC Genetics Apr 2016Multiple sclerosis is a chronic inflammatory, demyelinating disease of the central nervous system. Recent genome-wide studies have revealed more than 110 single...
BACKGROUND
Multiple sclerosis is a chronic inflammatory, demyelinating disease of the central nervous system. Recent genome-wide studies have revealed more than 110 single nucleotide polymorphisms as associated with susceptibility to multiple sclerosis, but their functional contribution to disease development is mostly unknown.
RESULTS
Consistent allelic imbalance was observed for rs907091 in IKZF3 and rs11609 in IQGAP1, which are in strong linkage disequilibrium with the multiple sclerosis associated single nucleotide polymorphisms rs12946510 and rs8042861, respectively. Using multiple sclerosis patients and healthy controls heterozygous for rs907091 and rs11609, we showed that the multiple sclerosis risk alleles at IKZF3 and IQGAP1 are expressed at higher levels as compared to the protective allele. Furthermore, individuals homozygous for the multiple sclerosis risk allele at IQGAP1 had a significantly higher total expression of IQGAP1 compared to individuals homozygous for the protective allele.
CONCLUSIONS
Our data indicate a possible regulatory role for the multiple sclerosis-associated IKZF3 and IQGAP1 variants. We suggest that such cis-acting mechanisms may contribute to the multiple sclerosis association of single nucleotide polymorphisms at IKZF3 and IQGAP1.
Topics: Adolescent; Adult; Aged; Aged, 80 and over; Alleles; Allelic Imbalance; Case-Control Studies; Female; Gene Expression Regulation; Genetic Predisposition to Disease; Genotyping Techniques; Humans; Ikaros Transcription Factor; Linkage Disequilibrium; Male; Middle Aged; Multiple Sclerosis; Polymorphism, Single Nucleotide; Sensitivity and Specificity; Young Adult; ras GTPase-Activating Proteins
PubMed: 27080863
DOI: 10.1186/s12863-016-0367-4 -
PLoS Genetics Mar 2021Allelic expression imbalance (AEI), quantified by the relative expression of two alleles of a gene in a diploid organism, can help explain phenotypic variations among...
Allelic expression imbalance (AEI), quantified by the relative expression of two alleles of a gene in a diploid organism, can help explain phenotypic variations among individuals. Traditional methods detect AEI using bulk RNA sequencing (RNA-seq) data, a data type that averages out cell-to-cell heterogeneity in gene expression across cell types. Since the patterns of AEI may vary across different cell types, it is desirable to study AEI in a cell-type-specific manner. Although this can be achieved by single-cell RNA sequencing (scRNA-seq), it requires full-length transcript to be sequenced in single cells of a large number of individuals, which are still cost prohibitive to generate. To overcome this limitation and utilize the vast amount of existing disease relevant bulk tissue RNA-seq data, we developed BSCET, which enables the characterization of cell-type-specific AEI in bulk RNA-seq data by integrating cell type composition information inferred from a small set of scRNA-seq samples, possibly obtained from an external dataset. By modeling covariate effect, BSCET can also detect genes whose cell-type-specific AEI are associated with clinical factors. Through extensive benchmark evaluations, we show that BSCET correctly detected genes with cell-type-specific AEI and differential AEI between healthy and diseased samples using bulk RNA-seq data. BSCET also uncovered cell-type-specific AEIs that were missed in bulk data analysis when the directions of AEI are opposite in different cell types. We further applied BSCET to two pancreatic islet bulk RNA-seq datasets, and detected genes showing cell-type-specific AEI that are related to the progression of type 2 diabetes. Since bulk RNA-seq data are easily accessible, BSCET provides a convenient tool to integrate information from scRNA-seq data to gain insight on AEI with cell type resolution. Results from such analysis will advance our understanding of cell type contributions in human diseases.
Topics: Alleles; Allelic Imbalance; Biomarkers; Gene Expression Profiling; Gene Expression Regulation; High-Throughput Nucleotide Sequencing; Humans; Organ Specificity; Sequence Analysis, RNA; Single-Cell Analysis
PubMed: 33661921
DOI: 10.1371/journal.pgen.1009080 -
Nature Communications Sep 2018Point mutations in cancer have been extensively studied but chromosomal gains and losses have been more challenging to interpret due to their unspecific nature. Here we...
Point mutations in cancer have been extensively studied but chromosomal gains and losses have been more challenging to interpret due to their unspecific nature. Here we examine high-resolution allelic imbalance (AI) landscape in 1699 colorectal cancers, 256 of which have been whole-genome sequenced (WGSed). The imbalances pinpoint 38 genes as plausible AI targets based on previous knowledge. Unbiased CRISPR-Cas9 knockout and activation screens identified in total 79 genes within AI peaks regulating cell growth. Genetic and functional data implicate loss of TP53 as a sufficient driver of AI. The WGS highlights an influence of copy number aberrations on the rate of detected somatic point mutations. Importantly, the data reveal several associations between AI target genes, suggesting a role for a network of lineage-determining transcription factors in colorectal tumorigenesis. Overall, the results unravel the contribution of AI in colorectal cancer and provide a plausible explanation why so few genes are commonly affected by point mutations in cancers.
Topics: Allelic Imbalance; CRISPR-Cas Systems; Chromosome Aberrations; Chromosomes, Human, Pair 8; Colorectal Neoplasms; DNA Copy Number Variations; Denmark; Gene Expression Profiling; Genetic Predisposition to Disease; Genomics; Genotype; Humans; Loss of Heterozygosity; Microsatellite Repeats; Phenotype; Point Mutation; Proto-Oncogene Proteins p21(ras); RNA, Small Interfering; Transcription Factors; Tumor Suppressor Protein p53; Whole Genome Sequencing
PubMed: 30202008
DOI: 10.1038/s41467-018-06132-1 -
Prostate Cancer and Prostatic Diseases Sep 2010Four independent regions within 8q24 near the MYC gene are associated with risk for prostate cancer (Pca). Here, we investigated allelic imbalance (AI) at 8q24 risk...
Four independent regions within 8q24 near the MYC gene are associated with risk for prostate cancer (Pca). Here, we investigated allelic imbalance (AI) at 8q24 risk variants and MYC gene DNA copy number (CN) in 27 primary Pcas. Heterozygotes were observed in 24 of 27 patients at one or more 8q24 markers and 27% of the loci exhibited AI in tumor DNA. The 8q24 risk alleles were preferentially favored in the tumors. Increased MYC gene CN was observed in 33% of tumors, and the co-existence of increased MYC gene CN with AI at risk loci was observed in 86% (P<0.004 exact binomial test) of the informative tumors. No AI was observed in tumors, which did not reveal increased MYC gene CN. Higher Gleason score was associated with tumors exhibiting AI (P=0.04) and also with increased MYC gene CN (P=0.02). Our results suggest that AI at 8q24 and increased MYC gene CN may both be related to high Gleason score in Pca. Our findings also suggest that these two somatic alterations may be due to the same preferential chromosomal duplication event during prostate tumorigenesis.
Topics: Adult; Aged; Allelic Imbalance; Chromosome Aberrations; Chromosomes, Human, Pair 8; DNA Primers; Gene Dosage; Genes, myc; Humans; In Situ Hybridization, Fluorescence; Male; Middle Aged; Neoplasm Staging; Polymorphism, Single Nucleotide; Prostatic Neoplasms
PubMed: 20634801
DOI: 10.1038/pcan.2010.20 -
Bioinformatics (Oxford, England) May 2022Allelic expression analysis aids in detection of cis-regulatory mechanisms of genetic variation, which produce allelic imbalance (AI) in heterozygotes. Measuring AI in...
MOTIVATION
Allelic expression analysis aids in detection of cis-regulatory mechanisms of genetic variation, which produce allelic imbalance (AI) in heterozygotes. Measuring AI in bulk data lacking time or spatial resolution has the limitation that cell-type-specific (CTS), spatial- or time-dependent AI signals may be dampened or not detected.
RESULTS
We introduce a statistical method airpart for identifying differential CTS AI from single-cell RNA-sequencing data, or dynamics AI from other spatially or time-resolved datasets. airpart outputs discrete partitions of data, pointing to groups of genes and cells under common mechanisms of cis-genetic regulation. In order to account for low counts in single-cell data, our method uses a Generalized Fused Lasso with Binomial likelihood for partitioning groups of cells by AI signal, and a hierarchical Bayesian model for AI statistical inference. In simulation, airpart accurately detected partitions of cell types by their AI and had lower Root Mean Square Error (RMSE) of allelic ratio estimates than existing methods. In real data, airpart identified differential allelic imbalance patterns across cell states and could be used to define trends of AI signal over spatial or time axes.
AVAILABILITY AND IMPLEMENTATION
The airpart package is available as an R/Bioconductor package at https://bioconductor.org/packages/airpart.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Alleles; Allelic Imbalance; Bayes Theorem; Computer Simulation; Models, Statistical; Software
PubMed: 35561168
DOI: 10.1093/bioinformatics/btac212 -
Genetics in Medicine : Official Journal... Mar 2022Monogenic disorders can present clinically heterogeneous symptoms. We hypothesized that in patients with a monogenic disorder caused by a large deletion, frequently...
PURPOSE
Monogenic disorders can present clinically heterogeneous symptoms. We hypothesized that in patients with a monogenic disorder caused by a large deletion, frequently additional loss-of-function (LOF)-intolerant genes are affected, potentially contributing to the phenotype.
METHODS
We investigated the LOF-intolerant gene distribution across the genome and its association with benign population and pathogenic classified deletions from individuals with presumably monogenic disorders. For people with presumably monogenic epilepsy, we compared Human Phenotype Ontology terms in people with large and small deletions.
RESULTS
We identified LOF-intolerant gene dense regions that were enriched for ClinVar and depleted for population copy number variants. Analysis of data from >143,000 individuals with a suspected monogenic disorder showed that 2.5% of haploinsufficiency disorder-associated deletions can affect at least 1 other LOF-intolerant gene. Focusing on epilepsy, we observed that 13.1% of pathogenic and likely pathogenic ClinVar deletions <3 megabase pair, covering the diagnostically most relevant genes, affected at least 1 additional LOF-intolerant gene. Those patients have potentially more complex phenotypes with increasing deletion size.
CONCLUSION
We could systematically show that large deletions frequently affected admditional LOF-intolerant genes in addition to the established disease gene. Further research is needed to understand how additional potential disease-relevant genes influence monogenic disorders to improve clinical care and the efficacy of targeted therapies.
Topics: DNA Copy Number Variations; Genome; Haploinsufficiency; Humans; Phenotype
PubMed: 34906500
DOI: 10.1016/j.gim.2021.10.026 -
Nature Communications Sep 2020Haplotype reconstruction of distant genetic variants remains an unsolved problem due to the short-read length of common sequencing data. Here, we introduce HapTree-X, a...
Haplotype reconstruction of distant genetic variants remains an unsolved problem due to the short-read length of common sequencing data. Here, we introduce HapTree-X, a probabilistic framework that utilizes latent long-range information to reconstruct unspecified haplotypes in diploid and polyploid organisms. It introduces the observation that differential allele-specific expression can link genetic variants from the same physical chromosome, thus even enabling using reads that cover only individual variants. We demonstrate HapTree-X's feasibility on in-house sequenced Genome in a Bottle RNA-seq and various whole exome, genome, and 10X Genomics datasets. HapTree-X produces more complete phases (up to 25%), even in clinically important genes, and phases more variants than other methods while maintaining similar or higher accuracy and being up to 10× faster than other tools. The advantage of HapTree-X's ability to use multiple lines of evidence, as well as to phase polyploid genomes in a single integrative framework, substantially grows as the amount of diverse data increases.
Topics: Algorithms; Allelic Imbalance; Databases, Genetic; Diploidy; Haplotypes; Humans; K562 Cells; Models, Genetic; Models, Statistical; Polymorphism, Single Nucleotide; Polyploidy; RNA-Seq; Sequence Analysis, RNA
PubMed: 32938926
DOI: 10.1038/s41467-020-18320-z -
Cold Spring Harbor Molecular Case... Jun 2020Proteus syndrome is a mosaic disorder that can cause progressive postnatal overgrowth of nearly any organ or tissue. To date, Proteus syndrome has been exclusively...
Proteus syndrome is a mosaic disorder that can cause progressive postnatal overgrowth of nearly any organ or tissue. To date, Proteus syndrome has been exclusively associated with the mosaic c.49G > A p.(Glu17Lys) pathogenic variant in , a variant that is also present in many cancers. Here we describe an individual with severe Proteus syndrome who died at 7.5 yr of age from combined parenchymal and restrictive pulmonary disease. Remarkably, this individual was found to harbor a mosaic c.49_50delinsAG p.(Glu17Arg) variant in at a variant allele fraction that ranged from <0.01 to 0.46 in fibroblasts established from an overgrown digit. This variant was demonstrated to be constitutively activating by phosphorylation of AKT(S473). These data document allelic heterogeneity for Proteus syndrome. We recommend that individuals with a potential clinical diagnosis of Proteus syndrome who are negative for the p.(Glu17Lys) variant be tested for other variants in .
Topics: Alleles; Allelic Imbalance; Amino Acid Substitution; Cervical Vertebrae; Genetic Association Studies; Genetic Heterogeneity; Genetic Predisposition to Disease; Genetic Testing; Humans; Infant; Magnetic Resonance Imaging; Male; Medical History Taking; Mutation; Phenotype; Proteus Syndrome; Proto-Oncogene Proteins c-akt; Radiography; Symptom Assessment
PubMed: 32327430
DOI: 10.1101/mcs.a005181 -
Genetics Mar 2021Somatic copy number alterations (SCNAs) serve as hallmarks of tumorigenesis and often result in deviations from one-to-one allelic ratios at heterozygous loci, leading...
Somatic copy number alterations (SCNAs) serve as hallmarks of tumorigenesis and often result in deviations from one-to-one allelic ratios at heterozygous loci, leading to allelic imbalance (AI). The Cancer Genome Atlas (TCGA) reports SCNAs identified using a circular binary segmentation algorithm, providing segment mean copy number estimates from single-nucleotide polymorphism DNA microarray total intensities (log R ratio), but not allele-specific intensities ("B allele" frequencies) that inform of AI. Our approach provides more sensitive identification of SCNAs by modeling the "B allele" frequencies jointly, thereby bolstering the catalog of chromosomal alterations in this widely utilized resource. Here we present AI summaries for all 33 tumor sites in TCGA, including those induced by SCNAs and copy-neutral loss-of-heterozygosity (cnLOH). We identified AI in 94% of the tumors, higher than in previous reports. Recurrent events included deletions of 17p, 9q, 3p, amplifications of 8q, 1q, 7p, as well as mixed event types on 8p and 13q. We also observed both site-specific and pan-cancer (spanning 17p) cnLOH, patterns which have not been comprehensively characterized. The identification of such cnLOH events elucidates tumor suppressors and multi-hit pathways to carcinogenesis. We also contrast the landscapes inferred from AI- and total intensity-derived SCNAs and propose an automated procedure to improve and adjust SCNAs in TCGA for cases where high levels of aneuploidy obscured baseline intensity identification. Our findings support the exploration of additional methods for robust automated inference procedures and to aid empirical discoveries across TCGA.
Topics: Chromosome Aberrations; Chromosomes, Human; DNA Copy Number Variations; Databases, Genetic; Gene Frequency; Humans; Loss of Heterozygosity; Neoplasms
PubMed: 33683368
DOI: 10.1093/genetics/iyaa021 -
BMC Genomics Aug 2017Coding/functional SNPs change the biological function of a gene and, therefore, could serve as "large-effect" genetic markers. In this study, we used two bioinformatics...
BACKGROUND
Coding/functional SNPs change the biological function of a gene and, therefore, could serve as "large-effect" genetic markers. In this study, we used two bioinformatics pipelines, GATK and SAMtools, for discovering coding/functional SNPs with allelic-imbalances associated with total body weight, muscle yield, muscle fat content, shear force, and whiteness. Phenotypic data were collected for approximately 500 fish, representing 98 families (5 fish/family), from a growth-selected line, and the muscle transcriptome was sequenced from 22 families with divergent phenotypes (4 low- versus 4 high-ranked families per trait).
RESULTS
GATK detected 59,112 putative SNPs; of these SNPs, 4798 showed allelic imbalances (>2.0 as an amplification and <0.5 as loss of heterozygosity). SAMtools detected 87,066 putative SNPs; and of them, 4962 had allelic imbalances between the low- and high-ranked families. Only 1829 SNPs with allelic imbalances were common between the two datasets, indicating significant differences in algorithms. The two datasets contained 7930 non-redundant SNPs of which 4439 mapped to 1498 protein-coding genes (with 6.4% non-synonymous SNPs) and 684 mapped to 295 lncRNAs. Validation of a subset of 92 SNPs revealed 1) 86.7-93.8% success rate in calling polymorphic SNPs and 2) 95.4% consistent matching between DNA and cDNA genotypes indicating a high rate of identifying SNPs with allelic imbalances. In addition, 4.64% SNPs revealed random monoallelic expression. Genome distribution of the SNPs with allelic imbalances exhibited high density for all five traits in several chromosomes, especially chromosome 9, 20 and 28. Most of the SNP-harboring genes were assigned to important growth-related metabolic pathways.
CONCLUSION
These results demonstrate utility of RNA-Seq in assessing phenotype-associated allelic imbalances in pooled RNA-Seq samples. The SNPs identified in this study were included in a new SNP-Chip design (available from Affymetrix) for genomic and genetic analyses in rainbow trout.
Topics: Allelic Imbalance; Animals; Food Quality; Genomics; Molecular Sequence Annotation; Muscle Development; Oncorhynchus mykiss; Phenotype; Polymorphism, Single Nucleotide; Sequence Analysis, RNA
PubMed: 28784089
DOI: 10.1186/s12864-017-3992-z