-
Epilepsy Research Nov 2021Previous findings have suggested that a preictal state might precede the epileptic seizure onset, which is the basis for seizure prediction attempts. Preictal states can...
Previous findings have suggested that a preictal state might precede the epileptic seizure onset, which is the basis for seizure prediction attempts. Preictal states can be apprehended as outliers that differ from an interictal baseline and display clinical changes. We collected daily clinical scores from patients with epilepsy who underwent continuous video-EEG and assessed the ability of several outlier detection methods to identify preictal states. Results from 24 patients suggested that outlying clinical features were suggestive of preictal states and can be identified by statistical methods: AUC = 0.71, 95 % CI = [0.63 - 0.79]; PPV = 0.77, 95 % CI = [0.70 - 0.84]; FPR = 0.31, 95 % CI = [0.21 - 0.44]); and F1 score = 0.74, 95 % CI = [0.64 - 0.81]. Such algorithms could be straightforwardly implemented in a mobile device (e.g., tablet or smartphone), which would allow a longer data collection that could improve prediction performances. Additional clinical - and even multimodal - parameters could identify more subtle physiological modifications.
Topics: Algorithms; Biomarkers; Electroencephalography; Epilepsy; Humans; Seizures
PubMed: 34571459
DOI: 10.1016/j.eplepsyres.2021.106774 -
NeuroImage Dec 2023Diffusion-weighted MRI (dMRI) is a medical imaging method that can be used to investigate the brain microstructure and structural connections between different brain...
Diffusion-weighted MRI (dMRI) is a medical imaging method that can be used to investigate the brain microstructure and structural connections between different brain regions. The method, however, requires relatively complex data processing frameworks and analysis pipelines. Many of these approaches are vulnerable to signal dropout artefacts that can originate from subjects moving their head during the scan. To combat these artefacts and eliminate such outliers, researchers have proposed two approaches: to replace outliers or to downweight outliers during modelling and analysis. With the rising interest in dMRI for clinical research, these types of corrections are increasingly important. Therefore, we set out to investigate the differences between outlier replacement and weighting approaches to help the dMRI community to select the best tool for their data processing pipelines. We evaluated dMRI motion correction registration and single tensor model fit pipelines using Gaussian Process and Spherical Harmonic based replacement approaches and outlier downweighting using highly realistic whole-brain simulations. As a proof of concept, we applied these approaches to dMRI infant data sets that contained varying numbers of dropout artefacts. Based on our results, we concluded that the Gaussian Process based outlier replacement provided similar tensor fit results to Gaussian Process based outlier detection and downweighting. Therefore, if only the least-squares estimate of the single tensor model is of interest, our recommendation is to use outlier replacement. However, outlier downweighting can potentially provide a more accurate estimate of the model precision which could be relevant for applications such as probabilistic tractoraphy.
Topics: Humans; Algorithms; Diffusion Magnetic Resonance Imaging; Brain; Artifacts; Least-Squares Analysis
PubMed: 37820862
DOI: 10.1016/j.neuroimage.2023.120397 -
Molecules (Basel, Switzerland) Jun 2021In this paper, we report comprehensive experimental and chemoinformatics analyses of the solubility of small organic molecules ("fragments") in dimethyl sulfoxide (DMSO)...
In this paper, we report comprehensive experimental and chemoinformatics analyses of the solubility of small organic molecules ("fragments") in dimethyl sulfoxide (DMSO) in the context of their ability to be tested in screening experiments. Here, DMSO solubility of 939 fragments has been measured experimentally using an NMR technique. A Support Vector Classification model was built on the obtained data using the ISIDA fragment descriptors. The analysis revealed 34 outliers: experimental issues were retrospectively identified for 28 of them. The updated model performs well in 5-fold cross-validation (balanced accuracy = 0.78). The datasets are available on the Zenodo platform (DOI:10.5281/zenodo.4767511) and the model is available on the website of the Laboratory of Chemoinformatics.
PubMed: 34203441
DOI: 10.3390/molecules26133950 -
Molecular Oncology Jun 2024Multiple strategies are continuously being explored to expand the drug target repertoire in solid tumors. We devised a novel computational workflow for...
Multiple strategies are continuously being explored to expand the drug target repertoire in solid tumors. We devised a novel computational workflow for transcriptome-wide gene expression outlier analysis that allows the systematic identification of both overexpression and underexpression events in cancer cells. Here, it was applied to expression values obtained through RNA sequencing in 226 colorectal cancer (CRC) cell lines that were also characterized by whole-exome sequencing and microarray-based DNA methylation profiling. We found cell models displaying an abnormally high or low expression level for 3533 and 965 genes, respectively. Gene expression abnormalities that have been previously associated with clinically relevant features of CRC cell lines were confirmed. Moreover, by integrating multi-omics data, we identified both genetic and epigenetic alternations underlying outlier expression values. Importantly, our atlas of CRC gene expression outliers can guide the discovery of novel drug targets and biomarkers. As a proof of concept, we found that CRC cell lines lacking expression of the MTAP gene are sensitive to treatment with a PRMT5-MTA inhibitor (MRTX1719). Finally, other tumor types may also benefit from this approach.
Topics: Humans; Colorectal Neoplasms; Gene Expression Regulation, Neoplastic; Cell Line, Tumor; Transcriptome; Gene Expression Profiling; DNA Methylation
PubMed: 38468448
DOI: 10.1002/1878-0261.13622 -
The Science of the Total Environment Sep 2023We demonstrate the benefits of using Riemannian geometry in the analysis of multi-site, multi-pollutant atmospheric monitoring data. Our approach uses covariance...
We demonstrate the benefits of using Riemannian geometry in the analysis of multi-site, multi-pollutant atmospheric monitoring data. Our approach uses covariance matrices to encode spatio-temporal variability and correlations of multiple pollutants at different sites and times. A key property of covariance matrices is that they lie on a Riemannian manifold and one can exploit this property to facilitate dimensionality reduction, outlier detection, and spatial interpolation. Specifically, the transformation of data using Reimannian geometry provides a better data surface for interpolation and assessment of outliers compared to traditional data analysis tools that assume Euclidean geometry. We demonstrate the utility of using Riemannian geometry by analyzing a full year of atmospheric monitoring data collected from 34 monitoring stations in Beijing, China.
Topics: Algorithms; Environmental Pollutants; Data Analysis; Beijing; China
PubMed: 37230339
DOI: 10.1016/j.scitotenv.2023.164064 -
Journal of Affective Disorders Feb 2022Symptom manifestations in affective disorders can be subtle. Small imprecisions in measurement can lead to incorrect estimation of change. Previously, expert-derived...
Symptom manifestations in affective disorders can be subtle. Small imprecisions in measurement can lead to incorrect estimation of change. Previously, expert-derived scoring inconsistency flags were developed for MADRS. Currently, we derive empirically based outlier-pattern flags, to further detect imprecisions in ratings. NEWMEDS data repository of almost 25,000 MADRS administrations from 11 registration trials of antidepressants was used to identify outlier response patterns reflecting potentially careless responses. Coverage of these flags was compared to previously published expert derived flags. Both sets of flags were also further tested in Monte Carlo simulated data as a proxy to applying flags under conditions of known inconsistency. The outlier flags derived provide cutting points to identify: (1) under and overuse of values (e.g., Scoring "1″ on 6 or more items), (2) disproportionate use of even or odd response choices (e.g., 8 or more odd values), (3) longest consecutive use of value (e.g., more than 5 items in a row scored with same value), (4) high variability within administration (standard deviation greater than 1.8), (5) outlier responses on multiple items (i.e., multivariate outliers), and (6) outlier scoring (e.g., scoring 4,5 or 6 on item 1). Outlier response flags were raised in 26% of the MADRS administration and in 97% of the Monte Carlo data. Of administrations with no expert flag, 21.7% had an outlier flag and of administrations with at least one expert flag, 27.7% also had an outlier flag. Outlier-pattern flags appear to be a useful adjunct to expert derived flags in the quest to improve measurement in clinical trials.
Topics: Antidepressive Agents; Depression; Humans; Mood Disorders; Psychiatric Status Rating Scales; Reproducibility of Results
PubMed: 34952105
DOI: 10.1016/j.jad.2021.12.076 -
Clinical Epigenetics Mar 2020DNA methylation outlier burden has been suggested as a potential marker of biological age. An outlier is typically defined as DNA methylation levels at any one CpG site...
BACKGROUND
DNA methylation outlier burden has been suggested as a potential marker of biological age. An outlier is typically defined as DNA methylation levels at any one CpG site that are three times beyond the inter-quartile range from the 25th or 75th percentiles compared to the rest of the population. DNA methylation outlier burden (the number of such outlier sites per individual) increases exponentially with age. However, these findings have been observed in small samples.
RESULTS
Here, we showed an association between age and log-transformed DNA methylation outlier burden in a large cross-sectional cohort, the Generation Scotland Family Health Study (N = 7010, β = 0.0091, p < 2 × 10), and in two longitudinal cohort studies, the Lothian Birth Cohorts of 1921 (N = 430, β = 0.033, p = 7.9 × 10) and 1936 (N = 898, β = 0.0079, p = 0.074). Significant confounders of both cross-sectional and longitudinal associations between outlier burden and age included white blood cell proportions, body mass index (BMI), smoking, and batch effects. In Generation Scotland, the increase in epigenetic outlier burden with age was not purely an artefact of an increase in DNA methylation level variability with age (epigenetic drift). Log-transformed DNA methylation outlier burden in Generation Scotland was not related to self-reported, or family history of, age-related diseases, and it was not heritable (SNP-based heritability of 4.4%, p = 0.18). Finally, DNA methylation outlier burden was not significantly related to survival in either of the Lothian Birth Cohorts individually or in the meta-analysis after correction for multiple testing (HR = 1.12; 95% CI = [1.02; 1.21]; p = 0.021).
CONCLUSIONS
These findings suggest that, while it does not associate with ageing-related health outcomes, DNA methylation outlier burden does track chronological ageing and may also relate to survival. DNA methylation outlier burden may thus be useful as a marker of biological ageing.
Topics: Adult; Age Factors; Aging; Confounding Factors, Epidemiologic; CpG Islands; Cross-Sectional Studies; DNA Methylation; Epigenesis, Genetic; Female; Humans; Longitudinal Studies; Male; Middle Aged; Risk Factors; Scotland
PubMed: 32216821
DOI: 10.1186/s13148-020-00838-0 -
IEEE Transactions on Image Processing :... 2021Outlier handling has attracted considerable attention recently but remains challenging for image deblurring. Existing approaches mainly depend on iterative outlier...
Outlier handling has attracted considerable attention recently but remains challenging for image deblurring. Existing approaches mainly depend on iterative outlier detection steps to explicitly or implicitly reduce the influence of outliers on image deblurring. However, these outlier detection steps usually involve heuristic operations and iterative optimization processes, which are complex and time-consuming. In contrast, we propose to learn a deep convolutional neural network to directly estimate the confidence map, which can identify reliable inliers and outliers from the blurred image and thus facilitates the following deblurring process. We analyze that the proposed algorithm incorporated with the learned confidence map is effective in handling outliers and does not require ad-hoc outlier detection steps which are critical to existing outlier handling methods. Compared to existing approaches, the proposed algorithm is more efficient and can be applied to both non-blind and blind image deblurring. Extensive experimental results demonstrate that the proposed algorithm performs favorably against state-of-the-art methods in terms of accuracy and efficiency.
PubMed: 33417555
DOI: 10.1109/TIP.2020.3048679 -
Proceedings of SPIE--the International... 2020Abdominal multi-organ segmentation of computed tomography (CT) images has been the subject of extensive research interest. It presents a substantial challenge in medical...
Abdominal multi-organ segmentation of computed tomography (CT) images has been the subject of extensive research interest. It presents a substantial challenge in medical image processing, as the shape and distribution of abdominal organs can vary greatly among the population and within an individual over time. While continuous integration of novel datasets into the training set provides potential for better segmentation performance, collection of data at scale is not only costly, but also impractical in some contexts. Moreover, it remains unclear what marginal value additional data have to offer. Herein, we propose a single-pass active learning method through human quality assurance (QA). We built on a pre-trained 3D U-Net model for abdominal multi-organ segmentation and augmented the dataset either with outlier data (e.g., exemplars for which the baseline algorithm failed) or inliers (e.g., exemplars for which the baseline algorithm worked). The new models were trained using the augmented datasets with 5-fold cross-validation (for outlier data) and withheld outlier samples (for inlier data). Manual labeling of outliers increased Dice scores with outliers by 0.130, compared to an increase of 0.067 with inliers (p<0.001, two-tailed paired t-test). By adding 5 to 37 inliers or outliers to training, we find that the marginal value of adding outliers is higher than that of adding inliers. In summary, improvement on single-organ performance was obtained without diminishing multi-organ performance or significantly increasing training time. Hence, identification and correction of baseline failure cases present an effective and efficient method of selecting training data to improve algorithm performance.
PubMed: 33907347
DOI: 10.1117/12.2549365 -
PloS One 2023This study proposes a robust outlier detection method based on the circular median for non-parametric linear-circular regression in case the response variable includes...
This study proposes a robust outlier detection method based on the circular median for non-parametric linear-circular regression in case the response variable includes outlier(s) and the residuals are Wrapped-Cauchy distributed. Nadaraya-Watson and local linear regression methods were employed to obtain non-parametric regression fits. The proposed method's performance was investigated by using a real dataset and a comprehensive simulation study with different sample sizes, contamination, and heterogeneity degrees. The method performs quite well in medium and higher contamination degrees, and its performance increases as the sample size and the homogeneity of data increase. In addition, when the response variable of linear-circular regression contains outliers, the Local Linear Estimation method fits the data set better than the Nadaraya Watson method.
Topics: Humans; Linear Models; Computer Simulation; Drug Contamination; Sample Size; Seizures
PubMed: 37307265
DOI: 10.1371/journal.pone.0286448