-
Frontiers in Neuroscience 2023Numerous studies have suggested a connection between circadian rhythm and neurological disorders with cognitive and consciousness impairments in humans, yet little...
INTRODUCTION
Numerous studies have suggested a connection between circadian rhythm and neurological disorders with cognitive and consciousness impairments in humans, yet little evidence stands for a causal relationship between circadian rhythm and the brain cortex.
METHODS
The top 10,000 morningness-related single-nucleotide polymorphisms of the Genome-wide association study (GWAS) summary statistics were used to filter the instrumental variables. GWAS summary statistics from the ENIGMA Consortium were used to assess the causal relationship between morningness and variates like cortical thickness (TH) or surficial area (SA) on the brain cortex. The inverse-variance weighted (IVW) and weighted median (WM) were used as the major estimates whereas MR-Egger, MR Pleiotropy RESidual Sum and Outlier, leave-one-out analysis, and funnel-plot were used for heterogeneity and pleiotropy detecting.
RESULTS
Regionally, morningness decreased SA of the rostral middle frontal gyrus with genomic control (IVW: β = -24.916 mm, 95% CI: -47.342 mm to -2.490 mm, = 0.029. WM: β = -33.208 mm, 95% CI: -61.933 mm to -4.483 mm, = 0.023. MR Egger: β < 0) and without genomic control (IVW: β = -24.581 mm, 95% CI: -47.552 mm to -1.609 mm, = 0.036. WM: β = -32.310 mm, 95% CI: -60.717 mm to -3.902 mm, = 0.026. MR Egger: β < 0) on a nominal significance, with no heterogeneity or no outliers.
CONCLUSIONS AND IMPLICATIONS
Circadian rhythm causally affects the rostral middle frontal gyrus; this sheds new light on the potential use of MRI in disease diagnosis, revealing the significance of circadian rhythm on the progression of disease, and might also suggest a fresh therapeutic approach for disorders related to the rostral middle frontal gyrus-related.
PubMed: 37547136
DOI: 10.3389/fnins.2023.1222551 -
Frontiers in Nutrition 2022Beef is common in daily diet, but its association with the risk of rheumatoid arthritis (RA) remains uncertain. The objective of this study is to explore the...
BACKGROUND
Beef is common in daily diet, but its association with the risk of rheumatoid arthritis (RA) remains uncertain. The objective of this study is to explore the relationship between beef intake and the risk of RA.
MATERIALS AND METHODS
We investigated the association between beef intake and risk of RA by multivariate logistic regression, based on the National Health and Nutrition Examination Survey (NHANES) 1999-2016 involving 9,618 participants. The dose-response relationship between beef intake and RA was explored as well. Furthermore, we performed Mendelian randomization (MR) analysis to examine the causal effect of beef intake on RA. Genetic instruments for beef intake were selected from a genome-wide association study (GWAS) including 335,576 individuals from the UK Biobank study, and summary statistics relating to RA were obtained from a GWAS meta-analysis of 14,361 RA patients and 43,923 controls. The inverse-variance weighted (IVW) approach was used to estimate the causal association, and MR-Egger regression and Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) test were applied to evaluate the pleiotropy and outliers.
RESULTS
Compared with the lowest quintile (0 to ≤33.50 g/d), beef intake was found to be significantly associated with the risk of RA [odds ratio (OR): 1.94; 95% confidence interval (CI): 1.20-3.12] in the third quintile (50.26 to ≤76.50 g/d). Moreover, a reversed "U" dose-response relationship between beef and RA ( = 0.023) was found. In the MR analysis, beef intake was associated with an increased risk of RA (OR: 3.05; 95% CI: 1.11-8.35; = 0.030) by the IVW method. The results from MR-Egger regression and MR-PRESSO test showed that there were no pleiotropic variations and outliers.
CONCLUSION
This study indicated that there is suggestive evidence to support the causal effect of beef intake on the risk of RA, while further studies are warranted to elucidate the exact association.
PubMed: 36147307
DOI: 10.3389/fnut.2022.923472 -
Cancer Research and Treatment Jan 2021To find biomarkers for disease, there have been constant attempts to investigate the genes that differ from those in the disease groups. However, the values that lie...
PURPOSE
To find biomarkers for disease, there have been constant attempts to investigate the genes that differ from those in the disease groups. However, the values that lie outside the overall pattern of a distribution, the outliers, are frequently excluded in traditional analytical methods as they are considered to be 'some sort of problem.' Such outliers may have a biologic role in the disease group. Thus, this study explored new biomarker using outlier analysis, and verified the suitability of therapeutic potential of two genes (TM4SF4 and LRRK2).
MATERIALS AND METHODS
Modified Tukey's fences outlier analysis was carried out to identify new biomarkers using the public gene expression datasets. And we verified the presence of the selected biomarkers in other clinical samples via customized gene expression panels and tissue microarrays. Moreover, a siRNA-based knockdown test was performed to evaluate the impact of the biomarkers on oncogenic phenotypes.
RESULTS
TM4SF4 in lung cancer and LRRK2 in breast cancer were chosen as candidates among the genes derived from the analysis. TM4SF4 and LRRK2 were overexpressed in the small number of samples with lung cancer (4.20%) and breast cancer (2.42%), respectively. Knockdown of TM4SF4 and LRRK2 suppressed the growth of lung and breast cancer cell lines. The LRRK2 overexpressing cell lines were more sensitive to LRRK2-IN-1 than the LRRK2 under-expressing cell lines.
CONCLUSION
Our modified outlier-based analysis method has proved to rescue biomarkers previously missed or unnoticed by traditional analysis showing TM4SF4 and LRRK2 are novel target candidates for lung and breast cancer, respectively.
Topics: Breast Neoplasms; Female; Humans; Leucine-Rich Repeat Serine-Threonine Protein Kinase-2; Lung Neoplasms; Membrane Glycoproteins; Molecular Targeted Therapy
PubMed: 32972043
DOI: 10.4143/crt.2020.434 -
PloS One 2022This replication underlines the importance of outlier diagnostics since many researchers have long neglected influential observations in OLS regression analysis. In his...
This replication underlines the importance of outlier diagnostics since many researchers have long neglected influential observations in OLS regression analysis. In his article, entitled "Primary Resources, Secondary Labor," Shin finds that advanced democracies with increased natural resource wealth, particularly from oil and natural gas production, are more likely to restrict low-skill immigration policy. By performing outlier diagnostics, this replication shows that Shin's findings are a statistical artifact. When one outlying country, Norway, is removed from the sample data, I observe almost no significant and negative relationship between oil wealth and immigration policy. When two outlying countries are excluded, the effect of oil wealth completely disappears. Robust regression analysis, a widely used remedial method for outlier problems, confirms the results of my outlier diagnostics.
Topics: Emigration and Immigration; Least-Squares Analysis; Models, Theoretical; Natural Resources; Norway; Public Policy
PubMed: 35025888
DOI: 10.1371/journal.pone.0261533 -
Journal of Medical Internet Research Jul 2023Reference intervals (RIs) play an important role in clinical decision-making. However, due to the time, labor, and financial costs involved in establishing RIs using... (Observational Study)
Observational Study
BACKGROUND
Reference intervals (RIs) play an important role in clinical decision-making. However, due to the time, labor, and financial costs involved in establishing RIs using direct means, the use of indirect methods, based on big data previously obtained from clinical laboratories, is getting increasing attention. Different indirect techniques combined with different data transformation methods and outlier removal might cause differences in the calculation of RIs. However, there are few systematic evaluations of this.
OBJECTIVE
This study used data derived from direct methods as reference standards and evaluated the accuracy of combinations of different data transformation, outlier removal, and indirect techniques in establishing complete blood count (CBC) RIs for large-scale data.
METHODS
The CBC data of populations aged ≥18 years undergoing physical examination from January 2010 to December 2011 were retrieved from the First Affiliated Hospital of China Medical University in northern China. After exclusion of repeated individuals, we performed parametric, nonparametric, Hoffmann, Bhattacharya, and truncation points and Kolmogorov-Smirnov distance (kosmic) indirect methods, combined with log or BoxCox transformation, and Reed-Dixon, Tukey, and iterative mean (3SD) outlier removal methods in order to derive the RIs of 8 CBC parameters and compared the results with those directly and previously established. Furthermore, bias ratios (BRs) were calculated to assess which combination of indirect technique, data transformation pattern, and outlier removal method is preferrable.
RESULTS
Raw data showed that the degrees of skewness of the white blood cell (WBC) count, platelet (PLT) count, mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHC), and mean corpuscular volume (MCV) were much more obvious than those of other CBC parameters. After log or BoxCox transformation combined with Tukey or iterative mean (3SD) processing, the distribution types of these data were close to Gaussian distribution. Tukey-based outlier removal yielded the maximum number of outliers. The lower-limit bias of WBC (male), PLT (male), hemoglobin (HGB; male), MCH (male/female), and MCV (female) was greater than that of the corresponding upper limit for more than half of 30 indirect methods. Computational indirect choices of CBC parameters for males and females were inconsistent. The RIs of MCHC established by the direct method for females were narrow. For this, the kosmic method was markedly superior, which contrasted with the RI calculation of CBC parameters with high |BR| qualification rates for males. Among the top 10 methodologies for the WBC count, PLT count, HGB, MCV, and MCHC with a high-BR qualification rate among males, the Bhattacharya, Hoffmann, and parametric methods were superior to the other 2 indirect methods.
CONCLUSIONS
Compared to results derived by the direct method, outlier removal methods and indirect techniques markedly influence the final RIs, whereas data transformation has negligible effects, except for obviously skewed data. Specifically, the outlier removal efficiency of Tukey and iterative mean (3SD) methods is almost equivalent. Furthermore, the choice of indirect techniques depends more on the characteristics of the studied analyte itself. This study provides scientific evidence for clinical laboratories to use their previous data sets to establish RIs.
Topics: Adolescent; Adult; Female; Humans; Male; Big Data; Blood Cell Count; China; Leukocyte Count; Reference Values; Clinical Decision-Making
PubMed: 37459170
DOI: 10.2196/45651 -
Communications in Statistics:... 2023A two-stage joint survival model is used to analyse time to event outcomes that could be associated with biomakers that are repeatedly collected over time. A Two-stage...
A two-stage joint survival model is used to analyse time to event outcomes that could be associated with biomakers that are repeatedly collected over time. A Two-stage joint survival model has limited model checking tools and is usually assessed using standard diagnostic tools for survival models. The diagnostic tools can be improved and implemented. Time-varying covariates in a two-stage joint survival model might contain outlying observations or subjects. In this study we used the variance shift outlier model (VSOM) to detect and down-weight outliers in the first stage of the two-stage joint survival model. This entails fitting a VSOM at the observation level and a VSOM at the subject level, and then fitting a combined VSOM for the identified outliers. The fitted values were then extracted from the combined VSOM which were then used as time-varying covariate in the extended Cox model. We illustrate this methodology on a dataset from a multi-centre randomised clinical trial. A multi-centre trial showed that a combined VSOM fits the data better than an extended Cox model. We noted that implementing a combined VSOM, when desired, has a better fit based on the fact that outliers are down-weighted.
PubMed: 37981985
DOI: 10.1080/03610918.2021.1995751 -
MedRxiv : the Preprint Server For... Jun 2023Neuroanatomical normative modelling can capture individual variability in Alzheimer's Disease (AD). We used neuroanatomical normative modelling to track individuals'...
INTRODUCTION
Neuroanatomical normative modelling can capture individual variability in Alzheimer's Disease (AD). We used neuroanatomical normative modelling to track individuals' disease progression in people with mild cognitive impairment (MCI) and patients with AD.
METHODS
Cortical thickness and subcortical volume neuroanatomical normative models were generated using healthy controls (n~58k). These models were used to calculate regional Z-scores in 4361 T1-weighted MRI time-series scans. Regions with Z-scores <-1.96 were classified as outliers and mapped on the brain, and also summarised by total outlier count (tOC).
RESULTS
Rate of change in tOC increased in AD and in people with MCI who converted to AD and correlated with multiple non-imaging markers. Moreover, a higher annual rate of change in tOC increased the risk of MCI progression to AD. Brain Z-score maps showed that the hippocampus had the highest rate of atrophy change.
CONCLUSIONS
Individual-level atrophy rates can be tracked by using regional outlier maps and tOC.
PubMed: 37398392
DOI: 10.1101/2023.06.15.23291418 -
Frontiers in Neuroscience 2022Previous observational studies have shown that low back pain (LBP) often coexists with sleep disturbances, however, the causal relationship remains unclear. In the...
BACKGROUND
Previous observational studies have shown that low back pain (LBP) often coexists with sleep disturbances, however, the causal relationship remains unclear. In the present study, the causal relationship between sleep disturbances and LBP was investigated and the importance of sleep improvement in the comprehensive management of LBP was emphasized.
METHODS
Genetic variants were extracted as instrumental variables (IVs) from the genome-wide association study (GWAS) of insomnia, sleep duration, short sleep duration, long sleep duration, and daytime sleepiness. Information regarding genetic variants in LBP was selected from a GWAS dataset and included 13,178 cases and 164,682 controls. MR-Egger, weighted median, inverse-variance weighted (IVW), penalized weighted median, and maximum likelihood (ML) were applied to assess the causal effects. Cochran's test and MR-Egger intercept were performed to estimate the heterogeneity and horizontal pleiotropy, respectively. Outliers were identified and eliminated based on MR-PRESSO analysis to reduce the effect of horizontal pleiotropy on the results. Removing each genetic variant using the leave-one-out analysis can help evaluate the stability of results. Finally, the reverse causal inference involving five sleep traits was implemented.
RESULTS
A causal relationship was observed between insomnia-LBP (OR = 1.954, 95% CI: 1.119-3.411), LBP-daytime sleepiness (OR = 1.011, 95% CI: 1.004-1.017), and LBP-insomnia (OR = 1.015, 95% CI: 1.004-1.026), however, the results of bidirectional MR analysis between other sleep traits and LBP were negative. The results of most heterogeneity tests were stable and specific evidence was not found to support the disturbance of horizontal multiplicity. Only one outlier was identified based on MR-PRESSO analysis.
CONCLUSION
The main results of our research showed a potential bidirectional causal association of genetically predicted insomnia with LBP. Sleep improvement may be important in comprehensive management of LBP.
PubMed: 36532278
DOI: 10.3389/fnins.2022.1074605 -
Bioinformatics (Oxford, England) Aug 2022It has become routine in neuroscience studies to measure brain networks for different individuals using neuroimaging. These networks are typically expressed as adjacency...
MOTIVATION
It has become routine in neuroscience studies to measure brain networks for different individuals using neuroimaging. These networks are typically expressed as adjacency matrices, with each cell containing a summary of connectivity between a pair of brain regions. There is an emerging statistical literature describing methods for the analysis of such multi-network data in which nodes are common across networks but the edges vary. However, there has been essentially no consideration of the important problem of outlier detection. In particular, for certain subjects, the neuroimaging data are so poor quality that the network cannot be reliably reconstructed. For such subjects, the resulting adjacency matrix may be mostly zero or exhibit a bizarre pattern not consistent with a functioning brain. These outlying networks may serve as influential points, contaminating subsequent statistical analyses. We propose a simple Outlier DetectIon for Networks (ODIN) method relying on an influence measure under a hierarchical generalized linear model for the adjacency matrices. An efficient computational algorithm is described, and ODIN is illustrated through simulations and an application to data from the UK Biobank.
RESULTS
ODIN was successful in identifying moderate to extreme outliers. Removing such outliers can significantly change inferences in downstream applications.
AVAILABILITY AND IMPLEMENTATION
ODIN has been implemented in both Python and R and these implementations along with other code are publicly available at github.com/pritamdey/ODIN-python and github.com/pritamdey/ODIN-r, respectively.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Humans; Algorithms; Neuroimaging; Brain; Software
PubMed: 35762974
DOI: 10.1093/bioinformatics/btac431 -
Entropy (Basel, Switzerland) Mar 2023Rate distortion theory was developed for optimizing lossy compression of data, but it also has applications in statistics. In this paper, we illustrate how rate...
Rate distortion theory was developed for optimizing lossy compression of data, but it also has applications in statistics. In this paper, we illustrate how rate distortion theory can be used to analyze various datasets. The analysis involves testing, identification of outliers, choice of compression rate, calculation of optimal reconstruction points, and assigning "descriptive confidence regions" to the reconstruction points. We study four models or datasets of increasing complexity: clustering, Gaussian models, linear regression, and a dataset describing orientations of early Islamic mosques. These examples illustrate how rate distortion analysis may serve as a common framework for handling different statistical problems.
PubMed: 36981344
DOI: 10.3390/e25030456