-
BMC Medical Research Methodology Oct 2023Growth studies rely on longitudinal measurements, typically represented as trajectories. However, anthropometry is prone to errors that can generate outliers. While...
BACKGROUND
Growth studies rely on longitudinal measurements, typically represented as trajectories. However, anthropometry is prone to errors that can generate outliers. While various methods are available for detecting outlier measurements, a gold standard has yet to be identified, and there is no established method for outlying trajectories. Thus, outlier types and their effects on growth pattern detection still need to be investigated. This work aimed to assess the performance of six methods at detecting different types of outliers, propose two novel methods for outlier trajectory detection and evaluate how outliers affect growth pattern detection.
METHODS
We included 393 healthy infants from The Applied Research Group for Kids (TARGet Kids!) cohort and 1651 children with severe malnutrition from the co-trimoxazole prophylaxis clinical trial. We injected outliers of three types and six intensities and applied four outlier detection methods for measurements (model-based and World Health Organization cut-offs-based) and two for trajectories. We also assessed growth pattern detection before and after outlier injection using time series clustering and latent class mixed models. Error type, intensity, and population affected method performance.
RESULTS
Model-based outlier detection methods performed best for measurements with precision between 5.72-99.89%, especially for low and moderate error intensities. The clustering-based outlier trajectory method had high precision of 14.93-99.12%. Combining methods improved the detection rate to 21.82% in outlier measurements. Finally, when comparing growth groups with and without outliers, the outliers were shown to alter group membership by 57.9 -79.04%.
CONCLUSIONS
World Health Organization cut-off-based techniques were shown to perform well in few very particular cases (extreme errors of high intensity), while model-based techniques performed well, especially for moderate errors of low intensity. Clustering-based outlier trajectory detection performed exceptionally well across all types and intensities of errors, indicating a potential strategic change in how outliers in growth data are viewed. Finally, the importance of detecting outliers was shown, given its impact on children growth studies, as demonstrated by comparing results of growth group detection.
Topics: Child; Humans; Cluster Analysis; Research Design; Infant; Child Development
PubMed: 37833647
DOI: 10.1186/s12874-023-02045-w -
Journal of Applied Statistics 2022Functional box plots satisfy two needs; visualization of functional data, and the calculation of important box plot statistics. Data visualization illuminates key...
Functional box plots satisfy two needs; visualization of functional data, and the calculation of important box plot statistics. Data visualization illuminates key characteristics of functional sets missed by statistical tests and summary statistics. The calculation of box plot statistics for functional sets permits a novel comparison more suited to functional data. The functional box plot uses a depth method to visualize and rank smooth functional curves in terms of a mean, box, whiskers, and outliers. The functional box plot improves upon other classic functional data analysis tools such as functional principal components and discriminant analysis for outlier detection. This research adds wavelet analysis as a generating mechanism along with depth for functional box plots to visualize functional data and calculate relevant statistics. The wavelet analysis of variance box plot tool gives competitive error rates in Gaussian test cases with magnitude outliers, and outperforms the functional box plot, for Gaussian test cases with shape outliers. Further, we show wavelet analysis is well suited at approximating irregular and noisy functional data and show the enhanced capability of WANOVA box plots to classify shape outliers which follow a different pattern than other functional data for both simulated and real data instances.
PubMed: 36246858
DOI: 10.1080/02664763.2021.1951685 -
EFORT Open Reviews Jan 2023Recent concerns surrounding joint replacements that have a higher than expected rate of revision have led to stricter controls by regulatory authorities with regards to... (Review)
Review
Recent concerns surrounding joint replacements that have a higher than expected rate of revision have led to stricter controls by regulatory authorities with regards to the introduction of new devices into the marketplace. Implant post-market surveillance remains important, and joint replacement registries are ideally placed to perform this role. This review examined if and how joint replacement registries identified outlier prostheses, outlined problems and suggested solutions to improve post-market surveillance. A search was performed of all joint replacement registries that had electronic or published reports detailing the outcomes of joint replacement. These reports were examined for registry identification of outlier prostheses. Five registries publicly identified outlier prostheses in their reports and the methods by which this was performed, and three others had internal reports. Identification of outlier prostheses is one area that may improve overall joint replacement outcomes; however, further research is needed to determine the optimum methods for identification, including the threshold, the comparator and the numbers required for notification of devices. Co-operation of registries at a global level may lead to earlier identification of devices and thereby further improve the results of joint replacement.
PubMed: 36705620
DOI: 10.1530/EOR-22-0058 -
Brain and Behavior Nov 2023Patients with autism spectrum disorder (ASD) commonly experience aberrant skin sensation sensitivity; however, the causal relationship is not yet clear. This study uses...
BACKGROUND AND AIM
Patients with autism spectrum disorder (ASD) commonly experience aberrant skin sensation sensitivity; however, the causal relationship is not yet clear. This study uses a bidirectional Mendelian randomization (MR) method to explore the relationship between disturbance of skin sensation (DSS) and ASD.
METHODS
Single-nucleotide polymorphisms (SNPs) extracted from the summary data of genome-wide association studies were used as genetic instruments. MR was performed using the inverse-variance-weighted method, with alternate methods (e.g., weighted median, MR-Egger, simple mode, weighted mode, and MR-pleiotropy residual sum and outlier) and multiple sensitivity analyses to assess horizontal pleiotropy and remove outliers.
RESULTS
The results of the analysis using six SNPs as genetic instruments showed that the DSS is associated with an increased risk of ASD (odds ratio = 1.126, 95% confidence interval = 1.029-1.132; p = .010). The results of the sensitivity analyses were robust with no evidence of pleiotropy. The reverse MR analyses showed no causal effects of ASD on DSS.
CONCLUSION
This study's findings suggest that DSS has potential causal effects on ASD, whereas ASD has no effect on DSS. Thus, skin sensitivity may represent a behavioral marker of ASD, by which some populations could be subtyped in the future.
Topics: Humans; Autism Spectrum Disorder; Genome-Wide Association Study; Mendelian Randomization Analysis; Skin; Sensation
PubMed: 37670485
DOI: 10.1002/brb3.3238 -
Frontiers in Genetics 2023The causal direction and magnitude of the associations between blood cell count and coronary heart disease (CHD) remain uncertain due to susceptibility of reverse...
The causal direction and magnitude of the associations between blood cell count and coronary heart disease (CHD) remain uncertain due to susceptibility of reverse causation and confounding. This study aimed to investigate the associations between blood cell count and CHD using Mendelian randomization (MR). In this two-sample MR study, we identified independent blood cell count associated genetic variants from a genome-wide association studies (GWAS) among European ancestry individuals. Summary level data of CHD was obtained from a GWAS consisting of 547261 subjects. Methods of inverse variance weighted (IVW), Mendelian Randomization-Egger (MR-Egger), weighted median, and outlier test (MR-PRESSO) were conducted to investigate the associations between blood cell and CHD. Among all cardiovascular outcomes of interest, blood cell counts were only associated with CHD. Our findings indicated that white blood cell count and neutrophil cell count were significantly associated with increased risk of CHD [odds ratio (OR) = 1.07, 95% confidence interval (CI), 1.01-1.14; OR = 1.09, 1.02-1.16). However, there was no significant association between monocyte cell count, basophil cell count, lymphocyte cell count, eosinophil cell count, and CHD ( > 0.05). The results after excluding outliers were consistent with main results and the sensitivity analyses showed no evidence of pleiotropy (MR-Egger intercept, > 0.05). Our MR study suggested that greater white blood cell count and neutrophil cell count were associated with a higher risk of CHD. Future studies are still warranted to validate the results and investigate the mechanisms underlying these associations.
PubMed: 36824433
DOI: 10.3389/fgene.2023.1127820 -
Scientific Reports Mar 2023There is still some controversy about the relationship between lipids and venous thrombosis (VTE). A bidirectional Mendelian randomization (MR) study was conducted to... (Randomized Controlled Trial)
Randomized Controlled Trial
There is still some controversy about the relationship between lipids and venous thrombosis (VTE). A bidirectional Mendelian randomization (MR) study was conducted to clarify the causal relationship between three classical lipids (low-density lipoprotein (LDL), high-density lipoprotein (HDL) and triglycerides (TGs)) and venous thromboembolism (VTE) (deep venous thrombosis (DVT) and pulmonary embolism (PE)). Three classical lipids and VTE were analysed by bidirectional Mendelian randomization (MR). We used the random effect inverse variance weighted (IVW) model as the main analysis model and the weighted median method, simple mode method, weighted mode method and MR-Egger methods as supplementary methods. The leave-one-out test was used to determine the influence of outliers. The heterogeneity was calculated by using Cochran Q statistics in the MR-Egger and IVW methods. The intercept term in the MR‒Egger regression was used to indicate whether horizontal pleiotropy affected the results of the MR analysis. In addition, MR-PRESSO identified outlier single-nucleotide polymorphisms (SNPs) and obtained a stable result by removing outlier SNPs and then performing MR analysis. When we used three classical lipids (LDL, HDL and TGs) as exposure variables, no causal relationship between them and VTE (DVT and PE) was found. In addition, we did not find significant causal effects of VTE on the three classical lipids in reverse MR analysis. There is no significant causal relationship between three classical lipids (LDL, HDL and TGs) and VTE (DVT and PE) from a genetic point of view.
Topics: Humans; Venous Thromboembolism; Venous Thrombosis; Lipids; Pulmonary Embolism; Triglycerides; Lipoproteins, HDL; Mendelian Randomization Analysis; Polymorphism, Single Nucleotide; Genome-Wide Association Study
PubMed: 36890190
DOI: 10.1038/s41598-023-31067-z -
BMC Public Health Jan 2023One of the seminal events since 2019 has been the outbreak of the SARS-CoV-2 pandemic. Countries have adopted various policies to deal with it, but they also differ in...
BACKGROUND
One of the seminal events since 2019 has been the outbreak of the SARS-CoV-2 pandemic. Countries have adopted various policies to deal with it, but they also differ in their socio-geographical characteristics and public health care facilities. Our study aimed to investigate differences between epidemiological parameters across countries.
METHOD
The analysed data represents SARS-CoV-2 repository provided by the Johns Hopkins University. Separately for each country, we estimated recovery and mortality rates using the SIRD model applied to the first 30, 60, 150, and 300 days of the pandemic. Moreover, a mixture of normal distributions was fitted to the number of confirmed cases and deaths during the first 300 days. The estimates of peaks' means and variances were used to identify countries with outlying parameters.
RESULTS
For 300 days Belgium, Cyprus, France, the Netherlands, Serbia, and the UK were classified as outliers by all three outlier detection methods. Yemen was classified as an outlier for each of the four considered timeframes, due to high mortality rates. During the first 300 days of the pandemic, the majority of countries underwent three peaks in the number of confirmed cases, except Australia and Kazakhstan with two peaks.
CONCLUSIONS
Considering recovery and mortality rates we observed heterogeneity between countries. Liechtenstein was the "positive" outlier with low mortality rates and high recovery rates, at the opposite, Yemen represented a "negative" outlier with high mortality for all four considered periods and low recovery for 30 and 60 days.
Topics: Humans; SARS-CoV-2; COVID-19; Pandemics; Disease Outbreaks; France
PubMed: 36681790
DOI: 10.1186/s12889-023-15092-1 -
Journal of Applied Statistics 2023Discriminative subspace clustering (DSC) can make full use of linear discriminant analysis (LDA) to reduce the dimension of data and achieve effective clustering...
Discriminative subspace clustering (DSC) can make full use of linear discriminant analysis (LDA) to reduce the dimension of data and achieve effective clustering high-dimension data by clustering low-dimension data in discriminant subspace. However, most existing DSC algorithms do not consider the noise and outliers that may be contained in data sets, and when they are applied to the data sets with noise or outliers, and they often obtain poor performance due to the influence of noise and outliers. In this paper, we address the problem of the sensitivity of DSC to noise and outlier. Replacing the Euclidean distance in the objective function of LDA by an exponential non-Euclidean distance, we first develop a noise-insensitive LDA (NILDA) algorithm. Then, combining the proposed NILDA and a noise-insensitive fuzzy clustering algorithm: AFKM, we propose a noise-insensitive discriminative subspace fuzzy clustering (NIDSFC) algorithm. Experiments on some benchmark data sets show the effectiveness of the proposed NIDSFC algorithm.
PubMed: 36819072
DOI: 10.1080/02664763.2021.1937583 -
Frontiers in Immunology 2024The co-occurrence of primary biliary cholangitis (PBC) and systemic lupus erythematosus (SLE) has been consistently reported in observational studies. Nevertheless, the...
Investigating the causal relationship and potential shared diagnostic genes between primary biliary cholangitis and systemic lupus erythematosus using bidirectional Mendelian randomization and transcriptomic analyses.
BACKGROUND
The co-occurrence of primary biliary cholangitis (PBC) and systemic lupus erythematosus (SLE) has been consistently reported in observational studies. Nevertheless, the underlying causal correlation between these two conditions still needs to be established.
METHODS
We performed a bidirectional two-sample Mendelian randomization (MR) study to assess their causal association. Five MR analysis methods were utilized for causal inference, with inverse-variance weighted (IVW) selected as the primary method. The Mendelian Randomization Pleiotropy RESidual Sum and Outlier (MR-PRESSO) and the IVW Radial method were applied to exclude outlying SNPs. To assess the robustness of the MR results, five sensitivity analyses were carried out. Multivariable MR (MVMR) analysis was also employed to evaluate the effect of possible confounders. In addition, we integrated transcriptomic data from PBC and SLE, employing Weighted Gene Co-expression Network Analysis (WGCNA) to explore shared genes between the two diseases. Then, we used Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment methods to perform on the shared genes. The Least Absolute Shrinkage and Selection Operator (LASSO) regression algorithm was utilized to identify potential shared diagnostic genes. Finally, we verified the potential shared diagnostic genes in peripheral blood mononuclear cells (PBMCs)-specific cell populations of SLE patients by single-cell analysis.
RESULTS
Our MR study provided evidence that PBC had a causal relationship with SLE (IVW, OR: 1.347, 95% CI: 1.276 - 1.422, P < 0.001) after removing outliers (MR-PRESSO, rs35464393, rs3771317; IVW Radial, rs11065987, rs12924729, rs3745516). Conversely, SLE also had a causal association with PBC (IVW, OR: 1.225, 95% CI: 1.141 - 1.315, P < 0.001) after outlier correction (MR-PRESSO, rs11065987, rs3763295, rs7774434; IVW Radial, rs2297067). Sensitivity analyses confirmed the robustness of the MR findings. MVMR analysis indicated that body mass index (BMI), smoking and drinking were not confounding factors. Moreover, bioinformatic analysis identified PARP9, ABCA1, CEACAM1, and DDX60L as promising diagnostic biomarkers for PBC and SLE. These four genes are highly expressed in CD14+ monocytes in PBMCs of SLE patients and potentially associated with innate immune responses and immune activation.
CONCLUSION
Our study confirmed the bidirectional causal relationship between PBC and SLE and identified PARP9, ABCA1, CEACAM1, and DDX60L genes as the most potentially shared diagnostic genes between the two diseases, providing insights for the exploration of the underlying mechanisms of these disorders.
Topics: Humans; Leukocytes, Mononuclear; Liver Cirrhosis, Biliary; Mendelian Randomization Analysis; Gene Expression Profiling; CEACAM1 Protein; Lupus Erythematosus, Systemic
PubMed: 38464525
DOI: 10.3389/fimmu.2024.1270401 -
Economics Letters Feb 2022COVID-19 hit the economy in an unprecedented way, changing the data generating process of many series. We compare different seasonal adjustment methods through...
COVID-19 hit the economy in an unprecedented way, changing the data generating process of many series. We compare different seasonal adjustment methods through simulations, introducing outliers in the trend and seasonality to reproduce the heterogeneity in the series during COVID-19.
PubMed: 34931098
DOI: 10.1016/j.econlet.2021.110206