-
Intensive Care Medicine Mar 2020Early clinical recognition of sepsis can be challenging. With the advancement of machine learning, promising real-time models to predict sepsis have emerged. We assessed... (Meta-Analysis)
Meta-Analysis Review
PURPOSE
Early clinical recognition of sepsis can be challenging. With the advancement of machine learning, promising real-time models to predict sepsis have emerged. We assessed their performance by carrying out a systematic review and meta-analysis.
METHODS
A systematic search was performed in PubMed, Embase.com and Scopus. Studies targeting sepsis, severe sepsis or septic shock in any hospital setting were eligible for inclusion. The index test was any supervised machine learning model for real-time prediction of these conditions. Quality of evidence was assessed using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) methodology, with a tailored Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) checklist to evaluate risk of bias. Models with a reported area under the curve of the receiver operating characteristic (AUROC) metric were meta-analyzed to identify strongest contributors to model performance.
RESULTS
After screening, a total of 28 papers were eligible for synthesis, from which 130 models were extracted. The majority of papers were developed in the intensive care unit (ICU, n = 15; 54%), followed by hospital wards (n = 7; 25%), the emergency department (ED, n = 4; 14%) and all of these settings (n = 2; 7%). For the prediction of sepsis, diagnostic test accuracy assessed by the AUROC ranged from 0.68-0.99 in the ICU, to 0.96-0.98 in-hospital and 0.87 to 0.97 in the ED. Varying sepsis definitions limit pooling of the performance across studies. Only three papers clinically implemented models with mixed results. In the multivariate analysis, temperature, lab values, and model type contributed most to model performance.
CONCLUSION
This systematic review and meta-analysis show that on retrospective data, individual machine learning models can accurately predict sepsis onset ahead of time. Although they present alternatives to traditional scoring systems, between-study heterogeneity limits the assessment of pooled results. Systematic reporting and clinical implementation studies are needed to bridge the gap between bytes and bedside.
Topics: Diagnostic Tests, Routine; Humans; Machine Learning; Retrospective Studies; Sepsis; Shock, Septic
PubMed: 31965266
DOI: 10.1007/s00134-019-05872-y -
The Lancet. Digital Health Jun 2022Skin cancers occur commonly worldwide. The prognosis and disease burden are highly dependent on the cancer type and disease stage at diagnosis. We systematically... (Review)
Review
Skin cancers occur commonly worldwide. The prognosis and disease burden are highly dependent on the cancer type and disease stage at diagnosis. We systematically reviewed studies on artificial intelligence and machine learning (AI/ML) algorithms that aim to facilitate the early diagnosis of skin cancers, focusing on their application in primary and community care settings. We searched MEDLINE, Embase, Scopus, and Web of Science (from Jan 1, 2000, to Aug 9, 2021) for all studies providing evidence on applying AI/ML algorithms to the early diagnosis of skin cancer, including all study designs and languages. The primary outcome was diagnostic accuracy of the algorithms for skin cancers. The secondary outcomes included an overview of AI/ML methods, evaluation approaches, cost-effectiveness, and acceptability to patients and clinicians. We identified 14 224 studies. Only two studies used data from clinical settings with a low prevalence of skin cancers. We reported data from all 272 studies that could be relevant in primary care. The primary outcomes showed reasonable mean diagnostic accuracy for melanoma (89·5% [range 59·7-100%]), squamous cell carcinoma (85·3% [71·0-97·8%]), and basal cell carcinoma (87·6% [70·0-99·7%]). The secondary outcomes showed a heterogeneity of AI/ML methods and study designs, with high amounts of incomplete reporting (eg, patient demographics and methods of data collection). Few studies used data on populations with a low prevalence of skin cancers to train and test their algorithms; therefore, the widespread adoption into community and primary care practice cannot currently be recommended until efficacy in these populations is shown. We did not identify any health economic, patient, or clinician acceptability data for any of the included studies. We propose a methodological checklist for use in the development of new AI/ML algorithms to detect skin cancer, to facilitate their design, evaluation, and implementation.
Topics: Algorithms; Artificial Intelligence; Early Detection of Cancer; Humans; Machine Learning; Primary Health Care; Skin Neoplasms
PubMed: 35623799
DOI: 10.1016/S2589-7500(22)00023-1 -
Healthcare (Basel, Switzerland) Dec 2021Emotional intelligence (EI) refers to the ability to perceive, express, understand, and manage emotions. Current research indicates that it may protect against the... (Review)
Review
Emotional intelligence (EI) refers to the ability to perceive, express, understand, and manage emotions. Current research indicates that it may protect against the emotional burden experienced in certain professions. This article aims to provide an updated systematic review of existing instruments to assess EI in professionals, focusing on the description of their characteristics as well as their psychometric properties (reliability and validity). A literature search was conducted in Web of Science (WoS). A total of 2761 items met the eligibility criteria, from which a total of 40 different instruments were extracted and analysed. Most were based on three main models (i.e., skill-based, trait-based, and mixed), which differ in the way they conceptualize and measure EI. All have been shown to have advantages and disadvantages inherent to the type of tool. The instruments reported in the largest number of studies are Emotional Quotient Inventory (EQ-i), Schutte Self Report-Inventory (SSRI), Mayer-Salovey-Caruso Emotional Intelligence Test 2.0 (MSCEIT 2.0), Trait Meta-Mood Scale (TMMS), Wong and Law's Emotional Intelligence Scale (WLEIS), and Trait Emotional Intelligence Questionnaire (TEIQue). The main measure of the estimated reliability has been internal consistency, and the construction of EI measures was predominantly based on linear modelling or classical test theory. The study has limitations: we only searched a single database, the impossibility of estimating inter-rater reliability, and non-compliance with some items required by PRISMA.
PubMed: 34946422
DOI: 10.3390/healthcare9121696 -
Radiology Jul 2022Background Patients with fractures are a common emergency presentation and may be misdiagnosed at radiologic imaging. An increasing number of studies apply artificial... (Meta-Analysis)
Meta-Analysis
Background Patients with fractures are a common emergency presentation and may be misdiagnosed at radiologic imaging. An increasing number of studies apply artificial intelligence (AI) techniques to fracture detection as an adjunct to clinician diagnosis. Purpose To perform a systematic review and meta-analysis comparing the diagnostic performance in fracture detection between AI and clinicians in peer-reviewed publications and the gray literature (ie, articles published on preprint repositories). Materials and Methods A search of multiple electronic databases between January 2018 and July 2020 (updated June 2021) was performed that included any primary research studies that developed and/or validated AI for the purposes of fracture detection at any imaging modality and excluded studies that evaluated image segmentation algorithms. Meta-analysis with a hierarchical model to calculate pooled sensitivity and specificity was used. Risk of bias was assessed by using a modified Prediction Model Study Risk of Bias Assessment Tool, or PROBAST, checklist. Results Included for analysis were 42 studies, with 115 contingency tables extracted from 32 studies (55 061 images). Thirty-seven studies identified fractures on radiographs and five studies identified fractures on CT images. For internal validation test sets, the pooled sensitivity was 92% (95% CI: 88, 93) for AI and 91% (95% CI: 85, 95) for clinicians, and the pooled specificity was 91% (95% CI: 88, 93) for AI and 92% (95% CI: 89, 92) for clinicians. For external validation test sets, the pooled sensitivity was 91% (95% CI: 84, 95) for AI and 94% (95% CI: 90, 96) for clinicians, and the pooled specificity was 91% (95% CI: 81, 95) for AI and 94% (95% CI: 91, 95) for clinicians. There were no statistically significant differences between clinician and AI performance. There were 22 of 42 (52%) studies that were judged to have high risk of bias. Meta-regression identified multiple sources of heterogeneity in the data, including risk of bias and fracture type. Conclusion Artificial intelligence (AI) and clinicians had comparable reported diagnostic performance in fracture detection, suggesting that AI technology holds promise as a diagnostic adjunct in future clinical practice. Clinical trial registration no. CRD42020186641 © RSNA, 2022 See also the editorial by Cohen and McInnes in this issue.
Topics: Algorithms; Artificial Intelligence; Fractures, Bone; Humans; Sensitivity and Specificity
PubMed: 35348381
DOI: 10.1148/radiol.211785 -
Journal of Medical Internet Research Sep 2020Helicobacter pylori plays a central role in the development of gastric cancer, and prediction of H pylori infection by visual inspection of the gastric mucosa is an... (Meta-Analysis)
Meta-Analysis
BACKGROUND
Helicobacter pylori plays a central role in the development of gastric cancer, and prediction of H pylori infection by visual inspection of the gastric mucosa is an important function of endoscopy. However, there are currently no established methods of optical diagnosis of H pylori infection using endoscopic images. Definitive diagnosis requires endoscopic biopsy. Artificial intelligence (AI) has been increasingly adopted in clinical practice, especially for image recognition and classification.
OBJECTIVE
This study aimed to evaluate the diagnostic test accuracy of AI for the prediction of H pylori infection using endoscopic images.
METHODS
Two independent evaluators searched core databases. The inclusion criteria included studies with endoscopic images of H pylori infection and with application of AI for the prediction of H pylori infection presenting diagnostic performance. Systematic review and diagnostic test accuracy meta-analysis were performed.
RESULTS
Ultimately, 8 studies were identified. Pooled sensitivity, specificity, diagnostic odds ratio, and area under the curve of AI for the prediction of H pylori infection were 0.87 (95% CI 0.72-0.94), 0.86 (95% CI 0.77-0.92), 40 (95% CI 15-112), and 0.92 (95% CI 0.90-0.94), respectively, in the 1719 patients (385 patients with H pylori infection vs 1334 controls). Meta-regression showed methodological quality and included the number of patients in each study for the purpose of heterogeneity. There was no evidence of publication bias. The accuracy of the AI algorithm reached 82% for discrimination between noninfected images and posteradication images.
CONCLUSIONS
An AI algorithm is a reliable tool for endoscopic diagnosis of H pylori infection. The limitations of lacking external validation performance and being conducted only in Asia should be overcome.
TRIAL REGISTRATION
PROSPERO CRD42020175957; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=175957.
Topics: Artificial Intelligence; Diagnostic Tests, Routine; Endoscopy; Helicobacter Infections; Helicobacter pylori; Humans
PubMed: 32936088
DOI: 10.2196/21983 -
Clinical Oral Investigations Jul 2021Deep learning (DL) has been increasingly employed for automated landmark detection, e.g., for cephalometric purposes. We performed a systematic review and meta-analysis... (Meta-Analysis)
Meta-Analysis Review
OBJECTIVES
Deep learning (DL) has been increasingly employed for automated landmark detection, e.g., for cephalometric purposes. We performed a systematic review and meta-analysis to assess the accuracy and underlying evidence for DL for cephalometric landmark detection on 2-D and 3-D radiographs.
METHODS
Diagnostic accuracy studies published in 2015-2020 in Medline/Embase/IEEE/arXiv and employing DL for cephalometric landmark detection were identified and extracted by two independent reviewers. Random-effects meta-analysis, subgroup, and meta-regression were performed, and study quality was assessed using QUADAS-2. The review was registered (PROSPERO no. 227498).
DATA
From 321 identified records, 19 studies (published 2017-2020), all employing convolutional neural networks, mainly on 2-D lateral radiographs (n=15), using data from publicly available datasets (n=12) and testing the detection of a mean of 30 (SD: 25; range.: 7-93) landmarks, were included. The reference test was established by two experts (n=11), 1 expert (n=4), 3 experts (n=3), and a set of annotators (n=1). Risk of bias was high, and applicability concerns were detected for most studies, mainly regarding the data selection and reference test conduct. Landmark prediction error centered around a 2-mm error threshold (mean; 95% confidence interval: (-0.581; 95 CI: -1.264 to 0.102 mm)). The proportion of landmarks detected within this 2-mm threshold was 0.799 (0.770 to 0.824).
CONCLUSIONS
DL shows relatively high accuracy for detecting landmarks on cephalometric imagery. The overall body of evidence is consistent but suffers from high risk of bias. Demonstrating robustness and generalizability of DL for landmark detection is needed.
CLINICAL SIGNIFICANCE
Existing DL models show consistent and largely high accuracy for automated detection of cephalometric landmarks. The majority of studies so far focused on 2-D imagery; data on 3-D imagery are sparse, but promising. Future studies should focus on demonstrating generalizability, robustness, and clinical usefulness of DL for this objective.
Topics: Cephalometry; Deep Learning; Radiography; Reproducibility of Results
PubMed: 34046742
DOI: 10.1007/s00784-021-03990-w -
International Journal of Medical... Jul 2021We aimed to assess whether machine learning models are superior at predicting acute kidney injury (AKI) compared to logistic regression (LR), a conventional prediction... (Meta-Analysis)
Meta-Analysis Review
INTRODUCTION
We aimed to assess whether machine learning models are superior at predicting acute kidney injury (AKI) compared to logistic regression (LR), a conventional prediction model.
METHODS
Eligible studies were identified using PubMed and Embase. A total of 24 studies consisting of 84 prediction models met inclusion criteria. Independent samples t-test was performed to detect mean differences in area under the curve (AUC) between ML and LR models. One-way ANOVA and post-hoc t-tests were performed to assess mean differences in AUC between ML methods.
RESULTS
AUC data were similar between ML (0.736 ± 0.116) and LR (0.748 ± 0.057) models (p = 0.538). However, specific ML models, such as gradient boosting (0.838 ± 0.077), exhibited superior performance at predicting AKI as compared to other ML models in the literature (p < 0.05). Creatinine and urine output, standard variables assessed for AKI staging, were classified as significant predictors across multiple ML models, although the majority of significant predictors were unique and study specific.
CONCLUSIONS
These data suggest that ML models perform equally to that of LR, however ML models exhibit variable performance with some ML models displaying exceptional performance. The variability in ML prediction of AKI can be attributed, in part, to the specific ML model utilized, variable selection and processing, study and subject characteristics, and the steps associated with model training, validation, testing, and calibration.
Topics: Acute Kidney Injury; Creatinine; Humans; Logistic Models; Machine Learning
PubMed: 33991886
DOI: 10.1016/j.ijmedinf.2021.104484 -
JAMA Network Open Dec 2023Contemporary studies raise concerns regarding the implications of excessive screen time on the development of autism spectrum disorder (ASD). However, the existing... (Meta-Analysis)
Meta-Analysis
IMPORTANCE
Contemporary studies raise concerns regarding the implications of excessive screen time on the development of autism spectrum disorder (ASD). However, the existing literature consists of mixed and unquantified findings.
OBJECTIVE
To conduct a systematic review and meta-analyis of the association between screen time and ASD.
DATA SOURCES
A search was conducted in the PubMed, PsycNET, and ProQuest Dissertation & Theses Global databases for studies published up to May 1, 2023.
STUDY SELECTION
The search was conducted independently by 2 authors. Included studies comprised empirical, peer-reviewed articles or dissertations published in English with statistics from which relevant effect sizes could be calculated. Discrepancies were resolved by consensus.
DATA EXTRACTION AND SYNTHESIS
This study followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) reporting guideline. Two authors independently coded all titles and abstracts, reviewed full-text articles against the inclusion and exclusion criteria, and resolved all discrepancies by consensus. Effect sizes were transformed into log odds ratios (ORs) and analyzed using a random-effects meta-analysis and mixed-effects meta-regression. Study quality was assessed using the Grading of Recommendations, Assessment, Development, and Evaluations (GRADE) approach. Publication bias was tested via the Egger z test for funnel plot asymmetry. Data analysis was performed in June 2023.
MAIN OUTCOMES AND MEASURES
The 2 main variables of interest in this study were screen time and ASD. Screen time was defined as hours of screen use per day or per week, and ASD was defined as an ASD clinical diagnosis (yes or no) or ASD symptoms. The meta-regression considered screen type (ie, general use of screens, television, video games, computers, smartphones, and social media), age group (children vs adults or heterogenous age groups), and type of ASD measure (clinical diagnosis vs ASD symptoms).
RESULTS
Of the 4682 records identified, 46 studies with a total of 562 131 participants met the inclusion criteria. The studies were observational (5 were longitudinal and 41 were cross-sectional) and included 66 relevant effect sizes. The meta-analysis resulted in a positive summary effect size (log OR, 0.54 [95% CI, 0.34 to 0.74]). A trim-and-fill correction for a significant publication bias (Egger z = 2.15; P = .03) resulted in a substantially decreased and nonsignificant effect size (log OR, 0.22 [95% CI, -0.004 to 0.44]). The meta-regression results suggested that the positive summary effect size was only significant in studies targeting general screen use (β [SE] = 0.73 [0.34]; t58 = 2.10; P = .03). This effect size was most dominant in studies of children (log OR, 0.98 [95% CI, 0.66 to 1.29]). Interestingly, a negative summary effect size was observed in studies investigating associations between social media and ASD (log OR, -1.24 [95% CI, -1.51 to -0.96]).
CONCLUSIONS AND RELEVANCE
The findings of this systematic review and meta-analysis suggest that the proclaimed association between screen use and ASD is not sufficiently supported in the existing literature. Although excessive screen use may pose developmental risks, the mixed findings, the small effect sizes (especially when considering the observed publication bias), and the correlational nature of the available research require further scientific investigation. These findings also do not rule out the complementary hypothesis that children with ASD may prioritize screen activities to avoid social challenges.
Topics: Child; Adult; Humans; Autism Spectrum Disorder; Screen Time; Publication Bias
PubMed: 38064216
DOI: 10.1001/jamanetworkopen.2023.46775 -
International Journal of Molecular... Mar 2021Alzheimer's disease (AD) is a complex and severe neurodegenerative disease that still lacks effective methods of diagnosis. The current diagnostic methods of AD rely on...
BACKGROUND
Alzheimer's disease (AD) is a complex and severe neurodegenerative disease that still lacks effective methods of diagnosis. The current diagnostic methods of AD rely on cognitive tests, imaging techniques and cerebrospinal fluid (CSF) levels of amyloid-β1-42 (Aβ42), total tau protein and hyperphosphorylated tau (p-tau). However, the available methods are expensive and relatively invasive. Artificial intelligence techniques like machine learning tools have being increasingly used in precision diagnosis.
METHODS
We conducted a meta-analysis to investigate the machine learning and novel biomarkers for the diagnosis of AD.
METHODS
We searched PubMed, the Cochrane Central Register of Controlled Trials, and the Cochrane Database of Systematic Reviews for reviews and trials that investigated the machine learning and novel biomarkers in diagnosis of AD.
RESULTS
In additional to Aβ and tau-related biomarkers, biomarkers according to other mechanisms of AD pathology have been investigated. Neuronal injury biomarker includes neurofiliament light (NFL). Biomarkers about synaptic dysfunction and/or loss includes neurogranin, BACE1, synaptotagmin, SNAP-25, GAP-43, synaptophysin. Biomarkers about neuroinflammation includes sTREM2, and YKL-40. Besides, d-glutamate is one of coagonists at the NMDARs. Several machine learning algorithms including support vector machine, logistic regression, random forest, and naïve Bayes) to build an optimal predictive model to distinguish patients with AD from healthy controls.
CONCLUSIONS
Our results revealed machine learning with novel biomarkers and multiple variables may increase the sensitivity and specificity in diagnosis of AD. Rapid and cost-effective HPLC for biomarkers and machine learning algorithms may assist physicians in diagnosing AD in outpatient clinics.
Topics: Aged; Alzheimer Disease; Biomarkers; Chromatography, High Pressure Liquid; Diagnosis, Computer-Assisted; Female; Humans; Machine Learning; Middle Aged
PubMed: 33803217
DOI: 10.3390/ijms22052761 -
Journal of Neuroengineering and... Aug 2022Virtual reality (VR), augmented reality (AR), and mixed reality (MR) are emerging technologies in the field of stroke rehabilitation that have the potential to overcome... (Meta-Analysis)
Meta-Analysis Review
Examining the effectiveness of virtual, augmented, and mixed reality (VAMR) therapy for upper limb recovery and activities of daily living in stroke patients: a systematic review and meta-analysis.
INTRODUCTION
Virtual reality (VR), augmented reality (AR), and mixed reality (MR) are emerging technologies in the field of stroke rehabilitation that have the potential to overcome the limitations of conventional treatment. Enhancing upper limb (UL) function is critical in stroke impairments because the upper limb is involved in the majority of activities of daily living (ADL).
METHODS
This study reviewed the use of virtual, augmented and mixed reality (VAMR) methods for improving UL recovery and ADL, and compared the effectiveness of VAMR treatment to conventional rehabilitation therapy. The databases ScienceDirect, PubMed, IEEE Xplore, and Web of Science were examined, and 50 randomized control trials comparing VAMR treatment to standard therapy were determined. The random effect model and fixed effect model are applied based on heterogeneity.
RESULTS
The most often used outcomes of UL recovery and ADL in stroke rehabilitation were the Fugl-Meyer Assessment for Upper Extremities (FMA-UE), followed by the Box and Block Test (BBT), the Wolf Motor Function Test (WMFT), and the Functional Independence Measure (FIM). According to the meta-analysis, VR, AR, and MR all have a significant positive effect on improving FMA-UE for UL impairment (36 studies, MD = 3.91, 95 percent CI = 1.70-6.12, P = 0.0005) and FIM for ADL (10 studies, MD = 4.25, 95 percent CI = 1.47-7.03, P = 0.003), but not on BBT and WMFT for the UL function tests (16 studies, MD = 2.07, 95 percent CI = - 0.58-4.72, P = 0.13), CONCLUSIONS: VAMR therapy was superior to conventional treatment in UL impairment and daily function outcomes, but not UL function measures. Future studies might include further high-quality trials examining the effect of VR, AR, and MR on UL function measures, with an emphasis on subgroup meta-analysis by stroke type and recovery stage.
Topics: Activities of Daily Living; Augmented Reality; Humans; Recovery of Function; Stroke; Stroke Rehabilitation; Upper Extremity
PubMed: 36002898
DOI: 10.1186/s12984-022-01071-x