-
International Journal of Surgery... Nov 2021Despite the extensive published literature on the significant potential of artificial intelligence (AI) there are no reports on its efficacy in improving patient safety... (Review)
Review
BACKGROUND
Despite the extensive published literature on the significant potential of artificial intelligence (AI) there are no reports on its efficacy in improving patient safety in robot-assisted surgery (RAS). The purposes of this work are to systematically review the published literature on AI in RAS, and to identify and discuss current limitations and challenges.
MATERIALS AND METHODS
A literature search was conducted on PubMed, Web of Science, Scopus, and IEEExplore according to PRISMA 2020 statement. Eligible articles were peer-review studies published in English language from January 1, 2016 to December 31, 2020. Amstar 2 was used for quality assessment. Risk of bias was evaluated with the Newcastle Ottawa Quality assessment tool. Data of the studies were visually presented in tables using SPIDER tool.
RESULTS
Thirty-five publications, representing 3436 patients, met the search criteria and were included in the analysis. The selected reports concern: motion analysis (n = 17), urology (n = 12), gynecology (n = 1), other specialties (n = 1), training (n = 3), and tissue retraction (n = 1). Precision for surgical tools detection varied from 76.0% to 90.6%. Mean absolute error on prediction of urinary continence after robot-assisted radical prostatectomy (RARP) ranged from 85.9 to 134.7 days. Accuracy on prediction of length of stay after RARP was 88.5%. Accuracy on recognition of the next surgical task during robot-assisted partial nephrectomy (RAPN) achieved 75.7%.
CONCLUSION
The reviewed studies were of low quality. The findings are limited by the small size of the datasets. Comparison between studies on the same topic was restricted due to algorithms and datasets heterogeneity. There is no proof that currently AI can identify the critical tasks of RAS operations, which determine patient outcome. There is an urgent need for studies on large datasets and external validation of the AI algorithms used. Furthermore, the results should be transparent and meaningful to surgeons, enabling them to inform patients in layman's words.
REGISTRATION
Review Registry Unique Identifying Number: reviewregistry1225.
Topics: Artificial Intelligence; Humans; Laparoscopy; Male; Prostate; Prostatectomy; Robotic Surgical Procedures
PubMed: 34695601
DOI: 10.1016/j.ijsu.2021.106151 -
Sensors (Basel, Switzerland) Oct 2021(1) Background: The rapid pace of digital development in everyday life is also reflected in dentistry, including the emergence of the first systems based on artificial... (Review)
Review
(1) Background: The rapid pace of digital development in everyday life is also reflected in dentistry, including the emergence of the first systems based on artificial intelligence (AI). This systematic review focused on the recent scientific literature and provides an overview of the application of AI in the dental discipline of prosthodontics. (2) Method: According to a modified PICO-strategy, an electronic (MEDLINE, EMBASE, CENTRAL) and manual search up to 30 June 2021 was carried out for the literature published in the last five years reporting the use of AI in the field of prosthodontics. (3) Results: 560 titles were screened, of which 30 abstracts and 16 full texts were selected for further review. Seven studies met the inclusion criteria and were analyzed. Most of the identified studies reported the training and application of an AI system ( = 6) or explored the function of an intrinsic AI system in a CAD software ( = 1). (4) Conclusions: While the number of included studies reporting the use of AI was relatively low, the summary of the obtained findings by the included studies represents the latest AI developments in prosthodontics demonstrating its application for automated diagnostics, as a predictive measure, and as a classification or identification tool. In the future, AI technologies will likely be used for collecting, processing, and organizing patient-related datasets to provide patient-centered, individualized dental treatment.
Topics: Artificial Intelligence; Delivery of Health Care; Humans; Prosthodontics
PubMed: 34640948
DOI: 10.3390/s21196628 -
The Lancet. Digital Health Jun 2022Skin cancers occur commonly worldwide. The prognosis and disease burden are highly dependent on the cancer type and disease stage at diagnosis. We systematically... (Review)
Review
Skin cancers occur commonly worldwide. The prognosis and disease burden are highly dependent on the cancer type and disease stage at diagnosis. We systematically reviewed studies on artificial intelligence and machine learning (AI/ML) algorithms that aim to facilitate the early diagnosis of skin cancers, focusing on their application in primary and community care settings. We searched MEDLINE, Embase, Scopus, and Web of Science (from Jan 1, 2000, to Aug 9, 2021) for all studies providing evidence on applying AI/ML algorithms to the early diagnosis of skin cancer, including all study designs and languages. The primary outcome was diagnostic accuracy of the algorithms for skin cancers. The secondary outcomes included an overview of AI/ML methods, evaluation approaches, cost-effectiveness, and acceptability to patients and clinicians. We identified 14 224 studies. Only two studies used data from clinical settings with a low prevalence of skin cancers. We reported data from all 272 studies that could be relevant in primary care. The primary outcomes showed reasonable mean diagnostic accuracy for melanoma (89·5% [range 59·7-100%]), squamous cell carcinoma (85·3% [71·0-97·8%]), and basal cell carcinoma (87·6% [70·0-99·7%]). The secondary outcomes showed a heterogeneity of AI/ML methods and study designs, with high amounts of incomplete reporting (eg, patient demographics and methods of data collection). Few studies used data on populations with a low prevalence of skin cancers to train and test their algorithms; therefore, the widespread adoption into community and primary care practice cannot currently be recommended until efficacy in these populations is shown. We did not identify any health economic, patient, or clinician acceptability data for any of the included studies. We propose a methodological checklist for use in the development of new AI/ML algorithms to detect skin cancer, to facilitate their design, evaluation, and implementation.
Topics: Algorithms; Artificial Intelligence; Early Detection of Cancer; Humans; Machine Learning; Primary Health Care; Skin Neoplasms
PubMed: 35623799
DOI: 10.1016/S2589-7500(22)00023-1 -
Journal of Medical Internet Research Apr 2021Artificial intelligence (AI) applications are growing at an unprecedented pace in health care, including disease diagnosis, triage or screening, risk analysis, surgical... (Review)
Review
BACKGROUND
Artificial intelligence (AI) applications are growing at an unprecedented pace in health care, including disease diagnosis, triage or screening, risk analysis, surgical operations, and so forth. Despite a great deal of research in the development and validation of health care AI, only few applications have been actually implemented at the frontlines of clinical practice.
OBJECTIVE
The objective of this study was to systematically review AI applications that have been implemented in real-life clinical practice.
METHODS
We conducted a literature search in PubMed, Embase, Cochrane Central, and CINAHL to identify relevant articles published between January 2010 and May 2020. We also hand searched premier computer science journals and conferences as well as registered clinical trials. Studies were included if they reported AI applications that had been implemented in real-world clinical settings.
RESULTS
We identified 51 relevant studies that reported the implementation and evaluation of AI applications in clinical practice, of which 13 adopted a randomized controlled trial design and eight adopted an experimental design. The AI applications targeted various clinical tasks, such as screening or triage (n=16), disease diagnosis (n=16), risk analysis (n=14), and treatment (n=7). The most commonly addressed diseases and conditions were sepsis (n=6), breast cancer (n=5), diabetic retinopathy (n=4), and polyp and adenoma (n=4). Regarding the evaluation outcomes, we found that 26 studies examined the performance of AI applications in clinical settings, 33 studies examined the effect of AI applications on clinician outcomes, 14 studies examined the effect on patient outcomes, and one study examined the economic impact associated with AI implementation.
CONCLUSIONS
This review indicates that research on the clinical implementation of AI applications is still at an early stage despite the great potential. More research needs to assess the benefits and challenges associated with clinical AI applications through a more rigorous methodology.
Topics: Artificial Intelligence; Humans; Randomized Controlled Trials as Topic; Risk Assessment; Sepsis
PubMed: 33885365
DOI: 10.2196/25759 -
Journal of Medical Internet Research Oct 2020The high demand for health care services and the growing capability of artificial intelligence have led to the development of conversational agents designed to support a...
BACKGROUND
The high demand for health care services and the growing capability of artificial intelligence have led to the development of conversational agents designed to support a variety of health-related activities, including behavior change, treatment support, health monitoring, training, triage, and screening support. Automation of these tasks could free clinicians to focus on more complex work and increase the accessibility to health care services for the public. An overarching assessment of the acceptability, usability, and effectiveness of these agents in health care is needed to collate the evidence so that future development can target areas for improvement and potential for sustainable adoption.
OBJECTIVE
This systematic review aims to assess the effectiveness and usability of conversational agents in health care and identify the elements that users like and dislike to inform future research and development of these agents.
METHODS
PubMed, Medline (Ovid), EMBASE (Excerpta Medica dataBASE), CINAHL (Cumulative Index to Nursing and Allied Health Literature), Web of Science, and the Association for Computing Machinery Digital Library were systematically searched for articles published since 2008 that evaluated unconstrained natural language processing conversational agents used in health care. EndNote (version X9, Clarivate Analytics) reference management software was used for initial screening, and full-text screening was conducted by 1 reviewer. Data were extracted, and the risk of bias was assessed by one reviewer and validated by another.
RESULTS
A total of 31 studies were selected and included a variety of conversational agents, including 14 chatbots (2 of which were voice chatbots), 6 embodied conversational agents (3 of which were interactive voice response calls, virtual patients, and speech recognition screening systems), 1 contextual question-answering agent, and 1 voice recognition triage system. Overall, the evidence reported was mostly positive or mixed. Usability and satisfaction performed well (27/30 and 26/31), and positive or mixed effectiveness was found in three-quarters of the studies (23/30). However, there were several limitations of the agents highlighted in specific qualitative feedback.
CONCLUSIONS
The studies generally reported positive or mixed evidence for the effectiveness, usability, and satisfactoriness of the conversational agents investigated, but qualitative user perceptions were more mixed. The quality of many of the studies was limited, and improved study design and reporting are necessary to more accurately evaluate the usefulness of the agents in health care and identify key areas for improvement. Further research should also analyze the cost-effectiveness, privacy, and security of the agents.
INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)
RR2-10.2196/16934.
Topics: Artificial Intelligence; Communication; Delivery of Health Care; Female; Humans; Male
PubMed: 33090118
DOI: 10.2196/20346 -
International Journal of Medical... Jul 2021We aimed to assess whether machine learning models are superior at predicting acute kidney injury (AKI) compared to logistic regression (LR), a conventional prediction... (Meta-Analysis)
Meta-Analysis Review
INTRODUCTION
We aimed to assess whether machine learning models are superior at predicting acute kidney injury (AKI) compared to logistic regression (LR), a conventional prediction model.
METHODS
Eligible studies were identified using PubMed and Embase. A total of 24 studies consisting of 84 prediction models met inclusion criteria. Independent samples t-test was performed to detect mean differences in area under the curve (AUC) between ML and LR models. One-way ANOVA and post-hoc t-tests were performed to assess mean differences in AUC between ML methods.
RESULTS
AUC data were similar between ML (0.736 ± 0.116) and LR (0.748 ± 0.057) models (p = 0.538). However, specific ML models, such as gradient boosting (0.838 ± 0.077), exhibited superior performance at predicting AKI as compared to other ML models in the literature (p < 0.05). Creatinine and urine output, standard variables assessed for AKI staging, were classified as significant predictors across multiple ML models, although the majority of significant predictors were unique and study specific.
CONCLUSIONS
These data suggest that ML models perform equally to that of LR, however ML models exhibit variable performance with some ML models displaying exceptional performance. The variability in ML prediction of AKI can be attributed, in part, to the specific ML model utilized, variable selection and processing, study and subject characteristics, and the steps associated with model training, validation, testing, and calibration.
Topics: Acute Kidney Injury; Creatinine; Humans; Logistic Models; Machine Learning
PubMed: 33991886
DOI: 10.1016/j.ijmedinf.2021.104484 -
International Journal of Environmental... Oct 2022The number of studies on the relationship between training and competition load and injury has increased exponentially in recent years, and it is also widely studied by... (Review)
Review
The number of studies on the relationship between training and competition load and injury has increased exponentially in recent years, and it is also widely studied by researchers in the field of professional soccer. In order to provide practical guidance for workload management and injury prevention in professional athletes, this study provides a review of the literature on the effect of load on injury risk, injury prediction, and interpretation mechanisms. The results of the research show that: (1) It appears that short-term fixture congestion may increase the match injury incidence, while long-term fixture congestion may have no effect on both the overall injury incidence and the match injury incidence. (2) It is impossible to determine conclusively whether any global positioning system (GPS)-derived metrics (total distance, high-speed running distance, and acceleration) are associated with an increased risk of injury. (3) The acute:chronic workload ratio (ACWR) of the session rating of perceived exertion (s-RPE) may be significantly associated with the risk of non-contact injuries, but an ACWR threshold with a minimum risk of injury could not be obtained. (4) Based on the workload and fatigue recovery factors, artificial intelligence technology may possess good predictive power regarding injury risk.
Topics: Humans; Male; Soccer; Workload; Athletic Injuries; Artificial Intelligence; Risk Factors
PubMed: 36293817
DOI: 10.3390/ijerph192013237 -
JAMA Network Open Mar 2023Artificial intelligence (AI) enables powerful models for establishment of clinical diagnostic and prognostic tools for hip fractures; however the performance and... (Meta-Analysis)
Meta-Analysis
IMPORTANCE
Artificial intelligence (AI) enables powerful models for establishment of clinical diagnostic and prognostic tools for hip fractures; however the performance and potential impact of these newly developed algorithms are currently unknown.
OBJECTIVE
To evaluate the performance of AI algorithms designed to diagnose hip fractures on radiographs and predict postoperative clinical outcomes following hip fracture surgery relative to current practices.
DATA SOURCES
A systematic review of the literature was performed using the MEDLINE, Embase, and Cochrane Library databases for all articles published from database inception to January 23, 2023. A manual reference search of included articles was also undertaken to identify any additional relevant articles.
STUDY SELECTION
Studies developing machine learning (ML) models for the diagnosis of hip fractures from hip or pelvic radiographs or to predict any postoperative patient outcome following hip fracture surgery were included.
DATA EXTRACTION AND SYNTHESIS
This study followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses and was registered with PROSPERO. Eligible full-text articles were evaluated and relevant data extracted independently using a template data extraction form. For studies that predicted postoperative outcomes, the performance of traditional predictive statistical models, either multivariable logistic or linear regression, was recorded and compared with the performance of the best ML model on the same out-of-sample data set.
MAIN OUTCOMES AND MEASURES
Diagnostic accuracy of AI models was compared with the diagnostic accuracy of expert clinicians using odds ratios (ORs) with 95% CIs. Areas under the curve for postoperative outcome prediction between traditional statistical models (multivariable linear or logistic regression) and ML models were compared.
RESULTS
Of 39 studies that met all criteria and were included in this analysis, 18 (46.2%) used AI models to diagnose hip fractures on plain radiographs and 21 (53.8%) used AI models to predict patient outcomes following hip fracture surgery. A total of 39 598 plain radiographs and 714 939 hip fractures were used for training, validating, and testing ML models specific to diagnosis and postoperative outcome prediction, respectively. Mortality and length of hospital stay were the most predicted outcomes. On pooled data analysis, compared with clinicians, the OR for diagnostic error of ML models was 0.79 (95% CI, 0.48-1.31; P = .36; I2 = 60%) for hip fracture radiographs. For the ML models, the mean (SD) sensitivity was 89.3% (8.5%), specificity was 87.5% (9.9%), and F1 score was 0.90 (0.06). The mean area under the curve for mortality prediction was 0.84 with ML models compared with 0.79 for alternative controls (P = .09).
CONCLUSIONS AND RELEVANCE
The findings of this systematic review and meta-analysis suggest that the potential applications of AI to aid with diagnosis from hip radiographs are promising. The performance of AI in diagnosing hip fractures was comparable with that of expert radiologists and surgeons. However, current implementations of AI for outcome prediction do not seem to provide substantial benefit over traditional multivariable predictive statistics.
Topics: Humans; Artificial Intelligence; Hip Fractures; Prognosis; Algorithms; Length of Stay
PubMed: 36930153
DOI: 10.1001/jamanetworkopen.2023.3391 -
The British Journal of Radiology Sep 2020In this review, we describe the technical aspects of artificial intelligence (AI) in cardiac imaging, starting with radiomics, basic algorithms of deep learning and...
In this review, we describe the technical aspects of artificial intelligence (AI) in cardiac imaging, starting with radiomics, basic algorithms of deep learning and application tasks of algorithms, until recently the availability of the public database. Subsequently, we conducted a systematic literature search for recently published clinically relevant studies on AI in cardiac imaging. As a result, 24 and 14 studies using CT and MRI, respectively, were included and summarized. From these studies, it can be concluded that AI is widely applied in cardiac applications in the clinic, including coronary calcium scoring, coronary CT angiography, fractional flow reserve CT, plaque analysis, left ventricular myocardium analysis, diagnosis of myocardial infarction, prognosis of coronary artery disease, assessment of cardiac function, and diagnosis and prognosis of cardiomyopathy. These advancements show that AI has a promising prospect in cardiac imaging.
Topics: Adipose Tissue; Algorithms; Artificial Intelligence; Cardiomyopathies; Computed Tomography Angiography; Coronary Disease; Coronary Stenosis; Databases, Factual; Deep Learning; Fractional Flow Reserve, Myocardial; Heart; Heart Ventricles; Humans; Magnetic Resonance Imaging; Myocardial Infarction; Prognosis; Vascular Calcification
PubMed: 32017605
DOI: 10.1259/bjr.20190812 -
Clinical Oral Investigations Jul 2021Deep learning (DL) has been increasingly employed for automated landmark detection, e.g., for cephalometric purposes. We performed a systematic review and meta-analysis... (Meta-Analysis)
Meta-Analysis Review
OBJECTIVES
Deep learning (DL) has been increasingly employed for automated landmark detection, e.g., for cephalometric purposes. We performed a systematic review and meta-analysis to assess the accuracy and underlying evidence for DL for cephalometric landmark detection on 2-D and 3-D radiographs.
METHODS
Diagnostic accuracy studies published in 2015-2020 in Medline/Embase/IEEE/arXiv and employing DL for cephalometric landmark detection were identified and extracted by two independent reviewers. Random-effects meta-analysis, subgroup, and meta-regression were performed, and study quality was assessed using QUADAS-2. The review was registered (PROSPERO no. 227498).
DATA
From 321 identified records, 19 studies (published 2017-2020), all employing convolutional neural networks, mainly on 2-D lateral radiographs (n=15), using data from publicly available datasets (n=12) and testing the detection of a mean of 30 (SD: 25; range.: 7-93) landmarks, were included. The reference test was established by two experts (n=11), 1 expert (n=4), 3 experts (n=3), and a set of annotators (n=1). Risk of bias was high, and applicability concerns were detected for most studies, mainly regarding the data selection and reference test conduct. Landmark prediction error centered around a 2-mm error threshold (mean; 95% confidence interval: (-0.581; 95 CI: -1.264 to 0.102 mm)). The proportion of landmarks detected within this 2-mm threshold was 0.799 (0.770 to 0.824).
CONCLUSIONS
DL shows relatively high accuracy for detecting landmarks on cephalometric imagery. The overall body of evidence is consistent but suffers from high risk of bias. Demonstrating robustness and generalizability of DL for landmark detection is needed.
CLINICAL SIGNIFICANCE
Existing DL models show consistent and largely high accuracy for automated detection of cephalometric landmarks. The majority of studies so far focused on 2-D imagery; data on 3-D imagery are sparse, but promising. Future studies should focus on demonstrating generalizability, robustness, and clinical usefulness of DL for this objective.
Topics: Cephalometry; Deep Learning; Radiography; Reproducibility of Results
PubMed: 34046742
DOI: 10.1007/s00784-021-03990-w