-
Central European Journal of Urology 2023Radiomics in uro-oncology is a rapidly evolving science proving to be a novel approach for optimizing the analysis of massive data from medical images to provide... (Review)
Review
INTRODUCTION
Radiomics in uro-oncology is a rapidly evolving science proving to be a novel approach for optimizing the analysis of massive data from medical images to provide auxiliary guidance in clinical issues. This scoping review aimed to identify key aspects wherein radiomics can potentially improve the accuracy of diagnosis, staging, and grading of renal and bladder cancer.
MATERIAL AND METHODS
A literature search was performed in June 2022 using PubMed, Embase, and Cochrane Central Controlled Register of Trials. Studies were included if radiomics were compared with radiological reports only.
RESULTS
Twenty-two papers were included, 4 were pertinent to bladder cancer, and 18 to renal cancer. Radiomics outperforms the visual assessment by radiologists in contrast-enhanced computed tomography (CECT) to predict muscle invasion but are equivalent to CT reporting by radiologists in predicting lymph node metastasis. Magnetic resonance imaging (MRI) radiomics outperforms radiological reporting for lymph node metastasis. Radiomics perform better than radiologists reporting the probability of renal cell carcinoma, improving interreader concordance and performance. Radiomics also helps to determine differences in types of renal pathology and between malignant lesions from their benign counterparts. Radiomics can be helpful to establish a model for differentiating low-grade from high-grade clear cell renal cancer with high accuracy just from contrast-enhanced CT scans.
CONCLUSIONS
Our review shows that radiomic models outperform individual reports by radiologists by their ability to incorporate many more complex radiological features.
PubMed: 37064257
DOI: 10.5173/ceju.2023.252 -
Vascular Health and Risk Management 2023In the United States, echocardiography is an essential component of the care of many cardiac patients. Recently, increased attention has been given to the accuracy of... (Meta-Analysis)
Meta-Analysis
BACKGROUND
In the United States, echocardiography is an essential component of the care of many cardiac patients. Recently, increased attention has been given to the accuracy of interpretation of cardiac-based procedures in different specialties, amongst them the field of cardiac anesthesiology and primary echocardiographers for transesophageal echocardiogram (TEE). The purpose of this study was to assess the TEE skills of cardiac anesthesiologists in comparison to primary echocardiographers, either radiologists or cardiologists. In this systematic review, we evaluated available current literature to identify if cardiac anesthesiologists interpret TEE procedures at an identical level to that of primary echocardiographers.
METHODS
A PRISMA systematic review was utilized from PubMed from the years 1952-2022. A broad keyword search of "Cardiology Anesthesiology Echocardiogram" and "Echocardiography Anesthesiology" to identify the literature was used. From reviewing 1798 articles, there were a total of 9 studies included in our systematic review, 3 of which yielded quantitative data and 6 of which yielded qualitative data. The mean accuracy from each of these three qualitative studies was calculated and used to represent the overall accuracy of cardiac anesthesiologists.
RESULTS
Through identified studies, a total of 8197 TEEs were interpreted by cardiac anesthesiologists with a concordance rate of 84% to the interpretations of primary echocardiographers. Cardiac anesthesiologists had a concordance rate of 83% when compared to radiologists. On the other hand, cardiac anesthesiologists and cardiologists had a concordance rate of 87% in one study and 79% in another study.
CONCLUSION
Based on these studies, cardiac anesthesiologists are shown to interpret TEEs similarly to that of primary echocardiographers. At this time, there is no gold standard to evaluate the accuracy of TEE readings. One way to address this is to individually assess the TEE interpretation of anesthesiologists and primary echocardiographers with a double-blind study.
Topics: Humans; Anesthesia, Cardiac Procedures; Anesthesiology; Cardiology; Echocardiography; Echocardiography, Transesophageal; Randomized Controlled Trials as Topic
PubMed: 37056574
DOI: 10.2147/VHRM.S400117 -
Children (Basel, Switzerland) Mar 2023This study aimed to systematically review the literature to synthesise and summarise the evidence surrounding the efficacy of artificial intelligence (AI) in classifying... (Review)
Review
This study aimed to systematically review the literature to synthesise and summarise the evidence surrounding the efficacy of artificial intelligence (AI) in classifying paediatric pneumonia on chest radiographs (CXRs). Following the initial search of studies that matched the pre-set criteria, their data were extracted using a data extraction tool, and the included studies were assessed via critical appraisal tools and risk of bias. Results were accumulated, and outcome measures analysed included sensitivity, specificity, accuracy, and area under the curve (AUC). Five studies met the inclusion criteria. The highest sensitivity was by an ensemble AI algorithm (96.3%). DenseNet201 obtained the highest level of specificity and accuracy (94%, 95%). The most outstanding AUC value was achieved by the VGG16 algorithm (96.2%). Some of the AI models achieved close to 100% diagnostic accuracy. To assess the efficacy of AI in a clinical setting, these AI models should be compared to that of radiologists. The included and evaluated AI algorithms showed promising results. These algorithms can potentially ease and speed up diagnosis once the studies are replicated and their performances are assessed in clinical settings, potentially saving millions of lives.
PubMed: 36980134
DOI: 10.3390/children10030576 -
Systematic Reviews Mar 2023To inform recommendations by the Canadian Task Force on Preventive Health Care, we reviewed evidence on the benefits, harms, and acceptability of screening and... (Meta-Analysis)
Meta-Analysis
Screening for the primary prevention of fragility fractures among adults aged 40 years and older in primary care: systematic reviews of the effects and acceptability of screening and treatment, and the accuracy of risk prediction tools.
BACKGROUND
To inform recommendations by the Canadian Task Force on Preventive Health Care, we reviewed evidence on the benefits, harms, and acceptability of screening and treatment, and on the accuracy of risk prediction tools for the primary prevention of fragility fractures among adults aged 40 years and older in primary care.
METHODS
For screening effectiveness, accuracy of risk prediction tools, and treatment benefits, our search methods involved integrating studies published up to 2016 from an existing systematic review. Then, to locate more recent studies and any evidence relating to acceptability and treatment harms, we searched online databases (2016 to April 4, 2022 [screening] or to June 1, 2021 [predictive accuracy]; 1995 to June 1, 2021, for acceptability; 2016 to March 2, 2020, for treatment benefits; 2015 to June 24, 2020, for treatment harms), trial registries and gray literature, and hand-searched reviews, guidelines, and the included studies. Two reviewers selected studies, extracted results, and appraised risk of bias, with disagreements resolved by consensus or a third reviewer. The overview of reviews on treatment harms relied on one reviewer, with verification of data by another reviewer to correct errors and omissions. When appropriate, study results were pooled using random effects meta-analysis; otherwise, findings were described narratively. Evidence certainty was rated according to the GRADE approach.
RESULTS
We included 4 randomized controlled trials (RCTs) and 1 controlled clinical trial (CCT) for the benefits and harms of screening, 1 RCT for comparative benefits and harms of different screening strategies, 32 validation cohort studies for the calibration of risk prediction tools (26 of these reporting on the Fracture Risk Assessment Tool without [i.e., clinical FRAX], or with the inclusion of bone mineral density (BMD) results [i.e., FRAX + BMD]), 27 RCTs for the benefits of treatment, 10 systematic reviews for the harms of treatment, and 12 studies for the acceptability of screening or initiating treatment. In females aged 65 years and older who are willing to independently complete a mailed fracture risk questionnaire (referred to as "selected population"), 2-step screening using a risk assessment tool with or without measurement of BMD probably (moderate certainty) reduces the risk of hip fractures (3 RCTs and 1 CCT, n = 43,736, absolute risk reduction [ARD] = 6.2 fewer in 1000, 95% CI 9.0-2.8 fewer, number needed to screen [NNS] = 161) and clinical fragility fractures (3 RCTs, n = 42,009, ARD = 5.9 fewer in 1000, 95% CI 10.9-0.8 fewer, NNS = 169). It probably does not reduce all-cause mortality (2 RCTs and 1 CCT, n = 26,511, ARD = no difference in 1000, 95% CI 7.1 fewer to 5.3 more) and may (low certainty) not affect health-related quality of life. Benefits for fracture outcomes were not replicated in an offer-to-screen population where the rate of response to mailed screening questionnaires was low. For females aged 68-80 years, population screening may not reduce the risk of hip fractures (1 RCT, n = 34,229, ARD = 0.3 fewer in 1000, 95% CI 4.2 fewer to 3.9 more) or clinical fragility fractures (1 RCT, n = 34,229, ARD = 1.0 fewer in 1000, 95% CI 8.0 fewer to 6.0 more) over 5 years of follow-up. The evidence for serious adverse events among all patients and for all outcomes among males and younger females (<65 years) is very uncertain. We defined overdiagnosis as the identification of high risk in individuals who, if not screened, would never have known that they were at risk and would never have experienced a fragility fracture. This was not directly reported in any of the trials. Estimates using data available in the trials suggest that among "selected" females offered screening, 12% of those meeting age-specific treatment thresholds based on clinical FRAX 10-year hip fracture risk, and 19% of those meeting thresholds based on clinical FRAX 10-year major osteoporotic fracture risk, may be overdiagnosed as being at high risk of fracture. Of those identified as being at high clinical FRAX 10-year hip fracture risk and who were referred for BMD assessment, 24% may be overdiagnosed. One RCT (n = 9268) provided evidence comparing 1-step to 2-step screening among postmenopausal females, but the evidence from this trial was very uncertain. For the calibration of risk prediction tools, evidence from three Canadian studies (n = 67,611) without serious risk of bias concerns indicates that clinical FRAX-Canada may be well calibrated for the 10-year prediction of hip fractures (observed-to-expected fracture ratio [O:E] = 1.13, 95% CI 0.74-1.72, I = 89.2%), and is probably well calibrated for the 10-year prediction of clinical fragility fractures (O:E = 1.10, 95% CI 1.01-1.20, I = 50.4%), both leading to some underestimation of the observed risk. Data from these same studies (n = 61,156) showed that FRAX-Canada with BMD may perform poorly to estimate 10-year hip fracture risk (O:E = 1.31, 95% CI 0.91-2.13, I = 92.7%), but is probably well calibrated for the 10-year prediction of clinical fragility fractures, with some underestimation of the observed risk (O:E 1.16, 95% CI 1.12-1.20, I = 0%). The Canadian Association of Radiologists and Osteoporosis Canada Risk Assessment (CAROC) tool may be well calibrated to predict a category of risk for 10-year clinical fractures (low, moderate, or high risk; 1 study, n = 34,060). The evidence for most other tools was limited, or in the case of FRAX tools calibrated for countries other than Canada, very uncertain due to serious risk of bias concerns and large inconsistency in findings across studies. Postmenopausal females in a primary prevention population defined as <50% prevalence of prior fragility fracture (median 16.9%, range 0 to 48% when reported in the trials) and at risk of fragility fracture, treatment with bisphosphonates as a class (median 2 years, range 1-6 years) probably reduces the risk of clinical fragility fractures (19 RCTs, n = 22,482, ARD = 11.1 fewer in 1000, 95% CI 15.0-6.6 fewer, [number needed to treat for an additional beneficial outcome] NNT = 90), and may reduce the risk of hip fractures (14 RCTs, n = 21,038, ARD = 2.9 fewer in 1000, 95% CI 4.6-0.9 fewer, NNT = 345) and clinical vertebral fractures (11 RCTs, n = 8921, ARD = 10.0 fewer in 1000, 95% CI 14.0-3.9 fewer, NNT = 100); it may not reduce all-cause mortality. There is low certainty evidence of little-to-no reduction in hip fractures with any individual bisphosphonate, but all provided evidence of decreased risk of clinical fragility fractures (moderate certainty for alendronate [NNT=68] and zoledronic acid [NNT=50], low certainty for risedronate [NNT=128]) among postmenopausal females. Evidence for an impact on risk of clinical vertebral fractures is very uncertain for alendronate and risedronate; zoledronic acid may reduce the risk of this outcome (4 RCTs, n = 2367, ARD = 18.7 fewer in 1000, 95% CI 25.6-6.6 fewer, NNT = 54) for postmenopausal females. Denosumab probably reduces the risk of clinical fragility fractures (6 RCTs, n = 9473, ARD = 9.1 fewer in 1000, 95% CI 12.1-5.6 fewer, NNT = 110) and clinical vertebral fractures (4 RCTs, n = 8639, ARD = 16.0 fewer in 1000, 95% CI 18.6-12.1 fewer, NNT=62), but may make little-to-no difference in the risk of hip fractures among postmenopausal females. Denosumab probably makes little-to-no difference in the risk of all-cause mortality or health-related quality of life among postmenopausal females. Evidence in males is limited to two trials (1 zoledronic acid, 1 denosumab); in this population, zoledronic acid may make little-to-no difference in the risk of hip or clinical fragility fractures, and evidence for all-cause mortality is very uncertain. The evidence for treatment with denosumab in males is very uncertain for all fracture outcomes (hip, clinical fragility, clinical vertebral) and all-cause mortality. There is moderate certainty evidence that treatment causes a small number of patients to experience a non-serious adverse event, notably non-serious gastrointestinal events (e.g., abdominal pain, reflux) with alendronate (50 RCTs, n = 22,549, ARD = 16.3 more in 1000, 95% CI 2.4-31.3 more, [number needed to treat for an additional harmful outcome] NNH = 61) but not with risedronate; influenza-like symptoms with zoledronic acid (5 RCTs, n = 10,695, ARD = 142.5 more in 1000, 95% CI 105.5-188.5 more, NNH = 7); and non-serious gastrointestinal adverse events (3 RCTs, n = 8454, ARD = 64.5 more in 1000, 95% CI 26.4-13.3 more, NNH = 16), dermatologic adverse events (3 RCTs, n = 8454, ARD = 15.6 more in 1000, 95% CI 7.6-27.0 more, NNH = 64), and infections (any severity; 4 RCTs, n = 8691, ARD = 1.8 more in 1000, 95% CI 0.1-4.0 more, NNH = 556) with denosumab. For serious adverse events overall and specific to stroke and myocardial infarction, treatment with bisphosphonates probably makes little-to-no difference; evidence for other specific serious harms was less certain or not available. There was low certainty evidence for an increased risk for the rare occurrence of atypical femoral fractures (0.06 to 0.08 more in 1000) and osteonecrosis of the jaw (0.22 more in 1000) with bisphosphonates (most evidence for alendronate). The evidence for these rare outcomes and for rebound fractures with denosumab was very uncertain. Younger (lower risk) females have high willingness to be screened. A minority of postmenopausal females at increased risk for fracture may accept treatment. Further, there is large heterogeneity in the level of risk at which patients may be accepting of initiating treatment, and treatment effects appear to be overestimated.
CONCLUSION
An offer of 2-step screening with risk assessment and BMD measurement to selected postmenopausal females with low prevalence of prior fracture probably results in a small reduction in the risk of clinical fragility fracture and hip fracture compared to no screening. These findings were most applicable to the use of clinical FRAX for risk assessment and were not replicated in the offer-to-screen population where the rate of response to mailed screening questionnaires was low. Limited direct evidence on harms of screening were available; using study data to provide estimates, there may be a moderate degree of overdiagnosis of high risk for fracture to consider. The evidence for younger females and males is very limited. The benefits of screening and treatment need to be weighed against the potential for harm; patient views on the acceptability of treatment are highly variable.
SYSTEMATIC REVIEW REGISTRATION
International Prospective Register of Systematic Reviews (PROSPERO): CRD42019123767.
Topics: Adult; Female; Humans; Male; Middle Aged; Alendronate; Canada; Denosumab; Diphosphonates; Hip Fractures; Osteoporotic Fractures; Primary Health Care; Primary Prevention; Risedronic Acid; Systematic Reviews as Topic; Zoledronic Acid
PubMed: 36945065
DOI: 10.1186/s13643-023-02181-w -
JAMA Network Open Mar 2023Artificial intelligence (AI) enables powerful models for establishment of clinical diagnostic and prognostic tools for hip fractures; however the performance and... (Meta-Analysis)
Meta-Analysis
IMPORTANCE
Artificial intelligence (AI) enables powerful models for establishment of clinical diagnostic and prognostic tools for hip fractures; however the performance and potential impact of these newly developed algorithms are currently unknown.
OBJECTIVE
To evaluate the performance of AI algorithms designed to diagnose hip fractures on radiographs and predict postoperative clinical outcomes following hip fracture surgery relative to current practices.
DATA SOURCES
A systematic review of the literature was performed using the MEDLINE, Embase, and Cochrane Library databases for all articles published from database inception to January 23, 2023. A manual reference search of included articles was also undertaken to identify any additional relevant articles.
STUDY SELECTION
Studies developing machine learning (ML) models for the diagnosis of hip fractures from hip or pelvic radiographs or to predict any postoperative patient outcome following hip fracture surgery were included.
DATA EXTRACTION AND SYNTHESIS
This study followed the Preferred Reporting Items for Systematic Reviews and Meta-analyses and was registered with PROSPERO. Eligible full-text articles were evaluated and relevant data extracted independently using a template data extraction form. For studies that predicted postoperative outcomes, the performance of traditional predictive statistical models, either multivariable logistic or linear regression, was recorded and compared with the performance of the best ML model on the same out-of-sample data set.
MAIN OUTCOMES AND MEASURES
Diagnostic accuracy of AI models was compared with the diagnostic accuracy of expert clinicians using odds ratios (ORs) with 95% CIs. Areas under the curve for postoperative outcome prediction between traditional statistical models (multivariable linear or logistic regression) and ML models were compared.
RESULTS
Of 39 studies that met all criteria and were included in this analysis, 18 (46.2%) used AI models to diagnose hip fractures on plain radiographs and 21 (53.8%) used AI models to predict patient outcomes following hip fracture surgery. A total of 39 598 plain radiographs and 714 939 hip fractures were used for training, validating, and testing ML models specific to diagnosis and postoperative outcome prediction, respectively. Mortality and length of hospital stay were the most predicted outcomes. On pooled data analysis, compared with clinicians, the OR for diagnostic error of ML models was 0.79 (95% CI, 0.48-1.31; P = .36; I2 = 60%) for hip fracture radiographs. For the ML models, the mean (SD) sensitivity was 89.3% (8.5%), specificity was 87.5% (9.9%), and F1 score was 0.90 (0.06). The mean area under the curve for mortality prediction was 0.84 with ML models compared with 0.79 for alternative controls (P = .09).
CONCLUSIONS AND RELEVANCE
The findings of this systematic review and meta-analysis suggest that the potential applications of AI to aid with diagnosis from hip radiographs are promising. The performance of AI in diagnosing hip fractures was comparable with that of expert radiologists and surgeons. However, current implementations of AI for outcome prediction do not seem to provide substantial benefit over traditional multivariable predictive statistics.
Topics: Humans; Artificial Intelligence; Hip Fractures; Prognosis; Algorithms; Length of Stay
PubMed: 36930153
DOI: 10.1001/jamanetworkopen.2023.3391 -
Archives of Academic Emergency Medicine 2023The diagnosis of intussusception can be challenging in children due to the fact that the findings of clinical evaluations are nonspecific and most of the patients... (Review)
Review
INTRODUCTION
The diagnosis of intussusception can be challenging in children due to the fact that the findings of clinical evaluations are nonspecific and most of the patients present with unclear history. Therefore, in this systematic review and meta-analysis, we aimed to investigate the diagnostic accuracy of ultrasonography for detection of intussusception and also compare the efficacy of point-of-care ultrasound (POCUS) with radiologist-performed ultrasound (RADUS).
METHODS
Two independent reviewers systematically searched different online electronic databases including MEDLINE, Scopus, Web of Science, Google Scholar, Embase, and Cochrane from inception to December 1, 2022 to identify published papers reporting accuracy of ultrasonography for diagnosis of intussusception. The quality assessment of the included studies was investigated using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-2 tool.
RESULTS
A total of 1446 records were retrieved in the initial search of databases. After screening the titles, a total of 344 studies were retrieved for the detailed assessment of full-text. Finally, 37 studies were included in qualitative and quantitative analysis. The pooled sensitivity and specificity of ultrasonography for diagnosis of intussusception were 0.96 (95% CI: 0.95-0.97) and 0.97 (95% CI: 0.97-0.98), respectively. The pooled positive likelihood ratio (PLR) and negative likelihood ratio (NLR) were 24.57 (95% CI: 8.26-73.03) and 0.05 (95% CI: 0.04-0.08), respectively. The area under the hierarchical summary receiver operating characteristic (HSROC) curve was 0.989. Mete-regression showed that there is no significant difference between diagnostic performance of POCUS and RADUS (p = 0.06 and rDOR (diagnostic odds ratio) = 4.38 (95% CI: 0.92-20.89)).
CONCLUSION
This meta-analysis shows that ultrasonography has excellent sensitivity, specificity, and accuracy for diagnosis of intussusception in pediatric patients. Moreover, we found that diagnostic performance of POCUS is similar to that of RADUS for diagnosis of intussusception.
PubMed: 36919137
DOI: 10.22037/aaem.v11i1.1914 -
Asian Pacific Journal of Cancer... Feb 2023Today, despite many studies on the diagnosis of metastasis to lymph nodes (LNs) in Rectal Cancer (RC), its diagnosis is still very challenging for radiologists. The... (Meta-Analysis)
Meta-Analysis
Value of Conventional MRI, DCE-MRI, and DWI-MRI in the Discrimination of Metastatic from Non-Metastatic Lymph Nodes in Rectal Cancer: A Systematic Review and Meta-Analysis Study.
BACKGROUND
Today, despite many studies on the diagnosis of metastasis to lymph nodes (LNs) in Rectal Cancer (RC), its diagnosis is still very challenging for radiologists. The purpose of the present study was to the assessment of the diagnostic value of conventional MRI, DCE-MRI, and DWI-MRI in the discrimination of metastatic from non-metastatic lymph nodes in RC.
METHODS
In the present meta-analysis study, we surveyed international databases including PubMed, Scopus, Embase, and Science Direct with appropriate keywords. Using the binomial distribution formula, the variance of each study was calculated and the data were analyzed using STATA version 14. Finally, the results of the studies were entered into the random-effects meta-analysis. Also, we used the chi-squared test and I2 index to calculate heterogeneity among studies, and for evaluating publication bias, Funnel plots and Egger tests were used.
RESULTS
31 articles published between 2005 and 2021, comprising 2517 patients were included in the present study. The sensitivity and specificity of DCE-MRI were 83% (74% to 80%), and 86% (80% to 93%), respectively with PPV 84% (76% to 89%) and NPV 88% (79% to 95%). Also, the sensitivity and specificity of DWI-MRI were 81% (74% to 88%), and 74% (78% to 91%), respectively with PPV 63% (54% to 74%), NPV 85% (77% to 93%), AUC 80 % (75% to 86%) and accuracy 82% (75% to 88%). For conventional MRI, the sensitivity 74% (67% to 80%), specificity 77% (71% to 83%), PPV 62% (48% to 69%), NPV 70% (62% to 77%), AUC 78% (72% to 83%) and 71% accuracy (68% to 78%) was obtained.
CONCLUSION
Based on our finding DCE-MRI is the most suitable technique for the discrimination of metastatic lymph nodes in rectal cancer.
Topics: Humans; Databases, Factual; Lymph Nodes; Magnetic Resonance Imaging; Rectal Neoplasms
PubMed: 36853286
DOI: 10.31557/APJCP.2023.24.2.401 -
Diagnostics (Basel, Switzerland) Feb 2023Limitations of the chest X-ray (CXR) have resulted in attempts to create machine learning systems to assist clinicians and improve interpretation accuracy. An... (Review)
Review
Limitations of the chest X-ray (CXR) have resulted in attempts to create machine learning systems to assist clinicians and improve interpretation accuracy. An understanding of the capabilities and limitations of modern machine learning systems is necessary for clinicians as these tools begin to permeate practice. This systematic review aimed to provide an overview of machine learning applications designed to facilitate CXR interpretation. A systematic search strategy was executed to identify research into machine learning algorithms capable of detecting >2 radiographic findings on CXRs published between January 2020 and September 2022. Model details and study characteristics, including risk of bias and quality, were summarized. Initially, 2248 articles were retrieved, with 46 included in the final review. Published models demonstrated strong standalone performance and were typically as accurate, or more accurate, than radiologists or non-radiologist clinicians. Multiple studies demonstrated an improvement in the clinical finding classification performance of clinicians when models acted as a diagnostic assistance device. Device performance was compared with that of clinicians in 30% of studies, while effects on clinical perception and diagnosis were evaluated in 19%. Only one study was prospectively run. On average, 128,662 images were used to train and validate models. Most classified less than eight clinical findings, while the three most comprehensive models classified 54, 72, and 124 findings. This review suggests that machine learning devices designed to facilitate CXR interpretation perform strongly, improve the detection performance of clinicians, and improve the efficiency of radiology workflow. Several limitations were identified, and clinician involvement and expertise will be key to driving the safe implementation of quality CXR machine learning systems.
PubMed: 36832231
DOI: 10.3390/diagnostics13040743 -
Gland Surgery Dec 2022S-detect is an emerging computer-aided diagnosis (CAD) technique that provides a reference for radiologists to identify breast cancer. Some studies have shown that US...
BACKGROUND
S-detect is an emerging computer-aided diagnosis (CAD) technique that provides a reference for radiologists to identify breast cancer. Some studies have shown that US (ultrasound) + S-detect can improve the diagnostic accuracy of junior radiologists more than senior radiologists, but the results are inconsistent in various studies. Therefore, this meta-analysis aimed to assess the value of S-detect combined with the US outcomes from senior and junior radiologists for the diagnosis of breast cancer.
METHODS
We searched the PubMed, Cochrane Library, Embase, Web of Science, and Wanfang databases, China Biology Medicine disc, China National Knowledge Infrastructure (CNKI), and VIP database for trials on the diagnostic accuracy of US + S-detect for the diagnosis of breast masses. The search time frame was from the date of establishment of the database to August 20, 2022. Two researchers independently screened the literature, extracted the information, and evaluated the quality of the included literature using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) scale. StataSE 15.1 software was utilized to assess pooled metrics, including sensitivity, specificity, and the area under the curve (AUC).
RESULTS
A total of 19 articles with 3,349 patients and 3,895 breast masses were included in this meta-analysis. Of these, seventeen articles evaluated the diagnostic performance of senior radiologists' US + S-detect for breast cancer, while twelve articles reported junior radiologists' diagnostic performance. The risk of bias was primarily attributed to patient selection, flow and timing. In the senior radiologist group, the pooled sensitivity and specificity of US + S-detect were 0.93 [95% confidence interval (CI): 0.89-0.95] and 0.86 (95% CI: 0.80-0.90), respectively, with an AUC of 0.96. As for the junior radiologist group, the pooled sensitivity and specificity of US + S-detect were 0.89 (95% CI: 0.83-0.93) and 0.79 (95% CI: 0.72-0.84), respectively, and the AUC was 0.91.
CONCLUSIONS
The results of this meta-analysis showed that the pooled sensitivity and the AUC of both the senior and junior radiologist groups were high, with good diagnostic efficacy and high clinical application. However, the results of this study are highly heterogeneous and need to be validated by collecting more high-quality studies and accumulating a larger sample size.
PubMed: 36654955
DOI: 10.21037/gs-22-643 -
Frontiers in Oncology 2022Here, we conducted a scoping review to (i) establish which machine learning (ML) methods have been applied to hematological malignancy imaging; (ii) establish how ML is...
BACKGROUND
Here, we conducted a scoping review to (i) establish which machine learning (ML) methods have been applied to hematological malignancy imaging; (ii) establish how ML is being applied to hematological cancer radiology; and (iii) identify addressable research gaps.
METHODS
The review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis Extension for Scoping Reviews guidelines. The inclusion criteria were (i) pediatric and adult patients with suspected or confirmed hematological malignancy undergoing imaging (); (ii) any study using ML techniques to derive models using radiological images to apply to the clinical management of these patients (); and (iii) original research articles conducted in any setting globally (). Quality Assessment of Diagnostic Accuracy Studies 2 criteria were used to assess diagnostic and segmentation studies, while the Newcastle-Ottawa scale was used to assess the quality of observational studies.
RESULTS
Of 53 eligible studies, 33 applied diverse ML techniques to diagnose hematological malignancies or to differentiate them from other diseases, especially discriminating gliomas from primary central nervous system lymphomas (n=18); 11 applied ML to segmentation tasks, while 9 applied ML to prognostication or predicting therapeutic responses, especially for diffuse large B-cell lymphoma. All studies reported discrimination statistics, but no study calculated calibration statistics. Every diagnostic/segmentation study had a high risk of bias due to their case-control design; many studies failed to provide adequate details of the reference standard; and only a few studies used independent validation.
CONCLUSION
To deliver validated ML-based models to radiologists managing hematological malignancies, future studies should (i) adhere to standardized, high-quality reporting guidelines such as the Checklist for Artificial Intelligence in Medical Imaging; (ii) validate models in independent cohorts; (ii) standardize volume segmentation methods for segmentation tasks; (iv) establish comprehensive prospective studies that include different tumor grades, comparisons with radiologists, optimal imaging modalities, sequences, and planes; (v) include side-by-side comparisons of different methods; and (vi) include low- and middle-income countries in multicentric studies to enhance generalizability and reduce inequity.
PubMed: 36605438
DOI: 10.3389/fonc.2022.1080988