-
International Journal of Surgery... Mar 2023Benchmarking, a novel measuring tool for outcome comparisons, is a recent concept in surgery. The objectives of this review are to examine the concept, definition, and... (Review)
Review
INTRODUCTION
Benchmarking, a novel measuring tool for outcome comparisons, is a recent concept in surgery. The objectives of this review are to examine the concept, definition, and evolution of benchmarking and its application in surgery.
METHODS
The literature about benchmarking was reviewed through an ever-narrowing search strategy, commencing from the concept, definition, and evolution of benchmarking to the application of benchmarking and experiences of benchmarking in surgery. PubMed, Web of Science, Embase, and Science Direct databases were searched until 20 September 2022, in the English language according to the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) guidelines.
RESULTS
In the first phase of the literature search, the development of benchmarking was identified. The definitions of benchmarking evolved from a surveying term to a novel quality-improvement tool to assess the best achievable results in surgery. In the second phase, a total of 23 studies were identified about benchmarking in surgery, including esophagectomy, hepatic surgery, pancreatic surgery, rectum resection, and bariatric surgery. All studies were multicenter analyses from national, international, or global expert centers. Most studies (87.0%) adopted the definition that benchmark was the 75th percentile of the median values of centers. Performance metrics to define benchmarks were clinically relevant intraoperative and postoperative outcome indicators.
CONCLUSION
Benchmarking in surgery is a novel quality-improvement tool to define and measure the best achievable results, establishing a meaningful reference to evaluate surgical performance.
Topics: Humans; Benchmarking; Postoperative Complications; Esophagectomy; Bariatric Surgery; Multicenter Studies as Topic
PubMed: 37093075
DOI: 10.1097/JS9.0000000000000212 -
Nature Communications Jan 2023A plethora of software suites and multiple classes of spectral libraries have been developed to enhance the depth and robustness of data-independent acquisition (DIA)...
A plethora of software suites and multiple classes of spectral libraries have been developed to enhance the depth and robustness of data-independent acquisition (DIA) data processing. However, how the combination of a DIA software tool and a spectral library impacts the outcome of DIA proteomics and phosphoproteomics data analysis has been rarely investigated using benchmark data that mimics biological complexity. In this study, we create DIA benchmark data sets simulating the regulation of thousands of proteins in a complex background, which are collected on both an Orbitrap and a timsTOF instruments. We evaluate four commonly used software suites (DIA-NN, Spectronaut, MaxDIA and Skyline) combined with seven different spectral libraries in global proteome analysis. Moreover, we assess their performances in analyzing phosphopeptide standards and TNF-α-induced phosphoproteome regulation. Our study provides a practical guidance on how to construct a robust data analysis pipeline for different proteomics studies implementing the DIA technique.
Topics: Proteomics; Benchmarking; Workflow; Mass Spectrometry; Software; Proteome
PubMed: 36609502
DOI: 10.1038/s41467-022-35740-1 -
BMC Health Services Research Apr 2017Although benchmarking may improve hospital processes, research on this subject is limited. The aim of this study was to provide an overview of publications on... (Review)
Review
BACKGROUND
Although benchmarking may improve hospital processes, research on this subject is limited. The aim of this study was to provide an overview of publications on benchmarking in specialty hospitals and a description of study characteristics.
METHODS
We searched PubMed and EMBASE for articles published in English in the last 10 years. Eligible articles described a project stating benchmarking as its objective and involving a specialty hospital or specific patient category; or those dealing with the methodology or evaluation of benchmarking.
RESULTS
Of 1,817 articles identified in total, 24 were included in the study. Articles were categorized into: pathway benchmarking, institutional benchmarking, articles on benchmark methodology or -evaluation and benchmarking using a patient registry. There was a large degree of variability:(1) study designs were mostly descriptive and retrospective; (2) not all studies generated and showed data in sufficient detail; and (3) there was variety in whether a benchmarking model was just described or if quality improvement as a consequence of the benchmark was reported upon. Most of the studies that described a benchmark model described the use of benchmarking partners from the same industry category, sometimes from all over the world.
CONCLUSIONS
Benchmarking seems to be more developed in eye hospitals, emergency departments and oncology specialty hospitals. Some studies showed promising improvement effects. However, the majority of the articles lacked a structured design, and did not report on benchmark outcomes. In order to evaluate the effectiveness of benchmarking to improve quality in specialty hospitals, robust and structured designs are needed including a follow up to check whether the benchmark study has led to improvements.
Topics: Benchmarking; Emergency Service, Hospital; Hospitals, Special; Humans; Models, Theoretical; Quality Improvement; Retrospective Studies
PubMed: 28372574
DOI: 10.1186/s12913-017-2154-y -
Neuron Nov 2020A potentially organizing goal of the brain and cognitive sciences is to accurately explain domains of human intelligence as executable, neurally mechanistic models.... (Review)
Review
A potentially organizing goal of the brain and cognitive sciences is to accurately explain domains of human intelligence as executable, neurally mechanistic models. Years of research have led to models that capture experimental results in individual behavioral tasks and individual brain regions. We here advocate for taking the next step: integrating experimental results from many laboratories into suites of benchmarks that, when considered together, push mechanistic models toward explaining entire domains of intelligence, such as vision, language, and motor control. Given recent successes of neurally mechanistic models and the surging availability of neural, anatomical, and behavioral data, we believe that now is the time to create integrative benchmarking platforms that incentivize ambitious, unified models. This perspective discusses the advantages and the challenges of this approach and proposes specific steps to achieve this goal in the domain of visual intelligence with the case study of an integrative benchmarking platform called Brain-Score.
Topics: Benchmarking; Brain; Humans; Intelligence; Models, Neurological; Neural Networks, Computer
PubMed: 32918861
DOI: 10.1016/j.neuron.2020.07.040 -
Journal of Gastrointestinal Surgery :... Mar 2021Pancreatic surgery is performed in relatively few centres. There are validated quality benchmarks for pancreatic surgery, although it remains unclear how published... (Review)
Review
BACKGROUND
Pancreatic surgery is performed in relatively few centres. There are validated quality benchmarks for pancreatic surgery, although it remains unclear how published benchmarks compare with each other. This study aimed to systematically review published literature to summarise metrics that define quality benchmarks for pancreatic surgery.
METHOD
A search of MEDLINE, EMBASE and CENTRAL was undertaken until June 2019. Articles that developed or validated published quality benchmarks for pancreatic surgery were included. Benchmarks were classified into three domains using the Donabedian framework, and their quality assessed using the AIRE Instrument.
RESULTS
Nineteen studies included 55 quality metrics, of which 8 developed new metrics, and 11 studies validated previously published metrics. The methodology of metric development was either expert opinion-driven or data-driven. All metrics demonstrated moderate quality scores. There was partial agreement in some metrics (e.g. < 10 h total operative duration), but lack of consensus for most others (e.g. lymph node yield ≥ 10, ≥ 12, ≥ 15, ≥ 16). No metrics related to patient reported outcomes.
CONCLUSIONS
Published quality benchmarks for pancreatic surgery predominantly arise from eight studies, with heterogeneity in how the metrics were developed. There was not consensus for all metrics. Metrics need to be reviewed as new data emerge, technologies develop and opinions change.
Topics: Benchmarking; Consensus; Humans
PubMed: 33159243
DOI: 10.1007/s11605-020-04827-9 -
Genome Biology Dec 2019Insufficient performance of optimization-based approaches for the fitting of mathematical models is still a major bottleneck in systems biology. In this article, the... (Review)
Review
Insufficient performance of optimization-based approaches for the fitting of mathematical models is still a major bottleneck in systems biology. In this article, the reasons and methodological challenges are summarized as well as their impact in benchmark studies. Important aspects for achieving an increased level of evidence for benchmark results are discussed. Based on general guidelines for benchmarking in computational biology, a collection of tailored guidelines is presented for performing informative and unbiased benchmarking of optimization-based fitting approaches. Comprehensive benchmark studies based on these recommendations are urgently required for the establishment of a robust and reliable methodology for the systems biology community.
Topics: Benchmarking; Models, Biological; Systems Biology
PubMed: 31842943
DOI: 10.1186/s13059-019-1887-9 -
Annual Review of Public Health Apr 2021Diet-related noncommunicable diseases (NCDs) and obesity are the leading contributors to poor health worldwide. Efforts to improve population diets need to focus on... (Review)
Review
Diet-related noncommunicable diseases (NCDs) and obesity are the leading contributors to poor health worldwide. Efforts to improve population diets need to focus on creating healthy food environments. INFORMAS, established in 2012, is an international network that monitors and benchmarks food environments and related policies. By 2020, INFORMAS was active in 58 countries; national government policies were the most frequent aspect benchmarked. INFORMAS has resulted in the development and widespread application of standardized methods for assessing the characteristics of food environments. The activities of INFORMAS have contributed substantially to capacity building, advocacy, stakeholder engagement, and policy evaluation in relation to creating healthy food environments. Future efforts to benchmark food environments need to incorporate measurements related to environmental sustainability. For sustained impact, INFORMAS activities will need to be embedded within other existing monitoring initiatives. The most value will come from repeated assessments that help drive increased accountability for improving food environments.
Topics: Benchmarking; Diet, Healthy; Environment; Global Health; Health Promotion; Humans; Program Evaluation; Public Health
PubMed: 33351647
DOI: 10.1146/annurev-publhealth-100919-114442 -
Structure (London, England : 1993) Jan 2023Recent advancements in computational tools have allowed protein structure prediction with high accuracy. Computational prediction methods have been used for modeling...
Recent advancements in computational tools have allowed protein structure prediction with high accuracy. Computational prediction methods have been used for modeling many soluble and membrane proteins, but the performance of these methods in modeling peptide structures has not yet been systematically investigated. We benchmarked the accuracy of AlphaFold2 in predicting 588 peptide structures between 10 and 40 amino acids using experimentally determined NMR structures as reference. Our results showed AlphaFold2 predicts α-helical, β-hairpin, and disulfide-rich peptides with high accuracy. AlphaFold2 performed at least as well if not better than alternative methods developed specifically for peptide structure prediction. AlphaFold2 showed several shortcomings in predicting Φ/Ψ angles, disulfide bond patterns, and the lowest RMSD structures failed to correlate with lowest pLDDT ranked structures. In summary, computation can be a powerful tool to predict peptide structures, but additional steps may be necessary to analyze and validate the results.
Topics: Protein Structure, Secondary; Benchmarking; Peptides; Membrane Proteins; Disulfides; Protein Conformation
PubMed: 36525975
DOI: 10.1016/j.str.2022.11.012 -
Nature Communications Mar 2023Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate...
Integration of single-cell RNA sequencing data between different samples has been a major challenge for analyzing cell populations. However, strategies to integrate differential expression analysis of single-cell data remain underinvestigated. Here, we benchmark 46 workflows for differential expression analysis of single-cell data with multiple batches. We show that batch effects, sequencing depth and data sparsity substantially impact their performances. Notably, we find that the use of batch-corrected data rarely improves the analysis for sparse data, whereas batch covariate modeling improves the analysis for substantial batch effects. We show that for low depth data, single-cell techniques based on zero-inflation model deteriorate the performance, whereas the analysis of uncorrected data using limmatrend, Wilcoxon test and fixed effects model performs well. We suggest several high-performance methods under different conditions based on various simulation and real data analyses. Additionally, we demonstrate that differential expression analysis for a specific cell type outperforms that of large-scale bulk sample data in prioritizing disease-related genes.
Topics: Sequence Analysis, RNA; Benchmarking; Computer Simulation; Workflow; Data Analysis; Single-Cell Analysis; Gene Expression Profiling
PubMed: 36944632
DOI: 10.1038/s41467-023-37126-3 -
RNA (New York, N.Y.) Dec 2023The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions,... (Review)
Review
The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets.
Topics: RNA; Benchmarking; RNA-Seq; Polyadenylation; Sequence Analysis, RNA
PubMed: 37816550
DOI: 10.1261/rna.079849.123