benchmarking - OpenMD.com Journal Search

Recommendations for machine learning benchmarks in neuroimaging.

NeuroImage Aug 2022

The field of neuroimaging has embraced methods from machine learning in a variety of ways. Although an increasing number of initiatives have published open-access...

Summary PubMed Full Text

Authors: Ramona Leenings, Nils R Winter, Udo Dannlowski...

The field of neuroimaging has embraced methods from machine learning in a variety of ways. Although an increasing number of initiatives have published open-access neuroimaging datasets, specifically designed benchmarks are rare in the field. In this article, we first describe how benchmarks in computer science and biomedical imaging have fostered methodological progress in machine learning. Second, we identify the special characteristics of neuroimaging data and outline what researchers have to ensure when establishing a neuroimaging benchmark, how datasets should be composed and how adequate evaluation criteria can be chosen. Based on lessons learned from machine learning benchmarks, we argue for an extended evaluation procedure that, next to applying suitable performance metrics, focuses on scientifically relevant aspects such as explainability, robustness, uncertainty, computational efficiency and code quality. Lastly, we envision a collaborative neuroimaging benchmarking platform that combines the discussed aspects in a collaborative and agile framework, allowing researchers across disciplines to work together on the key predictive problems of the field of neuroimaging and psychiatry.

Topics: Benchmarking; Humans; Machine Learning; Neuroimaging; Psychiatry

PubMed: 35561945
DOI: 10.1016/j.neuroimage.2022.119298

Benchmarking Orthogroup Inference Accuracy: Revisiting Orthobench.

Genome Biology and Evolution Dec 2020

Orthobench is the standard benchmark to assess the accuracy of orthogroup inference methods. It contains 70 expert-curated reference orthogroups (RefOGs) that span the...

Summary PubMed Full Text PDF

Authors: David M Emms, Steven Kelly

Orthobench is the standard benchmark to assess the accuracy of orthogroup inference methods. It contains 70 expert-curated reference orthogroups (RefOGs) that span the Bilateria and cover a range of different challenges for orthogroup inference. Here, we leveraged improvements in tree inference algorithms and computational resources to reinterrogate these RefOGs and carry out an extensive phylogenetic delineation of their composition. This phylogenetic revision altered the membership of 31 of the 70 RefOGs, with 24 subject to extensive revision and 7 that required minor changes. We further used these revised and updated RefOGs to provide an assessment of the orthogroup inference accuracy of widely used orthogroup inference methods. Finally, we provide an open-source benchmarking suite to support the future development and use of the Orthobench benchmark.

Topics: Benchmarking; Biological Evolution; Computational Biology; Databases, Factual; Genetic Techniques

PubMed: 33022036
DOI: 10.1093/gbe/evaa211

Development and Evaluation of BenchBalance: A System for Benchmarking Balance Capabilities of Wearable Robots and Their Users.

Sensors (Basel, Switzerland) Dec 2021

Recent advances in the control of overground exoskeletons are being centered on improving balance support and decreasing the reliance on crutches. However, appropriate...

Summary PubMed Full Text PDF

Authors: Cristina Bayón, Gabriel Delgado-Oleas, Leticia Avellar...

Recent advances in the control of overground exoskeletons are being centered on improving balance support and decreasing the reliance on crutches. However, appropriate methods to quantify the stability of these exoskeletons (and their users) are still under development. A reliable and reproducible balance assessment is critical to enrich exoskeletons' performance and their interaction with humans. In this work, we present the BenchBalance system, which is a benchmarking solution to conduct reproducible balance assessments of exoskeletons and their users. Integrating two key elements, i.e., a hand-held perturbator and a smart garment, BenchBalance is a portable and low-cost system that provides a quantitative assessment related to the reaction and capacity of wearable exoskeletons and their users to respond to controlled external perturbations. A software interface is used to guide the experimenter throughout a predefined protocol of measurable perturbations, taking into account antero-posterior and mediolateral responses. In total, the protocol is composed of sixteen perturbation conditions, which vary in magnitude and location while still controlling their orientation. The data acquired by the interface are classified and saved for a subsequent analysis based on synthetic metrics. In this paper, we present a proof of principle of the BenchBalance system with a healthy user in two scenarios: subject not wearing and subject wearing the H2 lower-limb exoskeleton. After a brief training period, the experimenter was able to provide the manual perturbations of the protocol in a consistent and reproducible way. The balance metrics defined within the BenchBalance framework were able to detect differences in performance depending on the perturbation magnitude, location, and the presence or not of the exoskeleton. The BenchBalance system will be integrated at EUROBENCH facilities to benchmark the balance capabilities of wearable exoskeletons and their users.

Topics: Benchmarking; Crutches; Exoskeleton Device; Humans; Lower Extremity; Wearable Electronic Devices

PubMed: 35009661
DOI: 10.3390/s22010119

Benchmarking microbial growth rate predictions from metagenomes.

The ISME Journal Jan 2021

Growth rates are central to understanding microbial interactions and community dynamics. Metagenomic growth estimators have been developed, specifically codon usage bias...

Summary PubMed Full Text PDF

Authors: Andrew M Long, Shengwei Hou, J Cesar Ignacio-Espinoza...

Growth rates are central to understanding microbial interactions and community dynamics. Metagenomic growth estimators have been developed, specifically codon usage bias (CUB) for maximum growth rates and "peak-to-trough ratio" (PTR) for in situ rates. Both were originally tested with pure cultures, but natural populations are more heterogeneous, especially in individual cell histories pertinent to PTR. To test these methods, we compared predictors with observed growth rates of freshly collected marine prokaryotes in unamended seawater. We prefiltered and diluted samples to remove grazers and greatly reduce virus infection, so net growth approximated gross growth. We sampled over 44 h for abundances and metagenomes, generating 101 metagenome-assembled genomes (MAGs), including Actinobacteria, Verrucomicrobia, SAR406, MGII archaea, etc. We tracked each MAG population by cell-abundance-normalized read recruitment, finding growth rates of 0 to 5.99 per day, the first reported rates for several groups, and used these rates as benchmarks. PTR, calculated by three methods, rarely correlated to growth (r ~-0.26-0.08), except for rapidly growing γ-Proteobacteria (r ~0.63-0.92), while CUB correlated moderately well to observed maximum growth rates (r = 0.57). This suggests that current PTR approaches poorly predict actual growth of most marine bacterial populations, but maximum growth rates can be approximated from genomic characteristics.

Topics: Archaea; Bacteria; Benchmarking; Metagenome; Metagenomics

PubMed: 32939027
DOI: 10.1038/s41396-020-00773-1

Machine learning for RNA 2D structure prediction benchmarked on experimental data.

Briefings in Bioinformatics May 2023

Since the 1980s, dozens of computational methods have addressed the problem of predicting RNA secondary structure. Among them are those that follow standard optimization... (Review)

Summary PubMed Full Text PDF

Review

Authors: Marek Justyna, Maciej Antczak, Marta Szachniuk...

Since the 1980s, dozens of computational methods have addressed the problem of predicting RNA secondary structure. Among them are those that follow standard optimization approaches and, more recently, machine learning (ML) algorithms. The former were repeatedly benchmarked on various datasets. The latter, on the other hand, have not yet undergone extensive analysis that could suggest to the user which algorithm best fits the problem to be solved. In this review, we compare 15 methods that predict the secondary structure of RNA, of which 6 are based on deep learning (DL), 3 on shallow learning (SL) and 6 control methods on non-ML approaches. We discuss the ML strategies implemented and perform three experiments in which we evaluate the prediction of (I) representatives of the RNA equivalence classes, (II) selected Rfam sequences and (III) RNAs from new Rfam families. We show that DL-based algorithms (such as SPOT-RNA and UFold) can outperform SL and traditional methods if the data distribution is similar in the training and testing set. However, when predicting 2D structures for new RNA families, the advantage of DL is no longer clear, and its performance is inferior or equal to that of SL and non-ML methods.

Topics: Humans; RNA; Machine Learning; Algorithms; Benchmarking

PubMed: 37096592
DOI: 10.1093/bib/bbad153

DrugMechDB: A Curated Database of Drug Mechanisms.

Scientific Data Sep 2023

Computational drug repositioning methods have emerged as an attractive and effective solution to find new candidates for existing therapies, reducing the time and cost...

Summary PubMed Full Text PDF

Authors: Adriana Carolina Gonzalez-Cavazos, Anna Tanska, Michael Mayers...

Computational drug repositioning methods have emerged as an attractive and effective solution to find new candidates for existing therapies, reducing the time and cost of drug development. Repositioning methods based on biomedical knowledge graphs typically offer useful supporting biological evidence. This evidence is based on reasoning chains or subgraphs that connect a drug to a disease prediction. However, there are no databases of drug mechanisms that can be used to train and evaluate such methods. Here, we introduce the Drug Mechanism Database (DrugMechDB), a manually curated database that describes drug mechanisms as paths through a knowledge graph. DrugMechDB integrates a diverse range of authoritative free-text resources to describe 4,583 drug indications with 32,249 relationships, representing 14 major biological scales. DrugMechDB can be employed as a benchmark dataset for assessing computational drug repositioning models or as a valuable resource for training such models.

Topics: Benchmarking; Databases, Factual; Drug Development; Drug Repositioning; Knowledge

PubMed: 37717042
DOI: 10.1038/s41597-023-02534-z

USP General Chapter <825> Impact on Nuclear Medicine Technology Practice.

Journal of Nuclear Medicine Technology Jun 2020

U.S. Pharmacopeia (USP) general chapter <825>, "Radiopharmaceuticals: Preparation, Compounding, Dispensing, and Repackaging," is a new standard proposed to provide... (Review)

Summary PubMed Full Text

Review

Authors: George H Hinkle

U.S. Pharmacopeia (USP) general chapter <825>, "Radiopharmaceuticals: Preparation, Compounding, Dispensing, and Repackaging," is a new standard proposed to provide minimum requirements for the preparation, compounding, dispensing, and repackaging of sterile and nonsterile radiopharmaceuticals. This new standard represents endeavors on the part of the USP to respond to appeals by nuclear medicine professionals to move beyond a minimal supplement to USP <797> and provide policies specific to radiopharmaceuticals. USP <825> provides nuclear pharmacies and nuclear medicine departments in hospitals and clinics with the benchmarks to assess current practice activities and integrate needed changes to meet regulatory and accreditation audit reviews. This continuing education article focuses on components of USP <825> specific to the nuclear medicine technologist for a better understanding of obligations when preparing sterile radiopharmaceuticals for clinical use.

Topics: Benchmarking; Humans; Nuclear Medicine; Organizations, Nonprofit; United States

PubMed: 32499321
DOI: 10.2967/jnmt.120.243378

Application of docking methodologies to modeled proteins.

Proteins Sep 2020

Protein docking is essential for structural characterization of protein interactions. Besides providing the structure of protein complexes, modeling of proteins and...

Summary PubMed Full Text PDF

Authors: Amar Singh, Taras Dauzhenka, Petras J Kundrotas...

Protein docking is essential for structural characterization of protein interactions. Besides providing the structure of protein complexes, modeling of proteins and their complexes is important for understanding the fundamental principles and specific aspects of protein interactions. The accuracy of protein modeling, in general, is still less than that of the experimental approaches. Thus, it is important to investigate the applicability of docking techniques to modeled proteins. We present new comprehensive benchmark sets of protein models for the development and validation of protein docking, as well as a systematic assessment of free and template-based docking techniques on these sets. As opposed to previous studies, the benchmark sets reflect the real case modeling/docking scenario where the accuracy of the models is assessed by the modeling procedure, without reference to the native structure (which would be unknown in practical applications). We also expanded the analysis to include docking of protein pairs where proteins have different structural accuracy. The results show that, in general, the template-based docking is less sensitive to the structural inaccuracies of the models than the free docking. The near-native docking poses generated by the template-based approach, typically, also have higher ranks than those produces by the free docking (although the free docking is indispensable in modeling the multiplicity of protein interactions in a crowded cellular environment). The results show that docking techniques are applicable to protein models in a broad range of modeling accuracy. The study provides clear guidelines for practical applications of docking to protein models.

Topics: Amino Acid Sequence; Benchmarking; Binding Sites; Databases, Protein; Molecular Docking Simulation; Protein Binding; Protein Structure, Secondary; Proteins; Software

PubMed: 32170770
DOI: 10.1002/prot.25889

Benchmarks in Liver Resection for Intrahepatic Cholangiocarcinoma.

Annals of Surgical Oncology May 2024

Benchmarking in surgery has been proposed as a means to compare results across institutions to establish best practices. We sought to define benchmark values for...

Summary PubMed Full Text PDF

Authors: Laura Alaimo, Yutaka Endo, Giovanni Catalano...

INTRODUCTION

Benchmarking in surgery has been proposed as a means to compare results across institutions to establish best practices. We sought to define benchmark values for hepatectomy for intrahepatic cholangiocarcinoma (ICC) across an international population.

METHODS

Patients who underwent liver resection for ICC between 1990 and 2020 were identified from an international database, including 14 Eastern and Western institutions. Patients operated on at high-volume centers who had no preoperative jaundice, ASA class <3, body mass index <35 km/m, without need for bile duct or vascular resection were chosen as the benchmark group.

RESULTS

Among 1193 patients who underwent curative-intent hepatectomy for ICC, 600 (50.3%) were included in the benchmark group. Among benchmark patients, median age was 58.0 years (interquartile range [IQR] 49.0-67.0), only 28 (4.7%) patients received neoadjuvant therapy, and most patients had a minor resection (n = 499, 83.2%). Benchmark values included ≥3 lymph nodes retrieved when lymphadenectomy was performed, blood loss ≤600 mL, perioperative blood transfusion rate ≤42.9%, and operative time ≤339 min. The postoperative benchmark values included TOO achievement ≥59.3%, positive resection margin ≤27.5%, 30-day readmission ≤3.6%, Clavien-Dindo III or more complications ≤14.3%, and 90-day mortality ≤4.8%, as well as hospital stay ≤14 days.

CONCLUSIONS

Benchmark cutoffs targeting short-term perioperative outcomes can help to facilitate comparisons across hospitals performing liver resection for ICC, assess inter-institutional variation, and identify the highest-performing centers to improve surgical and oncologic outcomes.

Topics: Humans; Middle Aged; Bile Ducts, Intrahepatic; Benchmarking; Hepatectomy; Bile Duct Neoplasms; Cholangiocarcinoma; Retrospective Studies

PubMed: 38214817
DOI: 10.1245/s10434-023-14880-8

Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data.

RNA (New York, N.Y.) Dec 2023

The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions,... (Review)

Summary PubMed Full Text PDF

Review

Authors: Sam Bryce-Smith, Dominik Burri, Matthew R Gazzara...

The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, limitations, and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for continuous extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies, while the containers and reproducible workflows could easily be deployed and extended to evaluate new methods or data sets.

Topics: RNA; Benchmarking; RNA-Seq; Polyadenylation; Sequence Analysis, RNA

PubMed: 37816550
DOI: 10.1261/rna.079849.123