benchmarking - OpenMD.com Journal Search

Benchmarking small-variant genotyping in polyploids.

Genome Research Feb 2022

Genotyping from sequencing is the basis of emerging strategies in the molecular breeding of polyploid plants. However, compared with the situation for diploids, in which...

Summary PubMed Full Text PDF

Authors: Daniel P Cooke, David C Wedge, Gerton Lunter...

Genotyping from sequencing is the basis of emerging strategies in the molecular breeding of polyploid plants. However, compared with the situation for diploids, in which genotyping accuracies are confidently determined with comprehensive benchmarks, polyploids have been neglected; there are no benchmarks measuring genotyping error rates for small variants using real sequencing reads. We previously introduced a variant calling method, Octopus, that accurately calls germline variants in diploids and somatic mutations in tumors. Here, we evaluate Octopus and other popular tools on whole-genome tetraploid and hexaploid data sets created using in silico mixtures of diploid Genome in a Bottle (GIAB) samples. We find that genotyping errors are abundant for typical sequencing depths but that Octopus makes 25% fewer errors than other methods on average. We supplement our benchmarks with concordance analysis in real autotriploid banana data sets.

Topics: Benchmarking; Genotype; High-Throughput Nucleotide Sequencing; Humans; Polyploidy

PubMed: 34965940
DOI: 10.1101/gr.275579.121

Integrative Benchmarking to Advance Neurally Mechanistic Models of Human Intelligence.

Neuron Nov 2020

A potentially organizing goal of the brain and cognitive sciences is to accurately explain domains of human intelligence as executable, neurally mechanistic models.... (Review)

Summary PubMed Full Text

Review

Authors: Martin Schrimpf, Jonas Kubilius, Michael J Lee...

A potentially organizing goal of the brain and cognitive sciences is to accurately explain domains of human intelligence as executable, neurally mechanistic models. Years of research have led to models that capture experimental results in individual behavioral tasks and individual brain regions. We here advocate for taking the next step: integrating experimental results from many laboratories into suites of benchmarks that, when considered together, push mechanistic models toward explaining entire domains of intelligence, such as vision, language, and motor control. Given recent successes of neurally mechanistic models and the surging availability of neural, anatomical, and behavioral data, we believe that now is the time to create integrative benchmarking platforms that incentivize ambitious, unified models. This perspective discusses the advantages and the challenges of this approach and proposes specific steps to achieve this goal in the domain of visual intelligence with the case study of an integrative benchmarking platform called Brain-Score.

Topics: Benchmarking; Brain; Humans; Intelligence; Models, Neurological; Neural Networks, Computer

PubMed: 32918861
DOI: 10.1016/j.neuron.2020.07.040

Benchmarking: a method for continuous quality improvement in health.

Healthcare Policy = Politiques de Sante May 2012

Benchmarking, a management approach for implementing best practices at best cost, is a recent concept in the healthcare system. The objectives of this paper are to... (Review)

Summary PubMed Full Text PDF

Review

Authors: Amina Ettorchi-Tardy, Marie Levif, Philippe Michel...

Benchmarking, a management approach for implementing best practices at best cost, is a recent concept in the healthcare system. The objectives of this paper are to better understand the concept and its evolution in the healthcare sector, to propose an operational definition, and to describe some French and international experiences of benchmarking in the healthcare sector. To this end, we reviewed the literature on this approach's emergence in the industrial sector, its evolution, its fields of application and examples of how it has been used in the healthcare sector. Benchmarking is often thought to consist simply of comparing indicators and is not perceived in its entirety, that is, as a tool based on voluntary and active collaboration among several organizations to create a spirit of competition and to apply best practices. The key feature of benchmarking is its integration within a comprehensive and participatory policy of continuous quality improvement (CQI). Conditions for successful benchmarking focus essentially on careful preparation of the process, monitoring of the relevant indicators, staff involvement and inter-organizational visits. Compared to methods previously implemented in France (CQI and collaborative projects), benchmarking has specific features that set it apart as a healthcare innovation. This is especially true for healthcare or medical-social organizations, as the principle of inter-organizational visiting is not part of their culture. Thus, this approach will need to be assessed for feasibility and acceptability before it is more widely promoted.

Topics: Benchmarking; Humans; Industry; Quality Improvement; Quality Indicators, Health Care; Total Quality Management

PubMed: 23634166
DOI: No ID Found

G-bic: generating synthetic benchmarks for biclustering.

BMC Bioinformatics Dec 2023

Biclustering is increasingly used in biomedical data analysis, recommendation tasks, and text mining domains, with hundreds of biclustering algorithms proposed. When...

Summary PubMed Full Text PDF

Authors: Eduardo N Castanho, João P Lobo, Rui Henriques...

BACKGROUND

Biclustering is increasingly used in biomedical data analysis, recommendation tasks, and text mining domains, with hundreds of biclustering algorithms proposed. When assessing the performance of these algorithms, more than real datasets are required as they do not offer a solid ground truth. Synthetic data surpass this limitation by producing reference solutions to be compared with the found patterns. However, generating synthetic datasets is challenging since the generated data must ensure reproducibility, pattern representativity, and real data resemblance.

RESULTS

We propose G-Bic, a dataset generator conceived to produce synthetic benchmarks for the normative assessment of biclustering algorithms. Beyond expanding on aspects of pattern coherence, data quality, and positioning properties, it further handles specificities related to mixed-type datasets and time-series data.G-Bic has the flexibility to replicate real data regularities from diverse domains. We provide the default configurations to generate reproducible benchmarks to evaluate and compare diverse aspects of biclustering algorithms. Additionally, we discuss empirical strategies to simulate the properties of real data.

CONCLUSION

G-Bic is a parametrizable generator for biclustering analysis, offering a solid means to assess biclustering solutions according to internal and external metrics robustly.

Topics: Gene Expression Profiling; Reproducibility of Results; Benchmarking; Cluster Analysis; Algorithms

PubMed: 38053078
DOI: 10.1186/s12859-023-05587-4

Supervised promoter recognition: a benchmark framework.

BMC Bioinformatics Apr 2022

Deep learning has become a prevalent method in identifying genomic regulatory sequences such as promoters. In a number of recent papers, the performance of deep learning...

Summary PubMed Full Text PDF

Authors: Raul I Perez Martell, Alison Ziesel, Hosna Jabbari...

MOTIVATION

Deep learning has become a prevalent method in identifying genomic regulatory sequences such as promoters. In a number of recent papers, the performance of deep learning models has continually been reported as an improvement over alternatives for sequence-based promoter recognition. However, the performance improvements in these models do not account for the different datasets that models are evaluated on. The lack of a consensus dataset and procedure for benchmarking purposes has made the comparison of each model's true performance difficult to assess.

RESULTS

We present a framework called Supervised Promoter Recognition Framework ('SUPR REF') capable of streamlining the complete process of training, validating, testing, and comparing promoter recognition models in a systematic manner. SUPR REF includes the creation of biologically relevant benchmark datasets to be used in the evaluation process of deep learning promoter recognition models. We showcase this framework by comparing the models' performances on alternative datasets, and properly evaluate previously published models on new benchmark datasets. Our results show that the reliability of deep learning ab initio promoter recognition models on eukaryotic genomic sequences is still not at a sufficient level, as overall performance is still low. These results originate from a subset of promoters, the well-known RNA Polymerase II core promoters. Furthermore, given the observational nature of these data, cross-validation results from small promoter datasets need to be interpreted with caution.

Topics: Benchmarking; Eukaryotic Cells; Genomics; Promoter Regions, Genetic; Reproducibility of Results

PubMed: 35366794
DOI: 10.1186/s12859-022-04647-5

Dockground scoring benchmarks for protein docking.

Proteins Jun 2022

Protein docking protocols typically involve global docking scan, followed by re-ranking of the scan predictions by more accurate scoring functions that are either...

Summary PubMed Full Text PDF

Authors: Ian Kotthoff, Petras J Kundrotas, Ilya A Vakser...

Protein docking protocols typically involve global docking scan, followed by re-ranking of the scan predictions by more accurate scoring functions that are either computationally too expensive or algorithmically impossible to include in the global scan. Development and validation of scoring methodologies are often performed on scoring benchmark sets (docking decoys) which offer concise and nonredundant representation of the global docking scan output for a large and diverse set of protein-protein complexes. Two such protein-protein scoring benchmarks were built for the Dockground resource, which contains various datasets for the development and testing of protein docking methodologies. One set was generated based on the Dockground unbound docking benchmark 4, and the other based on protein models from the Dockground model-model benchmark 2. The docking decoys were designed to reflect the reality of the real-case docking applications (e.g., correct docking predictions defined as near-native rather than native structures), and to minimize applicability of approaches not directly related to the development of scoring functions (reducing clustering of predictions in the binding funnel and disparity in structural quality of the near-native and nonnative matches). The sets were further characterized by the source organism and the function of the protein-protein complexes. The sets, freely available to the research community on the Dockground webpage, present a unique, user-friendly resource for the developing and testing of protein-protein scoring approaches.

Topics: Benchmarking; Molecular Docking Simulation; Protein Binding; Protein Conformation; Proteins

PubMed: 35072956
DOI: 10.1002/prot.26306

FASStR: a framework for ensuring high-quality operational metrics in health care.

The American Journal of Managed Care Jun 2020

Poorly defined measurement impairs interinstitutional comparison, interpretation of results, and process improvement in health care operations. We sought to develop a... (Review)

Summary PubMed Full Text PDF

Review

Authors: Elham Torabi, Tugba Cayirli, Craig M Froehle...

OBJECTIVES

Poorly defined measurement impairs interinstitutional comparison, interpretation of results, and process improvement in health care operations. We sought to develop a unifying framework that could be used by administrators, practitioners, and investigators to help define and document operational performance measures that are comparable and reproducible.

STUDY DESIGN

Retrospective analysis.

METHODS

Health care operations and clinical investigators used an iterative process consisting of (1) literature review, (2) expert assessment and collaborative design, and (3) end-user feedback. We sampled the literature from the medical, health systems research, and health care operations (business and engineering) disciplines to assemble a representative sample of studies in which outpatient health care performance metrics were used to describe the primary or secondary outcome of the research.

RESULTS

We identified 2 primary deficiencies in outpatient performance metric definitions: incompletion and inconsistency. From our review of performance metrics, we propose the FASStR framework for the Focus, Activity, Statistic, Scale type, and Reference dimensions of a performance metric. The FASStR framework is a method by which performance metrics can be developed and examined from a multidimensional perspective to evaluate their comprehensiveness and clarity. The framework was tested and revised in an iterative process with both practitioners and investigators.

CONCLUSIONS

The FASStR framework can guide the design, development, and implementation of operational metrics in outpatient health care settings. Further, this framework can assist investigators in the evaluation of the metrics that they are using. Overall, the FASStR framework can result in clearer, more consistent use and evaluation of outpatient performance metrics.

Topics: Benchmarking; Data Accuracy; Delivery of Health Care; Efficiency, Organizational; Forecasting; Humans; Quality Indicators, Health Care; Reproducibility of Results; Retrospective Studies; United States

PubMed: 32549066
DOI: 10.37765/ajmc.2020.43492

Benchmarks for models of short-term and working memory.

Psychological Bulletin Sep 2018

Any mature field of research in psychology-such as short-term/working memory-is characterized by a wealth of empirical findings. It is currently unrealistic to expect a...

Summary PubMed Full Text

Authors: Klaus Oberauer, Stephan Lewandowsky, Edward Awh...

Any mature field of research in psychology-such as short-term/working memory-is characterized by a wealth of empirical findings. It is currently unrealistic to expect a theory to explain them all; theorists must satisfice with explaining a subset of findings. The aim of the present article is to make the choice of that subset less arbitrary and idiosyncratic than is current practice. We propose criteria for identifying benchmark findings that every theory in a field should be able to explain: Benchmarks should be reproducible, generalize across materials and methodological variations, and be theoretically informative. We propose a set of benchmarks for theories and computational models of short-term and working memory. The benchmarks are described in as theory-neutral a way as possible, so that they can serve as empirical common ground for competing theoretical approaches. Benchmarks are rated on three levels according to their priority for explanation. Selection and ratings of the benchmarks is based on consensus among the authors, who jointly represent a broad range of theoretical perspectives on working memory, and they are supported by a survey among other experts on working memory. The article is accompanied by a web page providing an open forum for discussion and for submitting proposals for new benchmarks; and a repository for reference data sets for each benchmark. (PsycINFO Database Record

Topics: Benchmarking; Humans; Memory, Short-Term; Models, Psychological; Psychological Theory

PubMed: 30148379
DOI: 10.1037/bul0000153

Molecular geometric deep learning.

Cell Reports Methods Nov 2023

Molecular representation learning plays an important role in molecular property prediction. Existing molecular property prediction models rely on the de facto standard...

Summary PubMed Full Text PDF

Authors: Cong Shen, Jiawei Luo, Kelin Xia...

Molecular representation learning plays an important role in molecular property prediction. Existing molecular property prediction models rely on the de facto standard of covalent-bond-based molecular graphs for representing molecular topology at the atomic level and totally ignore the non-covalent interactions within the molecule. In this study, we propose a molecular geometric deep learning model to predict the properties of molecules that aims to comprehensively consider the information of covalent and non-covalent interactions of molecules. The essential idea is to incorporate a more general molecular representation into geometric deep learning (GDL) models. We systematically test molecular GDL (Mol-GDL) on fourteen commonly used benchmark datasets. The results show that Mol-GDL can achieve a better performance than state-of-the-art (SOTA) methods. Extensive tests have demonstrated the important role of non-covalent interactions in molecular property prediction and the effectiveness of Mol-GDL models.

Topics: Deep Learning; Benchmarking; Models, Molecular

PubMed: 37875121
DOI: 10.1016/j.crmeth.2023.100621

Benchmark of Generic Shapes for Macrocycles.

Journal of Chemical Information and... Dec 2020

Macrocycles target proteins that are otherwise considered undruggable because of a lack of hydrophobic cavities and the presence of extended featureless surfaces....

Summary PubMed Full Text PDF

Authors: Atilio Reyes Romero, Angel Jonathan Ruiz-Moreno, Matthew R Groves...

Macrocycles target proteins that are otherwise considered undruggable because of a lack of hydrophobic cavities and the presence of extended featureless surfaces. Increasing efforts by computational chemists have developed effective software to overcome the restrictions of torsional and conformational freedom that arise as a consequence of macrocyclization. Moloc is an efficient algorithm, with an emphasis on high interactivity, and has been constantly updated since 1986 by drug designers and crystallographers of the Roche biostructural community. In this work, we have benchmarked the shape-guided algorithm using a dataset of 208 macrocycles, carefully selected on the basis of structural complexity. We have quantified the accuracy, diversity, speed, exhaustiveness, and sampling efficiency in an automated fashion and we compared them with four commercial (Prime, MacroModel, molecular operating environment, and molecular dynamics) and four open-access (experimental-torsion distance geometry with additional "basic knowledge" alone and with Merck molecular force field minimization or universal force field minimization, Cambridge Crystallographic Data Centre conformer generator, and conformator) packages. With three-quarters of the database processed below the threshold of high ring accuracy, Moloc was identified as having the highest sampling efficiency and exhaustiveness without producing thousands of conformations, random ring splitting into two half-loops, and possibility to interactively produce globular or flat conformations with diversity similar to Prime, MacroModel, and molecular dynamics. The algorithm and the Python scripts for full automatization of these parameters are freely available for academic use.

Topics: Benchmarking; Macrocyclic Compounds; Molecular Conformation; Molecular Dynamics Simulation; Software

PubMed: 33270455
DOI: 10.1021/acs.jcim.0c01038