-
Genome Research Feb 2022Genotyping from sequencing is the basis of emerging strategies in the molecular breeding of polyploid plants. However, compared with the situation for diploids, in which...
Genotyping from sequencing is the basis of emerging strategies in the molecular breeding of polyploid plants. However, compared with the situation for diploids, in which genotyping accuracies are confidently determined with comprehensive benchmarks, polyploids have been neglected; there are no benchmarks measuring genotyping error rates for small variants using real sequencing reads. We previously introduced a variant calling method, Octopus, that accurately calls germline variants in diploids and somatic mutations in tumors. Here, we evaluate Octopus and other popular tools on whole-genome tetraploid and hexaploid data sets created using in silico mixtures of diploid Genome in a Bottle (GIAB) samples. We find that genotyping errors are abundant for typical sequencing depths but that Octopus makes 25% fewer errors than other methods on average. We supplement our benchmarks with concordance analysis in real autotriploid banana data sets.
Topics: Benchmarking; Genotype; High-Throughput Nucleotide Sequencing; Humans; Polyploidy
PubMed: 34965940
DOI: 10.1101/gr.275579.121 -
Neuron Nov 2020A potentially organizing goal of the brain and cognitive sciences is to accurately explain domains of human intelligence as executable, neurally mechanistic models.... (Review)
Review
A potentially organizing goal of the brain and cognitive sciences is to accurately explain domains of human intelligence as executable, neurally mechanistic models. Years of research have led to models that capture experimental results in individual behavioral tasks and individual brain regions. We here advocate for taking the next step: integrating experimental results from many laboratories into suites of benchmarks that, when considered together, push mechanistic models toward explaining entire domains of intelligence, such as vision, language, and motor control. Given recent successes of neurally mechanistic models and the surging availability of neural, anatomical, and behavioral data, we believe that now is the time to create integrative benchmarking platforms that incentivize ambitious, unified models. This perspective discusses the advantages and the challenges of this approach and proposes specific steps to achieve this goal in the domain of visual intelligence with the case study of an integrative benchmarking platform called Brain-Score.
Topics: Benchmarking; Brain; Humans; Intelligence; Models, Neurological; Neural Networks, Computer
PubMed: 32918861
DOI: 10.1016/j.neuron.2020.07.040 -
Healthcare Policy = Politiques de Sante May 2012Benchmarking, a management approach for implementing best practices at best cost, is a recent concept in the healthcare system. The objectives of this paper are to... (Review)
Review
Benchmarking, a management approach for implementing best practices at best cost, is a recent concept in the healthcare system. The objectives of this paper are to better understand the concept and its evolution in the healthcare sector, to propose an operational definition, and to describe some French and international experiences of benchmarking in the healthcare sector. To this end, we reviewed the literature on this approach's emergence in the industrial sector, its evolution, its fields of application and examples of how it has been used in the healthcare sector. Benchmarking is often thought to consist simply of comparing indicators and is not perceived in its entirety, that is, as a tool based on voluntary and active collaboration among several organizations to create a spirit of competition and to apply best practices. The key feature of benchmarking is its integration within a comprehensive and participatory policy of continuous quality improvement (CQI). Conditions for successful benchmarking focus essentially on careful preparation of the process, monitoring of the relevant indicators, staff involvement and inter-organizational visits. Compared to methods previously implemented in France (CQI and collaborative projects), benchmarking has specific features that set it apart as a healthcare innovation. This is especially true for healthcare or medical-social organizations, as the principle of inter-organizational visiting is not part of their culture. Thus, this approach will need to be assessed for feasibility and acceptability before it is more widely promoted.
Topics: Benchmarking; Humans; Industry; Quality Improvement; Quality Indicators, Health Care; Total Quality Management
PubMed: 23634166
DOI: No ID Found -
BMC Bioinformatics Dec 2023Biclustering is increasingly used in biomedical data analysis, recommendation tasks, and text mining domains, with hundreds of biclustering algorithms proposed. When...
BACKGROUND
Biclustering is increasingly used in biomedical data analysis, recommendation tasks, and text mining domains, with hundreds of biclustering algorithms proposed. When assessing the performance of these algorithms, more than real datasets are required as they do not offer a solid ground truth. Synthetic data surpass this limitation by producing reference solutions to be compared with the found patterns. However, generating synthetic datasets is challenging since the generated data must ensure reproducibility, pattern representativity, and real data resemblance.
RESULTS
We propose G-Bic, a dataset generator conceived to produce synthetic benchmarks for the normative assessment of biclustering algorithms. Beyond expanding on aspects of pattern coherence, data quality, and positioning properties, it further handles specificities related to mixed-type datasets and time-series data.G-Bic has the flexibility to replicate real data regularities from diverse domains. We provide the default configurations to generate reproducible benchmarks to evaluate and compare diverse aspects of biclustering algorithms. Additionally, we discuss empirical strategies to simulate the properties of real data.
CONCLUSION
G-Bic is a parametrizable generator for biclustering analysis, offering a solid means to assess biclustering solutions according to internal and external metrics robustly.
Topics: Gene Expression Profiling; Reproducibility of Results; Benchmarking; Cluster Analysis; Algorithms
PubMed: 38053078
DOI: 10.1186/s12859-023-05587-4 -
BMC Bioinformatics Apr 2022Deep learning has become a prevalent method in identifying genomic regulatory sequences such as promoters. In a number of recent papers, the performance of deep learning...
MOTIVATION
Deep learning has become a prevalent method in identifying genomic regulatory sequences such as promoters. In a number of recent papers, the performance of deep learning models has continually been reported as an improvement over alternatives for sequence-based promoter recognition. However, the performance improvements in these models do not account for the different datasets that models are evaluated on. The lack of a consensus dataset and procedure for benchmarking purposes has made the comparison of each model's true performance difficult to assess.
RESULTS
We present a framework called Supervised Promoter Recognition Framework ('SUPR REF') capable of streamlining the complete process of training, validating, testing, and comparing promoter recognition models in a systematic manner. SUPR REF includes the creation of biologically relevant benchmark datasets to be used in the evaluation process of deep learning promoter recognition models. We showcase this framework by comparing the models' performances on alternative datasets, and properly evaluate previously published models on new benchmark datasets. Our results show that the reliability of deep learning ab initio promoter recognition models on eukaryotic genomic sequences is still not at a sufficient level, as overall performance is still low. These results originate from a subset of promoters, the well-known RNA Polymerase II core promoters. Furthermore, given the observational nature of these data, cross-validation results from small promoter datasets need to be interpreted with caution.
Topics: Benchmarking; Eukaryotic Cells; Genomics; Promoter Regions, Genetic; Reproducibility of Results
PubMed: 35366794
DOI: 10.1186/s12859-022-04647-5 -
Proteins Jun 2022Protein docking protocols typically involve global docking scan, followed by re-ranking of the scan predictions by more accurate scoring functions that are either...
Protein docking protocols typically involve global docking scan, followed by re-ranking of the scan predictions by more accurate scoring functions that are either computationally too expensive or algorithmically impossible to include in the global scan. Development and validation of scoring methodologies are often performed on scoring benchmark sets (docking decoys) which offer concise and nonredundant representation of the global docking scan output for a large and diverse set of protein-protein complexes. Two such protein-protein scoring benchmarks were built for the Dockground resource, which contains various datasets for the development and testing of protein docking methodologies. One set was generated based on the Dockground unbound docking benchmark 4, and the other based on protein models from the Dockground model-model benchmark 2. The docking decoys were designed to reflect the reality of the real-case docking applications (e.g., correct docking predictions defined as near-native rather than native structures), and to minimize applicability of approaches not directly related to the development of scoring functions (reducing clustering of predictions in the binding funnel and disparity in structural quality of the near-native and nonnative matches). The sets were further characterized by the source organism and the function of the protein-protein complexes. The sets, freely available to the research community on the Dockground webpage, present a unique, user-friendly resource for the developing and testing of protein-protein scoring approaches.
Topics: Benchmarking; Molecular Docking Simulation; Protein Binding; Protein Conformation; Proteins
PubMed: 35072956
DOI: 10.1002/prot.26306 -
The American Journal of Managed Care Jun 2020Poorly defined measurement impairs interinstitutional comparison, interpretation of results, and process improvement in health care operations. We sought to develop a... (Review)
Review
OBJECTIVES
Poorly defined measurement impairs interinstitutional comparison, interpretation of results, and process improvement in health care operations. We sought to develop a unifying framework that could be used by administrators, practitioners, and investigators to help define and document operational performance measures that are comparable and reproducible.
STUDY DESIGN
Retrospective analysis.
METHODS
Health care operations and clinical investigators used an iterative process consisting of (1) literature review, (2) expert assessment and collaborative design, and (3) end-user feedback. We sampled the literature from the medical, health systems research, and health care operations (business and engineering) disciplines to assemble a representative sample of studies in which outpatient health care performance metrics were used to describe the primary or secondary outcome of the research.
RESULTS
We identified 2 primary deficiencies in outpatient performance metric definitions: incompletion and inconsistency. From our review of performance metrics, we propose the FASStR framework for the Focus, Activity, Statistic, Scale type, and Reference dimensions of a performance metric. The FASStR framework is a method by which performance metrics can be developed and examined from a multidimensional perspective to evaluate their comprehensiveness and clarity. The framework was tested and revised in an iterative process with both practitioners and investigators.
CONCLUSIONS
The FASStR framework can guide the design, development, and implementation of operational metrics in outpatient health care settings. Further, this framework can assist investigators in the evaluation of the metrics that they are using. Overall, the FASStR framework can result in clearer, more consistent use and evaluation of outpatient performance metrics.
Topics: Benchmarking; Data Accuracy; Delivery of Health Care; Efficiency, Organizational; Forecasting; Humans; Quality Indicators, Health Care; Reproducibility of Results; Retrospective Studies; United States
PubMed: 32549066
DOI: 10.37765/ajmc.2020.43492 -
Psychological Bulletin Sep 2018Any mature field of research in psychology-such as short-term/working memory-is characterized by a wealth of empirical findings. It is currently unrealistic to expect a...
Any mature field of research in psychology-such as short-term/working memory-is characterized by a wealth of empirical findings. It is currently unrealistic to expect a theory to explain them all; theorists must satisfice with explaining a subset of findings. The aim of the present article is to make the choice of that subset less arbitrary and idiosyncratic than is current practice. We propose criteria for identifying benchmark findings that every theory in a field should be able to explain: Benchmarks should be reproducible, generalize across materials and methodological variations, and be theoretically informative. We propose a set of benchmarks for theories and computational models of short-term and working memory. The benchmarks are described in as theory-neutral a way as possible, so that they can serve as empirical common ground for competing theoretical approaches. Benchmarks are rated on three levels according to their priority for explanation. Selection and ratings of the benchmarks is based on consensus among the authors, who jointly represent a broad range of theoretical perspectives on working memory, and they are supported by a survey among other experts on working memory. The article is accompanied by a web page providing an open forum for discussion and for submitting proposals for new benchmarks; and a repository for reference data sets for each benchmark. (PsycINFO Database Record
Topics: Benchmarking; Humans; Memory, Short-Term; Models, Psychological; Psychological Theory
PubMed: 30148379
DOI: 10.1037/bul0000153 -
Cell Reports Methods Nov 2023Molecular representation learning plays an important role in molecular property prediction. Existing molecular property prediction models rely on the de facto standard...
Molecular representation learning plays an important role in molecular property prediction. Existing molecular property prediction models rely on the de facto standard of covalent-bond-based molecular graphs for representing molecular topology at the atomic level and totally ignore the non-covalent interactions within the molecule. In this study, we propose a molecular geometric deep learning model to predict the properties of molecules that aims to comprehensively consider the information of covalent and non-covalent interactions of molecules. The essential idea is to incorporate a more general molecular representation into geometric deep learning (GDL) models. We systematically test molecular GDL (Mol-GDL) on fourteen commonly used benchmark datasets. The results show that Mol-GDL can achieve a better performance than state-of-the-art (SOTA) methods. Extensive tests have demonstrated the important role of non-covalent interactions in molecular property prediction and the effectiveness of Mol-GDL models.
Topics: Deep Learning; Benchmarking; Models, Molecular
PubMed: 37875121
DOI: 10.1016/j.crmeth.2023.100621 -
Journal of Chemical Information and... Dec 2020Macrocycles target proteins that are otherwise considered undruggable because of a lack of hydrophobic cavities and the presence of extended featureless surfaces....
Macrocycles target proteins that are otherwise considered undruggable because of a lack of hydrophobic cavities and the presence of extended featureless surfaces. Increasing efforts by computational chemists have developed effective software to overcome the restrictions of torsional and conformational freedom that arise as a consequence of macrocyclization. Moloc is an efficient algorithm, with an emphasis on high interactivity, and has been constantly updated since 1986 by drug designers and crystallographers of the Roche biostructural community. In this work, we have benchmarked the shape-guided algorithm using a dataset of 208 macrocycles, carefully selected on the basis of structural complexity. We have quantified the accuracy, diversity, speed, exhaustiveness, and sampling efficiency in an automated fashion and we compared them with four commercial (Prime, MacroModel, molecular operating environment, and molecular dynamics) and four open-access (experimental-torsion distance geometry with additional "basic knowledge" alone and with Merck molecular force field minimization or universal force field minimization, Cambridge Crystallographic Data Centre conformer generator, and conformator) packages. With three-quarters of the database processed below the threshold of high ring accuracy, Moloc was identified as having the highest sampling efficiency and exhaustiveness without producing thousands of conformations, random ring splitting into two half-loops, and possibility to interactively produce globular or flat conformations with diversity similar to Prime, MacroModel, and molecular dynamics. The algorithm and the Python scripts for full automatization of these parameters are freely available for academic use.
Topics: Benchmarking; Macrocyclic Compounds; Molecular Conformation; Molecular Dynamics Simulation; Software
PubMed: 33270455
DOI: 10.1021/acs.jcim.0c01038