-
Clinical and Translational Science Sep 2023In drug development a frequently used phrase is "data-driven". Just as high-test gas fuels a car, so drug development "runs on" high-quality data; hence, good data... (Review)
Review
In drug development a frequently used phrase is "data-driven". Just as high-test gas fuels a car, so drug development "runs on" high-quality data; hence, good data management practices, which involve case report form design, data entry, data capture, data validation, medical coding, database closure, and database locking, are critically important. This review covers the essentials of clinical data management (CDM) for the United States. It is intended to demystify CDM, which means nothing more esoteric than the collection, organization, maintenance, and analysis of data for clinical trials. The review is written with those who are new to drug development in mind and assumes only a passing familiarity with the terms and concepts that are introduced. However, its relevance may also extend to experienced professionals that feel the need to brush up on the basics. For added color and context, the review includes real-world examples with RRx-001, a new molecular entity in phase III and with fast-track status in head and neck cancer, and AdAPT-001, an oncolytic adenovirus armed with a transforming growth factor-beta (TGF-β) trap in a phase I/II clinical trial with which the authors, as employees of the biopharmaceutical company, EpicentRx, are closely involved. An alphabetized glossary of key terms and acronyms used throughout this review is also included for easy reference.
Topics: Humans; United States; Data Management
PubMed: 37382299
DOI: 10.1111/cts.13582 -
Seminars in Oncology Nursing Apr 2023To provide an overview of three consecutive stages involved in the processing of quantitative research data (ie, data management, analysis, and interpretation) with the... (Review)
Review
OBJECTIVES
To provide an overview of three consecutive stages involved in the processing of quantitative research data (ie, data management, analysis, and interpretation) with the aid of practical examples to foster enhanced understanding.
DATA SOURCES
Published scientific articles, research textbooks, and expert advice were used.
CONCLUSION
Typically, a considerable amount of numerical research data is collected that require analysis. On entry into a data set, data must be carefully checked for errors and missing values, and then variables must be defined and coded as part of data management. Quantitative data analysis involves the use of statistics. Descriptive statistics help summarize the variables in a data set to show what is typical for a sample. Measures of central tendency (ie, mean, median, mode), measures of spread (standard deviation), and parameter estimation measures (confidence intervals) may be calculated. Inferential statistics aid in testing hypotheses about whether or not a hypothesized effect, relationship, or difference is likely true. Inferential statistical tests produce a value for probability, the P value. The P value informs about whether an effect, relationship, or difference might exist in reality. Crucially, it must be accompanied by a measure of magnitude (effect size) to help interpret how small or large this effect, relationship, or difference is. Effect sizes provide key information for clinical decision-making in health care.
IMPLICATIONS FOR NURSING PRACTICE
Developing capacity in the management, analysis, and interpretation of quantitative research data can have a multifaceted impact in enhancing nurses' confidence in understanding, evaluating, and applying quantitative evidence in cancer nursing practice.
Topics: Humans; Data Management; Research Design; Data Collection
PubMed: 36868925
DOI: 10.1016/j.soncn.2023.151398 -
Journal of Assisted Reproduction and... Jul 2021
Topics: Artificial Intelligence; Data Management; Fertilization in Vitro; Humans; Reproductive Medicine
PubMed: 33715133
DOI: 10.1007/s10815-021-02122-3 -
Methods in Molecular Biology (Clifton,... 2024This chapter discusses the challenges and requirements of modern Research Data Management (RDM), particularly for biomedical applications in the context of...
This chapter discusses the challenges and requirements of modern Research Data Management (RDM), particularly for biomedical applications in the context of high-performance computing (HPC). The FAIR data principles (Findable, Accessible, Interoperable, Reusable) are of special importance. Data formats, publication platforms, annotation schemata, automated data management and staging, the data infrastructure in HPC centers, file transfer and staging methods in HPC, and the EUDAT components are discussed. Tools and approaches for automated data movement and replication in cross-center workflows are explained, as well as the development of ontologies for structuring and quality-checking of metadata in computational biomedicine. The CompBioMed project is used as a real-world example of implementing these principles and tools in practice. The LEXIS project has built a workflow-execution and data management platform that follows the paradigm of HPC-Cloud convergence for demanding Big Data applications. It is used for orchestrating workflows with YORC, utilizing the data documentation initiative (DDI) and distributed computing resources (DCI). The platform is accessed by a user-friendly LEXIS portal for workflow and data management, making HPC and Cloud Computing significantly more accessible. Checkpointing, duplicate runs, and spare images of the data are used to create resilient workflows. The CompBioMed project is completing the implementation of such a workflow, using data replication and brokering, which will enable urgent computing on exascale platforms.
Topics: Data Management; Big Data; Cloud Computing; Documentation; Movement
PubMed: 37702950
DOI: 10.1007/978-1-0716-3449-3_18 -
Scientific Data Jun 2022Data sharing can accelerate scientific discovery while increasing return on investment beyond the researcher or group that produced them. Data repositories enable data...
Data sharing can accelerate scientific discovery while increasing return on investment beyond the researcher or group that produced them. Data repositories enable data sharing and preservation over the long term, but little is known about scientists' perceptions of them and their perspectives on data management and sharing practices. Using focus groups with scientists from five disciplines (atmospheric and earth science, computer science, chemistry, ecology, and neuroscience), we asked questions about data management to lead into a discussion of what features they think are necessary to include in data repository systems and services to help them implement the data sharing and preservation parts of their data management plans. Participants identified metadata quality control and training as problem areas in data management. Additionally, participants discussed several desired repository features, including: metadata control, data traceability, security, stable infrastructure, and data use restrictions. We present their desired repository features as a rubric for the research community to encourage repository utilization. Future directions for research are discussed.
Topics: Data Management; Focus Groups; Humans; Information Dissemination; Metadata; Research Personnel
PubMed: 35715445
DOI: 10.1038/s41597-022-01428-w -
Briefings in Bioinformatics Jan 2021With advances in genomic sequencing technology, a large amount of data is publicly available for the research community to extract meaningful and reliable associations... (Review)
Review
With advances in genomic sequencing technology, a large amount of data is publicly available for the research community to extract meaningful and reliable associations among risk genes and the mechanisms of disease. However, this exponential growth of data is spread in over thousand heterogeneous repositories, represented in multiple formats and with different levels of quality what hinders the differentiation of clinically valid relationships from those that are less well-sustained and that could lead to wrong diagnosis. This paper presents how conceptual models can play a key role to efficiently manage genomic data. These data must be accessible, informative and reliable enough to extract valuable knowledge in the context of the identification of evidence supporting the relationship between DNA variants and disease. The approach presented in this paper provides a solution that help researchers to organize, store and process information focusing only on the data that are relevant and minimizing the impact that the information overload has in clinical and research contexts. A case-study (epilepsy) is also presented, to demonstrate its application in a real context.
Topics: Data Management; Data Systems; Epilepsy; Genetic Predisposition to Disease; Genomics; Humans
PubMed: 32533135
DOI: 10.1093/bib/bbaa100 -
BMC Bioinformatics Feb 2022As technical developments in omics and biomedical imaging increase the throughput of data generation in life sciences, the need for information systems capable of...
BACKGROUND
As technical developments in omics and biomedical imaging increase the throughput of data generation in life sciences, the need for information systems capable of managing heterogeneous digital assets is increasing. In particular, systems supporting the findability, accessibility, interoperability, and reusability (FAIR) principles of scientific data management.
RESULTS
We propose a Service Oriented Architecture approach for integrated management and analysis of multi-omics and biomedical imaging data. Our architecture introduces an image management system into a FAIR-supporting, web-based platform for omics data management. Interoperable metadata models and middleware components implement the required data management operations. The resulting architecture allows for FAIR management of omics and imaging data, facilitating metadata queries from software applications. The applicability of the proposed architecture is demonstrated using two technical proofs of concept and a use case, aimed at molecular plant biology and clinical liver cancer research, which integrate various imaging and omics modalities.
CONCLUSIONS
We describe a data management architecture for integrated, FAIR-supporting management of omics and biomedical imaging data, and exemplify its applicability for basic biology research and clinical studies. We anticipate that FAIR data management systems for multi-modal data repositories will play a pivotal role in data-driven research, including studies which leverage advanced machine learning methods, as the joint analysis of omics and imaging data, in conjunction with phenotypic metadata, becomes not only desirable but necessary to derive novel insights into biological processes.
Topics: Biological Science Disciplines; Data Management; Information Management; Metadata; Software
PubMed: 35130839
DOI: 10.1186/s12859-022-04584-3 -
The American Journal of Medicine Apr 2020
Topics: Angina Pectoris, Variant; Data Management; Hospitalization; Humans
PubMed: 32331576
DOI: 10.1016/j.amjmed.2019.09.009 -
Journal of Chemical Information and... Jul 2023A great advantage of computational research is its reproducibility and reusability. However, an enormous amount of computational research data in heterogeneous catalysis...
A great advantage of computational research is its reproducibility and reusability. However, an enormous amount of computational research data in heterogeneous catalysis is barricaded due to logistical limitations. Sufficient provenance and characterization of data and computational environment, with uniform organization and easy accessibility, can allow the development of software tools for integration across the multiscale modeling workflow. Here, we develop the Chemical Kinetics Database, CKineticsDB, a state-of-the-art datahub for multiscale modeling, designed to be compliant with the FAIR guiding principles for scientific data management. CKineticsDB utilizes a MongoDB back-end for extensibility and adaptation to varying data formats, with a referencing-based data model to reduce redundancy in storage. We have developed a Python software program for data processing operations and with built-in features to extract data for common applications. CKineticsDB evaluates the incoming data for quality and uniformity, retains curated information from simulations, enables accurate regeneration of publication results, optimizes storage, and allows the selective retrieval of files based on domain-relevant catalyst and simulation parameters. CKineticsDB provides data from multiple scales of theory (ab initio calculations, thermochemistry, and microkinetic models) to accelerate the development of new reaction pathways, kinetic analysis of reaction mechanisms, and catalysis discovery, along with several data-driven applications.
Topics: Data Management; Kinetics; Reproducibility of Results; Software
PubMed: 37436913
DOI: 10.1021/acs.jcim.3c00123 -
The Annals of Thoracic Surgery Nov 2019
Topics: Data Management; Humans; Surgeons
PubMed: 31653295
DOI: 10.1016/j.athoracsur.2019.04.076