-
PloS One 2021Research data is increasingly viewed as an important scholarly output. While a growing body of studies have investigated researcher practices and perceptions related to...
Research data is increasingly viewed as an important scholarly output. While a growing body of studies have investigated researcher practices and perceptions related to data sharing, information about data-related practices throughout the research process (including data collection and analysis) remains largely anecdotal. Building on our previous study of data practices in neuroimaging research, we conducted a survey of data management practices in the field of psychology. Our survey included questions about the type(s) of data collected, the tools used for data analysis, practices related to data organization, maintaining documentation, backup procedures, and long-term archiving of research materials. Our results demonstrate the complexity of managing and sharing data in psychology. Data is collected in multifarious forms from human participants, analyzed using a range of software tools, and archived in formats that may become obsolete. As individuals, our participants demonstrated relatively good data management practices, however they also indicated that there was little standardization within their research group. Participants generally indicated that they were willing to change their current practices in light of new technologies, opportunities, or requirements.
Topics: Archives; Bibliometrics; Data Collection; Data Management; Humans; Information Dissemination; Psychology; Software; Surveys and Questionnaires
PubMed: 34019600
DOI: 10.1371/journal.pone.0252047 -
Methods in Molecular Biology (Clifton,... 2020PLINK is a versatile program which supports data management, quality control, and common statistical computations on matrices of genomic variant calls, in a...
PLINK is a versatile program which supports data management, quality control, and common statistical computations on matrices of genomic variant calls, in a computationally efficient manner. In population genomics, it is frequently used to take care of the "basics," so they do not need to be reimplemented when a new type of analysis needs to be performed on such a matrix. I describe several of these basic operations, and discuss uses and pitfalls.
Topics: Algorithms; Computational Biology; Data Management; Gene Frequency; Genetic Variation; Genetics, Population; Genomics; Humans; Linkage Disequilibrium
PubMed: 31975163
DOI: 10.1007/978-1-0716-0199-0_3 -
Clinical and Translational Science Sep 2023In drug development a frequently used phrase is "data-driven". Just as high-test gas fuels a car, so drug development "runs on" high-quality data; hence, good data... (Review)
Review
In drug development a frequently used phrase is "data-driven". Just as high-test gas fuels a car, so drug development "runs on" high-quality data; hence, good data management practices, which involve case report form design, data entry, data capture, data validation, medical coding, database closure, and database locking, are critically important. This review covers the essentials of clinical data management (CDM) for the United States. It is intended to demystify CDM, which means nothing more esoteric than the collection, organization, maintenance, and analysis of data for clinical trials. The review is written with those who are new to drug development in mind and assumes only a passing familiarity with the terms and concepts that are introduced. However, its relevance may also extend to experienced professionals that feel the need to brush up on the basics. For added color and context, the review includes real-world examples with RRx-001, a new molecular entity in phase III and with fast-track status in head and neck cancer, and AdAPT-001, an oncolytic adenovirus armed with a transforming growth factor-beta (TGF-β) trap in a phase I/II clinical trial with which the authors, as employees of the biopharmaceutical company, EpicentRx, are closely involved. An alphabetized glossary of key terms and acronyms used throughout this review is also included for easy reference.
Topics: Humans; United States; Data Management
PubMed: 37382299
DOI: 10.1111/cts.13582 -
Seminars in Oncology Nursing Apr 2023To provide an overview of three consecutive stages involved in the processing of quantitative research data (ie, data management, analysis, and interpretation) with the... (Review)
Review
OBJECTIVES
To provide an overview of three consecutive stages involved in the processing of quantitative research data (ie, data management, analysis, and interpretation) with the aid of practical examples to foster enhanced understanding.
DATA SOURCES
Published scientific articles, research textbooks, and expert advice were used.
CONCLUSION
Typically, a considerable amount of numerical research data is collected that require analysis. On entry into a data set, data must be carefully checked for errors and missing values, and then variables must be defined and coded as part of data management. Quantitative data analysis involves the use of statistics. Descriptive statistics help summarize the variables in a data set to show what is typical for a sample. Measures of central tendency (ie, mean, median, mode), measures of spread (standard deviation), and parameter estimation measures (confidence intervals) may be calculated. Inferential statistics aid in testing hypotheses about whether or not a hypothesized effect, relationship, or difference is likely true. Inferential statistical tests produce a value for probability, the P value. The P value informs about whether an effect, relationship, or difference might exist in reality. Crucially, it must be accompanied by a measure of magnitude (effect size) to help interpret how small or large this effect, relationship, or difference is. Effect sizes provide key information for clinical decision-making in health care.
IMPLICATIONS FOR NURSING PRACTICE
Developing capacity in the management, analysis, and interpretation of quantitative research data can have a multifaceted impact in enhancing nurses' confidence in understanding, evaluating, and applying quantitative evidence in cancer nursing practice.
Topics: Humans; Data Management; Research Design; Data Collection
PubMed: 36868925
DOI: 10.1016/j.soncn.2023.151398 -
Journal of Assisted Reproduction and... Jul 2021
Topics: Artificial Intelligence; Data Management; Fertilization in Vitro; Humans; Reproductive Medicine
PubMed: 33715133
DOI: 10.1007/s10815-021-02122-3 -
Methods in Molecular Biology (Clifton,... 2024This chapter discusses the challenges and requirements of modern Research Data Management (RDM), particularly for biomedical applications in the context of...
This chapter discusses the challenges and requirements of modern Research Data Management (RDM), particularly for biomedical applications in the context of high-performance computing (HPC). The FAIR data principles (Findable, Accessible, Interoperable, Reusable) are of special importance. Data formats, publication platforms, annotation schemata, automated data management and staging, the data infrastructure in HPC centers, file transfer and staging methods in HPC, and the EUDAT components are discussed. Tools and approaches for automated data movement and replication in cross-center workflows are explained, as well as the development of ontologies for structuring and quality-checking of metadata in computational biomedicine. The CompBioMed project is used as a real-world example of implementing these principles and tools in practice. The LEXIS project has built a workflow-execution and data management platform that follows the paradigm of HPC-Cloud convergence for demanding Big Data applications. It is used for orchestrating workflows with YORC, utilizing the data documentation initiative (DDI) and distributed computing resources (DCI). The platform is accessed by a user-friendly LEXIS portal for workflow and data management, making HPC and Cloud Computing significantly more accessible. Checkpointing, duplicate runs, and spare images of the data are used to create resilient workflows. The CompBioMed project is completing the implementation of such a workflow, using data replication and brokering, which will enable urgent computing on exascale platforms.
Topics: Data Management; Big Data; Cloud Computing; Documentation; Movement
PubMed: 37702950
DOI: 10.1007/978-1-0716-3449-3_18 -
Scientific Data Jun 2022Data sharing can accelerate scientific discovery while increasing return on investment beyond the researcher or group that produced them. Data repositories enable data...
Data sharing can accelerate scientific discovery while increasing return on investment beyond the researcher or group that produced them. Data repositories enable data sharing and preservation over the long term, but little is known about scientists' perceptions of them and their perspectives on data management and sharing practices. Using focus groups with scientists from five disciplines (atmospheric and earth science, computer science, chemistry, ecology, and neuroscience), we asked questions about data management to lead into a discussion of what features they think are necessary to include in data repository systems and services to help them implement the data sharing and preservation parts of their data management plans. Participants identified metadata quality control and training as problem areas in data management. Additionally, participants discussed several desired repository features, including: metadata control, data traceability, security, stable infrastructure, and data use restrictions. We present their desired repository features as a rubric for the research community to encourage repository utilization. Future directions for research are discussed.
Topics: Data Management; Focus Groups; Humans; Information Dissemination; Metadata; Research Personnel
PubMed: 35715445
DOI: 10.1038/s41597-022-01428-w -
Briefings in Bioinformatics Jan 2021With advances in genomic sequencing technology, a large amount of data is publicly available for the research community to extract meaningful and reliable associations... (Review)
Review
With advances in genomic sequencing technology, a large amount of data is publicly available for the research community to extract meaningful and reliable associations among risk genes and the mechanisms of disease. However, this exponential growth of data is spread in over thousand heterogeneous repositories, represented in multiple formats and with different levels of quality what hinders the differentiation of clinically valid relationships from those that are less well-sustained and that could lead to wrong diagnosis. This paper presents how conceptual models can play a key role to efficiently manage genomic data. These data must be accessible, informative and reliable enough to extract valuable knowledge in the context of the identification of evidence supporting the relationship between DNA variants and disease. The approach presented in this paper provides a solution that help researchers to organize, store and process information focusing only on the data that are relevant and minimizing the impact that the information overload has in clinical and research contexts. A case-study (epilepsy) is also presented, to demonstrate its application in a real context.
Topics: Data Management; Data Systems; Epilepsy; Genetic Predisposition to Disease; Genomics; Humans
PubMed: 32533135
DOI: 10.1093/bib/bbaa100 -
BMC Bioinformatics Feb 2022As technical developments in omics and biomedical imaging increase the throughput of data generation in life sciences, the need for information systems capable of...
BACKGROUND
As technical developments in omics and biomedical imaging increase the throughput of data generation in life sciences, the need for information systems capable of managing heterogeneous digital assets is increasing. In particular, systems supporting the findability, accessibility, interoperability, and reusability (FAIR) principles of scientific data management.
RESULTS
We propose a Service Oriented Architecture approach for integrated management and analysis of multi-omics and biomedical imaging data. Our architecture introduces an image management system into a FAIR-supporting, web-based platform for omics data management. Interoperable metadata models and middleware components implement the required data management operations. The resulting architecture allows for FAIR management of omics and imaging data, facilitating metadata queries from software applications. The applicability of the proposed architecture is demonstrated using two technical proofs of concept and a use case, aimed at molecular plant biology and clinical liver cancer research, which integrate various imaging and omics modalities.
CONCLUSIONS
We describe a data management architecture for integrated, FAIR-supporting management of omics and biomedical imaging data, and exemplify its applicability for basic biology research and clinical studies. We anticipate that FAIR data management systems for multi-modal data repositories will play a pivotal role in data-driven research, including studies which leverage advanced machine learning methods, as the joint analysis of omics and imaging data, in conjunction with phenotypic metadata, becomes not only desirable but necessary to derive novel insights into biological processes.
Topics: Biological Science Disciplines; Data Management; Information Management; Metadata; Software
PubMed: 35130839
DOI: 10.1186/s12859-022-04584-3 -
The American Journal of Medicine Apr 2020
Topics: Angina Pectoris, Variant; Data Management; Hospitalization; Humans
PubMed: 32331576
DOI: 10.1016/j.amjmed.2019.09.009