-
Environment International Jul 2022Management of datasets that include health information and other sensitive personal information of European study participants has to be compliant with the General Data... (Review)
Review
Management of datasets that include health information and other sensitive personal information of European study participants has to be compliant with the General Data Protection Regulation (GDPR, Regulation (EU) 2016/679). Within scientific research, the widely subscribed'FAIR' data principles should apply, meaning that research data should be findable, accessible, interoperable and re-usable. Balancing the aim of open science driven FAIR data management with GDPR compliant personal data protection safeguards is now a common challenge for many research projects dealing with (sensitive) personal data. In December 2020 a workshop was held with representatives of several large EU research consortia and of the European Commission to reflect on how to apply the FAIR data principles for environment and health research (E&H). Several recent data intensive EU funded E&H research projects face this challenge and work intensively towards developing solutions to access, exchange, store, handle, share, process and use such sensitive personal data, with the aim to support European and transnational collaborations. As a result, several recommendations, opportunities and current limitations were formulated. New technical developments such as federated data management and analysis systems, machine learning together with advanced search software, harmonized ontologies and data quality standards should in principle facilitate the FAIRification of data. To address ethical, legal, political and financial obstacles to the wider re-use of data for research purposes, both specific expertise and underpinning infrastructure are needed. There is a need for the E&H research data to find their place in the European Open Science Cloud. Communities using health and population data, environmental data and other publicly available data have to interconnect and synergize. To maximize the use and re-use of environment and health data, a dedicated supporting European infrastructure effort, such as the EIRENE research infrastructure within the ESFRI roadmap 2021, is needed that would interact with existing infrastructures.
Topics: Computer Security; Data Management; Europe; Health Records, Personal; Humans
PubMed: 35696847
DOI: 10.1016/j.envint.2022.107334 -
Anesthesiology Mar 2019
Topics: Acetaminophen; Administration, Intravenous; Analgesics, Opioid; Colectomy; Data Management
PubMed: 30762645
DOI: 10.1097/ALN.0000000000002570 -
Seminars in Oncology Nursing Apr 2023To provide an overview of three consecutive stages involved in the processing of quantitative research data (ie, data management, analysis, and interpretation) with the... (Review)
Review
OBJECTIVES
To provide an overview of three consecutive stages involved in the processing of quantitative research data (ie, data management, analysis, and interpretation) with the aid of practical examples to foster enhanced understanding.
DATA SOURCES
Published scientific articles, research textbooks, and expert advice were used.
CONCLUSION
Typically, a considerable amount of numerical research data is collected that require analysis. On entry into a data set, data must be carefully checked for errors and missing values, and then variables must be defined and coded as part of data management. Quantitative data analysis involves the use of statistics. Descriptive statistics help summarize the variables in a data set to show what is typical for a sample. Measures of central tendency (ie, mean, median, mode), measures of spread (standard deviation), and parameter estimation measures (confidence intervals) may be calculated. Inferential statistics aid in testing hypotheses about whether or not a hypothesized effect, relationship, or difference is likely true. Inferential statistical tests produce a value for probability, the P value. The P value informs about whether an effect, relationship, or difference might exist in reality. Crucially, it must be accompanied by a measure of magnitude (effect size) to help interpret how small or large this effect, relationship, or difference is. Effect sizes provide key information for clinical decision-making in health care.
IMPLICATIONS FOR NURSING PRACTICE
Developing capacity in the management, analysis, and interpretation of quantitative research data can have a multifaceted impact in enhancing nurses' confidence in understanding, evaluating, and applying quantitative evidence in cancer nursing practice.
Topics: Humans; Data Management; Research Design; Data Collection
PubMed: 36868925
DOI: 10.1016/j.soncn.2023.151398 -
Journal of Medical Internet Research Jul 2020Over the last century, disruptive incidents in the fields of clinical and biomedical research have yielded a tremendous change in health data management systems. This is... (Review)
Review
BACKGROUND
Over the last century, disruptive incidents in the fields of clinical and biomedical research have yielded a tremendous change in health data management systems. This is due to a number of breakthroughs in the medical field and the need for big data analytics and the Internet of Things (IoT) to be incorporated in a real-time smart health information management system. In addition, the requirements of patient care have evolved over time, allowing for more accurate prognoses and diagnoses. In this paper, we discuss the temporal evolution of health data management systems and capture the requirements that led to the development of a given system over a certain period of time. Consequently, we provide insights into those systems and give suggestions and research directions on how they can be improved for a better health care system.
OBJECTIVE
This study aimed to show that there is a need for a secure and efficient health data management system that will allow physicians and patients to update decentralized medical records and to analyze the medical data for supporting more precise diagnoses, prognoses, and public insights. Limitations of existing health data management systems were analyzed.
METHODS
To study the evolution and requirements of health data management systems over the years, a search was conducted to obtain research articles and information on medical lawsuits, health regulations, and acts. These materials were obtained from the Institute of Electrical and Electronics Engineers, the Association for Computing Machinery, Elsevier, MEDLINE, PubMed, Scopus, and Web of Science databases.
RESULTS
Health data management systems have undergone a disruptive transformation over the years from paper to computer, web, cloud, IoT, big data analytics, and finally to blockchain. The requirements of a health data management system revealed from the evolving definitions of medical records and their management are (1) medical record data, (2) real-time data access, (3) patient participation, (4) data sharing, (5) data security, (6) patient identity privacy, and (7) public insights. This paper reviewed health data management systems based on these 7 requirements across studies conducted over the years. To our knowledge, this is the first analysis of the temporal evolution of health data management systems giving insights into the system requirements for better health care.
CONCLUSIONS
There is a need for a comprehensive real-time health data management system that allows physicians, patients, and external users to input their medical and lifestyle data into the system. The incorporation of big data analytics will aid in better prognosis or diagnosis of the diseases and the prediction of diseases. The prediction results will help in the development of an effective prevention plan.
Topics: Biomedical Research; Data Management; Delivery of Health Care; Humans
PubMed: 32348265
DOI: 10.2196/17508 -
BMC Bioinformatics Feb 2022As technical developments in omics and biomedical imaging increase the throughput of data generation in life sciences, the need for information systems capable of...
BACKGROUND
As technical developments in omics and biomedical imaging increase the throughput of data generation in life sciences, the need for information systems capable of managing heterogeneous digital assets is increasing. In particular, systems supporting the findability, accessibility, interoperability, and reusability (FAIR) principles of scientific data management.
RESULTS
We propose a Service Oriented Architecture approach for integrated management and analysis of multi-omics and biomedical imaging data. Our architecture introduces an image management system into a FAIR-supporting, web-based platform for omics data management. Interoperable metadata models and middleware components implement the required data management operations. The resulting architecture allows for FAIR management of omics and imaging data, facilitating metadata queries from software applications. The applicability of the proposed architecture is demonstrated using two technical proofs of concept and a use case, aimed at molecular plant biology and clinical liver cancer research, which integrate various imaging and omics modalities.
CONCLUSIONS
We describe a data management architecture for integrated, FAIR-supporting management of omics and biomedical imaging data, and exemplify its applicability for basic biology research and clinical studies. We anticipate that FAIR data management systems for multi-modal data repositories will play a pivotal role in data-driven research, including studies which leverage advanced machine learning methods, as the joint analysis of omics and imaging data, in conjunction with phenotypic metadata, becomes not only desirable but necessary to derive novel insights into biological processes.
Topics: Biological Science Disciplines; Data Management; Information Management; Metadata; Software
PubMed: 35130839
DOI: 10.1186/s12859-022-04584-3 -
Scientific Data Jun 2022Data sharing can accelerate scientific discovery while increasing return on investment beyond the researcher or group that produced them. Data repositories enable data...
Data sharing can accelerate scientific discovery while increasing return on investment beyond the researcher or group that produced them. Data repositories enable data sharing and preservation over the long term, but little is known about scientists' perceptions of them and their perspectives on data management and sharing practices. Using focus groups with scientists from five disciplines (atmospheric and earth science, computer science, chemistry, ecology, and neuroscience), we asked questions about data management to lead into a discussion of what features they think are necessary to include in data repository systems and services to help them implement the data sharing and preservation parts of their data management plans. Participants identified metadata quality control and training as problem areas in data management. Additionally, participants discussed several desired repository features, including: metadata control, data traceability, security, stable infrastructure, and data use restrictions. We present their desired repository features as a rubric for the research community to encourage repository utilization. Future directions for research are discussed.
Topics: Data Management; Focus Groups; Humans; Information Dissemination; Metadata; Research Personnel
PubMed: 35715445
DOI: 10.1038/s41597-022-01428-w -
Journal of Integrative Bioinformatics Dec 2022Core facilities have to offer technologies that best serve the needs of their users and provide them a competitive advantage in research. They have to set up and...
Core facilities have to offer technologies that best serve the needs of their users and provide them a competitive advantage in research. They have to set up and maintain instruments in the range of ten to a hundred, which produce large amounts of data and serve thousands of active projects and customers. Particular emphasis has to be given to the reproducibility of the results. More and more, the entire process from building the research hypothesis, conducting the experiments, doing the measurements, through the data explorations and analysis is solely driven by very few experts in various scientific fields. Still, the ability to perform the entire data exploration in real-time on a personal computer is often hampered by the heterogeneity of software, the data structure formats of the output, and the enormous data sizes. These impact the design and architecture of the implemented software stack. At the Functional Genomics Center Zurich (FGCZ), a joint state-of-the-art research and training facility of ETH Zurich and the University of Zurich, we have developed the B-Fabric system, which has served for more than a decade, an entire life sciences community with fundamental data science support. In this paper, we sketch how such a system can be used to glue together data (including metadata), computing infrastructures (clusters and clouds), and visualization software to support instant data exploration and visual analysis. We illustrate our in-daily life implemented approach using visualization applications of mass spectrometry data.
Topics: Data Management; Reproducibility of Results; Software; Genomics
PubMed: 36073980
DOI: 10.1515/jib-2022-0031 -
Journal of Chemical Information and... Jan 2022Projects in chemo- and bioinformatics often consist of scattered data in various types and are difficult to access in a meaningful way for efficient data analysis. Data...
Projects in chemo- and bioinformatics often consist of scattered data in various types and are difficult to access in a meaningful way for efficient data analysis. Data is usually too diverse to be even manipulated effectively. Sdfconf is data manipulation and analysis software to address this problem in a logical and robust manner. Other software commonly used for such tasks are either not designed with molecular and/or conformational data in mind or provide only a narrow set of tasks to be accomplished. Furthermore, many tools are only available within commercial software packages. Sdfconf is a flexible, robust, and free-of-charge tool for linking data from various sources for meaningful and efficient manipulation and analysis of molecule data sets. Sdfconf packages molecular structures and metadata into a complete ensemble, from which one can access both the whole data set and individual molecules and/or conformations. In this software note, we offer some practical examples of the utilization of sdfconf.
Topics: Computational Biology; Data Analysis; Data Management; Software
PubMed: 34932340
DOI: 10.1021/acs.jcim.1c01051 -
BMJ Open Aug 2022This article aims to measure the willingness of the Swiss public to participate in personalised health research, and their preferences regarding data management and...
OBJECTIVES
This article aims to measure the willingness of the Swiss public to participate in personalised health research, and their preferences regarding data management and governance.
SETTING
Results are presented from a nationwide survey of members of the Swiss public.
PARTICIPANTS
15 106 randomly selected Swiss residents received the survey in September 2019. The response rate was 34.1% (n=5156). Respondent age ranged from 18 to 79 years, with fairly uniform spread across sex and age categories between 25 and 64 years.
PRIMARY AND SECONDARY OUTCOME MEASURES
Willingness to participate in personalised health research and opinions regarding data management and governance.
RESULTS
Most respondents preferred to be contacted and reconsented for each new project using their data (39%, 95% CI: 37.4% to 40.7%), or stated that their preference depends on the project type (29.4%, 95% CI: 27.9% to 31%). Additionally, a majority (52%, 95% CI: 50.3% to 53.8%) preferred their data or samples be stored anonymously or in coded form (43.4%, 95% CI: 41.7% to 45.1%). Of those who preferred that their data be anonymised, most also indicated a wish to be recontacted for each new project (36.8%, 95% CI: 34.5% to 39.2%); however, these preferences are in conflict. Most respondents desired to personally own their data. Finally, most Swiss respondents trust their doctors, along with researchers at universities, to protect their data.
CONCLUSION
Insight into public preference can enable Swiss biobanks and research institutions to create management and governance strategies that match the expectations and preferences of potential participants. Models allowing participants to choose how to interact with the process, while more complex, may increase individual willingness to provide data to biobanks.
Topics: Adolescent; Adult; Aged; Biological Specimen Banks; Data Management; Humans; Middle Aged; Surveys and Questionnaires; Switzerland; Trust; Young Adult
PubMed: 36028266
DOI: 10.1136/bmjopen-2022-060844 -
International Journal of Population... 2021Data pooling from pre-existing datasets can be useful to increase study sample size and statistical power in order to answer a research question. However, individual...
Data pooling from pre-existing datasets can be useful to increase study sample size and statistical power in order to answer a research question. However, individual datasets may contain variables that measure the same construct differently, posing challenges for data pooling. Variable harmonization, an approach that can generate comparable datasets from heterogeneous sources, can address this issue in some circumstances. As an illustrative example, this paper describes the data harmonization strategies that helped generate comparable datasets across two Canadian pregnancy cohort studies: All Our Families; and the Alberta Pregnancy Outcomes and Nutrition. Variables were harmonized considering multiple features across the datasets: the construct measured; question asked/response options; the measurement scale used; the frequency of measurement; timing of measurement, and the data structure. Completely matching, partially matching, and completely un-matching variables across the datasets were determined based on these features. Variables that were an exact match were pooled as is. Partially matching variables were harmonized or processed under a common format across the datasets considering the frequency of measurement, the timing of measurement, the measurement scale used, and response options. Variables that were completely unmatching could not be harmonized into a single variable. The variable harmonization strategies that were used to generate comparable cohort datasets for data pooling are applicable to other data sources. Future studies may employ or evaluate these strategies, which permit researchers to answer novel research questions in a statistically efficient, timely, and cost-efficient manner that could not be achieved using a single data source.
Topics: Alberta; Cohort Studies; Data Collection; Data Management; Female; Humans; Pregnancy; Sample Size
PubMed: 34888420
DOI: 10.23889/ijpds.v6i1.1680