-
Health Expectations : An International... Aug 2019Numerous frameworks for supporting, evaluating and reporting patient and public involvement in research exist. The literature is diverse and theoretically heterogeneous.
BACKGROUND
Numerous frameworks for supporting, evaluating and reporting patient and public involvement in research exist. The literature is diverse and theoretically heterogeneous.
OBJECTIVES
To identify and synthesize published frameworks, consider whether and how these have been used, and apply design principles to improve usability.
SEARCH STRATEGY
Keyword search of six databases; hand search of eight journals; ancestry and snowball search; requests to experts.
INCLUSION CRITERIA
Published, systematic approaches (frameworks) designed to support, evaluate or report on patient or public involvement in health-related research.
DATA EXTRACTION AND SYNTHESIS
Data were extracted on provenance; collaborators and sponsors; theoretical basis; lay input; intended user(s) and use(s); topics covered; examples of use; critiques; and updates. We used the Canadian Centre for Excellence on Partnerships with Patients and Public (CEPPP) evaluation tool and hermeneutic methodology to grade and synthesize the frameworks. In five co-design workshops, we tested evidence-based resources based on the review findings.
RESULTS
Our final data set consisted of 65 frameworks, most of which scored highly on the CEPPP tool. They had different provenances, intended purposes, strengths and limitations. We grouped them into five categories: power-focused; priority-setting; study-focused; report-focused; and partnership-focused. Frameworks were used mainly by the groups who developed them. The empirical component of our study generated a structured format and evidence-based facilitator notes for a "build your own framework" co-design workshop.
CONCLUSION
The plethora of frameworks combined with evidence of limited transferability suggests that a single, off-the-shelf framework may be less useful than a menu of evidence-based resources which stakeholders can use to co-design their own frameworks.
Topics: Community Participation; Empowerment; Group Processes; Humans; Patient Participation; Research
PubMed: 31012259
DOI: 10.1111/hex.12888 -
Sensors (Basel, Switzerland) Jul 2023Data provenance means recording data origins and the history of data generation and processing. In healthcare, data provenance is one of the essential processes that... (Review)
Review
Data provenance means recording data origins and the history of data generation and processing. In healthcare, data provenance is one of the essential processes that make it possible to track the sources and reasons behind any problem with a user's data. With the emergence of the General Data Protection Regulation (GDPR), data provenance in healthcare systems should be implemented to give users more control over data. This SLR studies the impacts of data provenance in healthcare and GDPR-compliance-based data provenance through a systematic review of peer-reviewed articles. The SLR discusses the technologies used to achieve data provenance and various methodologies to achieve data provenance. We then explore different technologies that are applied in the healthcare domain and how they achieve data provenance. In the end, we have identified key research gaps followed by future research directions.
Topics: Biomedical Research; Delivery of Health Care
PubMed: 37514788
DOI: 10.3390/s23146495 -
MSphere Jun 2022The availability of public genomics data has become essential for modern life sciences research, yet the quality, traceability, and curation of these data have...
The availability of public genomics data has become essential for modern life sciences research, yet the quality, traceability, and curation of these data have significant impacts on a broad range of microbial genomics research. While microbial genome databases such as NCBI's RefSeq database leverage the scalability of crowd sourcing for growth, genomics data provenance and authenticity of the source materials used to produce data are not strict requirements. Here, we describe the assembly of 1,113 bacterial genome references produced from authenticated materials sourced from the American Type Culture Collection (ATCC), each with full genomics data provenance relating to bioinformatics methods, quality control, and passage history. Comparative genomics analysis of ATCC standard reference genomes (ASRGs) revealed significant issues with regard to NCBI's RefSeq bacterial genome assemblies related to completeness, mutations, structure, strain metadata, and gaps in traceability to the original biological source materials. Nearly half of RefSeq assemblies lack details on sample source information, sequencing technology, or bioinformatics methods. Deep curation of these records is not within the scope of NCBI's core mission in supporting open science, which aims to collect sequence records that are submitted by the public. Nonetheless, we propose that gaps in metadata accuracy and data provenance represent an "elephant in the room" for microbial genomics research. Effectively addressing these issues will require raising the level of accountability for data depositors and acknowledging the need for higher expectations of quality among the researchers whose research depends on accurate and attributable reference genome data. The traceability of microbial genomics data to authenticated physical biological materials is not a requirement for depositing these data into public genome databases. This creates significant risks for the reliability and data provenance of these important genomics research resources, the impact of which is not well understood. We sought to investigate this by carrying out a comparative genomics study of 1,113 ATCC standard reference genomes (ASRGs) produced by ATCC from authenticated and traceable materials using the latest sequencing technologies. We found widespread discrepancies in genome assembly quality, genetic variability, and the quality and completeness of the associated metadata among hundreds of reference genomes for ATCC strains found in NCBI's RefSeq database. We present a comparative analysis of -assembled ASRGs, their respective metadata, and variant analysis using RefSeq genomes as a reference. Although assembly quality in RefSeq has generally improved over time, we found that significant quality issues remain, especially as related to genomic data and metadata provenance. Our work highlights the importance of data authentication and provenance for the microbial genomics community, and underscores the risks of ignoring this issue in the future.
Topics: Databases, Genetic; Genome, Bacterial; Genome, Microbial; Genomics; Reproducibility of Results
PubMed: 35491842
DOI: 10.1128/msphere.00077-22 -
Anatomical Record (Hoboken, N.J. : 2007) Apr 2022Human fetal and embryos collections (FECs) peaked in the late 19th century, an era before informed consent, and hence have unclear provenance. These collections are not... (Review)
Review
Human fetal and embryos collections (FECs) peaked in the late 19th century, an era before informed consent, and hence have unclear provenance. These collections are not only historical artifacts, but prized resources for education and research. This study aimed to determine, via a narrative review, the present location, status, and profile of reported human fetal and embryonic collections. Twenty-seven articles that reported on collections appropriate to the study were selected from an initial search pool of 120 articles. The reported collections were in: Australia (n = 1), Germany (n = 6), Japan (n = 1), Spain (n = 1), and the United States (n = 5). The largest collection is reported to contain 45,000 prenatal remains and the smallest, three remains. The purpose of establishing majority of the collections was for education and research. Eight collections contain both embryos and fetuses, one collection contained embryos, exclusively. Another collection contained only fetuses and one neonatal cadaver. The provenance, where mentioned, specified gynecologists and obstetricians as the main source of remains (n = 5). Except for the Kyoto Collection, information regarding informed consent from the next-of-kin was lacking. This paper draws upon the three themes of purpose, provenance, and profile and highlights the need to establish agreed international guidelines for the most appropriate ethical and sustainable practice with respect to establishment, procurement of remains, access, and maintenance of these collections. Nine domains for these guidelines are recommended: consent, privacy, commercial gain, digital and emerging technologies, commemorations and memorials, destruction and disposal, dignity of donors, global database and collaboration, and sustainability.
Topics: Cadaver; Female; Fetus; Germany; Humans; Infant, Newborn; Informed Consent; Pregnancy; Spain; United States
PubMed: 35099840
DOI: 10.1002/ar.24863 -
Scientific Data Apr 2023We present a database resulting from high throughput experimentation, primarily on metal oxide solid state materials. The central relational database, the Materials...
We present a database resulting from high throughput experimentation, primarily on metal oxide solid state materials. The central relational database, the Materials Provenance Store (MPS), manages the metadata and experimental provenance from acquisition of raw materials, through synthesis, to a broad range of materials characterization techniques. Given the primary research goal of materials discovery of solar fuels materials, many of the characterization experiments involve electrochemistry, along with optical, structural, and compositional characterizations. The MPS is populated with all information required for executing common data queries, which typically do not involve direct query of raw data. The result is a database file that can be distributed to users so that they can independently execute queries and subsequently download the data of interest. We propose this strategy as an approach to manage the highly heterogeneous and distributed data that arises from materials science experiments, as demonstrated by the management of over 30 million experiments run on over 12 million samples in the present MPS release.
Topics: Semantics; Databases, Factual; Metadata
PubMed: 37024515
DOI: 10.1038/s41597-023-02107-0 -
NeuroImage Aug 2008Provenance, the description of the history of a set of data, has grown more important with the proliferation of research consortia-related efforts in neuroimaging....
Provenance, the description of the history of a set of data, has grown more important with the proliferation of research consortia-related efforts in neuroimaging. Knowledge about the origin and history of an image is crucial for establishing data and results quality; detailed information about how it was processed, including the specific software routines and operating systems that were used, is necessary for proper interpretation, high fidelity replication and re-use. We have drafted a mechanism for describing provenance in a simple and easy to use environment, alleviating the burden of documentation from the user while still providing a rich description of an image's provenance. This combination of ease of use and highly descriptive metadata should greatly facilitate the collection of provenance and subsequent sharing of data.
Topics: Database Management Systems; Databases, Factual; Diagnostic Imaging; Documentation; Information Storage and Retrieval; Neurosciences; Ownership; Terminology as Topic
PubMed: 18519166
DOI: 10.1016/j.neuroimage.2008.04.186 -
Journal of Medical Internet Research Mar 2023Data provenance refers to the origin, processing, and movement of data. Reliable and precise knowledge about data provenance has great potential to improve... (Review)
Review
BACKGROUND
Data provenance refers to the origin, processing, and movement of data. Reliable and precise knowledge about data provenance has great potential to improve reproducibility as well as quality in biomedical research and, therefore, to foster good scientific practice. However, despite the increasing interest on data provenance technologies in the literature and their implementation in other disciplines, these technologies have not yet been widely adopted in biomedical research.
OBJECTIVE
The aim of this scoping review was to provide a structured overview of the body of knowledge on provenance methods in biomedical research by systematizing articles covering data provenance technologies developed for or used in this application area; describing and comparing the functionalities as well as the design of the provenance technologies used; and identifying gaps in the literature, which could provide opportunities for future research on technologies that could receive more widespread adoption.
METHODS
Following a methodological framework for scoping studies and the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Extension for Scoping Reviews) guidelines, articles were identified by searching the PubMed, IEEE Xplore, and Web of Science databases and subsequently screened for eligibility. We included original articles covering software-based provenance management for scientific research published between 2010 and 2021. A set of data items was defined along the following five axes: publication metadata, application scope, provenance aspects covered, data representation, and functionalities. The data items were extracted from the articles, stored in a charting spreadsheet, and summarized in tables and figures.
RESULTS
We identified 44 original articles published between 2010 and 2021. We found that the solutions described were heterogeneous along all axes. We also identified relationships among motivations for the use of provenance information, feature sets (capture, storage, retrieval, visualization, and analysis), and implementation details such as the data models and technologies used. The important gap that we identified is that only a few publications address the analysis of provenance data or use established provenance standards, such as PROV.
CONCLUSIONS
The heterogeneity of provenance methods, models, and implementations found in the literature points to the lack of a unified understanding of provenance concepts for biomedical data. Providing a common framework, a biomedical reference, and benchmarking data sets could foster the development of more comprehensive provenance solutions.
Topics: Humans; Biomedical Research; Metadata; PubMed; Reproducibility of Results; Software
PubMed: 36972116
DOI: 10.2196/42289 -
JMIR Research Protocols Nov 2021Provenance supports the understanding of data genesis, and it is a key factor to ensure the trustworthiness of digital objects containing (sensitive) scientific data....
BACKGROUND
Provenance supports the understanding of data genesis, and it is a key factor to ensure the trustworthiness of digital objects containing (sensitive) scientific data. Provenance information contributes to a better understanding of scientific results and fosters collaboration on existing data as well as data sharing. This encompasses defining comprehensive concepts and standards for transparency and traceability, reproducibility, validity, and quality assurance during clinical and scientific data workflows and research.
OBJECTIVE
The aim of this scoping review is to investigate existing evidence regarding approaches and criteria for provenance tracking as well as disclosing current knowledge gaps in the biomedical domain. This review covers modeling aspects as well as metadata frameworks for meaningful and usable provenance information during creation, collection, and processing of (sensitive) scientific biomedical data. This review also covers the examination of quality aspects of provenance criteria.
METHODS
This scoping review will follow the methodological framework by Arksey and O'Malley. Relevant publications will be obtained by querying PubMed and Web of Science. All papers in English language will be included, published between January 1, 2006 and March 23, 2021. Data retrieval will be accompanied by manual search for grey literature. Potential publications will then be exported into a reference management software, and duplicates will be removed. Afterwards, the obtained set of papers will be transferred into a systematic review management tool. All publications will be screened, extracted, and analyzed: title and abstract screening will be carried out by 4 independent reviewers. Majority vote is required for consent to eligibility of papers based on the defined inclusion and exclusion criteria. Full-text reading will be performed independently by 2 reviewers and in the last step, key information will be extracted on a pretested template. If agreement cannot be reached, the conflict will be resolved by a domain expert. Charted data will be analyzed by categorizing and summarizing the individual data items based on the research questions. Tabular or graphical overviews will be given, if applicable.
RESULTS
The reporting follows the extension of the Preferred Reporting Items for Systematic reviews and Meta-Analyses statements for Scoping Reviews. Electronic database searches in PubMed and Web of Science resulted in 469 matches after deduplication. As of September 2021, the scoping review is in the full-text screening stage. The data extraction using the pretested charting template will follow the full-text screening stage. We expect the scoping review report to be completed by February 2022.
CONCLUSIONS
Information about the origin of healthcare data has a major impact on the quality and the reusability of scientific results as well as follow-up activities. This protocol outlines plans for a scoping review that will provide information about current approaches, challenges, or knowledge gaps with provenance tracking in biomedical sciences.
INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)
DERR1-10.2196/31750.
PubMed: 34813494
DOI: 10.2196/31750 -
Patterns (New York, N.Y.) Sep 2021Reproducible computational research (RCR) is the keystone of the scientific method for analyses, packaging the transformation of raw data to published results. In... (Review)
Review
Reproducible computational research (RCR) is the keystone of the scientific method for analyses, packaging the transformation of raw data to published results. In addition to its role in research integrity, improving the reproducibility of scientific studies can accelerate evaluation and reuse. This potential and wide support for the FAIR principles have motivated interest in metadata standards supporting reproducibility. Metadata provide context and provenance to raw data and methods and are essential to both discovery and validation. Despite this shared connection with scientific data, few studies have explicitly described how metadata enable reproducible computational research. This review employs a functional content analysis to identify metadata standards that support reproducibility across an analytic stack consisting of input data, tools, notebooks, pipelines, and publications. Our review provides background context, explores gaps, and discovers component trends of embeddedness and methodology weight from which we derive recommendations for future work.
PubMed: 34553169
DOI: 10.1016/j.patter.2021.100322 -
Journal of Biomedical Semantics Jul 2023Clinical decision support systems have been widely deployed to guide healthcare decisions on patient diagnosis, treatment choices, and patient management through...
BACKGROUND
Clinical decision support systems have been widely deployed to guide healthcare decisions on patient diagnosis, treatment choices, and patient management through evidence-based recommendations. These recommendations are typically derived from clinical practice guidelines created by clinical specialties or healthcare organizations. Although there have been many different technical approaches to encoding guideline recommendations into decision support systems, much of the previous work has not focused on enabling system generated recommendations through the formalization of changes in a guideline, the provenance of a recommendation, and applicability of the evidence. Prior work indicates that healthcare providers may not find that guideline-derived recommendations always meet their needs for reasons such as lack of relevance, transparency, time pressure, and applicability to their clinical practice.
RESULTS
We introduce several semantic techniques that model diseases based on clinical practice guidelines, provenance of the guidelines, and the study cohorts they are based on to enhance the capabilities of clinical decision support systems. We have explored ways to enable clinical decision support systems with semantic technologies that can represent and link to details in related items from the scientific literature and quickly adapt to changing information from the guidelines, identifying gaps, and supporting personalized explanations. Previous semantics-driven clinical decision systems have limited support in all these aspects, and we present the ontologies and semantic web based software tools in three distinct areas that are unified using a standard set of ontologies and a custom-built knowledge graph framework: (i) guideline modeling to characterize diseases, (ii) guideline provenance to attach evidence to treatment decisions from authoritative sources, and (iii) study cohort modeling to identify relevant research publications for complicated patients.
CONCLUSIONS
We have enhanced existing, evidence-based knowledge by developing ontologies and software that enables clinicians to conveniently access updates to and provenance of guidelines, as well as gather additional information from research studies applicable to their patients' unique circumstances. Our software solutions leverage many well-used existing biomedical ontologies and build upon decades of knowledge representation and reasoning work, leading to explainable results.
Topics: Humans; Decision Support Systems, Clinical; Software; Knowledge Bases; Biological Ontologies; Publications
PubMed: 37464259
DOI: 10.1186/s13326-023-00285-9