-
Reviews in Fish Biology and Fisheries 2023Seafood is an important source of protein and micronutrients, but fishery stocks are increasingly under pressure from both legitimate and illegitimate fishing practices....
Seafood is an important source of protein and micronutrients, but fishery stocks are increasingly under pressure from both legitimate and illegitimate fishing practices. Sustainable management of our oceans is a global responsibility, aligning with United Nations Sustainable Development Goal 14, . In a post-COVID-19 world, there is an opportunity to build back better, where locally sourced food via transparent supply chains are ever-more important. This article summarises emerging research of two innovative case studies in detecting and validating seafood provenance; and using alternative supply chains to minimise the opportunity for seafood fraud in a post-COVID-19 world.
PubMed: 36593873
DOI: 10.1007/s11160-022-09747-2 -
Patterns (New York, N.Y.) Sep 2021Reproducible computational research (RCR) is the keystone of the scientific method for analyses, packaging the transformation of raw data to published results. In... (Review)
Review
Reproducible computational research (RCR) is the keystone of the scientific method for analyses, packaging the transformation of raw data to published results. In addition to its role in research integrity, improving the reproducibility of scientific studies can accelerate evaluation and reuse. This potential and wide support for the FAIR principles have motivated interest in metadata standards supporting reproducibility. Metadata provide context and provenance to raw data and methods and are essential to both discovery and validation. Despite this shared connection with scientific data, few studies have explicitly described how metadata enable reproducible computational research. This review employs a functional content analysis to identify metadata standards that support reproducibility across an analytic stack consisting of input data, tools, notebooks, pipelines, and publications. Our review provides background context, explores gaps, and discovers component trends of embeddedness and methodology weight from which we derive recommendations for future work.
PubMed: 34553169
DOI: 10.1016/j.patter.2021.100322 -
International Journal of Medical... Sep 2020The creation and exchange of patients' Electronic Healthcare Records have developed significantly in the last decade. Patients' records are however distributed in data...
OBJECTIVE
The creation and exchange of patients' Electronic Healthcare Records have developed significantly in the last decade. Patients' records are however distributed in data silos across multiple healthcare facilities, posing technical and clinical challenges that may endanger patients' safety. Current healthcare sharing systems ensure interoperability of patients' records across facilities, but they have limits in presenting doctors with the clinical context of the data in the records. We design and implement a platform for managing provenance tracking of Electronic Healthcare Records based on blockchain technology, compliant with the latest healthcare standards and following the patient-informed consent preferences.
METHODS
The platform leverages two pillars: the use of international standards such as Integrating the Healthcare Enterprise (IHE), Health Level Seven International (HL7) and Fast Healthcare Interoperability Resources (FHIR) to achieve interoperability, and the use of a provenance creation process that by-design, avoids personal data storage within the blockchain. The platform consists of: (1) a smart contract implemented within the Hyperledger Fabric blockchain that manages provenance according to the W3C PROV for medical document in standardised formats (e.g. a CDA document, a FHIR resource, a DICOM study, etc.); (2) a Java Proxy that intercepts all the document submissions and retrievals for which provenance shall be evaluated; (3) a service used to retrieve the PROV document.
RESULTS
We integrated our decentralised platform with the SpiritEHR engine, an enterprise-grade healthcare system, and we stored and retrieved the available documents in the Mandel's sample CDA repository, which contained no protected health information. Using a cloud-based blockchain solution, we observed that the overhead added to the typical processing time of reading and writing medical data is in the order of milliseconds. Moreover, the integration of the Proxy at the level of exchanged messages in EHR systems allows transparent usage of provenance data in multiple health computing domains such as decision making, data reconciliation, and patient consent auditing.
CONCLUSIONS
By using international healthcare standards and a cloud-based blockchain deployment, we delivered a solution that can manage provenance of patients' records via transparent integration within the routine operations on healthcare data.
Topics: Delivery of Health Care; Electronic Health Records; Health Facilities; Health Level Seven; Humans; Information Storage and Retrieval
PubMed: 32540775
DOI: 10.1016/j.ijmedinf.2020.104197 -
Seminars in Diagnostic Pathology Sep 2019From a technical perspective, specimen identity determination in surgical pathology over the last several decades has primarily focused on analysis of repetitive DNA... (Review)
Review
From a technical perspective, specimen identity determination in surgical pathology over the last several decades has primarily focused on analysis of repetitive DNA sequences, specifically microsatellite repeats. However, a number of techniques have recently been developed that have similar, if not greater, utility in surgical pathology, most notably analysis of single nucleotide polymorphism (SNPs) and gene panels by next generation sequencing (NGS). For cases with an extremely limited sample or a degraded sample, sequence analysis of mitochondrial DNA continues to be the method of choice. From a diagnostic perspective, interest in identity determination in surgical pathology is usually centered on resolving issues of specimen provenance due to specimen labeling/accessioning deficiencies and possible contamination, but is also frequently performed in cases for which the patient's clinical course following definitive therapy is remarkably atypical, in cases of an unexpected diagnosis, and by patient request for "peace of mind". However, the methods used for identity determination have a much broader range of applications in surgical pathology beyond tissue provenance analysis. The methods can be used to provide ancillary information for cases in which the histomorphology is not definitively diagnostic, as for example for tumors that have a virtually identical microscopic appearance but for which the differential diagnosis includes synchronous/metachronous tumors versus a metastasis, and for the diagnosis of hydropic early gestations versus hydatidiform molar pregnancies. The methods also have utility in several other clinical settings, for example to rule out a donor-transmitted malignancy in a transplant recipient, to monitor bone marrow transplant engraftment, and to evaluate natural chimerism.
Topics: High-Throughput Nucleotide Sequencing; Humans; Pathology, Surgical
PubMed: 31196743
DOI: 10.1053/j.semdp.2019.06.001 -
MSphere Jun 2022The availability of public genomics data has become essential for modern life sciences research, yet the quality, traceability, and curation of these data have...
The availability of public genomics data has become essential for modern life sciences research, yet the quality, traceability, and curation of these data have significant impacts on a broad range of microbial genomics research. While microbial genome databases such as NCBI's RefSeq database leverage the scalability of crowd sourcing for growth, genomics data provenance and authenticity of the source materials used to produce data are not strict requirements. Here, we describe the assembly of 1,113 bacterial genome references produced from authenticated materials sourced from the American Type Culture Collection (ATCC), each with full genomics data provenance relating to bioinformatics methods, quality control, and passage history. Comparative genomics analysis of ATCC standard reference genomes (ASRGs) revealed significant issues with regard to NCBI's RefSeq bacterial genome assemblies related to completeness, mutations, structure, strain metadata, and gaps in traceability to the original biological source materials. Nearly half of RefSeq assemblies lack details on sample source information, sequencing technology, or bioinformatics methods. Deep curation of these records is not within the scope of NCBI's core mission in supporting open science, which aims to collect sequence records that are submitted by the public. Nonetheless, we propose that gaps in metadata accuracy and data provenance represent an "elephant in the room" for microbial genomics research. Effectively addressing these issues will require raising the level of accountability for data depositors and acknowledging the need for higher expectations of quality among the researchers whose research depends on accurate and attributable reference genome data. The traceability of microbial genomics data to authenticated physical biological materials is not a requirement for depositing these data into public genome databases. This creates significant risks for the reliability and data provenance of these important genomics research resources, the impact of which is not well understood. We sought to investigate this by carrying out a comparative genomics study of 1,113 ATCC standard reference genomes (ASRGs) produced by ATCC from authenticated and traceable materials using the latest sequencing technologies. We found widespread discrepancies in genome assembly quality, genetic variability, and the quality and completeness of the associated metadata among hundreds of reference genomes for ATCC strains found in NCBI's RefSeq database. We present a comparative analysis of -assembled ASRGs, their respective metadata, and variant analysis using RefSeq genomes as a reference. Although assembly quality in RefSeq has generally improved over time, we found that significant quality issues remain, especially as related to genomic data and metadata provenance. Our work highlights the importance of data authentication and provenance for the microbial genomics community, and underscores the risks of ignoring this issue in the future.
Topics: Databases, Genetic; Genome, Bacterial; Genome, Microbial; Genomics; Reproducibility of Results
PubMed: 35491842
DOI: 10.1128/msphere.00077-22 -
Studies in Health Technology and... May 2022The distributed nature of modern research emphasizes the importance of collecting and sharing the history of digital and physical material, to improve the...
The distributed nature of modern research emphasizes the importance of collecting and sharing the history of digital and physical material, to improve the reproducibility of experiments and the quality and reusability of results. Yet, the application of the current methodologies to record provenance information is largely scattered, leading to silos of provenance information at different granularities. To tackle this fragmentation, we developed the Common Provenance Model, a set of guidelines for the generation of interoperable provenance information, and to allow the reconstruction and the navigation of a continuous provenance chain. This work presents the first version of the model, available online, based on the W3C PROV Data Model and the Provenance Composition pattern.
Topics: Biological Science Disciplines; Reproducibility of Results
PubMed: 35612111
DOI: 10.3233/SHTI220489 -
The Analyst Jun 2020Providing maximum information on the provenance of scientific results in life sciences is getting considerable attention since the widely publicized reproducibility... (Review)
Review
Providing maximum information on the provenance of scientific results in life sciences is getting considerable attention since the widely publicized reproducibility crisis. Improving the reproducibility of data processing and analysis workflows is part of this movement and may help achieve clinical deployment quicker. Scientific workflow managers can be valuable tools towards achieving this goal. Although these platforms are already well established in the field of genomics and other omics fields, in metabolomics scripts and dedicated software packages are still more popular. However, versatile workflows for metabolomics exist in the KNIME and Galaxy platforms. We will here summarize the available options of scientific workflow managers dedicated to metabolomics analysis.
Topics: Metabolomics; Reproducibility of Results; Software; Workflow
PubMed: 32374793
DOI: 10.1039/d0an00272k -
Patterns (New York, N.Y.) May 2020Data provenance is a machine-readable summary of the collection and computational history of a dataset. Data provenance confers or adds value to a dataset, helps...
Data provenance is a machine-readable summary of the collection and computational history of a dataset. Data provenance confers or adds value to a dataset, helps reproduce computational analyses, or validates scientific conclusions. The people of the End-to-End Provenance Project are a community of professionals who have developed software tools to collect and use data provenance.
PubMed: 33205093
DOI: 10.1016/j.patter.2020.100016 -
The FEBS Journal Jan 2022The FEBS Journal, a leading multidisciplinary journal in the life sciences, publishes high-impact papers on diverse topics relating to molecular mechanisms underpinning...
The FEBS Journal, a leading multidisciplinary journal in the life sciences, publishes high-impact papers on diverse topics relating to molecular mechanisms underpinning biological processes. Here, Editor-in-Chief Seamus Martin discusses the critical importance of data provenance and data integrity to the scientific method and discusses some of the highlights from 2021 at The FEBS Journal.
PubMed: 34982855
DOI: 10.1111/febs.16332 -
The Behavioral and Brain Sciences Sep 2022Cognitive scientists and psychometricians are unaccustomed to thinking about culture, often treating their measures - memory, vocabulary, intelligence - as natural...
Cognitive scientists and psychometricians are unaccustomed to thinking about culture, often treating their measures - memory, vocabulary, intelligence - as natural kinds. Relying on these measures, behavioral geneticists likewise seem to not wonder about their origin and cultural provenance. I argue that complex human traits - the sort we are most interested in measuring - are cultural products. We can measure them and their heritability, but to conclude that what we have measured is unbound to a time and place is hubris.
Topics: Cognition; Humans; Intelligence
PubMed: 36098407
DOI: 10.1017/S0140525X21001710