-
Clinical and Translational Science Aug 2023The purpose of this article is to propose and provide a blueprint for a graduate-level curriculum in clinical data science, devoted to the measurement, acquisition,...
The purpose of this article is to propose and provide a blueprint for a graduate-level curriculum in clinical data science, devoted to the measurement, acquisition, care, treatment, and inferencing of clinical research data. The curriculum presented here contains a series of five required core courses, five required research courses, and a list of potential electives. The coursework draws from but does not duplicate content from the foundational areas of biostatistics, clinical medicine, biomedical informatics, and regulatory affairs, and may be reproduced by any institution interested in and capable of offering such a program. This new curriculum in "clinical" data science will prepare students for work in academic, industry, and government research settings as well as offer a unifying knowledge base for the profession.
Topics: Humans; Data Management; Data Science; Models, Educational; Biometry; Curriculum
PubMed: 37587756
DOI: 10.1111/cts.13545 -
BioRxiv : the Preprint Server For... Nov 2023Machine learning approaches have the potential for meaningful impact in the biomedical field. However, there are often challenges unique to biomedical data that...
Machine learning approaches have the potential for meaningful impact in the biomedical field. However, there are often challenges unique to biomedical data that prohibits the adoption of these innovations. For example, limited data, data volatility, and data shifts all compromise model robustness and generalizability. Without proper tuning and data management, deploying machine learning models in the presence of unaccounted for corruptions leads to reduced or misleading performance. This study explores techniques to enhance model generalizability through iterative adjustments. Specifically, we investigate a detection tasks using electron microscopy images and compare models trained with different normalization and augmentation techniques. We found that models trained with Group Normalization or texture data augmentation outperform other normalization techniques and classical data augmentation, enabling them to learn more generalized features. These improvements persist even when models are trained and tested on disjoint datasets acquired through diverse data acquisition protocols. Results hold true for transformerand convolution-based detection architectures. The experiments show an impressive 29% boost in average precision, indicating significant enhancements in the model's generalizibality. This underscores the models' capacity to effectively adapt to diverse datasets and demonstrates their increased resilience in real-world applications.
PubMed: 38076794
DOI: 10.1101/2023.11.27.568889 -
Journal of Registry Management 2023The past several years have been marked by substantial growth in pediatric cancer data and collection across the world. In the United States, multiple projects and...
The past several years have been marked by substantial growth in pediatric cancer data and collection across the world. In the United States, multiple projects and standard setters have laid a foundation for the growth of this data, and the need for an overview and explanation of a few of the programs directly relevant to cancer registrars has become apparent. This article will discuss 3 initiatives that highlight many of the efforts and intricacies involved with the collection of pediatric cancer data in the cancer registry world: the National Childhood Cancer Registry, the Toronto Pediatric Cancer Stage Guidelines, and the Pediatric Site-Specific Data Items Work Group.
Topics: Child; Humans; United States; Neoplasms; Registries; Neoplasm Staging; Data Management; Data Collection
PubMed: 37941745
DOI: No ID Found -
Journal of the American Medical... Sep 2023Researchers at New York University (NYU) Grossman School of Medicine contacted the Health Sciences Library for help with locating large datasets for reuse. In response,...
OBJECTIVE
Researchers at New York University (NYU) Grossman School of Medicine contacted the Health Sciences Library for help with locating large datasets for reuse. In response, the library developed and maintained the NYU Data Catalog, a public-facing data catalog that has supported not only faculty acquisition of data but also the dissemination of the products of their research in various ways.
MATERIALS AND METHODS
The current NYU Data Catalog is built upon the Symfony framework with a tailored metadata schema reflecting the scope of faculty research areas. The project team curates new resources, including datasets and supporting software code, and conducts quarterly and annual evaluations to assess user interactions with the NYU Data Catalog and opportunities for growth.
RESULTS
Since its launch in 2015, the NYU Data Catalog underwent a number of changes prompted by an increase in the disciplines represented by faculty contributors. The catalog has also utilized faculty feedback to enhance support of data reuse and researcher collaboration through alterations to its schema, layout, and visibility of records.
DISCUSSION
These findings demonstrate the flexibility of data catalogs as a platform for enabling the discovery of disparate sources of data. While not a repository, the NYU Data Catalog is well-positioned to support mandates for data sharing from study sponsors and publishers.
CONCLUSION
The NYU Data Catalog makes the most of the data that researchers share and can be harnessed as a modular and adaptable platform to promote data sharing as a cultural practice.
Topics: Humans; New York; Universities; Software; Medicine
PubMed: 37414539
DOI: 10.1093/jamia/ocad125 -
Cells Dec 2023The European Bank for induced pluripotent Stem Cells (EBiSC) was established in 2014 as a non-profit project for the banking, quality control, and distribution of human...
The Management of Data for the Banking, Qualification, and Distribution of Induced Pluripotent Stem Cells: Lessons Learned from the European Bank for Induced Pluripotent Stem Cells.
The European Bank for induced pluripotent Stem Cells (EBiSC) was established in 2014 as a non-profit project for the banking, quality control, and distribution of human iPSC lines for research around the world. EBiSC iPSCs are deposited from diverse laboratories internationally and, hence, a key activity for EBiSC is standardising not only the iPSC lines themselves but also the data associated with them. This includes enabling unique nomenclature for the cells, as well as applying uniformity to the data provided by the cell line generator versus quality control data generated by EBiSC, and providing mechanisms to share personal data in a secure and GDPR-compliant manner. A joint approach implemented by EBiSC and the human pluripotent stem cell registry (hPSCreg) has provided a solution that enabled hPSCreg to improve its registration platform for iPSCs and EBiSC to have a pipeline for the import, standardisation, storage, and management of data associated with EBiSC iPSCs. In this work, we describe the experience of cell line data management for iPSC banking throughout the course of EBiSC's development as a central European banking infrastructure and present a model for how this could be implemented by other iPSC repositories to increase the FAIRness of iPSC research globally.
Topics: Humans; Induced Pluripotent Stem Cells; Pluripotent Stem Cells; Cell Line; Registries; Reference Standards
PubMed: 38067184
DOI: 10.3390/cells12232756 -
JMIR Medical Informatics Aug 2023With the advent of the digital economy and the aging population, the demand for diversified health care services and innovative care delivery models has been... (Review)
Review
BACKGROUND
With the advent of the digital economy and the aging population, the demand for diversified health care services and innovative care delivery models has been overwhelming. This trend has accelerated the urgency to implement effective and efficient data exchange and service interoperability, which underpins coordinated care services among tiered health care institutions, improves the quality of oversight of regulators, and provides vast and comprehensive data collection to support clinical medicine and health economics research, thus improving the overall service quality and patient satisfaction. To meet this demand and facilitate the interoperability of IT systems of stakeholders, after years of preparation, Health Level 7 formally introduced, in 2014, the Fast Healthcare Interoperability Resources (FHIR) standard. It has since continued to evolve. FHIR depends on the Implementation Guide (IG) to ensure feasibility and consistency while developing an interoperable health care service. The IG defines rules with associated documentation on how FHIR resources are used to tackle a particular problem. However, a gap remains between IGs and the process of building actual services because IGs are rules without specifying concrete methods, procedures, or tools. Thus, stakeholders may feel it nontrivial to participate in the ecosystem, giving rise to the need for a more actionable practice guideline (PG) for promoting FHIR's fast adoption.
OBJECTIVE
This study aimed to propose a general FHIR PG to facilitate stakeholders in the health care ecosystem to understand FHIR and quickly develop interoperable health care services.
METHODS
We selected a collection of FHIR-related papers about the latest studies or use cases on designing and building FHIR-based interoperable health care services and tagged each use case as belonging to 1 of the 3 dominant innovation feature groups that are also associated with practice stages, that is, data standardization, data management, and data integration. Next, we reviewed each group's detailed process and key techniques to build respective care services and collate a complete FHIR PG. Finally, as an example, we arbitrarily selected a use case outside the scope of the reviewed papers and mapped it back to the FHIR PG to demonstrate the effectiveness and generalizability of the PG.
RESULTS
The FHIR PG includes 2 core elements: one is a practice design that defines the responsibilities of stakeholders and outlines the complete procedure from data to services, and the other is a development architecture for practice design, which lists the available tools for each practice step and provides direct and actionable recommendations.
CONCLUSIONS
The FHIR PG can bridge the gap between IGs and the process of building actual services by proposing actionable methods, procedures, and tools. It assists stakeholders in identifying participants' roles, managing the scope of responsibilities, and developing relevant modules, thus helping promote FHIR-based interoperable health care services.
PubMed: 37603388
DOI: 10.2196/44842 -
BMC Health Services Research Sep 2023One crucial obstacle to attaining universal immunization coverage in Sub-Saharan Africa is the paucity of timely and high-quality data. This challenge, in part, stems...
BACKGROUND
One crucial obstacle to attaining universal immunization coverage in Sub-Saharan Africa is the paucity of timely and high-quality data. This challenge, in part, stems from the fact that many frontline immunization staff in this part of the world are commonly overburdened with multiple data-related responsibilities that often compete with their clinical tasks, which in turn could affect their data collection practices. This study assessed the data management practices of immunization staff and unveiled potential barriers impacting immunization data quality in Cameroon.
METHODS
A descriptive cross-sectional study was conducted, involving health districts and health facilities in all 10 regions in Cameroon selected by a multi-stage sampling scheme. Structured questionnaires and observation checklists were used to collect data from Expanded Program of Immunization (EPI) staff, and data were analyzed using STATA VERSION 13.0 (StataCorp LP. 2015. College Station, TX).
RESULTS
A total of 265 facilities in 68 health districts were assessed. There was limited availability of some data recording tools like vaccination cards (43%), maintenance registers (8%), and stock cards (57%) in most health facilities. Core data collection tools were incompletely filled in a significant proportion of facilities (37% for registers and 81% for tally sheets). Almost every health facility (89%) did not adhere to the recommendation of filling tally sheets during vaccination; the filling was instead done either before (51% of facilities) or after (25% of facilities) vaccinating several children. Moreso, about 8% of facilities did not collect data on vaccine administration. About a third of facilities did not collect data on stock levels (35%), vaccine storage temperatures (21%), and vaccine wastage (39%).
CONCLUSION
Our findings unveil important gaps in data collection practices at the facility level that could adversely affect Cameroon's immunization data quality. It highlights the urgent need for systematic capacity building of frontline immunization staff on data management capacity, standardizing data management processes, and building systems that ensure constant availability of data recording tools at the facility level.
Topics: Child; Humans; Data Management; Data Accuracy; Cameroon; Cross-Sectional Studies; Vaccination; Immunization; Vaccines; Surveys and Questionnaires; Immunization Programs
PubMed: 37759205
DOI: 10.1186/s12913-023-09965-9 -
Sensors (Basel, Switzerland) Jan 2024Cloud computing technology is rapidly becoming ubiquitous and indispensable. However, its widespread adoption also exposes organizations and individuals to a broad... (Review)
Review
Cloud computing technology is rapidly becoming ubiquitous and indispensable. However, its widespread adoption also exposes organizations and individuals to a broad spectrum of potential threats. Despite the multiple advantages the cloud offers, organizations remain cautious about migrating their data and applications to the cloud due to fears of data breaches and security compromises. In light of these concerns, this study has conducted an in-depth examination of a variety of articles to enhance the comprehension of the challenges related to safeguarding and fortifying data within the cloud environment. Furthermore, the research has scrutinized several well-documented data breaches, analyzing the financial consequences they inflicted. Additionally, it scrutinizes the distinctions between conventional digital forensics and the forensic procedures specific to cloud computing. As a result of this investigation, the study has concluded by proposing potential opportunities for further research in this critical domain. By doing so, it contributes to our collective understanding of the complex panorama of cloud data protection and security, while acknowledging the evolving nature of technology and the need for ongoing exploration and innovation in this field. This study also helps in understanding the compound annual growth rate (CAGR) of cloud digital forensics, which is found to be quite high at ≈16.53% from 2023 to 2031. Moreover, its market is expected to reach ≈USD 36.9 billion by the year 2031; presently, it is ≈USD 11.21 billion, which shows that there are great opportunities for investment in this area. This study also strategically addresses emerging challenges in cloud digital forensics, providing a comprehensive approach to navigating and overcoming the complexities associated with the evolving landscape of cloud computing.
PubMed: 38257526
DOI: 10.3390/s24020433 -
Cureus Apr 2024The efficacy of immunization programs is critically dependent on robust supply chain management, a complex challenge exacerbated by expanding program scopes and evolving... (Review)
Review
The efficacy of immunization programs is critically dependent on robust supply chain management, a complex challenge exacerbated by expanding program scopes and evolving vaccine technologies. This comprehensive review underscores the pivotal role of Resource Centers in fortifying the immunization supply chain, presenting a paradigm shift toward enhanced national and global health outcomes. Through a detailed examination of their key activities, the article elucidates how these centers catalyze improvements across various facets of supply chain management - from the integration of suitable technology technologies and specialized training programs to the development of sustainable models and advocacy for policy prioritization. This further explores the multifaceted challenges these centers confront, including funding constraints, capacity building, and infrastructural gaps, alongside the burgeoning opportunities presented by new vaccine introductions, donor interest in health system strengthening, and the potential for broadened scope beyond immunization. By weaving together examples of existing centers worldwide, the review highlights their contributions towards optimizing vaccine logistics, enhancing data management, and ultimately achieving Sustainable Development Goal 3. The insights provided offer valuable guidance for planning and sustaining resource centers, positioning them as indispensable allies in the global pursuit of universal immunization coverage.
PubMed: 38800200
DOI: 10.7759/cureus.58966 -
Journal of Pancreatology Mar 2024The "omics" revolution has transformed the biomedical research landscape by equipping scientists with the ability to interrogate complex biological phenomenon and... (Review)
Review
The "omics" revolution has transformed the biomedical research landscape by equipping scientists with the ability to interrogate complex biological phenomenon and disease processes at an unprecedented level. The volume of "big" data generated by the different omics studies such as genomics, transcriptomics, proteomics, and metabolomics has led to the concurrent development of computational tools to enable in silico analysis and aid data deconvolution. Considering the intensive resources and high costs required to generate and analyze big data, there has been centralized, collaborative efforts to make the data and analysis tools freely available as "Open Source," to benefit the wider research community. Pancreatology research studies have contributed to this "big data rush" and have additionally benefitted from utilizing the open source data as evidenced by the increasing number of new research findings and publications that stem from such data. In this review, we briefly introduce the evolution of open source omics data, data types, the "FAIR" guiding principles for data management and reuse, and centralized platforms that enable free and fair data accessibility, availability, and provide tools for omics data analysis. We illustrate, through the case study of our own experience in mining pancreatitis omics data, the power of repurposing open source data to answer translationally relevant questions in pancreas research.
PubMed: 38524857
DOI: 10.1097/JP9.0000000000000173