-
IEEE Computer Graphics and Applications 2024Provenance facts, such as who made an image and how, can provide valuable context for users to make trust decisions about visual content. Against a backdrop of...
Provenance facts, such as who made an image and how, can provide valuable context for users to make trust decisions about visual content. Against a backdrop of inexorable progress in generative AI for computer graphics, over two billion people will vote in public elections this year. Emerging standards and provenance enhancing tools promise to play an important role in fighting fake news and the spread of misinformation. In this article, we contrast three provenance enhancing technologies-metadata, fingerprinting, and watermarking-and discuss how we can build upon the complementary strengths of these three pillars to provide robust trust signals to support stories told by real and generative images. Beyond authenticity, we describe how provenance can also underpin new models for value creation in the age of generative AI. In doing so, we address other risks arising with generative AI such as ensuring training consent, and the proper attribution of credit to creatives who contribute their work to train generative models. We show that provenance may be combined with distributed ledger technology to develop novel solutions for recognizing and rewarding creative endeavor in the age of generative AI.
Topics: Humans; Computer Graphics; Artificial Intelligence
PubMed: 38905025
DOI: 10.1109/MCG.2024.3380168 -
Journal of the American Medical... Jun 2024We analyzed the degree to which daily documentation patterns in primary care varied and whether specific patterns, consistency over time, and deviations from clinicians'...
OBJECTIVES
We analyzed the degree to which daily documentation patterns in primary care varied and whether specific patterns, consistency over time, and deviations from clinicians' usual patterns were associated with note-writing efficiency.
MATERIALS AND METHODS
We used electronic health record (EHR) active use data from the Oracle Cerner Advance platform capturing hourly active documentation time for 498 physicians and advance practice clinicians (eg, nurse practitioners) for 65 152 clinic days. We used k-means clustering to identify distinct daily patterns of active documentation time and analyzed the relationship between these patterns and active documentation time per note. We determined each primary care clinician's (PCC) modal documentation pattern and analyzed how consistency and deviations were related to documentation efficiency.
RESULTS
We identified 8 distinct daily documentation patterns; the 3 most common patterns accounted for 80.6% of PCC-days and differed primarily in average volume of documentation time (78.1 minutes per day; 35.4 minutes per day; 144.6 minutes per day); associations with note efficiency were mixed. PCCs with >80% of days attributable to a single pattern demonstrated significantly more efficient documentation than PCCs with lower consistency; for high-consistency PCCs, days that deviated from their usual patterns were associated with less efficient documentation.
DISCUSSION
We found substantial variation in efficiency across daily documentation patterns, suggesting that PCC-level factors like EHR facility and consistency may be more important than when documentation occurs. There were substantial efficiency returns to consistency, and deviations from consistent patterns were costly.
CONCLUSION
Organizational leaders aiming to reduce documentation burden should pay specific attention to the ability for PCCs to execute consistent documentation patterns day-to-day.
PubMed: 38905016
DOI: 10.1093/jamia/ocae156 -
Research in Social & Administrative... Jun 2024The Medical Subject Headings (MeSH) thesaurus is the controlled vocabulary used to index articles in MEDLINE. MeSH were mainly manually selected until June 2022 when an...
BACKGROUND
The Medical Subject Headings (MeSH) thesaurus is the controlled vocabulary used to index articles in MEDLINE. MeSH were mainly manually selected until June 2022 when an automated algorithm, the Medical Text Indexer (MTI) automated was fully implemented. A selection of automated indexed articles is then reviewed (curated) by human indexers to ensure the quality of the process.
OBJECTIVE
To describe the association of MEDLINE indexing methods (i.e., manual, automated, and automated + curated) on the MeSH assignment in pharmacy practice journals compared with medical journals.
METHODS
Original research articles published between 2016 and 2023 in two groups of journals (i.e., the Big-five general medicine and three pharmacy practice journals) were selected from PubMed using journal-specific search strategies. Metadata of the articles, including MeSH terms and indexing method, was extracted. A list of pharmacy-specific MeSH terms had been compiled from previously published studies, and their presence in pharmacy practice journal records was investigated. Using bivariate and multivariate analyses, as well as effect size measures, the number of MeSH per article was compared between journal groups, geographic origin of the journal, and indexing method.
RESULTS
A total of 8479 original research articles was retrieved: 6254 from the medical journals and 2225 from pharmacy practice journals. The number of articles indexed by the various methods was disproportionate; 77.8 % of medical and 50.5 % of pharmacy manually indexed. Among those indexed using the automated system, 51.1 % medical and 10.9 % pharmacy practice articles were then curated to ensure the indexing quality. Number of MeSH per article varied among the three indexing methods for medical and pharmacy journals, with 15.5 vs. 13.0 in manually indexed, 9.4 vs. 7.4 in automated indexed, and 12.1 vs. 7.8 in automated and then curated, respectively. Multivariate analysis showed significant effect of indexing method and journal group in the number of MeSH attributed, but not the geographical origin of the journal.
CONCLUSIONS
Articles indexed using automated MTI have less MeSH than manually indexed articles. Articles published in pharmacy practice journals were indexed with fewer number of MeSH compared with general medical journal articles regardless of the indexing method used.
PubMed: 38902136
DOI: 10.1016/j.sapharm.2024.06.003 -
Biology of Reproduction Jun 2024The Multispecies Ovary Tissue Histology Electronic Repository (MOTHER) is a publicly accessible repository of ovary histology images. MOTHER includes hundreds of images...
The Multispecies Ovary Tissue Histology Electronic Repository (MOTHER) is a publicly accessible repository of ovary histology images. MOTHER includes hundreds of images from nonhuman primates, as well as ovary histology images from an expanding range of other species. Along with an image, MOTHER provides metadata about the image, and for selected species, follicle identification annotations. Ongoing work includes assisting scientists with contributing their histology images, creation of manual and automated (via machine learning) processing pipelines to identify and count ovarian follicles in different stages of development, and the incorporation of that data into the MOTHER database (MOTHER-DB). MOTHER will be a critical data repository storing and disseminating high-value histology images that are essential for research into ovarian function, fertility, and intra-species variability.
PubMed: 38900906
DOI: 10.1093/biolre/ioae101 -
IMeta Jun 2024A large number of oceanic metagenomic data and environmental metadata have been published. However, most studies focused on limited ecosystems using different analysis...
A large number of oceanic metagenomic data and environmental metadata have been published. However, most studies focused on limited ecosystems using different analysis tools, making it challenging to integrate these data into robust results and comprehensive global understanding of marine microbiome. Here, we constructed a systematic and quantitative analysis platform, the Microbiome Atlas/Sino-Hydrosphere for Ocean Ecosystem (MASH-Ocean: https://www.biosino.org/mash-ocean/), by integrating global marine metagenomic data and a unified data processing flow. MASH-Ocean 1.0 comprises 2147 metagenomic samples with five analysis modules: sample view, diversity, function, biogeography, and interaction network. This platform provides convenient and stable support for researchers in microbiology, environmental science, and biogeochemistry, to ensure the integration of omics data generated from hydrosphere ecosystems, to bridge the gap between elusive omics data and biological, ecological, and geological discovery, ultimately to foster the formation of a comprehensive atlas for aquatic environments.
PubMed: 38898978
DOI: 10.1002/imt2.201 -
JAMIA Open Jul 2024To enable reproducible research at scale by creating a platform that enables health data users to find, access, curate, and re-use electronic health record phenotyping...
OBJECTIVE
To enable reproducible research at scale by creating a platform that enables health data users to find, access, curate, and re-use electronic health record phenotyping algorithms.
MATERIALS AND METHODS
We undertook a structured approach to identifying requirements for a phenotype algorithm platform by engaging with key stakeholders. User experience analysis was used to inform the design, which we implemented as a web application featuring a novel metadata standard for defining phenotyping algorithms, access via Application Programming Interface (API), support for computable data flows, and version control. The application has creation and editing functionality, enabling researchers to submit phenotypes directly.
RESULTS
We created and launched the Phenotype Library in October 2021. The platform currently hosts 1049 phenotype definitions defined against 40 health data sources and >200K terms across 16 medical ontologies. We present several case studies demonstrating its utility for supporting and enabling research: the library hosts curated phenotype collections for the BREATHE respiratory health research hub and the Adolescent Mental Health Data Platform, and it is supporting the development of an informatics tool to generate clinical evidence for clinical guideline development groups.
DISCUSSION
This platform makes an impact by being open to all health data users and accepting all appropriate content, as well as implementing key features that have not been widely available, including managing structured metadata, access via an API, and support for computable phenotypes.
CONCLUSIONS
We have created the first openly available, programmatically accessible resource enabling the global health research community to store and manage phenotyping algorithms. Removing barriers to describing, sharing, and computing phenotypes will help unleash the potential benefit of health data for patients and the public.
PubMed: 38895652
DOI: 10.1093/jamiaopen/ooae049 -
Diagnostics (Basel, Switzerland) Jun 2024In recent years, Convolutional Neural Network (CNN) models have demonstrated notable advancements in various domains such as image classification and Natural Language...
In recent years, Convolutional Neural Network (CNN) models have demonstrated notable advancements in various domains such as image classification and Natural Language Processing (NLP). Despite their success in image classification tasks, their potential impact on medical image retrieval, particularly in text-based medical image retrieval (TBMIR) tasks, has not yet been fully realized. This could be attributed to the complexity of the ranking process, as there is ambiguity in treating TBMIR as an image retrieval task rather than a traditional information retrieval or NLP task. To address this gap, our paper proposes a novel approach to re-ranking medical images using a Deep Matching Model (DMM) and Medical-Dependent Features (MDF). These features incorporate categorical attributes such as medical terminologies and imaging modalities. Specifically, our DMM aims to generate effective representations for query and image metadata using a personalized CNN, facilitating matching between these representations. By using MDF, a semantic similarity matrix based on Unified Medical Language System (UMLS) meta-thesaurus, and a set of personalized filters taking into account some ranking features, our deep matching model can effectively consider the TBMIR task as an image retrieval task, as previously mentioned. To evaluate our approach, we performed experiments on the medical ImageCLEF datasets from 2009 to 2012. The experimental results show that the proposed model significantly enhances image retrieval performance compared to the baseline and state-of-the-art approaches.
PubMed: 38893730
DOI: 10.3390/diagnostics14111204 -
Diagnostics (Basel, Switzerland) May 2024In order to generate a machine learning algorithm (MLA) that can support ophthalmologists with the diagnosis of glaucoma, a carefully selected dataset that is based on...
In order to generate a machine learning algorithm (MLA) that can support ophthalmologists with the diagnosis of glaucoma, a carefully selected dataset that is based on clinically confirmed glaucoma patients as well as borderline cases (e.g., patients with suspected glaucoma) is required. The clinical annotation of datasets is usually performed at the expense of the data volume, which results in poorer algorithm performance. This study aimed to evaluate the application of an MLA for the automated classification of physiological optic discs (PODs), glaucomatous optic discs (GODs), and glaucoma-suspected optic discs (GSODs). Annotation of the data to the three groups was based on the diagnosis made in clinical practice by a glaucoma specialist. Color fundus photographs and 14 types of metadata (including visual field testing, retinal nerve fiber layer thickness, and cup-disc ratio) of 1168 eyes from 584 patients (POD = 321, GOD = 336, GSOD = 310) were used for the study. Machine learning (ML) was performed in the first step with the color fundus photographs only and in the second step with the images and metadata. Sensitivity, specificity, and accuracy of the classification of GSOD vs. GOD and POD vs. GOD were evaluated. Classification of GOD vs. GSOD and GOD vs. POD performed in the first step had AUCs of 0.84 and 0.88, respectively. By combining the images and metadata, the AUCs increased to 0.92 and 0.99, respectively. By combining images and metadata, excellent performance of the MLA can be achieved despite having only a small amount of data, thus supporting ophthalmologists with glaucoma diagnosis.
PubMed: 38893600
DOI: 10.3390/diagnostics14111073 -
Foods (Basel, Switzerland) May 2024Amplicon-targeted metagenomics is now the standard approach for the study of the composition and dynamics of food microbial communities. Hundreds of papers on this...
Amplicon-targeted metagenomics is now the standard approach for the study of the composition and dynamics of food microbial communities. Hundreds of papers on this subject have been published in scientific journals and the information is dispersed in a variety of sources, while raw sequences and their metadata are available in public repositories for some, but not all, of the published studies. A limited number of web resources and databases allow scientists to access this wealth of information but their level of annotation on studies and samples varies. Here, we report on the release of FoodMicrobionet v5, a comprehensive database of metataxonomic studies on bacterial and fungal communities of foods. The current version of the database includes 251 published studies (11 focusing on fungal microbiota, 230 on bacterial microbiota, and 10 providing data for both bacterial and fungal microbiota) and 14,035 samples with data on bacteria and 1114 samples with data on fungi. The new structure of the database is compatible with interactive apps and scripts developed for previous versions and allows scientists, R&D personnel in industries and regulators to access a wealth of information on food microbial communities.
PubMed: 38890917
DOI: 10.3390/foods13111689 -
Scientific Data Jun 2024Air temperature (Ta), snow depth (Sd), and soil temperature (Tg) are crucial variables for studying the above- and below-ground thermal conditions, especially in high...
Air temperature (Ta), snow depth (Sd), and soil temperature (Tg) are crucial variables for studying the above- and below-ground thermal conditions, especially in high latitudes. However, in-situ observations are frequently sparse and inconsistent across various datasets, with a significant amount of missing data. This study has assembled a comprehensive dataset of in-situ observations of Ta, Sd, and Tg for the Northern Hemisphere (higher than 30°N latitude), spanning 1960-2021. This dataset encompasses metadata and daily data time series for 27,768, 32,417, and 659 gages for Ta, Sd, and Tg, respectively. Using the ERA5-Land reanalysis data product, we applied deep learning methodology to reconstruct the missing data that account for 54.5%, 59.3%, and 74.3% of Ta, Sd, and Tg daily time series, respectively. The obtained high temporal resolution dataset can be used to better understand physical phenomena and relevant mechanisms, such as the dynamics of land-surface-atmosphere energy exchange, snowpack, and permafrost.
PubMed: 38890309
DOI: 10.1038/s41597-024-03483-x