metadata - OpenMD.com Journal Search

To Authenticity, and Beyond! Building Safe and Fair Generative AI Upon the Three Pillars of Provenance.

IEEE Computer Graphics and Applications 2024

Provenance facts, such as who made an image and how, can provide valuable context for users to make trust decisions about visual content. Against a backdrop of...

Summary PubMed

Authors: John Collomosse, Andy Parsons, Mike Potel...

Provenance facts, such as who made an image and how, can provide valuable context for users to make trust decisions about visual content. Against a backdrop of inexorable progress in generative AI for computer graphics, over two billion people will vote in public elections this year. Emerging standards and provenance enhancing tools promise to play an important role in fighting fake news and the spread of misinformation. In this article, we contrast three provenance enhancing technologies-metadata, fingerprinting, and watermarking-and discuss how we can build upon the complementary strengths of these three pillars to provide robust trust signals to support stories told by real and generative images. Beyond authenticity, we describe how provenance can also underpin new models for value creation in the age of generative AI. In doing so, we address other risks arising with generative AI such as ensuring training consent, and the proper attribution of credit to creatives who contribute their work to train generative models. We show that provenance may be combined with distributed ledger technology to develop novel solutions for recognizing and rewarding creative endeavor in the age of generative AI.

Topics: Humans; Computer Graphics; Artificial Intelligence

PubMed: 38905025
DOI: 10.1109/MCG.2024.3380168

Consistency is key: documentation distribution and efficiency in primary care.

Journal of the American Medical... Jun 2024

We analyzed the degree to which daily documentation patterns in primary care varied and whether specific patterns, consistency over time, and deviations from clinicians'...

Summary PubMed

Authors: Nate C Apathy, Joshua Biro, A Jay Holmgren...

OBJECTIVES

We analyzed the degree to which daily documentation patterns in primary care varied and whether specific patterns, consistency over time, and deviations from clinicians' usual patterns were associated with note-writing efficiency.

MATERIALS AND METHODS

We used electronic health record (EHR) active use data from the Oracle Cerner Advance platform capturing hourly active documentation time for 498 physicians and advance practice clinicians (eg, nurse practitioners) for 65 152 clinic days. We used k-means clustering to identify distinct daily patterns of active documentation time and analyzed the relationship between these patterns and active documentation time per note. We determined each primary care clinician's (PCC) modal documentation pattern and analyzed how consistency and deviations were related to documentation efficiency.

RESULTS

We identified 8 distinct daily documentation patterns; the 3 most common patterns accounted for 80.6% of PCC-days and differed primarily in average volume of documentation time (78.1 minutes per day; 35.4 minutes per day; 144.6 minutes per day); associations with note efficiency were mixed. PCCs with >80% of days attributable to a single pattern demonstrated significantly more efficient documentation than PCCs with lower consistency; for high-consistency PCCs, days that deviated from their usual patterns were associated with less efficient documentation.

DISCUSSION

We found substantial variation in efficiency across daily documentation patterns, suggesting that PCC-level factors like EHR facility and consistency may be more important than when documentation occurs. There were substantial efficiency returns to consistency, and deviations from consistent patterns were costly.

CONCLUSION

Organizational leaders aiming to reduce documentation burden should pay specific attention to the ability for PCCs to execute consistent documentation patterns day-to-day.

PubMed: 38905016
DOI: 10.1093/jamia/ocae156

Influence of automated indexing in Medical Subject Headings (MeSH) selection for pharmacy practice journals.

Research in Social & Administrative... Jun 2024

The Medical Subject Headings (MeSH) thesaurus is the controlled vocabulary used to index articles in MEDLINE. MeSH were mainly manually selected until June 2022 when an...

Summary PubMed

Authors: Fernando Fernandez-Llimos, Luciana G Negrão, Christine Bond...

BACKGROUND

The Medical Subject Headings (MeSH) thesaurus is the controlled vocabulary used to index articles in MEDLINE. MeSH were mainly manually selected until June 2022 when an automated algorithm, the Medical Text Indexer (MTI) automated was fully implemented. A selection of automated indexed articles is then reviewed (curated) by human indexers to ensure the quality of the process.

OBJECTIVE

To describe the association of MEDLINE indexing methods (i.e., manual, automated, and automated + curated) on the MeSH assignment in pharmacy practice journals compared with medical journals.

METHODS

Original research articles published between 2016 and 2023 in two groups of journals (i.e., the Big-five general medicine and three pharmacy practice journals) were selected from PubMed using journal-specific search strategies. Metadata of the articles, including MeSH terms and indexing method, was extracted. A list of pharmacy-specific MeSH terms had been compiled from previously published studies, and their presence in pharmacy practice journal records was investigated. Using bivariate and multivariate analyses, as well as effect size measures, the number of MeSH per article was compared between journal groups, geographic origin of the journal, and indexing method.

RESULTS

A total of 8479 original research articles was retrieved: 6254 from the medical journals and 2225 from pharmacy practice journals. The number of articles indexed by the various methods was disproportionate; 77.8 % of medical and 50.5 % of pharmacy manually indexed. Among those indexed using the automated system, 51.1 % medical and 10.9 % pharmacy practice articles were then curated to ensure the indexing quality. Number of MeSH per article varied among the three indexing methods for medical and pharmacy journals, with 15.5 vs. 13.0 in manually indexed, 9.4 vs. 7.4 in automated indexed, and 12.1 vs. 7.8 in automated and then curated, respectively. Multivariate analysis showed significant effect of indexing method and journal group in the number of MeSH attributed, but not the geographical origin of the journal.

CONCLUSIONS

Articles indexed using automated MTI have less MeSH than manually indexed articles. Articles published in pharmacy practice journals were indexed with fewer number of MeSH compared with general medical journal articles regardless of the indexing method used.

PubMed: 38902136
DOI: 10.1016/j.sapharm.2024.06.003

Overview of the multispecies ovary tissue histology electronic repository (MOTHER).

Biology of Reproduction Jun 2024

The Multispecies Ovary Tissue Histology Electronic Repository (MOTHER) is a publicly accessible repository of ovary histology images. MOTHER includes hundreds of images...

Summary PubMed

Authors: Karen H Watanabe, Suzanne B Dietrich, Yian Ding...

The Multispecies Ovary Tissue Histology Electronic Repository (MOTHER) is a publicly accessible repository of ovary histology images. MOTHER includes hundreds of images from nonhuman primates, as well as ovary histology images from an expanding range of other species. Along with an image, MOTHER provides metadata about the image, and for selected species, follicle identification annotations. Ongoing work includes assisting scientists with contributing their histology images, creation of manual and automated (via machine learning) processing pipelines to identify and count ovarian follicles in different stages of development, and the incorporation of that data into the MOTHER database (MOTHER-DB). MOTHER will be a critical data repository storing and disseminating high-value histology images that are essential for research into ovarian function, fertility, and intra-species variability.

PubMed: 38900906
DOI: 10.1093/biolre/ioae101

MASH-Ocean 1.0: Interactive platform for investigating microbial diversity, function, and biogeography with marine metagenomic data.

IMeta Jun 2024

A large number of oceanic metagenomic data and environmental metadata have been published. However, most studies focused on limited ecosystems using different analysis...

Summary PubMed Full Text PDF

Authors: Yinzhao Wang, Liuyang Li, Qiang Li...

A large number of oceanic metagenomic data and environmental metadata have been published. However, most studies focused on limited ecosystems using different analysis tools, making it challenging to integrate these data into robust results and comprehensive global understanding of marine microbiome. Here, we constructed a systematic and quantitative analysis platform, the Microbiome Atlas/Sino-Hydrosphere for Ocean Ecosystem (MASH-Ocean: https://www.biosino.org/mash-ocean/), by integrating global marine metagenomic data and a unified data processing flow. MASH-Ocean 1.0 comprises 2147 metagenomic samples with five analysis modules: sample view, diversity, function, biogeography, and interaction network. This platform provides convenient and stable support for researchers in microbiology, environmental science, and biogeochemistry, to ensure the integration of omics data generated from hydrosphere ecosystems, to bridge the gap between elusive omics data and biological, ecological, and geological discovery, ultimately to foster the formation of a comprehensive atlas for aquatic environments.

PubMed: 38898978
DOI: 10.1002/imt2.201

Creating a next-generation phenotype library: the health data research UK Phenotype Library.

JAMIA Open Jul 2024

To enable reproducible research at scale by creating a platform that enables health data users to find, access, curate, and re-use electronic health record phenotyping...

Summary PubMed Full Text PDF

Authors: Daniel S Thayer, Shahzad Mumtaz, Muhammad A Elmessary...

OBJECTIVE

To enable reproducible research at scale by creating a platform that enables health data users to find, access, curate, and re-use electronic health record phenotyping algorithms.

MATERIALS AND METHODS

We undertook a structured approach to identifying requirements for a phenotype algorithm platform by engaging with key stakeholders. User experience analysis was used to inform the design, which we implemented as a web application featuring a novel metadata standard for defining phenotyping algorithms, access via Application Programming Interface (API), support for computable data flows, and version control. The application has creation and editing functionality, enabling researchers to submit phenotypes directly.

RESULTS

We created and launched the Phenotype Library in October 2021. The platform currently hosts 1049 phenotype definitions defined against 40 health data sources and >200K terms across 16 medical ontologies. We present several case studies demonstrating its utility for supporting and enabling research: the library hosts curated phenotype collections for the BREATHE respiratory health research hub and the Adolescent Mental Health Data Platform, and it is supporting the development of an informatics tool to generate clinical evidence for clinical guideline development groups.

DISCUSSION

This platform makes an impact by being open to all health data users and accepting all appropriate content, as well as implementing key features that have not been widely available, including managing structured metadata, access via an API, and support for computable phenotypes.

CONCLUSIONS

We have created the first openly available, programmatically accessible resource enabling the global health research community to store and manage phenotyping algorithms. Removing barriers to describing, sharing, and computing phenotypes will help unleash the potential benefit of health data for patients and the public.

PubMed: 38895652
DOI: 10.1093/jamiaopen/ooae049

Enhancing Medical Image Retrieval with UMLS-Integrated CNN-Based Text Indexing.

Diagnostics (Basel, Switzerland) Jun 2024

In recent years, Convolutional Neural Network (CNN) models have demonstrated notable advancements in various domains such as image classification and Natural Language...

Summary PubMed Full Text PDF

Authors: Karim Gasmi, Hajer Ayadi, Mouna Torjmen...

In recent years, Convolutional Neural Network (CNN) models have demonstrated notable advancements in various domains such as image classification and Natural Language Processing (NLP). Despite their success in image classification tasks, their potential impact on medical image retrieval, particularly in text-based medical image retrieval (TBMIR) tasks, has not yet been fully realized. This could be attributed to the complexity of the ranking process, as there is ambiguity in treating TBMIR as an image retrieval task rather than a traditional information retrieval or NLP task. To address this gap, our paper proposes a novel approach to re-ranking medical images using a Deep Matching Model (DMM) and Medical-Dependent Features (MDF). These features incorporate categorical attributes such as medical terminologies and imaging modalities. Specifically, our DMM aims to generate effective representations for query and image metadata using a personalized CNN, facilitating matching between these representations. By using MDF, a semantic similarity matrix based on Unified Medical Language System (UMLS) meta-thesaurus, and a set of personalized filters taking into account some ranking features, our deep matching model can effectively consider the TBMIR task as an image retrieval task, as previously mentioned. To evaluate our approach, we performed experiments on the medical ImageCLEF datasets from 2009 to 2012. The experimental results show that the proposed model significantly enhances image retrieval performance compared to the baseline and state-of-the-art approaches.

PubMed: 38893730
DOI: 10.3390/diagnostics14111204

Automated Classification of Physiologic, Glaucomatous, and Glaucoma-Suspected Optic Discs Using Machine Learning.

Diagnostics (Basel, Switzerland) May 2024

In order to generate a machine learning algorithm (MLA) that can support ophthalmologists with the diagnosis of glaucoma, a carefully selected dataset that is based on...

Summary PubMed Full Text PDF

Authors: Raphael Diener, Alexander W Renz, Florian Eckhard...

In order to generate a machine learning algorithm (MLA) that can support ophthalmologists with the diagnosis of glaucoma, a carefully selected dataset that is based on clinically confirmed glaucoma patients as well as borderline cases (e.g., patients with suspected glaucoma) is required. The clinical annotation of datasets is usually performed at the expense of the data volume, which results in poorer algorithm performance. This study aimed to evaluate the application of an MLA for the automated classification of physiological optic discs (PODs), glaucomatous optic discs (GODs), and glaucoma-suspected optic discs (GSODs). Annotation of the data to the three groups was based on the diagnosis made in clinical practice by a glaucoma specialist. Color fundus photographs and 14 types of metadata (including visual field testing, retinal nerve fiber layer thickness, and cup-disc ratio) of 1168 eyes from 584 patients (POD = 321, GOD = 336, GSOD = 310) were used for the study. Machine learning (ML) was performed in the first step with the color fundus photographs only and in the second step with the images and metadata. Sensitivity, specificity, and accuracy of the classification of GSOD vs. GOD and POD vs. GOD were evaluated. Classification of GOD vs. GSOD and GOD vs. POD performed in the first step had AUCs of 0.84 and 0.88, respectively. By combining the images and metadata, the AUCs increased to 0.92 and 0.99, respectively. By combining images and metadata, excellent performance of the MLA can be achieved despite having only a small amount of data, thus supporting ophthalmologists with glaucoma diagnosis.

PubMed: 38893600
DOI: 10.3390/diagnostics14111073

A Comprehensive View of Food Microbiota: Introducing FoodMicrobionet v5.

Foods (Basel, Switzerland) May 2024

Amplicon-targeted metagenomics is now the standard approach for the study of the composition and dynamics of food microbial communities. Hundreds of papers on this...

Summary PubMed Full Text PDF

Authors: Eugenio Parente, Annamaria Ricciardi

Amplicon-targeted metagenomics is now the standard approach for the study of the composition and dynamics of food microbial communities. Hundreds of papers on this subject have been published in scientific journals and the information is dispersed in a variety of sources, while raw sequences and their metadata are available in public repositories for some, but not all, of the published studies. A limited number of web resources and databases allow scientists to access this wealth of information but their level of annotation on studies and samples varies. Here, we report on the release of FoodMicrobionet v5, a comprehensive database of metataxonomic studies on bacterial and fungal communities of foods. The current version of the database includes 251 published studies (11 focusing on fungal microbiota, 230 on bacterial microbiota, and 10 providing data for both bacterial and fungal microbiota) and 14,035 samples with data on bacteria and 1114 samples with data on fungi. The new structure of the database is compatible with interactive apps and scripts developed for previous versions and allows scientists, R&D personnel in industries and regulators to access a wealth of information on food microbial communities.

PubMed: 38890917
DOI: 10.3390/foods13111689

Daily station-level records of air temperature, snow depth, and ground temperature in the Northern Hemisphere.

Scientific Data Jun 2024

Air temperature (Ta), snow depth (Sd), and soil temperature (Tg) are crucial variables for studying the above- and below-ground thermal conditions, especially in high...

Summary PubMed Full Text PDF

Authors: Vinh Ngoc Tran, Wenbo Zhou, Taeho Kim...

Air temperature (Ta), snow depth (Sd), and soil temperature (Tg) are crucial variables for studying the above- and below-ground thermal conditions, especially in high latitudes. However, in-situ observations are frequently sparse and inconsistent across various datasets, with a significant amount of missing data. This study has assembled a comprehensive dataset of in-situ observations of Ta, Sd, and Tg for the Northern Hemisphere (higher than 30°N latitude), spanning 1960-2021. This dataset encompasses metadata and daily data time series for 27,768, 32,417, and 659 gages for Ta, Sd, and Tg, respectively. Using the ERA5-Land reanalysis data product, we applied deep learning methodology to reconstruct the missing data that account for 54.5%, 59.3%, and 74.3% of Ta, Sd, and Tg daily time series, respectively. The obtained high temporal resolution dataset can be used to better understand physical phenomena and relevant mechanisms, such as the dynamics of land-surface-atmosphere energy exchange, snowpack, and permafrost.

PubMed: 38890309
DOI: 10.1038/s41597-024-03483-x