metadata - OpenMD.com Journal Search

The microgeo: an R package rapidly displays the biogeography of soil microbial community traits on maps.

FEMS Microbiology Ecology Jun 2024

Many R packages provide statistical approaches for elucidating the diversity of soil microbes, yet they still struggle to visualize microbial traits on a geographical...

Summary PubMed Full Text PDF

Authors: Chaonan Li, Chi Liu, Hankang Li...

Many R packages provide statistical approaches for elucidating the diversity of soil microbes, yet they still struggle to visualize microbial traits on a geographical map. This creates challenges in interpreting microbial biogeography on a regional scale, especially when the spatial scale is large or the distribution of sampling sites is uneven. Here, we developed a lightweight, flexible, and user-friendly R package called microgeo. This package integrates many functions involved in reading, manipulating, and visualizing geographical boundary data; downloading spatial datasets; and calculating microbial traits and rendering them onto a geographical map using grid-based visualization, spatial interpolation, or machine learning. Using this R package, users can visualize any trait calculated by microgeo or other tools on a map and can analyze microbiome data in conjunction with metadata derived from a geographical map. In contrast to other R packages that statistically analyze microbiome data, microgeo provides more-intuitive approaches in illustrating the biogeography of soil microbes on a large geographical scale, serving as an important supplement to statistically driven comparisons and facilitating the biogeographic analysis of publicly accessible microbiome data at a large spatial scale in a more convenient and efficient manner. The microgeo R package can be installed from the Gitee (https://gitee.com/bioape/microgeo) and GitHub (https://github.com/ChaonanLi/microgeo) repositories. Detailed tutorials for the microgeo R package are available at https://chaonanli.github.io/microgeo.

Topics: Soil Microbiology; Microbiota; Software; Bacteria; Phylogeography

PubMed: 38866720
DOI: 10.1093/femsec/fiae087

Partial spike gene sequencing for the identification of SARS-CoV-2 variants circulating in Cameroon in 2021.

Journal of Infection in Developing... May 2024

Global monitoring of severe acute respiratory syndrome related coronavirus 2 (SARS-CoV-2) genetic sequences and associated metadata is essential for coronavirus disease...

Summary PubMed

Authors: Chavely Gwladys Monamele, Serge Alain Sadeuh-Mba, Pauliana Vanessa Ilouga...

INTRODUCTION

Global monitoring of severe acute respiratory syndrome related coronavirus 2 (SARS-CoV-2) genetic sequences and associated metadata is essential for coronavirus disease 2019 (COVID-19) response. Therefore, Sanger's partial genome sequencing technique was used to monitor the circulating variants of SARS-CoV-2 in Cameroon.

METHODOLOGY

Nasopharyngeal specimen was collected from persons suspected of SARS-CoV-2 following the national guidelines between January and December 2021. All specimens with cycle threshold (Ct) below 30 after amplification were eligible for sequencing of the partial spike (S) gene of SARS-CoV-2 using the Sanger sequencing method.

RESULTS

During the year 2021, 1481 real time reverse transcriptase polymerase chain reaction (RT-PCR) SARS-CoV-2 positive samples were selected for partial sequencing of the S gene of SARS-CoV-2. Amongst these, 878 yielded good sequencing products. A total of 231 probable variants (26.3%) were identified. The variants were mainly represented by Delta (70.6%), Alpha (15.6%), Omicron (7.4%), Beta (3.5%), Mu (1.7%) and Gamma (0.4%). Phylogenetic analysis of the probable variants from Cameroon with reference strains confirmed that all prior and current variants of concern (VOC) clustered with their respective reference sequences.

CONCLUSIONS

The surveillance strategy implemented in Cameroon, based on partial sequencing of the S gene enabled identification of the major circulating variants and provided information on the distribution of these variants, which contributed to implementing public health measures to control disease spread in the country.

Topics: Humans; Cameroon; SARS-CoV-2; Spike Glycoprotein, Coronavirus; COVID-19; Male; Female; Adult; Adolescent; Child; Middle Aged; Young Adult; Child, Preschool; Nasopharynx; Aged; Phylogeny; Infant

PubMed: 38865404
DOI: 10.3855/jidc.18155

[Structural Topic Modeling Analysis of Patient Safety Interest among Health Consumers in Social Media].

Journal of Korean Academy of Nursing May 2024

This study aimed to investigate healthcare consumers' interest in patient safety on social media using structural topic modeling (STM) and to identify changes in...

Summary PubMed

Authors: Nari Kim, Nam-Ju Lee

PURPOSE

This study aimed to investigate healthcare consumers' interest in patient safety on social media using structural topic modeling (STM) and to identify changes in interest over time.

METHODS

Analyzing 105,727 posts from Naver news comments, blogs, internet cafés, and Twitter between 2010 and 2022, this study deployed a Python script for data collection and preprocessing. STM analysis was conducted using R, with the documents' publication years serving as metadata to trace the evolution of discussions on patient safety.

RESULTS

The analysis identified a total of 13 distinct topics, organized into three primary communities: (1) "Demand for systemic improvement of medical accidents," underscoring the need for legal and regulatory reform to enhance accountability; (2) "Efforts of the government and organizations for safety management," highlighting proactive risk mitigation strategies; and (3) "Medical accidents exposed in the media," reflecting widespread concerns over medical negligence and its repercussions. These findings indicate pervasive concerns regarding medical accountability and transparency among healthcare consumers.

CONCLUSION

The findings emphasize the importance of transparent healthcare policies and practices that openly address patient safety incidents. There is clear advocacy for policy reforms aimed at increasing the accountability and transparency of healthcare providers. Moreover, this study highlights the significance of educational and engagement initiatives involving healthcare consumers in fostering a culture of patient safety. Integrating consumer perspectives into patient safety strategies is crucial for developing a robust safety culture in healthcare.

Topics: Humans; Social Media; Patient Safety

PubMed: 38863193
DOI: 10.4040/jkan.23156

KoNA: Korean Nucleotide Archive as A New Data Repository for Nucleotide Sequence Data.

Genomics, Proteomics & Bioinformatics May 2024

During the last decade, the generation and accumulation of petabase-scale high-throughput sequencing data have resulted in great challenges, including access to human...

Summary PubMed

Authors: Gunhwan Ko, Jae Ho Lee, Young Mi Sim...

During the last decade, the generation and accumulation of petabase-scale high-throughput sequencing data have resulted in great challenges, including access to human data, as well as transfer, storage, and sharing of enormous amounts of data. To promote data-driven biological research, the Korean government announced that all biological data generated from government-funded research projects should be deposited at the Korea BioData Station (K-BDS), which consists of multiple databases for individual data types. Here, we introduce the Korean Nucleotide Archive (KoNA), a repository of nucleotide sequence data. As of July 2022, the Korean Read Archive in KoNA has collected over 477 TB of raw next-generation sequencing data from national genome projects. To ensure data quality and prepare for international alignment, a standard operating procedure was adopted, which is similar to that of the International Nucleotide Sequence Database Collaboration. The standard operating procedure includes quality control processes for submitted data and metadata using an automated pipeline, followed by manual examination. To ensure fast and stable data transfer, a high-speed transmission system called GBox is used in KoNA. Furthermore, the data uploaded to or downloaded from KoNA through GBox can be readily processed using a cloud computing service called Bio-Express. This seamless coupling of KoNA, GBox, and Bio-Express enhances the data experience, including submission, access, and analysis of raw nucleotide sequences. KoNA not only satisfies the unmet needs for a national sequence repository in Korea but also provides datasets to researchers globally and contributes to advances in genomics. The KoNA is available at https://www.kobic.re.kr/kona/.

Topics: Republic of Korea; Databases, Nucleic Acid; Humans; High-Throughput Nucleotide Sequencing

PubMed: 38862433
DOI: 10.1093/gpbjnl/qzae017

Automated selection of abdominal MRI series using a DICOM metadata classifier and selective use of a pixel-based classifier.

Abdominal Radiology (New York) Jun 2024

Accurate, automated MRI series identification is important for many applications, including display ("hanging") protocols, machine learning, and radiomics. The use of...

Summary PubMed

Authors: Chad M Miller, Zhe Zhu, Maciej A Mazurowski...

Accurate, automated MRI series identification is important for many applications, including display ("hanging") protocols, machine learning, and radiomics. The use of the series description or a pixel-based classifier each has limitations. We demonstrate a combined approach utilizing a DICOM metadata-based classifier and selective use of a pixel-based classifier to identify abdominal MRI series. The metadata classifier was assessed alone as Group metadata and combined with selective use of the pixel-based classifier for predictions with less than 70% certainty (Group combined). The overall accuracy (mean and 95% confidence intervals) for Groups metadata and combined on the test dataset were 0.870 CI (0.824,0.912) and 0.930 CI (0.893,0.963), respectively. With this combined metadata and pixel-based approach, we demonstrate accurate classification of 95% or greater for all pre-contrast MRI series and improved performance for some post-contrast series.

PubMed: 38860997
DOI: 10.1007/s00261-024-04379-5

Within-host influenza viral diversity in the pediatric population as a function of age, vaccine, and health status.

Virus Evolution 2024

Seasonal influenza virus predominantly evolves through antigenic drift, marked by the accumulation of mutations at antigenic sites. Because of antigenic drift, influenza...

Summary PubMed Full Text PDF

Authors: Ashley Sobel Leonard, Lydia Mendoza, Alexander G McFarland...

Seasonal influenza virus predominantly evolves through antigenic drift, marked by the accumulation of mutations at antigenic sites. Because of antigenic drift, influenza vaccines are frequently updated, though their efficacy may still be limited due to strain mismatches. Despite the high levels of viral diversity observed across populations, most human studies reveal limited intrahost diversity, leaving the origin of population-level viral diversity unclear. Previous studies show host characteristics, such as immunity, might affect within-host viral evolution. Here we investigate influenza A viral diversity in children aged between 6 months and 18 years. Influenza virus evolution in children is less well characterized than in adults, yet may be associated with higher levels of viral diversity given the lower level of pre-existing immunity and longer durations of infection in children. We obtained influenza isolates from banked influenza A-positive nasopharyngeal swabs collected at the Children's Hospital of Philadelphia during the 2017-18 influenza season. Using next-generation sequencing, we evaluated the population of influenza viruses present in each sample. We characterized within-host viral diversity using the number and frequency of intrahost single-nucleotide variants (iSNVs) detected in each sample. We related viral diversity to clinical metadata, including subjects' age, vaccination status, and comorbid conditions, as well as sample metadata such as virus strain and cycle threshold. Consistent with previous studies, most samples contained low levels of diversity with no clear association between the subjects' age, vaccine status, or health status. Further, there was no enrichment of iSNVs near known antigenic sites. Taken together, these findings are consistent with previous observations that the majority of intrahost influenza virus infection is characterized by low viral diversity without evidence of diversifying selection.

PubMed: 38859985
DOI: 10.1093/ve/veae034

RefAI: a GPT-powered retrieval-augmented generative tool for biomedical literature recommendation and summarization.

Journal of the American Medical... Jun 2024

Precise literature recommendation and summarization are crucial for biomedical professionals. While the latest iteration of generative pretrained transformer (GPT)...

Summary PubMed

Authors: Yiming Li, Jeff Zhao, Manqi Li...

OBJECTIVES

Precise literature recommendation and summarization are crucial for biomedical professionals. While the latest iteration of generative pretrained transformer (GPT) incorporates 2 distinct modes-real-time search and pretrained model utilization-it encounters challenges in dealing with these tasks. Specifically, the real-time search can pinpoint some relevant articles but occasionally provides fabricated papers, whereas the pretrained model excels in generating well-structured summaries but struggles to cite specific sources. In response, this study introduces RefAI, an innovative retrieval-augmented generative tool designed to synergize the strengths of large language models (LLMs) while overcoming their limitations.

MATERIALS AND METHODS

RefAI utilized PubMed for systematic literature retrieval, employed a novel multivariable algorithm for article recommendation, and leveraged GPT-4 turbo for summarization. Ten queries under 2 prevalent topics ("cancer immunotherapy and target therapy" and "LLMs in medicine") were chosen as use cases and 3 established counterparts (ChatGPT-4, ScholarAI, and Gemini) as our baselines. The evaluation was conducted by 10 domain experts through standard statistical analyses for performance comparison.

RESULTS

The overall performance of RefAI surpassed that of the baselines across 5 evaluated dimensions-relevance and quality for literature recommendation, accuracy, comprehensiveness, and reference integration for summarization, with the majority exhibiting statistically significant improvements (P-values <.05).

DISCUSSION

RefAI demonstrated substantial improvements in literature recommendation and summarization over existing tools, addressing issues like fabricated papers, metadata inaccuracies, restricted recommendations, and poor reference integration.

CONCLUSION

By augmenting LLM with external resources and a novel ranking algorithm, RefAI is uniquely capable of recommending high-quality literature and generating well-structured summaries, holding the potential to meet the critical needs of biomedical professionals in navigating and synthesizing vast amounts of scientific literature.

PubMed: 38857454
DOI: 10.1093/jamia/ocae129

Creation of Standardized Common Data Elements for Diagnostic Tests in Infectious Disease Studies: Semantic and Syntactic Mapping.

Journal of Medical Internet Research Jun 2024

It is necessary to harmonize and standardize data variables used in case report forms (CRFs) of clinical studies to facilitate the merging and sharing of the collected...

Summary PubMed Full Text PDF

Authors: Caroline Stellmach, Sina Marie Hopff, Thomas Jaenisch...

BACKGROUND

It is necessary to harmonize and standardize data variables used in case report forms (CRFs) of clinical studies to facilitate the merging and sharing of the collected patient data across several clinical studies. This is particularly true for clinical studies that focus on infectious diseases. Public health may be highly dependent on the findings of such studies. Hence, there is an elevated urgency to generate meaningful, reliable insights, ideally based on a high sample number and quality data. The implementation of core data elements and the incorporation of interoperability standards can facilitate the creation of harmonized clinical data sets.

OBJECTIVE

This study's objective was to compare, harmonize, and standardize variables focused on diagnostic tests used as part of CRFs in 6 international clinical studies of infectious diseases in order to, ultimately, then make available the panstudy common data elements (CDEs) for ongoing and future studies to foster interoperability and comparability of collected data across trials.

METHODS

We reviewed and compared the metadata that comprised the CRFs used for data collection in and across all 6 infectious disease studies under consideration in order to identify CDEs. We examined the availability of international semantic standard codes within the Systemized Nomenclature of Medicine - Clinical Terms, the National Cancer Institute Thesaurus, and the Logical Observation Identifiers Names and Codes system for the unambiguous representation of diagnostic testing information that makes up the CDEs. We then proposed 2 data models that incorporate semantic and syntactic standards for the identified CDEs.

RESULTS

Of 216 variables that were considered in the scope of the analysis, we identified 11 CDEs to describe diagnostic tests (in particular, serology and sequencing) for infectious diseases: viral lineage/clade; test date, type, performer, and manufacturer; target gene; quantitative and qualitative results; and specimen identifier, type, and collection date.

CONCLUSIONS

The identification of CDEs for infectious diseases is the first step in facilitating the exchange and possible merging of a subset of data across clinical studies (and with that, large research projects) for possible shared analysis to increase the power of findings. The path to harmonization and standardization of clinical study data in the interest of interoperability can be paved in 2 ways. First, a map to standard terminologies ensures that each data element's (variable's) definition is unambiguous and that it has a single, unique interpretation across studies. Second, the exchange of these data is assisted by "wrapping" them in a standard exchange format, such as Fast Health care Interoperability Resources or the Clinical Data Interchange Standards Consortium's Clinical Data Acquisition Standards Harmonization Model.

Topics: Humans; Communicable Diseases; Semantics; Common Data Elements

PubMed: 38857066
DOI: 10.2196/50049

Genomic surveillance during the first two years of the COVID-19 pandemic - country experience and lessons learned from Türkiye.

Frontiers in Public Health 2024

Türkiye confirmed its first case of SARS-CoV-2 on March 11, 2020, coinciding with the declaration of the global COVID-19 pandemic. Subsequently, Türkiye swiftly...

Summary PubMed Full Text PDF

Authors: Süleyman Yalçın, Yasemin Coşgun, Ege Dedeoğlu...

BACKGROUND

Türkiye confirmed its first case of SARS-CoV-2 on March 11, 2020, coinciding with the declaration of the global COVID-19 pandemic. Subsequently, Türkiye swiftly increased testing capacity and implemented genomic sequencing in 2020. This paper describes Türkiye's journey of establishing genomic surveillance as a middle-income country with limited prior sequencing capacity and analyses sequencing data from the first two years of the pandemic. We highlight the achievements and challenges experienced and distill globally relevant lessons.

METHODS

We tracked the evolution of the COVID-19 pandemic in Türkiye from December 2020 to February 2022 through a timeline and analysed epidemiological, vaccination, and testing data. To investigate the phylodynamic and phylogeographic aspects of SARS-CoV-2, we used Nextstrain to analyze 31,629 high-quality genomes sampled from seven regions nationwide.

RESULTS

Türkiye's epidemiological curve, mirroring global trends, featured four distinct waves, each coinciding with the emergence and spread of variants of concern (VOCs). Utilizing locally manufactured kits to expand testing capacity and introducing variant-specific quantitative reverse transcription polymerase chain reaction (RT-qPCR) tests developed in partnership with a private company was a strategic advantage in Türkiye, given the scarcity and fragmented global supply chain early in the pandemic. Türkiye contributed more than 86,000 genomic sequences to global databases by February 2022, ensuring that Turkish data was reflected globally. The synergy of variant-specific RT-qPCR kits and genomic sequencing enabled cost-effective monitoring of VOCs. However, data analysis was constrained by a weak sequencing sampling strategy and fragmented data management systems, limiting the application of sequencing data to guide the public health response. Phylodynamic analysis indicated that Türkiye's geographical position as an international travel hub influenced both national and global transmission of each VOC despite travel restrictions.

CONCLUSION

This paper provides valuable insights into the testing and genomic surveillance systems adopted by Türkiye during the COVID-19 pandemic, proposing important lessons for countries developing national systems. The findings underscore the need for robust testing and sampling strategies, streamlined sample referral, and integrated data management with metadata linkage and data quality crucial for impactful epidemiological analysis. We recommend developing national genomic surveillance strategies to guide sustainable and integrated expansion of capacities built for COVID-19 and to optimize the effective utilization of sequencing data for public health action.

Topics: Humans; COVID-19; SARS-CoV-2; Genomics; Pandemics; Genome, Viral; Male

PubMed: 38855447
DOI: 10.3389/fpubh.2024.1332109

A one health approach for monitoring antimicrobial resistance: developing a national freshwater pilot effort.

Frontiers in Water May 2024

Antimicrobial resistance (AMR) is a world-wide public health threat that is projected to lead to 10 million annual deaths globally by 2050. The AMR public health issue...

Summary PubMed Full Text PDF

Authors: Alison M Franklin, Daniel L Weller, Lisa M Durso...

Antimicrobial resistance (AMR) is a world-wide public health threat that is projected to lead to 10 million annual deaths globally by 2050. The AMR public health issue has led to the development of action plans to combat AMR, including improved antimicrobial stewardship, development of new antimicrobials, and advanced monitoring. The National Antimicrobial Resistance Monitoring System (NARMS) led by the United States (U.S) Food and Drug Administration along with the U.S. Centers for Disease Control and U.S. Department of Agriculture has monitored antimicrobial resistant bacteria in retail meats, humans, and food animals since the mid 1990's. NARMS is currently exploring an integrated One Health monitoring model recognizing that human, animal, plant, and environmental systems are linked to public health. Since 2020, the U.S. Environmental Protection Agency has led an interagency NARMS environmental working group (EWG) to implement a surface water AMR monitoring program (SWAM) at watershed and national scales. The NARMS EWG divided the development of the environmental monitoring effort into five areas: (i) defining objectives and questions, (ii) designing study/sampling design, (iii) selecting AMR indicators, (iv) establishing analytical methods, and (v) developing data management/analytics/metadata plans. For each of these areas, the consensus among the scientific community and literature was reviewed and carefully considered prior to the development of this environmental monitoring program. The data produced from the SWAM effort will help develop robust surface water monitoring programs with the goal of assessing public health risks associated with AMR pathogens in surface water (e.g., recreational water exposures), provide a comprehensive picture of how resistant strains are related spatially and temporally within a watershed, and help assess how anthropogenic drivers and intervention strategies impact the transmission of AMR within human, animal, and environmental systems.

PubMed: 38855419
DOI: 10.3389/frwa.2024.1359109