-
Developmental Cognitive Neuroscience Jun 2024One in three children in the United States is exposed to insecure housing conditions, including unaffordable, inconsistent, and unsafe housing. These exposures have...
Integrating developmental neuroscience with community-engaged approaches to address mental health outcomes for housing-insecure youth: Implications for research, practice, and policy.
One in three children in the United States is exposed to insecure housing conditions, including unaffordable, inconsistent, and unsafe housing. These exposures have detrimental impacts on youth mental health. Delineating the neurobehavioral pathways linking exposure to housing insecurity with children's mental health has the potential to inform interventions and policy. However, in approaching this work, carefully considering the lived experiences of youth and families is essential to translating scientific discovery to improve health outcomes in an equitable and representative way. In the current paper, we provide an introduction to the range of stressful experiences that children may face when exposed to insecure housing conditions. Next, we highlight findings from the early-life stress literature regarding the potential neurobehavioral consequences of insecure housing, focusing on how unpredictability is associated with the neural circuitry supporting cognitive and emotional development. We then delineate how community-engaged research (CEnR) approaches have been leveraged to understand the effects of housing insecurity on mental health, and we propose future research directions that integrate developmental neuroscience research and CEnR approaches to maximize the impact of this work. We conclude by outlining practice and policy recommendations that aim to improve the mental health of children exposed to insecure housing.
PubMed: 38875770
DOI: 10.1016/j.dcn.2024.101399 -
JMIR AI Feb 2024The use of artificial intelligence (AI) for pain assessment has the potential to address historical challenges in infant pain assessment. There is a dearth of...
BACKGROUND
The use of artificial intelligence (AI) for pain assessment has the potential to address historical challenges in infant pain assessment. There is a dearth of information on the perceived benefits and barriers to the implementation of AI for neonatal pain monitoring in the neonatal intensive care unit (NICU) from the perspective of health care professionals (HCPs) and parents. This qualitative analysis provides novel data obtained from 2 large tertiary care hospitals in Canada and the United Kingdom.
OBJECTIVE
The aim of the study is to explore the perspectives of HCPs and parents regarding the use of AI for pain assessment in the NICU.
METHODS
In total, 20 HCPs and 20 parents of preterm infants were recruited and consented to participate from February 2020 to October 2022 in interviews asking about AI use for pain assessment in the NICU, potential benefits of the technology, and potential barriers to use.
RESULTS
The 40 participants included 20 HCPs (17 women and 3 men) with an average of 19.4 (SD 10.69) years of experience in the NICU and 20 parents (mean age 34.4, SD 5.42 years) of preterm infants who were on average 43 (SD 30.34) days old. Six themes from the perspective of HCPs were identified: regular use of technology in the NICU, concerns with regard to AI integration, the potential to improve patient care, requirements for implementation, AI as a tool for pain assessment, and ethical considerations. Seven parent themes included the potential for improved care, increased parental distress, support for parents regarding AI, the impact on parent engagement, the importance of human care, requirements for integration, and the desire for choice in its use. A consistent theme was the importance of AI as a tool to inform clinical decision-making and not replace it.
CONCLUSIONS
HCPs and parents expressed generally positive sentiments about the potential use of AI for pain assessment in the NICU, with HCPs highlighting important ethical considerations. This study identifies critical methodological and ethical perspectives from key stakeholders that should be noted by any team considering the creation and implementation of AI for pain monitoring in the NICU.
PubMed: 38875686
DOI: 10.2196/51535 -
JMIR AI Jun 2024The COVID-19 pandemic had a devastating global impact. In the United States, there were >98 million COVID-19 cases and >1 million resulting deaths. One consequence of...
BACKGROUND
The COVID-19 pandemic had a devastating global impact. In the United States, there were >98 million COVID-19 cases and >1 million resulting deaths. One consequence of COVID-19 infection has been post-COVID-19 condition (PCC). People with this syndrome, colloquially called long haulers, experience symptoms that impact their quality of life. The root cause of PCC and effective treatments remains unknown. Many long haulers have turned to social media for support and guidance.
OBJECTIVE
In this study, we sought to gain a better understanding of the long hauler experience by investigating what has been discussed and how information about long haulers is perceived on social media. We specifically investigated the following: (1) the range of symptoms that are discussed, (2) the ways in which information about long haulers is perceived, (3) informational and emotional support that is available to long haulers, and (4) discourse between viewers and creators. We selected YouTube as our data source due to its popularity and wide range of audience.
METHODS
We systematically gathered data from 3 different types of content creators: medical sources, news sources, and long haulers. To computationally understand the video content and viewers' reactions, we used Biterm, a topic modeling algorithm created specifically for short texts, to analyze snippets of video transcripts and all top-level comments from the comment section. To triangulate our findings about viewers' reactions, we used the Valence Aware Dictionary and Sentiment Reasoner to conduct sentiment analysis on comments from each type of content creator. We grouped the comments into positive and negative categories and generated topics for these groups using Biterm. We then manually grouped resulting topics into broader themes for the purpose of analysis.
RESULTS
We organized the resulting topics into 28 themes across all sources. Examples of medical source transcript themes were Explanations in layman's terms and Biological explanations. Examples of news source transcript themes were Negative experiences and handling the long haul. The 2 long hauler transcript themes were Taking treatments into own hands and Changes to daily life. News sources received a greater share of negative comments. A few themes of these negative comments included Misinformation and disinformation and Issues with the health care system. Similarly, negative long hauler comments were organized into several themes, including Disillusionment with the health care system and Requiring more visibility. In contrast, positive medical source comments captured themes such as Appreciation of helpful content and Exchange of helpful information. In addition to this theme, one positive theme found in long hauler comments was Community building.
CONCLUSIONS
The results of this study could help public health agencies, policy makers, organizations, and health researchers understand symptomatology and experiences related to PCC. They could also help these agencies develop their communication strategy concerning PCC.
PubMed: 38875666
DOI: 10.2196/54501 -
JMIR AI May 2024Health care-associated infections due to multidrug-resistant organisms (MDROs), such as methicillin-resistant Staphylococcus aureus (MRSA) and Clostridioides difficile...
BACKGROUND
Health care-associated infections due to multidrug-resistant organisms (MDROs), such as methicillin-resistant Staphylococcus aureus (MRSA) and Clostridioides difficile (CDI), place a significant burden on our health care infrastructure.
OBJECTIVE
Screening for MDROs is an important mechanism for preventing spread but is resource intensive. The objective of this study was to develop automated tools that can predict colonization or infection risk using electronic health record (EHR) data, provide useful information to aid infection control, and guide empiric antibiotic coverage.
METHODS
We retrospectively developed a machine learning model to detect MRSA colonization and infection in undifferentiated patients at the time of sample collection from hospitalized patients at the University of Virginia Hospital. We used clinical and nonclinical features derived from on-admission and throughout-stay information from the patient's EHR data to build the model. In addition, we used a class of features derived from contact networks in EHR data; these network features can capture patients' contacts with providers and other patients, improving model interpretability and accuracy for predicting the outcome of surveillance tests for MRSA. Finally, we explored heterogeneous models for different patient subpopulations, for example, those admitted to an intensive care unit or emergency department or those with specific testing histories, which perform better.
RESULTS
We found that the penalized logistic regression performs better than other methods, and this model's performance measured in terms of its receiver operating characteristics-area under the curve score improves by nearly 11% when we use polynomial (second-degree) transformation of the features. Some significant features in predicting MDRO risk include antibiotic use, surgery, use of devices, dialysis, patient's comorbidity conditions, and network features. Among these, network features add the most value and improve the model's performance by at least 15%. The penalized logistic regression model with the same transformation of features also performs better than other models for specific patient subpopulations.
CONCLUSIONS
Our study shows that MRSA risk prediction can be conducted quite effectively by machine learning methods using clinical and nonclinical features derived from EHR data. Network features are the most predictive and provide significant improvement over prior methods. Furthermore, heterogeneous prediction models for different patient subpopulations enhance the model's performance.
PubMed: 38875598
DOI: 10.2196/48067 -
JMIR AI May 2024Large language models (LLMs) have the potential to support promising new applications in health informatics. However, practical data on sample size considerations for...
BACKGROUND
Large language models (LLMs) have the potential to support promising new applications in health informatics. However, practical data on sample size considerations for fine-tuning LLMs to perform specific tasks in biomedical and health policy contexts are lacking.
OBJECTIVE
This study aims to evaluate sample size and sample selection techniques for fine-tuning LLMs to support improved named entity recognition (NER) for a custom data set of conflicts of interest disclosure statements.
METHODS
A random sample of 200 disclosure statements was prepared for annotation. All "PERSON" and "ORG" entities were identified by each of the 2 raters, and once appropriate agreement was established, the annotators independently annotated an additional 290 disclosure statements. From the 490 annotated documents, 2500 stratified random samples in different size ranges were drawn. The 2500 training set subsamples were used to fine-tune a selection of language models across 2 model architectures (Bidirectional Encoder Representations from Transformers [BERT] and Generative Pre-trained Transformer [GPT]) for improved NER, and multiple regression was used to assess the relationship between sample size (sentences), entity density (entities per sentence [EPS]), and trained model performance (F-score). Additionally, single-predictor threshold regression models were used to evaluate the possibility of diminishing marginal returns from increased sample size or entity density.
RESULTS
Fine-tuned models ranged in topline NER performance from F-score=0.79 to F-score=0.96 across architectures. Two-predictor multiple linear regression models were statistically significant with multiple R ranging from 0.6057 to 0.7896 (all P<.001). EPS and the number of sentences were significant predictors of F-scores in all cases ( P<.001), except for the GPT-2_large model, where EPS was not a significant predictor (P=.184). Model thresholds indicate points of diminishing marginal return from increased training data set sample size measured by the number of sentences, with point estimates ranging from 439 sentences for RoBERTa_large to 527 sentences for GPT-2_large. Likewise, the threshold regression models indicate a diminishing marginal return for EPS with point estimates between 1.36 and 1.38.
CONCLUSIONS
Relatively modest sample sizes can be used to fine-tune LLMs for NER tasks applied to biomedical text, and training data entity density should representatively approximate entity density in production data. Training data quality and a model architecture's intended use (text generation vs text processing or classification) may be as, or more, important as training data volume and model parameter size.
PubMed: 38875593
DOI: 10.2196/52095 -
JMIR AI Jan 2024The COVID-19 pandemic drove investment and research into medical imaging platforms to provide data to create artificial intelligence (AI) algorithms for the management...
BACKGROUND
The COVID-19 pandemic drove investment and research into medical imaging platforms to provide data to create artificial intelligence (AI) algorithms for the management of patients with COVID-19. Building on the success of England's National COVID-19 Chest Imaging Database, the national digital policy body (NHSX) sought to create a generalized national medical imaging platform for the development, validation, and deployment of algorithms.
OBJECTIVE
This study aims to understand international use cases of medical imaging platforms for the development and implementation of algorithms to inform the creation of England's national imaging platform.
METHODS
The National Health Service (NHS) AI Lab Policy and Strategy Team adopted a multiphased approach: (1) identification and prioritization of national AI imaging platforms; (2) Political, Economic, Social, Technological, Legal, and Environmental (PESTLE) factor analysis deep dive into national AI imaging platforms; (3) semistructured interviews with key stakeholders; (4) workshop on emerging themes and insights with the internal NHSX team; and (5) formulation of policy recommendations.
RESULTS
International use cases of national AI imaging platforms (n=7) were prioritized for PESTLE factor analysis. Stakeholders (n=13) from the international use cases were interviewed. Themes (n=8) from the semistructured interviews, including interview quotes, were analyzed with workshop participants (n=5). The outputs of the deep dives, interviews, and workshop were synthesized thematically into 8 categories with 17 subcategories. On the basis of the insights from the international use cases, policy recommendations (n=12) were developed to support the NHS AI Lab in the design and development of the English national medical imaging platform.
CONCLUSIONS
The creation of AI algorithms supporting technology and infrastructure such as platforms often occurs in isolation within countries, let alone between countries. This novel policy research project sought to bridge the gap by learning from the challenges, successes, and experience of England's international counterparts. Policy recommendations based on international learnings focused on the demonstrable benefits of the platform to secure sustainable funding, validation of algorithms and infrastructure to support in situ deployment, and creating wraparound tools for nontechnical participants such as clinicians to engage with algorithm creation. As health care organizations increasingly adopt technological solutions, policy makers have a responsibility to ensure that initiatives are informed by learnings from both national and international initiatives as well as disseminating the outcomes of their work.
PubMed: 38875584
DOI: 10.2196/51168 -
JMIR AI Mar 2024Large curated data sets are required to leverage speech-based tools in health care. These are costly to produce, resulting in increased interest in data sharing. As...
BACKGROUND
Large curated data sets are required to leverage speech-based tools in health care. These are costly to produce, resulting in increased interest in data sharing. As speech can potentially identify speakers (ie, voiceprints), sharing recordings raises privacy concerns. This is especially relevant when working with patient data protected under the Health Insurance Portability and Accountability Act.
OBJECTIVE
We aimed to determine the reidentification risk for speech recordings, without reference to demographics or metadata, in clinical data sets considering both the size of the search space (ie, the number of comparisons that must be considered when reidentifying) and the nature of the speech recording (ie, the type of speech task).
METHODS
Using a state-of-the-art speaker identification model, we modeled an adversarial attack scenario in which an adversary uses a large data set of identified speech (hereafter, the known set) to reidentify as many unknown speakers in a shared data set (hereafter, the unknown set) as possible. We first considered the effect of search space size by attempting reidentification with various sizes of known and unknown sets using VoxCeleb, a data set with recordings of natural, connected speech from >7000 healthy speakers. We then repeated these tests with different types of recordings in each set to examine whether the nature of a speech recording influences reidentification risk. For these tests, we used our clinical data set composed of recordings of elicited speech tasks from 941 speakers.
RESULTS
We found that the risk was inversely related to the number of comparisons an adversary must consider (ie, the search space), with a positive linear correlation between the number of false acceptances (FAs) and the number of comparisons (r=0.69; P<.001). The true acceptances (TAs) stayed relatively stable, and the ratio between FAs and TAs rose from 0.02 at 1 × 10 comparisons to 1.41 at 6 × 10 comparisons, with a near 1:1 ratio at the midpoint of 3 × 10 comparisons. In effect, risk was high for a small search space but dropped as the search space grew. We also found that the nature of a speech recording influenced reidentification risk, with nonconnected speech (eg, vowel prolongation: FA/TA=98.5; alternating motion rate: FA/TA=8) being harder to identify than connected speech (eg, sentence repetition: FA/TA=0.54) in cross-task conditions. The inverse was mostly true in within-task conditions, with the FA/TA ratio for vowel prolongation and alternating motion rate dropping to 0.39 and 1.17, respectively.
CONCLUSIONS
Our findings suggest that speaker identification models can be used to reidentify participants in specific circumstances, but in practice, the reidentification risk appears small. The variation in risk due to search space size and type of speech task provides actionable recommendations to further increase participant privacy and considerations for policy regarding public release of speech recordings.
PubMed: 38875581
DOI: 10.2196/52054 -
JMIR AI Jun 2023With the growing volume and complexity of laboratory repositories, it has become tedious to parse unstructured data into structured and tabulated formats for secondary...
BACKGROUND
With the growing volume and complexity of laboratory repositories, it has become tedious to parse unstructured data into structured and tabulated formats for secondary uses such as decision support, quality assurance, and outcome analysis. However, advances in natural language processing (NLP) approaches have enabled efficient and automated extraction of clinically meaningful medical concepts from unstructured reports.
OBJECTIVE
In this study, we aimed to determine the feasibility of using the NLP model for information extraction as an alternative approach to a time-consuming and operationally resource-intensive handcrafted rule-based tool. Therefore, we sought to develop and evaluate a deep learning-based NLP model to derive knowledge and extract information from text-based laboratory reports sourced from a provincial laboratory repository system.
METHODS
The NLP model, a hierarchical multilabel classifier, was trained on a corpus of laboratory reports covering testing for 14 different respiratory viruses and viral subtypes. The corpus includes 87,500 unique laboratory reports annotated by 8 subject matter experts (SMEs). The classification task involved assigning the laboratory reports to labels at 2 levels: 24 fine-grained labels in level 1 and 6 coarse-grained labels in level 2. A "label" also refers to the status of a specific virus or strain being tested or detected (eg, influenza A is detected). The model's performance stability and variation were analyzed across all labels in the classification task. Additionally, the model's generalizability was evaluated internally and externally on various test sets.
RESULTS
Overall, the NLP model performed well on internal, out-of-time (pre-COVID-19), and external (different laboratories) test sets with microaveraged F-scores >94% across all classes. Higher precision and recall scores with less variability were observed for the internal and pre-COVID-19 test sets. As expected, the model's performance varied across categories and virus types due to the imbalanced nature of the corpus and sample sizes per class. There were intrinsically fewer classes of viruses being detected than those tested; therefore, the model's performance (lowest F-score of 57%) was noticeably lower in the detected cases.
CONCLUSIONS
We demonstrated that deep learning-based NLP models are promising solutions for information extraction from text-based laboratory reports. These approaches enable scalable, timely, and practical access to high-quality and encoded laboratory data if integrated into laboratory information system repositories.
PubMed: 38875570
DOI: 10.2196/44835 -
JMIR AI Apr 2024The world has witnessed increased adoption of large language models (LLMs) in the last year. Although the products developed using LLMs have the potential to solve...
Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability Framework for Safe and Effective Large Language Models in Medical Education: Narrative Review and Qualitative Study.
BACKGROUND
The world has witnessed increased adoption of large language models (LLMs) in the last year. Although the products developed using LLMs have the potential to solve accessibility and efficiency problems in health care, there is a lack of available guidelines for developing LLMs for health care, especially for medical education.
OBJECTIVE
The aim of this study was to identify and prioritize the enablers for developing successful LLMs for medical education. We further evaluated the relationships among these identified enablers.
METHODS
A narrative review of the extant literature was first performed to identify the key enablers for LLM development. We additionally gathered the opinions of LLM users to determine the relative importance of these enablers using an analytical hierarchy process (AHP), which is a multicriteria decision-making method. Further, total interpretive structural modeling (TISM) was used to analyze the perspectives of product developers and ascertain the relationships and hierarchy among these enablers. Finally, the cross-impact matrix-based multiplication applied to a classification (MICMAC) approach was used to determine the relative driving and dependence powers of these enablers. A nonprobabilistic purposive sampling approach was used for recruitment of focus groups.
RESULTS
The AHP demonstrated that the most important enabler for LLMs was credibility, with a priority weight of 0.37, followed by accountability (0.27642) and fairness (0.10572). In contrast, usability, with a priority weight of 0.04, showed negligible importance. The results of TISM concurred with the findings of the AHP. The only striking difference between expert perspectives and user preference evaluation was that the product developers indicated that cost has the least importance as a potential enabler. The MICMAC analysis suggested that cost has a strong influence on other enablers. The inputs of the focus group were found to be reliable, with a consistency ratio less than 0.1 (0.084).
CONCLUSIONS
This study is the first to identify, prioritize, and analyze the relationships of enablers of effective LLMs for medical education. Based on the results of this study, we developed a comprehendible prescriptive framework, named CUC-FATE (Cost, Usability, Credibility, Fairness, Accountability, Transparency, and Explainability), for evaluating the enablers of LLMs in medical education. The study findings are useful for health care professionals, health technology experts, medical technology regulators, and policy makers.
PubMed: 38875562
DOI: 10.2196/51834 -
JMIR AI Dec 2023Drug-induced mortality across the United States has continued to rise. To date, there are limited measures to evaluate patient preferences and priorities regarding...
BACKGROUND
Drug-induced mortality across the United States has continued to rise. To date, there are limited measures to evaluate patient preferences and priorities regarding substance use disorder (SUD) treatment, and many patients do not have access to evidence-based treatment options. Patients and their families seeking SUD treatment may begin their search for an SUD treatment facility online, where they can find information about individual facilities, as well as a summary of patient-generated web-based reviews via popular platforms such as Google or Yelp. Web-based reviews of health care facilities may reflect information about factors associated with positive or negative patient satisfaction. The association between patient satisfaction with SUD treatment and drug-induced mortality is not well understood.
OBJECTIVE
The objective of this study was to examine the association between online review content of SUD treatment facilities and drug-induced state mortality.
METHODS
A cross-sectional analysis of online reviews and ratings of Substance Abuse and Mental Health Services Administration (SAMHSA)-designated SUD treatment facilities listed between September 2005 and October 2021 was conducted. The primary outcomes were (1) mean online rating of SUD treatment facilities from 1 star (worst) to 5 stars (best) and (2) average drug-induced mortality rates from the Centers for Disease Control and Prevention (CDC) WONDER Database (2006-2019). Clusters of words with differential frequencies within reviews were identified. A 3-level linear model was used to estimate the association between online review ratings and drug-induced mortality.
RESULTS
A total of 589 SAMHSA-designated facilities (n=9597 reviews) were included in this study. Drug-induced mortality was compared with the average. Approximately half (24/47, 51%) of states had below average ("low") mortality rates (mean 13.40, SD 2.45 deaths per 100,000 people), and half (23/47, 49%) had above average ("high") drug-induced mortality rates (mean 21.92, SD 3.69 deaths per 100,000 people). The top 5 themes associated with low drug-induced mortality included detoxification and addiction rehabilitation services (r=0.26), gratitude for recovery (r=-0.25), thankful for treatment (r=-0.32), caring staff and amazing experience (r=-0.23), and individualized recovery programs (r=-0.20). The top 5 themes associated with high mortality were care from doctors or providers (r=0.24), rude and insensitive care (r=0.23), medication and prescriptions (r=0.22), front desk and reception experience (r=0.22), and dissatisfaction with communication (r=0.21). In the multilevel linear model, a state with a 10 deaths per 100,000 people increase in mortality was associated with a 0.30 lower average Yelp rating (P=.005).
CONCLUSIONS
Lower online ratings of SUD treatment facilities were associated with higher drug-induced mortality at the state level. Elements of patient experience may be associated with state-level mortality. Identified themes from online, organically derived patient content can inform efforts to improve high-quality and patient-centered SUD care.
PubMed: 38875553
DOI: 10.2196/46317