-
Journal of Medical Internet Research Sep 2023Conversational agents (CAs), also known as chatbots, are digital dialog systems that enable people to have a text-based, speech-based, or nonverbal conversation with a... (Review)
Review
BACKGROUND
Conversational agents (CAs), also known as chatbots, are digital dialog systems that enable people to have a text-based, speech-based, or nonverbal conversation with a computer or another machine based on natural language via an interface. The use of CAs offers new opportunities and various benefits for health care. However, they are not yet ubiquitous in daily practice. Nevertheless, research regarding the implementation of CAs in health care has grown tremendously in recent years.
OBJECTIVE
This review aims to present a synthesis of the factors that facilitate or hinder the implementation of CAs from the perspectives of patients and health care professionals. Specifically, it focuses on the early implementation outcomes of acceptability, acceptance, and adoption as cornerstones of later implementation success.
METHODS
We performed an integrative review. To identify relevant literature, a broad literature search was conducted in June 2021 with no date limits and using all fields in PubMed, Cochrane Library, Web of Science, LIVIVO, and PsycINFO. To keep the review current, another search was conducted in March 2022. To identify as many eligible primary sources as possible, we used a snowballing approach by searching reference lists and conducted a hand search. Factors influencing the acceptability, acceptance, and adoption of CAs in health care were coded through parallel deductive and inductive approaches, which were informed by current technology acceptance and adoption models. Finally, the factors were synthesized in a thematic map.
RESULTS
Overall, 76 studies were included in this review. We identified influencing factors related to 4 core Unified Theory of Acceptance and Use of Technology (UTAUT) and Unified Theory of Acceptance and Use of Technology 2 (UTAUT2) factors (performance expectancy, effort expectancy, facilitating conditions, and hedonic motivation), with most studies underlining the relevance of performance and effort expectancy. To meet the particularities of the health care context, we redefined the UTAUT2 factors social influence, habit, and price value. We identified 6 other influencing factors: perceived risk, trust, anthropomorphism, health issue, working alliance, and user characteristics. Overall, we identified 10 factors influencing acceptability, acceptance, and adoption among health care professionals (performance expectancy, effort expectancy, facilitating conditions, social influence, price value, perceived risk, trust, anthropomorphism, working alliance, and user characteristics) and 13 factors influencing acceptability, acceptance, and adoption among patients (additionally hedonic motivation, habit, and health issue).
CONCLUSIONS
This review shows manifold factors influencing the acceptability, acceptance, and adoption of CAs in health care. Knowledge of these factors is fundamental for implementation planning. Therefore, the findings of this review can serve as a basis for future studies to develop appropriate implementation strategies. Furthermore, this review provides an empirical test of current technology acceptance and adoption models and identifies areas where additional research is necessary.
TRIAL REGISTRATION
PROSPERO CRD42022343690; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=343690.
Topics: Humans; Communication; Language; Habits; Speech; Delivery of Health Care
PubMed: 37751279
DOI: 10.2196/46548 -
Schizophrenia Research Sep 2023Disorganization, presenting as impairment in thought, language and goal-directed behavior, is a core multidimensional syndrome of psychotic disorders. This study...
BACKGROUND
Disorganization, presenting as impairment in thought, language and goal-directed behavior, is a core multidimensional syndrome of psychotic disorders. This study examined whether scalable computational measures of spoken language, and smartphone usage pattern, could serve as digital biomarkers of clinical disorganization symptoms.
METHODS
We examined in a longitudinal cohort of adults with a psychotic disorder, the associations between clinical measures of disorganization and computational measures of 1) spoken language derived from monthly, semi-structured, recorded clinical interviews; and 2) smartphone usage pattern derived via passive sensing technologies over the month prior to the interview. The language features included speech quantity, rate, fluency, and semantic regularity. The smartphone features included data missingness and phone usage during sleep time. The clinical measures consisted of the Positive and Negative Symptom Scale (PANSS) conceptual disorganization, difficulty in abstract thinking, and poor attention, items. Mixed linear regression analyses were used to estimate both fixed and random effects.
RESULTS
Greater severity of clinical symptoms of conceptual disorganization was associated with greater verbosity and more disfluent speech. Greater severity of conceptual disorganization was also associated with greater missingness of smartphone data, and greater smartphone usage during sleep time. While the observed associations were significant across the group, there was also significant variation between individuals.
CONCLUSIONS
The findings suggest that digital measures of speech disfluency may serve as scalable markers of conceptual disorganization. The findings warrant further investigation into the use of recorded interviews and passive sensing technologies to assist in the characterization and tracking of psychotic illness.
Topics: Adult; Humans; Psychotic Disorders; Language; Thinking; Cognition; Speech
PubMed: 36564239
DOI: 10.1016/j.schres.2022.12.003 -
Memory & Cognition Nov 2023The changing state effect is the finding that a stream of irrelevant sounds that change more (e.g., different digits in random order) disrupts memory more than a stream...
The changing state effect is the finding that a stream of irrelevant sounds that change more (e.g., different digits in random order) disrupts memory more than a stream of irrelevant sounds that change less (e.g., a single digit repeated over and over). According to the Object-Oriented Episodic Record (O-OER) model, the changing state effect will be observed only in memory tasks that have an order component or which induce serial rehearsal or serial processing. In contrast, other accounts-including the Feature Model, the Primacy Model, and various attentional theories-predict that the changing state effect should be observable when there is no order component. Experiment 1 first demonstrated that the irrelevant stimuli created for the current experiments produced a changing state effect in immediate serial recall in both on-campus and online samples. Then, three experiments assessed whether a changing state effect is observable in a surprise 2AFC recognition test. Experiment 2 replicated Stokes and Arnell (2012, Memory & Cognition, 40, 918-931), who found that although irrelevant sounds reduce performance on a surprise recognition test of words presented previously in a lexical decision task, they do not produce a changing state effect. Experiments 3 and 4 used two different encoding tasks (pleasantness and frequency judgment) and also found no changing state effect. The results support the prediction of the O-OER model and provide additional evidence against the other accounts.
Topics: Humans; Speech; Mental Recall; Learning; Speech Perception; Memory, Short-Term
PubMed: 37326785
DOI: 10.3758/s13421-023-01437-z -
The Clinical Neuropsychologist Oct 2023Parkinson's disease (PD) and essential tremor (ET) involve neuroanatomical circuitry that impact frontal lobe functioning, via the striatum and cerebellum,...
Parkinson's disease (PD) and essential tremor (ET) involve neuroanatomical circuitry that impact frontal lobe functioning, via the striatum and cerebellum, respectively. The aim of this exploratory study was to investigate quantitative and qualitative performance between and within these groups on measures of verbal fluency. Sixty-three PD and 53 ET patients completed neuropsychological testing. Linear regression models with robust variance estimation compared verbal fluency performance between groups related to correct responses and errors. Paired -tests investigated within group error rates. PD patients gave more correct responses for phonological (=5.3, =.01) and category fluency (=4.1, =.01) than ET patients; however, when processing speed was added as a covariate, this attenuated performance on both measures and only phonological fluency remained significant (=4.0, =.04). There were no statistical differences in error scores between groups. Error rates within groups suggested that PD patients had higher error rates in total errors and perseveration errors on phonological fluency ( = 2.6, =.00; = 1.6, =.00) and higher total errors and set-loss error rates on category switching ( = 5.1, <.001; = 4.1, <.001). ET patients had higher error rate with relation to total errors and set-loss errors on phonological fluency ( = 2.5, =.00; = 1.5, =.02) and category switching ( = 3.9, =,00; = 3.9, <.001). PD patients performed better than ET patients on phonological fluency. PD patients appear to make more perseveration errors on phonological fluency, while ET patients made more set-loss errors. Implications for frontal lobe dysfunction and clinical impact are discussed.
Topics: Humans; Parkinson Disease; Essential Tremor; Neuropsychological Tests; Processing Speed; Verbal Behavior
PubMed: 36550679
DOI: 10.1080/13854046.2022.2157885 -
Scientific Reports Dec 2023While speech biomarkers of disease have attracted increased interest in recent years, a challenge is that features derived from signal processing or machine learning...
While speech biomarkers of disease have attracted increased interest in recent years, a challenge is that features derived from signal processing or machine learning approaches may lack clinical interpretability. As an example, Mel frequency cepstral coefficients (MFCCs) have been identified in several studies as a useful marker of disease, but are regarded as uninterpretable. Here we explore correlations between MFCC coefficients and more interpretable speech biomarkers. In particular we quantify the MFCC2 endpoint, which can be interpreted as a weighted ratio of low- to high-frequency energy, a concept which has been previously linked to disease-induced voice changes. By exploring MFCC2 in several datasets, we show how its sensitivity to disease can be increased by adjusting computation parameters.
Topics: Speech; Speech Acoustics; Signal Processing, Computer-Assisted
PubMed: 38123603
DOI: 10.1038/s41598-023-49352-2 -
The Journal of Neuroscience : the... Aug 2023Hearing impairment affects many older adults but is often diagnosed decades after speech comprehension in noisy situations has become effortful. Accurate assessment of...
Hearing impairment affects many older adults but is often diagnosed decades after speech comprehension in noisy situations has become effortful. Accurate assessment of listening effort may thus help diagnose hearing impairment earlier. However, pupillometry-the most used approach to assess listening effort-has limitations that hinder its use in practice. The current study explores a novel way to assess listening effort through eye movements. Building on cognitive and neurophysiological work, we examine the hypothesis that eye movements decrease when speech listening becomes challenging. In three experiments with human participants from both sexes, we demonstrate, consistent with this hypothesis, that fixation duration increases and spatial gaze dispersion decreases with increasing speech masking. Eye movements decreased during effortful speech listening for different visual scenes (free viewing, object tracking) and speech materials (simple sentences, naturalistic stories). In contrast, pupillometry was less sensitive to speech masking during story listening, suggesting pupillometric measures may not be as effective for the assessments of listening effort in naturalistic speech-listening paradigms. Our results reveal a critical link between eye movements and cognitive load, suggesting that neural activity in the brain regions that support the regulation of eye movements, such as frontal eye field and superior colliculus, are modulated when listening is effortful. Assessment of listening effort is critical for early diagnosis of age-related hearing loss. Pupillometry is most used but has several disadvantages. The current study explores a novel way to assess listening effort through eye movements. We examine the hypothesis that eye movements decrease when speech listening becomes effortful. We demonstrate, consistent with this hypothesis, that fixation duration increases and gaze dispersion decreases with increasing speech masking. Eye movements decreased during effortful speech listening for different visual scenes (free viewing, object tracking) and speech materials (sentences, naturalistic stories). Our results reveal a critical link between eye movements and cognitive load, suggesting that neural activity in brain regions that support the regulation of eye movements are modulated when listening is effortful.
Topics: Male; Female; Humans; Aged; Speech; Eye Movements; Speech Perception; Auditory Perception; Noise; Speech Intelligibility
PubMed: 37491313
DOI: 10.1523/JNEUROSCI.0240-23.2023 -
Advanced Materials (Deerfield Beach,... Jun 2024A wearable Braille-to-speech translation system is of great importance for providing auditory feedback in assisting blind people and people with speech impairment....
A wearable Braille-to-speech translation system is of great importance for providing auditory feedback in assisting blind people and people with speech impairment. However, previous reported Braille-to-speech translation systems still need to be improved in terms of comfortability or integration. Here, a Braille-to-speech translation system that uses dual-functional electrostatic transducers which are made of fabric-based materials and can be integrated into textiles is reported. Based on electrostatic induction, the electrostatic transducer can either serve as a tactile sensor or a loudspeaker with the same design. The proposed electrostatic transducers have excellent output performances, mechanical robustness, and working stability. By combining the devices with machine learning algorithms, it is possible to translate the Braille alphabet and 40 commonly used words (extensible) into speech with an accuracy of 99.09% and 97.08%, respectively. This work demonstrates a new approach for further developments of advanced assistive technology toward improving the lives of disabled people.
Topics: Textiles; Humans; Static Electricity; Wearable Electronic Devices; Speech; Equipment Design; Sensory Aids; Machine Learning
PubMed: 38502121
DOI: 10.1002/adma.202313518 -
Proceedings of the National Academy of... Oct 2023Human cognition is underpinned by structured internal representations that encode relationships between entities in the world (cognitive maps). Clinical features of...
Human cognition is underpinned by structured internal representations that encode relationships between entities in the world (cognitive maps). Clinical features of schizophrenia-from thought disorder to delusions-are proposed to reflect disorganization in such conceptual representations. Schizophrenia is also linked to abnormalities in neural processes that support cognitive map representations, including hippocampal replay and high-frequency ripple oscillations. Here, we report a computational assay of semantically guided conceptual sampling and exploit this to test a hypothesis that people with schizophrenia (PScz) exhibit abnormalities in semantically guided cognition that relate to hippocampal replay and ripples. Fifty-two participants [26 PScz (13 unmedicated) and 26 age-, gender-, and intelligence quotient (IQ)-matched nonclinical controls] completed a category- and letter-verbal fluency task, followed by a magnetoencephalography (MEG) scan involving a separate sequence-learning task. We used a pretrained word embedding model of semantic similarity, coupled to a computational model of word selection, to quantify the degree to which each participant's verbal behavior was guided by semantic similarity. Using MEG, we indexed neural replay and ripple power in a post-task rest session. Across all participants, word selection was strongly influenced by semantic similarity. The strength of this influence showed sensitivity to task demands (category > letter fluency) and predicted performance. In line with our hypothesis, the influence of semantic similarity on behavior was reduced in schizophrenia relative to controls, predicted negative psychotic symptoms, and correlated with an MEG signature of hippocampal ripple power (but not replay). The findings bridge a gap between phenomenological and neurocomputational accounts of schizophrenia.
Topics: Humans; Schizophrenia; Semantics; Psychotic Disorders; Verbal Behavior; Learning
PubMed: 37816054
DOI: 10.1073/pnas.2305290120 -
Integrative Psychological & Behavioral... Sep 2023This article reflects on the analyzes and comments of Marioka (2023), Fadeev (2023) and Machková (2023) on the book New Perspectives on Inner Speech (Fossa, 2022a)....
This article reflects on the analyzes and comments of Marioka (2023), Fadeev (2023) and Machková (2023) on the book New Perspectives on Inner Speech (Fossa, 2022a). First, I focus on responding and expanding the ideas presented by the authors, to later integrate the elements highlighted by them. By integrating the reflections and comments of the authors, it is evident that in inner speech there is an intersection between two continua. On the one hand, the control-lack of control continuum and, on the other hand, the diffuse-clear continuum. The level of clarity and control varies permanently during each act of internal speech, accounting for a phenomenon that advances from infinite interiority to infinite exteriority, and vice versa. This complex interaction of two continua - based on the level of control and the level of sharpness - defies empirical applications and demands methodological innovation in research centers interested in the Inexhaustible Experience of the Inner Voice.
Topics: Humans; Speech
PubMed: 37199895
DOI: 10.1007/s12124-023-09782-z -
American Journal of Speech-language... Jul 2023This study investigated perceived speech naturalness estimated by adult listeners in typically developing children and children with dysarthria. We aimed to identify...
PURPOSE
This study investigated perceived speech naturalness estimated by adult listeners in typically developing children and children with dysarthria. We aimed to identify predictors of naturalness among auditory-perceptual parameters and to evaluate the concept of naturalness as a clinical marker of childhood dysarthria.
METHOD
In a listening experiment, naive adult listeners rated speech naturalness of 144 typically developing children (3-9 years old) and 28 children with neurological conditions (5-9 years old) on a visual analog scale. Speech samples were recorded using the materials of the Bogenhausen Dysarthria Scales-Childhood Dysarthria, which also provides for auditory-perceptual judgments covering all speech subsystems.
RESULTS
Children with dysarthria obtained significantly lower naturalness ratings compared to typically developing children. However, there was a substantial age effect observable in the typically developing children; that is, younger typically developing children were also perceived as somewhat unnatural. The ratings of the typically developing children were influenced by the occurrence of developmental speech features; for the children with neurological conditions, specific symptoms of dysarthria had an additional effect. In both groups, the perception of naturalness was predominantly determined by the children's articulation and intelligibility.
CONCLUSIONS
Both symptoms of childhood dysarthria and developmental speech features (e.g., regarding articulation and intelligibility) were associated to some extent with unnatural speech by the listeners. Thus, perceived speech naturalness appears less suitable as a marker of dysarthria in children than in adults.
Topics: Adult; Humans; Child, Preschool; Child; Speech; Dysarthria; Speech Production Measurement; Auditory Perception; Judgment; Speech Intelligibility
PubMed: 37343549
DOI: 10.1044/2023_AJSLP-23-00023