-
CoDAS 2023To validate a proposal for an autobiographical interview oriented to the typical older adult.
PURPOSE
To validate a proposal for an autobiographical interview oriented to the typical older adult.
METHODS
Questions for a semi-structured autobiographical memory interview were designed and a protocol for its application was developed. Fourteen Speech and Language Pathologists judges and 14 older adults were used. Additionally, 2 interviews were conducted. Subsequently, content validity was obtained by means of Lawshe's classic procedure. Also, using a complementary evaluation for comprehensibility and length of the interview, as well as a data triangulation procedure with the judges and participants of the pilot experience.
RESULTS
Of the 22 items evaluated, only 4 were above the critical reference value (0.49).
CONCLUSION
The need to incorporate this measure of analysis in the context of respect, identity and agency of older adults is discussed as part of a change in thinking under the gaze of person-centered care and a communicative competence model. As well as the need to incorporate different cultural paradigms and the use of digital technologies.
Topics: Humans; Aged; Speech; Language; Surveys and Questionnaires
PubMed: 37820098
DOI: 10.1590/2317-1782/20232022151es -
PloS One 2023Extracting speech information from vibration response signals is a typical system identification problem, and the traditional method is too sensitive to deviations such...
Extracting speech information from vibration response signals is a typical system identification problem, and the traditional method is too sensitive to deviations such as model parameters, noise, boundary conditions, and position. A method was proposed to obtain speech signals by collecting vibration signals of vibroacoustic systems for deep learning training in the work. The vibroacoustic coupling finite element model was first established with the voice signal as the excitation source. The vibration acceleration signals of the vibration response point were used as the training set to extract its spectral characteristics. Training was performed by two types of networks: fully connected, and convolutional. And it is found that the Fully Connected network prediction model has faster Rate of convergence and better quality of extracted speech. The amplitude spectra of the output speech signals (network output) and the phase of the vibration signals were used to convert extracted speech signals back to the time domain during the test set. The simulation results showed that the positions of the vibration response points had little effect on the quality of speech recognition, and good speech extraction quality can be obtained. The noises of the speech signals posed a greater influence on the speech extraction quality than the noises of the vibration signals. Extracted speech quality was poor when both had large noises. This method was robust to the position deviation of vibration responses during training and testing. The smaller the structural flexibility, the better the speech extraction quality. The quality of speech extraction was reduced in a trained system as the mass of node increased in the test set, but with negligible differences. Changes in boundary conditions did not significantly affect extracted speech quality. The speech extraction model proposed in the work has good robustness to position deviations, quality deviations, and boundary conditions.
Topics: Speech; Vibration; Deep Learning; Noise
PubMed: 37878667
DOI: 10.1371/journal.pone.0288847 -
JASA Express Letters Sep 2023This study examined accent rating of speech samples collected from 12 Mandarin-accented English talkers and two native English talkers. The speech samples were processed...
This study examined accent rating of speech samples collected from 12 Mandarin-accented English talkers and two native English talkers. The speech samples were processed with noise- and tone-vocoders at 1, 2, 4, 8, and 16 channels. The accentedness of the vocoded and unprocessed signals was judged by 53 native English listeners on a 9-point scale. The foreign-accented talkers were judged as having a less strong accent in the vocoded conditions than in the unprocessed condition. The native talkers and foreign-accented talkers with varying degrees of accentedness demonstrated different patterns of accent rating changes as a function of the number of channels.
Topics: Speech; Speech Perception; Language; Noise; Internationality
PubMed: 37747319
DOI: 10.1121/10.0020989 -
American Journal of Speech-language... Sep 2023The aim of this study was to quantify the combined effects of face masks and effortful speech styles on listener intelligibility and perceived listener effort in talkers...
PURPOSE
The aim of this study was to quantify the combined effects of face masks and effortful speech styles on listener intelligibility and perceived listener effort in talkers with and without Parkinson's disease (PD).
METHOD
Ten people with PD and 10 healthy, older controls read aloud sentences in two face mask and three speech style conditions. Masks included no mask and KN95 masks. Speech styles included habitual, clear, and loud. Listener participants were tasked with listening to each sentence mixed with background noise and then transcribing what they heard and rating how effortful it was to understand. Listener accuracy and effort were each modeled as a function of speaker group, face mask, and speech style using mixed-effects regression models.
RESULTS
Listeners were less accurate and reported greater listening effort for the PD group and for the mask condition. Listeners were more accurate and reported less effort when listening to clear and loud compared to habitual speech. Listener accuracy and listener effort were strongly negatively correlated across all conditions. Face masks were also associated with a steeper decline in speech intelligibility and an increase in listener effort for talkers with PD.
DISCUSSION
Face masks resulted in steeper speech intelligibility decline for talkers with PD compared to controls. Speaking more loudly or more clearly when wearing a face mask improved intelligibility for talkers with PD compared to habitual speech, and both speech styles resulted in speech intelligibility levels that approximated talkers' baseline intelligibility levels without a mask.
Topics: Humans; Speech Intelligibility; Masks; Parkinson Disease; Speech Disorders; Cognition
PubMed: 37625133
DOI: 10.1044/2023_AJSLP-23-00085 -
Journal of Voice : Official Journal of... Mar 2024The objective of this study was to investigate the effects of speaking rate (habitual and fast) and speech task (reading and spontaneous speech) on seven dependent...
OBJECTIVES
The objective of this study was to investigate the effects of speaking rate (habitual and fast) and speech task (reading and spontaneous speech) on seven dependent variables: Breath group size (in syllables), Breath group duration (in seconds), Lung volume at breath group initiation, Lung volume at breath group termination, Lung volume excursion for each breath group (in % vital capacity), Lung volume excursion per syllable (in % vital capacity) and mean speaking Fundamental frequency (f).
METHODS
Ten women and seven men were included as subjects. Lung volume and breathing behaviors were measured by respiratory inductance plethysmography and f was measured from audio recordings by the Praat software. Statistical significance was tested by analysis of variance.
RESULTS
For both reading and spontaneous speech, the group increased mean breath group size and breath group duration significantly in the fast speaking rate condition. The group significantly decreased lung volume excursion per syllable in fast speech. Females also showed a significant increase of f in fast speech. The lung volume levels for initiation and termination of breath groups, as well as lung volume excursions in % vital capacity, showed great individual variations and no significant effects of rate. Significant effects of speech task were found for breath group size and lung volume excursion per syllable, where reading induced more syllables produced per breath group and less % VC spend per syllable as compared to spontaneous speech. Interaction effects showed that the increases in breath group size and breath group duration associated with fast rate were significantly larger in reading than in spontaneous speech.
CONCLUSION
Our data from 17 vocally untrained, healthy subjects showed great individual variations but still significant group effects regarding increased speaking rate, where the subjects seemed to spend less air per syllable and inhaled less often as a consequence of greater breath group sizes in fast speech. Subjects showed greater changes in breath group patterns as a consequence of fast speech in reading than in spontaneous speech, indicating that effects of speaking rate are dependent on the speech task.
Topics: Male; Humans; Female; Respiration; Lung Volume Measurements; Speech; Voice; Cognition
PubMed: 34711460
DOI: 10.1016/j.jvoice.2021.09.005 -
Disability and Rehabilitation.... Apr 2024To identify and describe the aims, methodological approaches, and major findings of studies on the use of STT among secondary pupils (age 12-18) with learning... (Review)
Review
PURPOSE
To identify and describe the aims, methodological approaches, and major findings of studies on the use of STT among secondary pupils (age 12-18) with learning difficulties published from January 2000 to April 2022.
MATERIALS AND METHOD
This scoping review includes empirical studies published in peer-reviewed journals and grey literature between January 2000 and April 2022. Searches were conducted in April 2022 in three databases: ERIC, PsycINFO and Scopus. In addition, related reviews were manually screened for relevant papers.
RESULTS
Eight peer-revied studies and five publications of grey literature were found to meet the inclusion criteria; two studies employed experimental designs, four employed quasi-experimental designs and seven employed explorative designs. Six studies described STT as an assistive technology (a compensatory aid for poor writing performance); two assessed STT as an instructional technology to determine whether it improves overall writing and related skills (e.g., reading). Results suggest that STT may increase pupils' abilities to produce texts with fewer errors, provide help with spelling and improve reading comprehension and word recognition. To date, there is a paucity of high-quality research on the use of STT among adolescents with LD.
CONCLUSION
The scoping review shows that very little research has been conducted on the use of STT for adolescents with learning difficulties in secondary education. Findings from the studies identified five areas of interest: writing related skills, text assessment, writing processes, accuracy of the technology, and participants' experiences. Findings indicate that writing performance among students with learning difficulties improves when using STT. Parents, teachers, and pupils report positive experiences with the technology, particularly for students with severe reading and writing difficulties.IMPLICATIONS FOR REHABILITATIONThere is a great need for more robust research on the use of speech-to-text technology (STT) in educational settings, especially on its effect on writing skillsStudies describe STT as either an assistive (a compensatory aid for poor writing performance) or instructional technology (aiming to improve learning in general). It is important that practitioners are aware of the different aims and possible consequences of introducing STT to learners with writing difficulties.STT provides both opportunities and challenges for writers with learning difficulties in secondary education. Findings indicate that writing performance among students with learning difficulties improves when using STT, yet inaccuracy of the technology was presented as one of the main challenges.Parents, teachers, and pupils report positive experiences with the technology, particularly for students with severe reading and writing difficulties.
Topics: Humans; Adolescent; Child; Speech; Learning; Reading; Writing; Technology
PubMed: 36427182
DOI: 10.1080/17483107.2022.2149865 -
The Journal of Head Trauma...As part of a larger study dedicated to identifying speech and language biomarkers of neurological decline associated with repetitive head injury (RHI) in professional... (Observational Study)
Observational Study
OBJECTIVE
As part of a larger study dedicated to identifying speech and language biomarkers of neurological decline associated with repetitive head injury (RHI) in professional boxers and mixed martial artists (MMAs), we examined articulation rate, pausing, and disfluency in passages read aloud by participants in the Professional Athletes Brain Health Study.
SETTING
A large outpatient medical center specializing in neurological care.
PARTICIPANTS, DESIGN, AND MAIN MEASURES
Passages read aloud by 60 boxers, 40 MMAs, and 55 controls were acoustically analyzed to determine articulation rate (the number of syllables produced per second), number and duration of pauses, and number and duration of disfluencies in this observational study.
RESULTS
Both boxers and MMAs differed from controls in articulation rate, producing syllables at a slower rate than controls by nearly half a syllable per second on average. Boxers produced significantly more pauses and disfluencies in passages read aloud than MMAs and controls.
CONCLUSIONS
Slower articulation rate in both boxers and MMA fighters compared with individuals with no history of RHI and the increased occurrence of pauses and disfluencies in the speech of boxers suggest changes in speech motor behavior that may relate to RHI. These speech characteristics can be measured in everyday speaking conditions and by automatic recognition systems, so they have the potential to serve as effective, noninvasive clinical indicators for RHI-associated neurological decline.
Topics: Humans; Speech; Brain; Craniocerebral Trauma
PubMed: 36701308
DOI: 10.1097/HTR.0000000000000841 -
Journal of Neural Engineering Aug 2023When listening to continuous speech, populations of neurons in the brain track different features of the signal. Neural tracking can be measured by relating the...
When listening to continuous speech, populations of neurons in the brain track different features of the signal. Neural tracking can be measured by relating the electroencephalography (EEG) and the speech signal. Recent studies have shown a significant contribution of linguistic features over acoustic neural tracking using linear models. However, linear models cannot model the nonlinear dynamics of the brain. To overcome this, we use a convolutional neural network (CNN) that relates EEG to linguistic features using phoneme or word onsets as a control and has the capacity to model non-linear relations.We integrate phoneme- and word-based linguistic features (phoneme surprisal, cohort entropy (CE), word surprisal (WS) and word frequency (WF)) in our nonlinear CNN model and investigate if they carry additional information on top of lexical features (phoneme and word onsets). We then compare the performance of our nonlinear CNN with that of a linear encoder and a linearized CNN.For the non-linear CNN, we found a significant contribution of CE over phoneme onsets and of WS and WF over word onsets. Moreover, the non-linear CNN outperformed the linear baselines.Measuring coding of linguistic features in the brain is important for auditory neuroscience research and applications that involve objectively measuring speech understanding. With linear models, this is measurable, but the effects are very small. The proposed non-linear CNN model yields larger differences between linguistic and lexical models and, therefore, could show effects that would otherwise be unmeasurable and may, in the future, lead to improved within-subject measures and shorter recordings.
Topics: Humans; Speech; Neurons; Cochlear Nerve; Linguistics; Neural Networks, Computer
PubMed: 37595606
DOI: 10.1088/1741-2552/acf1ce -
Journal of Applied Behavior Analysis Jan 2024Autoclitics are secondary verbal operants that are controlled by a feature of the conditions that occasion or evoke a primary verbal operant such as a tact or mand....
Autoclitics are secondary verbal operants that are controlled by a feature of the conditions that occasion or evoke a primary verbal operant such as a tact or mand. Qualifying autoclitics extend, negate, or assert a speaker's primary verbal response and modify the intensity or direction of the listener's behavior. Howard and Rice (1988) established autoclitics that indicated weak stimulus control (e.g., "like a [primary tact]") with four neurotypical preschool children. However, generalization to newly acquired tacts was limited. In Experiment 1, we addressed similar behavior as in Howard and Rice but with autistic children while using simultaneous teaching procedures, and we observed generalization across sets and with newly acquired tacts. In Experiment 2, we evaluated the effects of multiple-exemplar training on generalization of autoclitics across sets of naturalistic stimuli. Across participants, gradual increases in the frequency of autoclitics occurred with untaught stimuli after teaching with one or more sets.
Topics: Child, Preschool; Humans; Autism Spectrum Disorder; Verbal Behavior; Generalization, Psychological; Tellurium
PubMed: 37828795
DOI: 10.1002/jaba.1026 -
Behavioural Brain Research Aug 2023A central issue in spoken word production concerns how activation is transmitted from semantic to phonological levels. The current study investigated the issue of...
A central issue in spoken word production concerns how activation is transmitted from semantic to phonological levels. The current study investigated the issue of seriality and cascadedness in Chinese spoken word production, via the combined semantic blocked paradigm (with homogeneous and heterogeneous blocks) and picture-word interference paradigm (with phonologically related, mediated and unrelated distractors). Naming latencies data showed a mediated effect via comparing mediated and unrelated distractors in homogeneous blocks, a phonological facilitation effect via comparing phonologically related and unrelated distractors in homogeneous and heterogeneous blocks, and a semantic interference effect via comparing homogeneous and heterogeneous blocks. Critically, cluster-based permutation test of ERP data demonstrated a mediated effect around 266-326ms and an overlapped pattern of semantic interference effect around 264-418ms and phonological facilitation effect around 210-310ms in homogeneous or around 236-316ms in heterogeneous blocks. These findings indicated that speakers activate phonological nodes of non-targets, and present a cascadedness pattern of the transmission from semantics to phonology in Chinese spoken production. The present study sheds new insight on the neural correlates of semantic and phonological effects, and provides behavioral and electrophysiological evidences for the cascaded model within a theoretical framework of lexical competition in speech production.
Topics: Humans; East Asian People; Language; Semantics; Speech
PubMed: 37269928
DOI: 10.1016/j.bbr.2023.114523