-
JASA Express Letters Aug 2023Demographic differences in acoustic environments are usually studied using geographic area monitoring. This approach, however, may miss valuable information...
Demographic differences in acoustic environments are usually studied using geographic area monitoring. This approach, however, may miss valuable information differentiating cultures. This motivated the current study, which used wearable sound recorders to measure noise levels and speech-to-noise ratios (SNRs) in the immediate acoustic environment of Latinx and European-American college students. Latinx experienced higher noise levels (64.8 dBC) and lower SNRs (3.7 dB) compared to European-Americans (noise levels, 63 dB; SNRs, 5.4 dB). This work provides a framework for a larger study on the impact of culture on auditory ecology.
Topics: Humans; Acoustics; Ecology; Sound; Speech; Students
PubMed: 37589565
DOI: 10.1121/10.0020608 -
Perspectives on Psychological Science :... Jul 2023In this commentary I provide a review of the microaggression construct within a linguistic-pragmatic framework. From this perspective, microaggressions can be viewed as... (Review)
Review
In this commentary I provide a review of the microaggression construct within a linguistic-pragmatic framework. From this perspective, microaggressions can be viewed as nonconventional indirect speech acts, that is, utterances that, because of their aggressive meaning, require some type of inferential processing on the part of the hearer. This inferential process requires a consideration of the remark in the context within which it occurs, including the prior discourse, as well as the roles and statuses of the interactants. Because microaggressions are indirect, the speaker always has the option, especially if they are higher in power, of denying any aggressive meaning. Focusing on their linguistic/pragmatic features allows for the development of a more principled framework for specifying what constitutes a microaggression, as well as helping to identify the relevant features of the context and the processes involved in the recognition of microaggressions.
Topics: Humans; Microaggression; Linguistics; Aggression; Speech; Recognition, Psychology
PubMed: 36395088
DOI: 10.1177/17456916221133824 -
American Journal of Law & Medicine Dec 2023While physician-assisted suicide legislation is being drafted and passed across the United States, a gray-area continues to exist in regard to the legality of a lay...
While physician-assisted suicide legislation is being drafted and passed across the United States, a gray-area continues to exist in regard to the legality of a lay person's assistance with suicide. Several high-profile cases have been covered in the media, namely that of Michelle Carter in Massachusetts and William Melchert-Dinkel in Minnesota, but there is also a growing volume of anonymous pro-suicide materials online. Pro-suicide groups fly under the radar and claim to help those desiring to take their own lives. This paper aims to identify the point at which an individual or group can be held civilly or criminally liable for assisting suicide and discusses how the First Amendment can be used to shield authors from such liability.
Topics: Humans; United States; Speech; Suicide, Assisted; Massachusetts
PubMed: 38563271
DOI: 10.1017/amj.2024.2 -
Neuroscience and Biobehavioral Reviews Nov 2023Individuals with autism spectrum disorder (ASD) exhibit atypical speech-in-noise (SiN) perception, but the scope of these impairments has not been clearly defined. We... (Review)
Review
Individuals with autism spectrum disorder (ASD) exhibit atypical speech-in-noise (SiN) perception, but the scope of these impairments has not been clearly defined. We conducted a systematic review of the behavioural research on SiN perception in ASD, using a comprehensive search strategy across databases (Embase, Pubmed, Web of Science, APA PsycArticles, LLBA, clinicaltrials.gov and PsyArXiv). We withheld 20 studies that generally revealed intact speech perception in stationary noise, while impairments in speech discrimination were found in temporally modulated noise, concurrent speech, and audiovisual speech perception. An association with auditory temporal processing deficits, exacerbated by suboptimal language skills, is shown. Speech-in-speech perception might be further impaired due to deficient top-down processing of speech. Further research is needed to address remaining challenges and gaps in our understanding of these impairments, including the developmental aspects of SiN processing in ASD, and the impact of gender and social attentional orienting on this ability. Our findings have important implications for improving communication in ASD, both in daily interactions and in clinical and educational settings.
Topics: Humans; Speech Perception; Autistic Disorder; Autism Spectrum Disorder; Speech; Auditory Perception
PubMed: 37797728
DOI: 10.1016/j.neubiorev.2023.105406 -
International Journal of Neural Systems Jul 2023Machine Learning (ML), among other things, facilitates Text Classification, the task of assigning classes to textual items. Classification performance in ML has been...
Machine Learning (ML), among other things, facilitates Text Classification, the task of assigning classes to textual items. Classification performance in ML has been significantly improved due to recent developments, including the rise of Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM), Gated Recurrent Units (GRUs), and Transformer Models. Internal memory states with dynamic temporal behavior can be found in these kinds of cells. This temporal behavior in the LSTM cell is stored in two different states: "Current" and "Hidden". In this work, we define a modification layer within the LSTM cell which allows us to perform additional state adjustments for either state, or even simultaneously alter both. We perform 17 state alterations. Out of these 17 single-state alteration experiments, 12 involve the Current state whereas five involve the Hidden one. These alterations are evaluated using seven datasets related to sentiment analysis, document classification, hate speech detection, and human-to-robot interaction. Our results showed that the highest performing alteration for Current and Hidden state can achieve an average 1 improvement of 0.5% and 0.3%, respectively. We also compare our modified cell performance to two Transformer models, where our modified LSTM cell is outperformed in classification metrics in 4/6 datasets, but improves upon the simple Transformer model and clearly has a better cost efficiency than both Transformer models.
Topics: Humans; Memory, Short-Term; Memory, Long-Term; Neural Networks, Computer; Machine Learning; Speech
PubMed: 37300815
DOI: 10.1142/S0129065723500399 -
Biological Psychiatry. Cognitive... Oct 2023Language has been used as a privileged window to investigate mental processes. More recently, descriptions of psychopathological symptoms have been analyzed with the... (Review)
Review
Language has been used as a privileged window to investigate mental processes. More recently, descriptions of psychopathological symptoms have been analyzed with the help of natural language processing tools. An example is the study of speech organization using graph theoretical approaches that began approximately 10 years ago. After its application in different areas, there is a need to better characterize what aspects can be associated with typical and atypical behavior throughout the lifespan, given the variables related to aging as well as biological and social contexts. The precise quantification of mental processes assessed through language may allow us to disentangle biological/social markers by looking at naturalistic protocols in different contexts. In this review, we discuss 10 years of studies in which word recurrence graphs were adopted to characterize the chain of thoughts expressed by individuals while producing discourse. Initially developed to understand formal thought disorder in the context of psychotic syndromes, this line of research has been expanded to understand the atypical development in different stages of psychosis and differential diagnosis (such as dementia) as well as the typical development of thought organization in school-age children/teenagers in naturalistic and school-based protocols. We comment on the effects of environmental factors, such as education and reading habits (in monolingual and bilingual contexts), in clinical and nonclinical populations at different developmental stages (from childhood to older adulthood, considering aging effects on cognition). Looking toward the future, there is an opportunity to use word recurrence graphs to address complex questions that consider biological/social factors within a developmental perspective in typical and atypical contexts.
Topics: Child; Adolescent; Humans; Aged; Speech; Psychotic Disorders; Cognition; Social Environment
PubMed: 37085138
DOI: 10.1016/j.bpsc.2023.04.004 -
Journal of Speech, Language, and... Aug 2023Nonnative consonant cluster learning has become a useful experimental approach for learning about speech motor learning, and we sought to enhance our understanding of...
PURPOSE
Nonnative consonant cluster learning has become a useful experimental approach for learning about speech motor learning, and we sought to enhance our understanding of this area and to establish best practices for this type of research.
METHOD
One hundred twenty individuals completed a nonnative consonant cluster learning task within a speech motor learning paradigm. Following a brief prepractice, participants then practiced the production of eight word-initial nonnative consonant clusters embedded in bisyllabic nonwords (e.g., GD in /gdivu/). The clusters ranged in difficulty according to linguistic typology and sonority sequencing. Acquisition was operationalized as the change across the practice section and learning was assessed with two retention sessions (R1: 30 min after practice; R2: 2 days after practice). We evaluated changes in accuracy as well as in the acoustic details of the cluster production at each time point.
RESULTS
Overall, participants improved in their production of the consonant clusters. Accuracy increased, and duration measures decreased in specific measures associated with cluster production. The change in coordination measured in the acoustics changed both for clusters that were incorrectly produced and for those that were correctly produced, indicating continued motor learning even in accurate tokens.
CONCLUSIONS
These results aid our understanding of the complexity of nonnative consonant cluster learning. In particular, both factors related to both phonological and speech motor control properties affect the learning of novel speech sequences.
SUPPLEMENTAL MATERIAL
https://doi.org/10.23641/asha.21844185.
Topics: Humans; Phonetics; Speech; Learning; Acoustics
PubMed: 36634242
DOI: 10.1044/2022_JSLHR-22-00322 -
Nature Communications Sep 2023Imagine being in a crowded room with a cacophony of speakers and having the ability to focus on or remove speech from a specific 2D region. This would require...
Imagine being in a crowded room with a cacophony of speakers and having the ability to focus on or remove speech from a specific 2D region. This would require understanding and manipulating an acoustic scene, isolating each speaker, and associating a 2D spatial context with each constituent speech. However, separating speech from a large number of concurrent speakers in a room into individual streams and identifying their precise 2D locations is challenging, even for the human brain. Here, we present the first acoustic swarm that demonstrates cooperative navigation with centimeter-resolution using sound, eliminating the need for cameras or external infrastructure. Our acoustic swarm forms a self-distributing wireless microphone array, which, along with our attention-based neural network framework, lets us separate and localize concurrent human speakers in the 2D space, enabling speech zones. Our evaluations showed that the acoustic swarm could localize and separate 3-5 concurrent speech sources in real-world unseen reverberant environments with median and 90-percentile 2D errors of 15 cm and 50 cm, respectively. Our system enables applications like mute zones (parts of the room where sounds are muted), active zones (regions where sounds are captured), multi-conversation separation and location-aware interaction.
Topics: Humans; Speech; Acoustics; Sound; Communication; Awareness
PubMed: 37735445
DOI: 10.1038/s41467-023-40869-8 -
PloS One 2023Speech deepfakes are artificial voices generated by machine learning models. Previous literature has highlighted deepfakes as one of the biggest security threats arising...
Speech deepfakes are artificial voices generated by machine learning models. Previous literature has highlighted deepfakes as one of the biggest security threats arising from progress in artificial intelligence due to their potential for misuse. However, studies investigating human detection capabilities are limited. We presented genuine and deepfake audio to n = 529 individuals and asked them to identify the deepfakes. We ran our experiments in English and Mandarin to understand if language affects detection performance and decision-making rationale. We found that detection capability is unreliable. Listeners only correctly spotted the deepfakes 73% of the time, and there was no difference in detectability between the two languages. Increasing listener awareness by providing examples of speech deepfakes only improves results slightly. As speech synthesis algorithms improve and become more realistic, we can expect the detection task to become harder. The difficulty of detecting speech deepfakes confirms their potential for misuse and signals that defenses against this threat are needed.
Topics: Humans; Speech; Artificial Intelligence; Phonetics; Speech Perception; Language
PubMed: 37531336
DOI: 10.1371/journal.pone.0285333 -
Assessment Oct 2023Category and letter verbal fluency assessment is widely used in basic and clinical research. Yet, the nature of the processes measured by such means remains a matter of...
Category and letter verbal fluency assessment is widely used in basic and clinical research. Yet, the nature of the processes measured by such means remains a matter of debate. To delineate automatic (free-associative) versus controlled (dissociative) retrieval processes involved in verbal fluency tasks, we carried out a psychometric study combining a novel lexical-semantic retrieval paradigm and structural equation modeling. We show that category fluency primarily engages a free-associative retrieval, whereas letter fluency exerts executive suppression of habitual semantic associates. Importantly, the models demonstrated that this dissociation is parametric rather than absolute, exhibiting a degree of unity as well as diversity among the retrieval measures. These findings and further exploratory analyses validate that category and letter fluency tasks reflect partially distinct forms of memory search and retrieval control, warranting different application in basic research and clinical assessment. Finally, we conclude that the novel associative-dissociative paradigm provides straightforward and useful behavioral measures for the assessment and differentiation of automatic versus controlled retrieval ability.
Topics: Humans; Neuropsychological Tests; Semantics; Verbal Behavior
PubMed: 35979927
DOI: 10.1177/10731911221117512