-
Journal of Psycholinguistic Research Jun 2024The present paper examines how English native speakers produce scopally ambiguous sentences and how they make use of gestures and prosody for disambiguation. As a case...
The present paper examines how English native speakers produce scopally ambiguous sentences and how they make use of gestures and prosody for disambiguation. As a case in point, the participants in the present study produced the English negative quantifiers. They appear in two different positions as (1) The election of no candidate was a surprise (a: 'for those elected, none of them was a surprise'; b: 'no candidate was elected, and that was a surprise') and (2) no candidate's election was a surprise (a: 'for those elected, none of them was a surprise'; b: # 'no candidate was elected, and that was a surprise.' We were able to investigate the gesture production and the prosodic patterns of the positional effects (i.e., a-interpretation is available at two different positions in 1 and 2) and the interpretation effects (i.e., two different interpretations are available in the same position in 1). We discovered that the participants tended to launch more head shakes in the (a) interpretation despites the different positions, but more head nod/beat in the (b) interpretation. While there is not a difference in prosody of no in (a) and (b) interpretation in (1), there are pitch and durational differences between (a) interpretations in (1) and (2). This study points out the abstract similarities across languages such as Catalan and Spanish (Prieto et al. in Lingua 131:136-150, 2013. 10.1016/j.lingua.2013.02.008; Tubau et al. in Linguist Rev 32(1):115-142, 2015. 10.1515/tlr-2014-0016) in the gestural movements, and the meaning is crucial for gesture patterns. We emphasize that gesture patterns disambiguate ambiguous interpretation when prosody cannot do so.
Topics: Humans; Gestures; Adult; Psycholinguistics; Male; Female; Speech; Language; Young Adult
PubMed: 38926243
DOI: 10.1007/s10936-024-10075-8 -
Science Advances Jun 2024Lip language recognition urgently needs wearable and easy-to-use interfaces for interference-free and high-fidelity lip-reading acquisition and to develop accompanying...
Lip language recognition urgently needs wearable and easy-to-use interfaces for interference-free and high-fidelity lip-reading acquisition and to develop accompanying data-efficient decoder-modeling methods. Existing solutions suffer from unreliable lip reading, are data hungry, and exhibit poor generalization. Here, we propose a wearable lip language decoding technology that enables interference-free and high-fidelity acquisition of lip movements and data-efficient recognition of fluent lip language based on wearable motion capture and continuous lip speech movement reconstruction. The method allows us to artificially generate any wanted continuous speech datasets from a very limited corpus of word samples from users. By using these artificial datasets to train the decoder, we achieve an average accuracy of 92.0% across individuals ( = 7) for actual continuous and fluent lip speech recognition for 93 English sentences, even observing no training burn on users because all training datasets are artificially generated. Our method greatly minimizes users' training/learning load and presents a data-efficient and easy-to-use paradigm for lip language recognition.
Topics: Humans; Wearable Electronic Devices; Speech; Language; Lip; Movement; Male; Female; Adult; Lipreading; Motion Capture
PubMed: 38924408
DOI: 10.1126/sciadv.ado9576 -
Cognitive Science Jun 2024Experiments on visually grounded, definite reference production often manipulate simple visual scenes in the form of grids filled with objects, for example, to test how...
Experiments on visually grounded, definite reference production often manipulate simple visual scenes in the form of grids filled with objects, for example, to test how speakers are affected by the number of objects that are visible. Regarding the latter, it was found that speech onset times increase along with domain size, at least when speakers refer to nonsalient target objects that do not pop out of the visual domain. This finding suggests that even in the case of many distractors, speakers perform object-by-object scans of the visual scene. The current study investigates whether this systematic processing strategy can be explained by the simplified nature of the scenes that were used, and if different strategies can be identified for photo-realistic visual scenes. In doing so, we conducted a preregistered experiment that manipulated domain size and saturation; replicated the measures of speech onset times; and recorded eye movements to measure speakers' viewing strategies more directly. Using controlled photo-realistic scenes, we find (1) that speech onset times increase linearly as more distractors are present; (2) that larger domains elicit relatively fewer fixation switches back and forth between the target and its distractors, mainly before speech onset; and (3) that speakers fixate the target relatively less often in larger domains, mainly after speech onset. We conclude that careful object-by-object scans remain the dominant strategy in our photo-realistic scenes, to a limited extent combined with low-level saliency mechanisms. A relevant direction for future research would be to employ less controlled photo-realistic stimuli that do allow for interpretation based on context.
Topics: Humans; Speech; Male; Female; Eye Movements; Adult; Young Adult; Visual Perception; Attention; Photic Stimulation
PubMed: 38924126
DOI: 10.1111/cogs.13473 -
Cognitive Science Jun 2024Words that describe sensory perception give insight into how language mediates human experience, and the acquisition of these words is one way to examine how we learn to...
Words that describe sensory perception give insight into how language mediates human experience, and the acquisition of these words is one way to examine how we learn to categorize and communicate sensation. We examine the differential predictions of the typological prevalence hypothesis and embodiment hypothesis regarding the acquisition of perception verbs. Studies 1 and 2 examine the acquisition trajectories of perception verbs across 12 languages using parent questionnaire responses, while Study 3 examines their relative frequencies in English corpus data. We find the vision verbs see and look are acquired first, consistent with the typological prevalence hypothesis. However, for children at 12-23 months, touch-not audition-verbs take precedence in terms of their age of acquisition, frequency in child-produced speech, and frequency in child-directed speech, consistent with the embodiment hypothesis. Later at 24-35 months old, frequency rates are observably different and audition begins to align with what has previously been reported in adult English data. It seems the initial orientation to verbalizing touch over audition in child-caregiver interaction is especially related to the control of physically and socially appropriate behaviors. Taken together, the results indicate children's acquisition of perception verbs arises from the complex interplay of embodiment, language-specific input, and child-directed socialization routines.
Topics: Humans; Language Development; Infant; Female; Male; Language; Child, Preschool; Visual Perception; Speech; Touch; Auditory Perception
PubMed: 38923050
DOI: 10.1111/cogs.13469 -
Advances in Experimental Medicine and... 2024Speech can be defined as the human ability to communicate through a sequence of vocal sounds. Consequently, speech requires an emitter (the speaker) capable of... (Review)
Review
Speech can be defined as the human ability to communicate through a sequence of vocal sounds. Consequently, speech requires an emitter (the speaker) capable of generating the acoustic signal and a receiver (the listener) able to successfully decode the sounds produced by the emitter (i.e., the acoustic signal). Time plays a central role at both ends of this interaction. On the one hand, speech production requires precise and rapid coordination, typically within the order of milliseconds, of the upper vocal tract articulators (i.e., tongue, jaw, lips, and velum), their composite movements, and the activation of the vocal folds. On the other hand, the generated acoustic signal unfolds in time, carrying information at different timescales. This information must be parsed and integrated by the receiver for the correct transmission of meaning. This chapter describes the temporal patterns that characterize the speech signal and reviews research that explores the neural mechanisms underlying the generation of these patterns and the role they play in speech comprehension.
Topics: Humans; Speech; Speech Perception; Speech Acoustics; Periodicity
PubMed: 38918356
DOI: 10.1007/978-3-031-60183-5_14 -
Nicotine & Tobacco Research : Official... Jun 2024Pictorial health warning labels (HWLs) can communicate the harms of tobacco product use, yet little research exists for cigars. We sought to identify the most effective...
INTRODUCTION
Pictorial health warning labels (HWLs) can communicate the harms of tobacco product use, yet little research exists for cigars. We sought to identify the most effective types of images to pair with newly developed cigar HWLs.
AIMS AND METHODS
In September 2021, we conducted an online survey experiment with US adults who reported using little cigars, cigarillos, or large cigars in the past 30 days (n = 753). After developing nine statements about health effects of cigar use, we randomized participants to view one of three levels of harm visibility paired with each statement, either: (1) an image depicting internal harm not visible outside the body, (2) an image depicting external harm visible outside of the body, or (3) two images depicting both internal and external harm. After viewing each image, participants answered questions on perceived message effectiveness (PME), negative affect, and visual-verbal redundancy (VVR). We used linear mixed models to examine the effect of harm visibility on each outcome, controlling for warning statement.
RESULTS
Warnings with both and external harm depictions performed significantly better than the internal harm depictions across all outcomes, including PME (B = 0.21 and B = 0.17), negative affect (B = 0.26 and B = 0.25), and VVR (B = 0.24 and B = 0.17), respectively (all p < .001). Compared to both, the external depiction of harm did not significantly change PME or negative affect but did significantly lower VVR (B = -0.07, p = .01).
CONCLUSIONS
Future cigar pictorial HWLs may benefit from including images depicting both or external harm depictions. Future research should examine harm visibility's effect for other tobacco pictorial HWLs.
IMPLICATIONS
The cigar health warning labels (HWLs) proposed by the US Food and Drug Administration are text-only. We conducted an online survey experiment among people who use cigars to examine the effectiveness of warnings with images depicting different levels of harm visibility. We found HWLs with images depicting both an internal and external depiction of cigar harm, or an external depiction of harm alone, performed better overall than images portraying internal depictions of harm. These findings provide important regulatory evidence regarding what type of images may increase warning effectiveness and offer a promising route for future cigar HWL development.
PubMed: 38918001
DOI: 10.1093/ntr/ntae113 -
Journal of Sex Research Jun 2024Coerced condomless sex is a prevalent form of sexual coercion that is associated with severe negative health consequences. This scoping review addresses the current lack... (Review)
Review
Coerced condomless sex is a prevalent form of sexual coercion that is associated with severe negative health consequences. This scoping review addresses the current lack of synthesized qualitative evidence on coerced condomless sex. Our systematic literature search yielded 21 articles that met review eligibility criteria. Themes of coerced condomless sex were organized into three categories (tactics, motives, and sequelae) and presented separately for studies based on whether researchers stipulated pregnancy promotion intent as underlying the behavior. Coerced condomless sex perpetration tactics ranged from verbal pressure to physical assault. Besides pregnancy promotion, perpetration motives included control, dominance, entrapment, enhancing sexual experiences, and avoiding conflict. Following coerced condomless sex, victims reported developing protective strategies. They also reported experiencing various negative emotional, relational, and physical health effects. Interventions that specifically address coerced condomless sex perpetration and provide supportive programs for those who have experienced coercive condomless sex may be beneficial.
PubMed: 38913125
DOI: 10.1080/00224499.2024.2365936 -
Frontiers in Sociology 2024With growing commercial, regulatory and scholarly interest in use of Artificial Intelligence (AI) to profile and interact with human emotion ("emotional AI"), attention...
With growing commercial, regulatory and scholarly interest in use of Artificial Intelligence (AI) to profile and interact with human emotion ("emotional AI"), attention is turning to its capacity for people, relating to factors impacting on a person's decisions and behavior. Given prior social disquiet about AI and profiling technologies, surprisingly little is known on people's views on the benefits and harms of emotional AI technologies, especially their capacity for manipulation. This matters because regulators of AI (such as in the European Union and the UK) wish to stimulate AI innovation, minimize harms and build public trust in these systems, but to do so they should understand the public's expectations. Addressing this, we ascertain UK adults' perspectives on the potential of emotional AI technologies for manipulating people through a two-stage study. Stage One (the qualitative phase) uses design fiction principles to generate adequate understanding and informed discussion in 10 focus groups with diverse participants ( = 46) on how emotional AI technologies may be used in a range of mundane, everyday settings. The focus groups primarily flagged concerns about manipulation in two settings: emotion profiling in social media (involving deepfakes, false information and conspiracy theories), and emotion profiling in child oriented "emotoys" (where the toy responds to the child's facial and verbal expressions). In both these settings, participants express concerns that emotion profiling covertly exploits users' cognitive or affective weaknesses and vulnerabilities; additionally, in the social media setting, participants express concerns that emotion profiling damages people's capacity for rational thought and action. To explore these insights at a larger scale, Stage Two (the quantitative phase), conducts a UK-wide, demographically representative national survey ( = 2,068) on attitudes toward emotional AI. Taking care to avoid leading and dystopian framings of emotional AI, we find that large majorities express concern about the potential for being manipulated through social media and emotoys. In addition to signaling need for civic protections and practical means of ensuring trust in emerging technologies, the research also leads us to provide a policy-friendly subdivision of what is meant by manipulation through emotional AI and related technologies.
PubMed: 38912311
DOI: 10.3389/fsoc.2024.1339834 -
Cyberpsychology, Behavior and Social... Jun 2024It is well known that social interaction enhances learning processes, improving abilities such as attention and memorization. However, it is not clear whether similar...
It is well known that social interaction enhances learning processes, improving abilities such as attention and memorization. However, it is not clear whether similar advantages may be obtained even in virtual environments. Here, we investigate whether virtual interactions in a video game, similarly to real-life social interactions, may improve individuals' performance in a subsequent implicit learning task. Twenty-one healthy participants were asked to play a cooperative video game for 20 minutes in three different gaming modalities: alone (); together with someone without verbal interactions (); and with someone with verbal interactions (). After each gaming session, participants were presented with an EEG paradigm directed to measure mismatch negativity (MMN) responses, a well-validated index of implicit learning. MMN responses were significantly larger following , as compared with conditions. No significant difference was found between and conditions. These results indicate that implicit learning processes are enhanced following communicative virtual interactions. Verbal interaction in a virtual environment seems necessary to elicit social copresence and its positive effects on learning performances. This finding may have important implications for the design of virtual rehabilitation protocols and distance learning programs.
PubMed: 38905139
DOI: 10.1089/cyber.2023.0336 -
Noise & HealthDigital noise reduction (DNR) minimizes the effect of noise on speech signals by continuously monitoring frequency bands in the presence of noise. In the present study,...
AIMS
Digital noise reduction (DNR) minimizes the effect of noise on speech signals by continuously monitoring frequency bands in the presence of noise. In the present study, we explored the effect of DNR technology on speech intelligibility in individuals using hearing aids (HAs) and investigated implications for daily use.
METHODS AND MATERIAL
Eighteen participants with bilateral moderate sensorineural hearing loss (aged 16-45 years) were included. Bilateral receiver-in-the-ear HAs were fitted in the participants. The adaptive and nonadaptive (with a signal-to-noise ratio (SNR) of +5 and -5 dB, respectively) Turkish matrix sentence test (TURMatrix) in noise and free-field hearing assessments, including hearing thresholds with hearing aids, speech recognition thresholds (SRT), and speech discrimination scores, were conducted in two different conditions: HA in the DNR-on and DNR-off conditions.
RESULTS
No significant difference was observed between free-field hearing assessments with the HA in the DNR-off and DNR-on conditions (P > 0.05). Furthermore, the adaptive and nonadaptive TURMatrix revealed significant differences between the scores under the DNR-on and DNR-off conditions (P < 0.05). Nevertheless, under the DNR-on condition, there was no correlation between free-field hearing assessments with HA and TURMatrix results (P > 0.05). However, a significant correlation was observed between SRT scores with HA and TURMatrix scores (adaptive and nonadaptive, +5 and -5 dB SNR, respectively) under the DNR-off condition (P < 0.05).
CONCLUSION
Our study findings suggest that DNR can improve speech intelligibility in noisy environments. Therefore, DNR can enhance an individual's auditory comfort by improving their capacity to grasp speech in background noise.
Topics: Humans; Hearing Aids; Adult; Noise; Male; Middle Aged; Hearing Loss, Sensorineural; Female; Young Adult; Adolescent; Speech Intelligibility; Signal-To-Noise Ratio; Auditory Threshold; Speech Perception; Speech Reception Threshold Test
PubMed: 38904826
DOI: 10.4103/nah.nah_67_23