-
Hearing Research Sep 2023The relative contributions of superior temporal vs. inferior frontal and parietal networks to recognition of speech in a background of competing speech remain unclear,...
The relative contributions of superior temporal vs. inferior frontal and parietal networks to recognition of speech in a background of competing speech remain unclear, although the contributions themselves are well established. Here, we use fMRI with spectrotemporal modulation transfer function (ST-MTF) modeling to examine the speech information represented in temporal vs. frontoparietal networks for two speech recognition tasks with and without a competing talker. Specifically, 31 listeners completed two versions of a three-alternative forced choice competing speech task: "Unison" and "Competing", in which a female (target) and a male (competing) talker uttered identical or different phrases, respectively. Spectrotemporal modulation filtering (i.e., acoustic distortion) was applied to the two-talker mixtures and ST-MTF models were generated to predict brain activation from differences in spectrotemporal-modulation distortion on each trial. Three cortical networks were identified based on differential patterns of ST-MTF predictions and the resultant ST-MTF weights across conditions (Unison, Competing): a bilateral superior temporal (S-T) network, a frontoparietal (F-P) network, and a network distributed across cortical midline regions and the angular gyrus (M-AG). The S-T network and the M-AG network responded primarily to spectrotemporal cues associated with speech intelligibility, regardless of condition, but the S-T network responded to a greater range of temporal modulations suggesting a more acoustically driven response. The F-P network responded to the absence of intelligibility-related cues in both conditions, but also to the absence (presence) of target-talker (competing-talker) vocal pitch in the Competing condition, suggesting a generalized response to signal degradation. Task performance was best predicted by activation in the S-T and F-P networks, but in opposite directions (S-T: more activation = better performance; F-P: vice versa). Moreover, S-T network predictions were entirely ST-MTF mediated while F-P network predictions were ST-MTF mediated only in the Unison condition, suggesting an influence from non-acoustic sources (e.g., informational masking) in the Competing condition. Activation in the M-AG network was weakly positively correlated with performance and this relation was entirely superseded by those in the S-T and F-P networks. Regarding contributions to speech recognition, we conclude: (a) superior temporal regions play a bottom-up, perceptual role that is not qualitatively dependent on the presence of competing speech; (b) frontoparietal regions play a top-down role that is modulated by competing speech and scales with listening effort; and (c) performance ultimately relies on dynamic interactions between these networks, with ancillary contributions from networks not involved in speech processing per se (e.g., the M-AG network).
Topics: Male; Humans; Female; Speech; Speech Perception; Cognition; Cues; Acoustics; Speech Intelligibility; Perceptual Masking
PubMed: 37531847
DOI: 10.1016/j.heares.2023.108856 -
Frontiers in Psychology 2023It is widely known that among others, a pervasive symptom characterizing anorexia nervosa (AN) concerns body image overestimation, which largely contributes to the onset...
It is widely known that among others, a pervasive symptom characterizing anorexia nervosa (AN) concerns body image overestimation, which largely contributes to the onset and maintenance of eating disorders. In the present study, we investigated the nature of the body image distortion by recording accuracy and reaction times in both a group of healthy controls and AN patients during two validated tasks requiring an implicit or explicit recognition of self/other hand stimuli, in which the perceived size of the stimuli was manipulated. Our results showed that (1) the perceived size of hand stimuli modulated both the implicit and explicit processing of body parts in both groups; (2) the implicit self-advantage emerged in both groups, but the bodily self, at an explicit level (perceptual, psycho-affective, cognitive) together with the integration and the distinction between self and other, was altered only in restrictive anorexia patients. Although further investigations will be necessary, these findings shed new light on the relationship between the different layers of self-experience and bodily self-disorders.
PubMed: 37519354
DOI: 10.3389/fpsyg.2023.1197319 -
IEEE Transactions on Image Processing :... 2023Perception-based image analysis technologies can be used to help visually impaired people take better quality pictures by providing automated guidance, thereby...
Perception-based image analysis technologies can be used to help visually impaired people take better quality pictures by providing automated guidance, thereby empowering them to interact more confidently on social media. The photographs taken by visually impaired users often suffer from one or both of two kinds of quality issues: technical quality (distortions), and semantic quality, such as framing and aesthetic composition. Here we develop tools to help them minimize occurrences of common technical distortions, such as blur, poor exposure, and noise. We do not address the complementary problems of semantic quality, leaving that aspect for future work. The problem of assessing, and providing actionable feedback on the technical quality of pictures captured by visually impaired users is hard enough, owing to the severe, commingled distortions that often occur. To advance progress on the problem of analyzing and measuring the technical quality of visually impaired user-generated content (VI-UGC), we built a very large and unique subjective image quality and distortion dataset. This new perceptual resource, which we call the LIVE-Meta VI-UGC Database, contains 40K real-world distorted VI-UGC images and 40K patches, on which we recorded 2.7M human perceptual quality judgments and 2.7M distortion labels. Using this psychometric resource we also created an automatic limited vision picture quality and distortion predictor that learns local-to-global spatial quality relationships, achieving state-of-the-art prediction performance on VI-UGC pictures, significantly outperforming existing picture quality models on this unique class of distorted picture data. We also created a prototype feedback system that helps to guide users to mitigate quality issues and take better quality pictures, by creating a multi-task learning framework. The dataset and models can be accessed at: https://github.com/mandal-cv/visimpaired.
Topics: Humans; Image Processing, Computer-Assisted; Semantics; Visually Impaired Persons; Color Perception; Visual Acuity
PubMed: 37432828
DOI: 10.1109/TIP.2023.3282067 -
Journal of Affective Disorders Oct 2023Evaluate differences in sustained attention (SAT) and associated neurofunctional profiles between bipolar disorder type I (BD), attention-deficit/hyperactivity disorder...
OBJECTIVES
Evaluate differences in sustained attention (SAT) and associated neurofunctional profiles between bipolar disorder type I (BD), attention-deficit/hyperactivity disorder (ADHD), and healthy comparison (HC) youth.
METHODS
Adolescent participants, aged 12-17 years, with BD (n = 30) and ADHD (n = 28) and HC adolescents (n = 26) underwent structural and functional magnetic resonance imaging (fMRI) while completing a modified Continuous Performance Task-Identical Pairs task. Attentional load was modifying in this task using three levels of image distortion (0 %, 25 % and 50 % image distortion). Task related fMRI activation and performance measures: perceptual sensitivity index (PSI); response bias (RB) and response time (RT); were calculated and compared between groups.
RESULTS
BD participants displayed lower perceptual sensitivity index (0 % p = 0.012; 25 % p = 0.015; 50 % p = 0.036) and higher values of response bias across levels of distortion (0 % p = 0.002, 25 % p = 0.001, and 50 % p = 0.008) as compared to HC. No statistically significant differences were observed for PSI and RB between BD and ADHD groups. No difference in RT were detected. Between-group and within-group differences in task related fMRI measures were detected in several clusters. In a region of interest (ROI) analysis of these clusters comparing BD and ADHD confirmed differences between these two groups.
CONCLUSIONS
Compared with HC, BD participants displayed SAT deficits. Increased attentional load revealed that BD participants had lower activation in brain regions associated with performance and integration of neural processes in SAT. ROI analysis between BD and ADHD participants shows that the differences were likely not attributable to ADHD comorbidity, suggesting SAT deficits were distinct to the BD group.
Topics: Humans; Adolescent; Bipolar Disorder; Mania; Brain; Attention; Attention Deficit Disorder with Hyperactivity; Magnetic Resonance Imaging
PubMed: 37380109
DOI: 10.1016/j.jad.2023.06.030 -
Journal of Imaging Jun 2023Given the reference (distortion-free) image, full-reference image quality assessment (FR-IQA) algorithms seek to assess the perceptual quality of the test image. Over...
Given the reference (distortion-free) image, full-reference image quality assessment (FR-IQA) algorithms seek to assess the perceptual quality of the test image. Over the years, many effective, hand-crafted FR-IQA metrics have been proposed in the literature. In this work, we present a novel framework for FR-IQA that combines multiple metrics and tries to leverage the strength of each by formulating FR-IQA as an optimization problem. Following the idea of other fusion-based metrics, the perceptual quality of a test image is defined as the weighted product of several already existing, hand-crafted FR-IQA metrics. Unlike other methods, the weights are determined in an optimization-based framework and the objective function is defined to maximize the correlation and minimize the root mean square error between the predicted and ground-truth quality scores. The obtained metrics are evaluated on four popular benchmark IQA databases and compared to the state of the art. This comparison has revealed that the compiled fusion-based metrics are able to outperform other competing algorithms, including deep learning-based ones.
PubMed: 37367464
DOI: 10.3390/jimaging9060116 -
Applied Sciences (Basel, Switzerland) May 2023Speech is a communication method found only in humans that relies on precisely articulated sounds to encode and express thoughts. Anatomical differences in the maxilla,...
Speech is a communication method found only in humans that relies on precisely articulated sounds to encode and express thoughts. Anatomical differences in the maxilla, mandible, tooth position, and vocal tract affect tongue placement and broadly influence the patterns of airflow and resonance during speech production. Alterations in these structures can create perceptual distortions in speech known as speech sound disorders (SSDs). As craniofacial development occurs, the vocal tract, jaws, and teeth change in parallel with stages of speech development, from babbling to adult phonation. Alterations from a normal Class 1 dental and skeletal relationship can impact speech. Dentofacial disharmony (DFD) patients have jaw disproportions, with a high prevalence of SSDs, where the severity of malocclusion correlates with the degree of speech distortion. DFD patients often seek orthodontic and orthognathic surgical treatment, but there is limited familiarity among dental providers on the impacts of malocclusion and its correction on speech. We sought to review the interplay between craniofacial and speech development and the impacts of orthodontic and surgical treatment on speech. Shared knowledge can facilitate collaborations between dental specialists and speech pathologists for the proper diagnosis, referral, and treatment of DFD patients with speech pathologies.
PubMed: 37323873
DOI: 10.3390/app13095496 -
BioRxiv : the Preprint Server For... Feb 2024While humans experience the visual environment in a panoramic 220° view, traditional functional MRI setups are limited to display images like postcards in the central...
While humans experience the visual environment in a panoramic 220° view, traditional functional MRI setups are limited to display images like postcards in the central 10-15° of the visual field. Thus, it remains unknown how a scene is represented in the brain when perceived across the full visual field. Here, we developed a novel method for ultra-wide angle visual presentation and probed for signatures of immersive scene representation. To accomplish this, we bounced the projected image off angled-mirrors directly onto a custom-built curved screen, creating an unobstructed view of 175°. Scene images were created from custom-built virtual environments with a compatible wide field-of-view to avoid perceptual distortion. We found that immersive scene representation drives medial cortex with far-peripheral preferences, but surprisingly had little effect on classic scene regions. That is, scene regions showed relatively minimal modulation over dramatic changes of visual size. Further, we found that scene and face-selective regions maintain their content preferences even under conditions of central scotoma, when only the extreme far-peripheral visual field is stimulated. These results highlight that not all far-peripheral information is automatically integrated into the computations of scene regions, and that there are routes to high-level visual areas that do not require direct stimulation of the central visual field. Broadly, this work provides new clarifying evidence on content vs. peripheral preferences in scene representation, and opens new neuroimaging research avenues to understand immersive visual representation.
PubMed: 37292806
DOI: 10.1101/2023.05.14.540275 -
Journal of Experimental Psychology.... Nov 2023How people set decision criteria in signal detection model is an important research question. The likelihood ratio (LR) theory, which is one of the most influential...
How people set decision criteria in signal detection model is an important research question. The likelihood ratio (LR) theory, which is one of the most influential theories about criteria setting, typically assumes that (a) decisions are based on the objective LR of the signal and noise distributions, and (b) LR criteria do not change across tasks with various difficulty levels. However, it is often questioned whether people are really able to know the exact shape of signal and noise distributions, and compute the objective LR accordingly. Here we suggest whether decision criteria are set based on objective LR can be tested in two-condition experiments with different difficulty levels across conditions. We then asked participants in three empirical experiments to perform two-condition perceptual or memory tasks, and give their answer using confidence rating scale. Results revealed that the two assumptions of LR theory contradicted with each other: if we assumed decision criteria were based on objective LR, then the estimated LR criteria differed across difficulty levels, and fanned out as task difficulty decreased. We suggest people might inaccurately estimate the LR in signal detection tasks, and several possible explanations for the distortion of LR are discussed. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
PubMed: 37261799
DOI: 10.1037/xge0001438 -
IEEE Transactions on Pattern Analysis... Sep 2023Image editing and compositing have become ubiquitous in entertainment, from digital art to AR and VR experiences. To produce beautiful composites, the camera needs to be...
Image editing and compositing have become ubiquitous in entertainment, from digital art to AR and VR experiences. To produce beautiful composites, the camera needs to be geometrically calibrated, which can be tedious and requires a physical calibration target. In place of the traditional multi-image calibration process, we propose to infer the camera calibration parameters such as pitch, roll, field of view, and lens distortion directly from a single image using a deep convolutional neural network. We train this network using automatically generated samples from a large-scale panorama dataset, yielding competitive accuracy in terms of standard l error. However, we argue that minimizing such standard error metrics might not be optimal for many applications. In this work, we investigate human sensitivity to inaccuracies in geometric camera calibration. To this end, we conduct a large-scale human perception study where we ask participants to judge the realism of 3D objects composited with correct and biased camera calibration parameters. Based on this study, we develop a new perceptual measure for camera calibration and demonstrate that our deep calibration network outperforms previous single-image based calibration methods both on standard metrics as well as on this novel perceptual measure. Finally, we demonstrate the use of our calibration network for several applications, including virtual object insertion, image retrieval, and compositing.
PubMed: 37195850
DOI: 10.1109/TPAMI.2023.3269641 -
Applied Neuropsychology. Adult Apr 2023There are many commonalities between the clinical symptoms of dementia with Lewy bodies (DLB) and those of Alzheimer's disease (AD). The accurate differentiation of...
There are many commonalities between the clinical symptoms of dementia with Lewy bodies (DLB) and those of Alzheimer's disease (AD). The accurate differentiation of these two diseases is an important neuropsychological issue. The Mini-Mental State Examination (MMSE) is often used as a screening test for dementing disorders. We created evaluation items for the pentagon copy test of MMSE and developed a simple, highly accurate evaluation method for differentiating DLB in combination with conventional evaluation items such as the Qualitative Scoring MMSE Pentagon Test (QSPT). Subjects were divided into three groups: DLB (n = 119), AD (n = 50), and Normal (n = 26). The severities of DLB and AD ranged from mild cognitive impairment (MCI) to mild dementia. We compared the results of the pentagon copy test. We found that the rates of patients with abnormalities in "motor incoordination" and "gestalt destruction" were higher in the DLB group than the AD group. Furthermore, receiver operating characteristic curve analysis suggested the differentiation of DLB with high accuracy (sensitivity: 0.70, specificity: 0.78) using the criterion of patients meeting one of the following three characteristics: "the number of angles on QSPT: scores other than 4," "major tremor (Parkinsonism-related tremor) is present," and "gestalt destruction (distortion in overall coherence) is present." This evaluation method may be clinically useful for evaluating MCI to mild DLB patients because the burden on patients is low.
PubMed: 37052204
DOI: 10.1080/23279095.2023.2200948