-
Computational Intelligence and... 2021During the past two decades, many remote sensing image fusion techniques have been designed to improve the spatial resolution of the low-spatial-resolution multispectral...
During the past two decades, many remote sensing image fusion techniques have been designed to improve the spatial resolution of the low-spatial-resolution multispectral bands. The main objective is fuse the low-resolution multispectral (MS) image and the high-spatial-resolution panchromatic (PAN) image to obtain a fused image having high spatial and spectral information. Recently, many artificial intelligence-based deep learning models have been designed to fuse the remote sensing images. But these models do not consider the inherent image distribution difference between MS and PAN images. Therefore, the obtained fused images may suffer from gradient and color distortion problems. To overcome these problems, in this paper, an efficient artificial intelligence-based deep transfer learning model is proposed. Inception-ResNet-v2 model is improved by using a color-aware perceptual loss (CPL). The obtained fused images are further improved by using gradient channel prior as a postprocessing step. Gradient channel prior is used to preserve the color and gradient information. Extensive experiments are carried out by considering the benchmark datasets. Performance analysis shows that the proposed model can efficiently preserve color and gradient information in the fused remote sensing images than the existing models.
Topics: Artificial Intelligence; Remote Sensing Technology
PubMed: 34976044
DOI: 10.1155/2021/7615106 -
Scientific Reports Dec 2021We introduce the Bioacoustic Cocktail Party Problem Network (BioCPPNet), a lightweight, modular, and robust U-Net-based machine learning architecture optimized for...
We introduce the Bioacoustic Cocktail Party Problem Network (BioCPPNet), a lightweight, modular, and robust U-Net-based machine learning architecture optimized for bioacoustic source separation across diverse biological taxa. Employing learnable or handcrafted encoders, BioCPPNet operates directly on the raw acoustic mixture waveform containing overlapping vocalizations and separates the input waveform into estimates corresponding to the sources in the mixture. Predictions are compared to the reference ground truth waveforms by searching over the space of (output, target) source order permutations, and we train using an objective function motivated by perceptual audio quality. We apply BioCPPNet to several species with unique vocal behavior, including macaques, bottlenose dolphins, and Egyptian fruit bats, and we evaluate reconstruction quality of separated waveforms using the scale-invariant signal-to-distortion ratio (SI-SDR) and downstream identity classification accuracy. We consider mixtures with two or three concurrent conspecific vocalizers, and we examine separation performance in open and closed speaker scenarios. To our knowledge, this paper redefines the state-of-the-art in end-to-end single-channel bioacoustic source separation in a permutation-invariant regime across a heterogeneous set of non-human species. This study serves as a major step toward the deployment of bioacoustic source separation systems for processing substantial volumes of previously unusable data containing overlapping bioacoustic signals.
Topics: Acoustics; Animals; Humans; Machine Learning; Neural Networks, Computer; Vocalization, Animal
PubMed: 34873197
DOI: 10.1038/s41598-021-02790-2 -
The Journal of the Acoustical Society... Nov 2021Understanding speech in noisy environments, such as classrooms, is a challenge for children. When a spatial separation is introduced between the target and masker, as...
Understanding speech in noisy environments, such as classrooms, is a challenge for children. When a spatial separation is introduced between the target and masker, as compared to when both are co-located, children demonstrate intelligibility improvement of the target speech. Such intelligibility improvement is known as spatial release from masking (SRM). In most reverberant environments, binaural cues associated with the spatial separation are distorted; the extent to which such distortion will affect children's SRM is unknown. Two virtual acoustic environments with reverberation times between 0.4 s and 1.1 s were compared. SRM was measured using a spatial separation with symmetrically displaced maskers to maximize access to binaural cues. The role of informational masking in modulating SRM was investigated through voice similarity between the target and masker. Results showed that, contradictory to previous developmental findings on free-field SRM, children's SRM in reverberation has not yet reached maturity in the 7-12 years age range. When reducing reverberation, an SRM improvement was seen in adults but not in children. Our findings suggest that, even though school-age children have access to binaural cues that are distorted in reverberation, they demonstrate immature use of such cues for speech-in-noise perception, even in mild reverberation.
Topics: Acoustics; Adult; Child; Humans; Noise; Perceptual Masking; Schools; Speech Intelligibility; Speech Perception
PubMed: 34852617
DOI: 10.1121/10.0006752 -
JAMA Psychiatry Feb 2022Recent accounts suggest that delusions and hallucinations may result from alterations in how prior knowledge is integrated with new information, but experimental...
IMPORTANCE
Recent accounts suggest that delusions and hallucinations may result from alterations in how prior knowledge is integrated with new information, but experimental evidence supporting this idea has been complex and inconsistent. Evidence from a simpler perceptual task would make clear whether psychotic symptoms are associated with overreliance on prior information and impaired updating.
OBJECTIVE
To investigate whether individuals with schizophrenia or schizoaffective disorder (PSZ) and healthy control individuals (HCs) differ in the ability to update their beliefs based on evidence in a relatively simple perceptual paradigm.
DESIGN, SETTING, AND PARTICIPANTS
This case-control study included individuals who met DSM-IV criteria for PSZ and matched HC participants in 2 independent samples. The PSZ group was recruited from the Maryland Psychiatric Research Center, Yale University, and community clinics, and the HC group was recruited from the community. To test perceptual updating, a random dot kinematogram paradigm was implemented in which dots moving coherently in a single direction were mixed with randomly moving dots. On 50% of trials, the direction of coherent motion changed by 90° midway through the trial. Participants were asked to report the direction perceived at the end of the trial. The Peters Delusions Inventory and Brief Psychiatric Rating Scale (BPRS) were used to quantify the severity of positive symptoms. Data were collected from September 2018 to March 2020 and were analyzed from approximately March 2020 to March 2021.
MAIN OUTCOMES AND MEASURES
Critical measures included the proportion of responses centered around the initial direction vs the subsequent changed direction and the overall precision of motion perception and reaction times.
RESULTS
A total of 48 participants were included in the PSZ group (31 [65%] male; mean [SD] age, 36.56 [9.76] years) and 36 in the HC group (22 [61%] male; mean [SD] age, 35.67 [10.74] years) in the original sample. An independent replication sample included 42 participants in the PSZ group (29 [69%] male; mean [SD] age, 33.98 [11.03] years) and 34 in the HC group (20 [59%] male; mean [SD] age, 34.29 [10.44] years). In line with previous research, patients with PSZ were less precise and had slower reaction times overall. The key finding was that patients with PSZ were significantly more likely (original sample: mean, 27.88 [95% CI, 24.19-31.57]; replication sample: mean, 26.70 [95% CI, 23.53-29.87]) than HC participants (original sample: mean, 18.86 [95% CI, 16.56-21.16]; replication sample: mean, 15.67 [95% CI, 12.61-18.73]) to report the initial motion direction rather than the final one. Moreover, the tendency to report the direction of initial motion correlated with the degree of conviction on the Peters Delusions Inventory (original sample: r = 0.32 [P = .05]; replication sample: r = 0.30 [P = .05]) and the Brief Psychiatric Rating Scale Reality Distortion score (original sample: r = 0.55 [P = .001]; replication sample: r = 0.35 [P = .03]) and severity of hallucinations (original sample: r = 0.39 [P = .02]; replication sample: r = 0.30 [P = .05]).
CONCLUSIONS AND RELEVANCE
The findings of this case-control study suggest that the severity of psychotic symptoms is associated with a tendency to overweight initial information over incoming sensory evidence. These results are consistent with predictive coding accounts of the origins of positive symptoms and suggest that deficits in very elementary perceptual updating may be a critical mechanism in psychosis.
Topics: Adult; Case-Control Studies; Cognition; Female; Humans; Male; Middle Aged; Neuropsychological Tests; Patient Acuity; Psychotic Disorders; Schizophrenic Psychology; Young Adult
PubMed: 34851373
DOI: 10.1001/jamapsychiatry.2021.3482 -
Frontiers in Medicine 2021The cochlea plays a key role in the transmission from acoustic vibration to neural stimulation upon which the brain perceives the sound. A cochlear implant (CI) is an...
The cochlea plays a key role in the transmission from acoustic vibration to neural stimulation upon which the brain perceives the sound. A cochlear implant (CI) is an auditory prosthesis to replace the damaged cochlear hair cells to achieve acoustic-to-neural conversion. However, the CI is a very coarse bionic imitation of the normal cochlea. The highly resolved time-frequency-intensity information transmitted by the normal cochlea, which is vital to high-quality auditory perception such as speech perception in challenging environments, cannot be guaranteed by CIs. Although CI recipients with state-of-the-art commercial CI devices achieve good speech perception in quiet backgrounds, they usually suffer from poor speech perception in noisy environments. Therefore, noise suppression or speech enhancement (SE) is one of the most important technologies for CI. In this study, we introduce recent progress in deep learning (DL), mostly neural networks (NN)-based SE front ends to CI, and discuss how the hearing properties of the CI recipients could be utilized to optimize the DL-based SE. In particular, different loss functions are introduced to supervise the NN training, and a set of objective and subjective experiments is presented. Results verify that the CI recipients are more sensitive to the residual noise than the SE-induced speech distortion, which has been common knowledge in CI research. Furthermore, speech reception threshold (SRT) in noise tests demonstrates that the intelligibility of the denoised speech can be significantly improved when the NN is trained with a loss function bias to more noise suppression than that with equal attention on noise residue and speech distortion.
PubMed: 34820392
DOI: 10.3389/fmed.2021.740123 -
Visual Cognition Jan 2021Studies suggest looming motion represents a special class of attentional capture stimulus due to behavioral urgency: the need to act upon objects moving toward us in an...
Studies suggest looming motion represents a special class of attentional capture stimulus due to behavioral urgency: the need to act upon objects moving toward us in an environment. In particular, one theory suggests that faster reaction times to targets cued by looming relative to receding motion are driven by post-attentional, motor-priming processes beyond the attentional capture effects seen with other stimulus qualities such as color pop-out. The present study tested this theory using a relative size judgment task where targets were pre-cued by looming and receding optic flow fields. Results show systematic increases in the perceived size of targets that were cued by looming flow fields, consistent with previous attentional capture studies using onset cues. These results challenge theories attributing behavioral changes from looming motion to motor-priming alone.
PubMed: 34712098
DOI: 10.1080/13506285.2021.1874583 -
Frontiers in Neuroscience 2021The Better hEAring Rehabilitation (BEAR) project aims to provide a new clinical profiling tool-a test battery-for hearing loss characterization. Although the loss of...
The Better hEAring Rehabilitation (BEAR) project aims to provide a new clinical profiling tool-a test battery-for hearing loss characterization. Although the loss of sensitivity can be efficiently measured using pure-tone audiometry, the assessment of supra-threshold hearing deficits remains a challenge. In contrast to the classical "attenuation-distortion" model, the proposed BEAR approach is based on the hypothesis that the hearing abilities of a given listener can be characterized along two dimensions, reflecting independent types of perceptual deficits (distortions). A data-driven approach provided evidence for the existence of different auditory profiles with different degrees of distortions. Ten tests were included in a test battery, based on their clinical feasibility, time efficiency, and related evidence from the literature. The tests were divided into six categories: audibility, speech perception, binaural processing abilities, loudness perception, spectro-temporal modulation sensitivity, and spectro-temporal resolution. Seventy-five listeners with symmetric, mild-to-severe sensorineural hearing loss were selected from a clinical population. The analysis of the results showed interrelations among outcomes related to high-frequency processing and outcome measures related to low-frequency processing abilities. The results showed the ability of the tests to reveal differences among individuals and their potential use in clinical settings.
PubMed: 34658768
DOI: 10.3389/fnins.2021.724007 -
Journal of the Association For Research... Dec 2021Age-related hearing loss (ARHL) is a devastating public health issue. To successfully address ARHL using existing and future treatments, it is imperative to detect the...
Age-related hearing loss (ARHL) is a devastating public health issue. To successfully address ARHL using existing and future treatments, it is imperative to detect the earliest signs of age-related auditory decline and understand the mechanisms driving it. Here, we explore early signs of age-related auditory decline by characterizing cochlear function in 199 ears aged 10-65 years, all of which had clinically defined normal hearing (i.e., behavioral thresholds ≤ 25 dB HL from .25 to 8 kHz bilaterally) and no history of noise exposure. We characterized cochlear function by measuring behavioral thresholds in two paradigms (traditional audiometric thresholds from .25 to 8 kHz and Békésy tracking thresholds from .125 to 20 kHz) and distortion product otoacoustic emission (DPOAE) growth functions at f = 2, 4, and 8 kHz. Behavioral thresholds through a standard clinical frequency range (up to 8 kHz) showed statistically, but not clinically, significant declines across increasing decades of life. In contrast, DPOAE growth measured in the same frequency range showed clear declines as early 30 years of age, particularly across moderate stimulus levels (L = 25-45 dB SPL). These substantial declines in DPOAE growth were not fully explained by differences in behavioral thresholds measured in the same frequency region. Additionally, high-frequency Békésy tracking thresholds above ~11.2 kHz showed frank declines with increasing age. Collectively, these results suggest that early age-related cochlear decline (1) begins as early as the third or fourth decade of life, (2) is greatest in the cochlear base but apparent through the length of the cochlear partition, (3) cannot be detected fully by traditional clinical measures, and (4) is likely due to a complex mix of etiologies.
Topics: Acoustic Stimulation; Adolescent; Adult; Aged; Aged, 80 and over; Aging; Audiometry; Auditory Threshold; Child; Cochlea; Female; Hearing Disorders; Humans; Male; Middle Aged; Otoacoustic Emissions, Spontaneous; Perceptual Distortion; Young Adult
PubMed: 34591199
DOI: 10.1007/s10162-021-00805-3 -
Journal of Imaging Aug 2021Currently available 360° cameras normally capture several images covering a scene in all directions around a shooting point. The captured images are spherical in nature...
Currently available 360° cameras normally capture several images covering a scene in all directions around a shooting point. The captured images are spherical in nature and are mapped to a two-dimensional plane using various projection methods. Many projection formats have been proposed for 360° videos. However, standards for a quality assessment of 360° images are limited. In this paper, various projection formats are compared to explore the problem of distortion caused by a mapping operation, which has been a considerable challenge in recent approaches. The performances of various projection formats, including equi-rectangular, equal-area, cylindrical, cube-map, and their modified versions, are evaluated based on the conversion causing the least amount of distortion when the format is changed. The evaluation is conducted using sample images selected based on several attributes that determine the perceptual image quality. The evaluation results based on the objective quality metrics have proved that the hybrid equi-angular cube-map format is the most appropriate solution as a common format in 360° image services for where format conversions are frequently demanded. This study presents findings ranking these formats that are useful for identifying the best image format for a future standard.
PubMed: 34460774
DOI: 10.3390/jimaging7080137 -
Frontiers in Neuroscience 2021Hallucinogen-persisting perception disorder (HPPD) features as a diagnostic category in the DSM-5, ICD-11, and other major classifications, but our knowledge of the...
Hallucinogen-persisting perception disorder (HPPD) features as a diagnostic category in the DSM-5, ICD-11, and other major classifications, but our knowledge of the phenomenology of the perceptual symptoms involved and the changes in consciousness during the characteristic "flashbacks" is limited. We systematically evaluated original case reports and case series on HPPD to define its phenomenology, associated (psycho)pathology, and course. Our search of PubMed and Embase yielded 66 relevant publications that described 97 people who, together, experienced 64 unique symptoms of HPPD. Of these, 76% concerned symptoms characteristic of Alice in Wonderland syndrome, over 50% non-visual symptoms, and 38% perceptual symptoms not clearly linked to prior intoxication states. This is in contrast with the DSM-5 diagnostic criteria for HPPD. Even though less than half of the patients showed a protracted disease course of over a year, a third achieved remission. However, in patients with co-occurring depression (with or without anxiety) HPPD symptoms persisted longer and treatment outcomes were more often negative. Thus, unlike the acute stages of psychedelic drug intoxication, which may be accompanied by altered states of consciousness, HPPD is rather characterized by changes in the of consciousness and an attentional shift from exogenous to endogenous phenomena. Since HPPD is a more encompassing nosological entity than suggested in the DSM-5, we recommend expanding its diagnostic criteria. In addition, we make recommendations for clinical practice and future research.
PubMed: 34456666
DOI: 10.3389/fnins.2021.675768