-
Nature Communications Jun 2024Humans produce two forms of cognitively complex vocalizations: speech and song. It is debated whether these differ based primarily on culturally specific, learned...
Humans produce two forms of cognitively complex vocalizations: speech and song. It is debated whether these differ based primarily on culturally specific, learned features, or if acoustical features can reliably distinguish them. We study the spectro-temporal modulation patterns of vocalizations produced by 369 people living in 21 urban, rural, and small-scale societies across six continents. Specific ranges of spectral and temporal modulations, overlapping within categories and across societies, significantly differentiate speech from song. Machine-learning classification shows that this effect is cross-culturally robust, vocalizations being reliably classified solely from their spectro-temporal features across all 21 societies. Listeners unfamiliar with the cultures classify these vocalizations using similar spectro-temporal cues as the machine learning algorithm. Finally, spectro-temporal features are better able to discriminate song from speech than a broad range of other acoustical variables, suggesting that spectro-temporal modulation-a key feature of auditory neuronal tuning-accounts for a fundamental difference between these categories.
Topics: Humans; Speech; Male; Female; Machine Learning; Adult; Acoustics; Cross-Cultural Comparison; Auditory Perception; Sound Spectrography; Singing; Music; Middle Aged; Young Adult
PubMed: 38844457
DOI: 10.1038/s41467-024-49040-3 -
Preventive Medicine Reports Jul 2024Noma is a neglected tropical disease and a global health concern. (Review)
Review
BACKGROUND
Noma is a neglected tropical disease and a global health concern.
OBJECTIVES
To elucidate the epidemiology, management, prevention, and public health implications of Noma.
METHODS
PubMed, Scopus, and Web of Science, supplemented by Google Scholar and World Health Organization databases, were searched using keywords to gather both published and grey literature from 1970 to 2023 in English.
RESULTS
Approximately 30,000-40,000 cases occur annually, with varying incidences across various African countries, such as Nigeria, Niger, and Chad. Incidence in Nigerian and Ethiopian states range from 0.6 to 3300 and 1.64 to 13.4 per 100,000 population, respectively. Mortality is approximately 8.5% in Niger. Risk factors include malnutrition, immunocompromised status, poor dental hygiene, inadequate sanitation, gingival lesions, low socioeconomic status, chronic and infectious diseases, low birth weight, high parity, diarrhoea, and fever. Diagnosis is primarily made based on clinical signs/symptoms and accordingly staging of disease is done. Stage I, II and II presents with acute necrotizing gingivitis, facial edema with halitosis, and necrotizing stomatitis, respectively. If the patient survives acute stages, the progress to Stage IV and Stage V manifests as trismus, difficulty in deglutition and phonation, and facial disfigurement, with increased severity in last stage. Treatment encompasses antibiotic therapy (amoxicillin, metronidazole, chlorhexidine, ampicillin, gentamicin), surgical interventions, wound management (honey dressing, ketamine), and nutritional support. Prevention strategies include oral hygiene, vaccination, health education, and community-based interventions.
CONCLUSION
Noma's recent inclusion in WHO list of neglected tropical diseases is a milestone in recognizing the importance of prevention and early intervention to globally enhance health outcomes.
PubMed: 38826589
DOI: 10.1016/j.pmedr.2024.102764 -
PLOS Digital Health May 2024Detecting voice disorders from voice recordings could allow for frequent, remote, and low-cost screening before costly clinical visits and a more invasive laryngoscopy...
Detecting voice disorders from voice recordings could allow for frequent, remote, and low-cost screening before costly clinical visits and a more invasive laryngoscopy examination. Our goals were to detect unilateral vocal fold paralysis (UVFP) from voice recordings using machine learning, to identify which acoustic variables were important for prediction to increase trust, and to determine model performance relative to clinician performance. Patients with confirmed UVFP through endoscopic examination (N = 77) and controls with normal voices matched for age and sex (N = 77) were included. Voice samples were elicited by reading the Rainbow Passage and sustaining phonation of the vowel "a". Four machine learning models of differing complexity were used. SHapley Additive exPlanations (SHAP) was used to identify important features. The highest median bootstrapped ROC AUC score was 0.87 and beat clinician's performance (range: 0.74-0.81) based on the recordings. Recording durations were different between UVFP recordings and controls due to how that data was originally processed when storing, which we can show can classify both groups. And counterintuitively, many UVFP recordings had higher intensity than controls, when UVFP patients tend to have weaker voices, revealing a dataset-specific bias which we mitigate in an additional analysis. We demonstrate that recording biases in audio duration and intensity created dataset-specific differences between patients and controls, which models used to improve classification. Furthermore, clinician's ratings provide further evidence that patients were over-projecting their voices and being recorded at a higher amplitude signal than controls. Interestingly, after matching audio duration and removing variables associated with intensity in order to mitigate the biases, the models were able to achieve a similar high performance. We provide a set of recommendations to avoid bias when building and evaluating machine learning models for screening in laryngology.
PubMed: 38814939
DOI: 10.1371/journal.pdig.0000516 -
Frontiers in Public Health 2024The domination of the Contemporary Commercial Music (CCM) industry in music markets has led to a significant increase in the number of CCM performers. Performing in a...
BACKGROUND
The domination of the Contemporary Commercial Music (CCM) industry in music markets has led to a significant increase in the number of CCM performers. Performing in a wide variety of singing styles involves exposing CCM singers to specific risk factors potentially leading to voice problems. This, in turn, necessitates the consideration of this particular group of voice users in the Occupational Health framework. The aim of the present research was threefold. First, it sought to profile the group of Polish CCM singers. Second, it was designed to explore the prevalence of self-reported voice problems and voice quality in this population, in both speech and singing. Third, it aimed to explore the relationships between voice problems and lifetime singing involvement, occupational voice use, smoking, alcohol consumption, vocal training, and microphone use, as potential voice risk factors.
MATERIALS AND METHODS
The study was conducted in Poland from January 2020 to April 2023. An online survey included socio-demographic information, singing involvement characteristics, and singers' voice self-assessment. The prevalence of voice problems was assessed by the Polish versions of the Vocal Tract Discomfort Scale (VTDS) and the Singing Voice Handicap Index (SVHI). Also, a self-reported dysphonia symptoms protocol was applied. The perceived overall voice quality was assessed by a Visual Analogue Scale (VAS) of 100 mm.
RESULTS
412 singers, 310 women and 102 men, completed the survey. Nearly half of the studied population declared lifetime singing experience over 10 years with an average daily singing time of 1 or 2 h. 283 participants received vocal training. For 11.4% of respondents, singing was the primary income source, and 42% defined their career goals as voice-related. The median scores of the VTDS were 11.00 (0-44) and 12.00 (0-40) for the Frequency and Severity subscales, respectively. The median SVHI score of 33 (0-139) was significantly higher than the normative values determined in a systematic review and meta-analysis (2018). Strong positive correlations were observed between SVHI and both VTD subscales: Frequency ( = 0.632, < 0.001) and Severity ( = 0.611, < 0.001). The relationships between most of the other variables studied were weak or negligible.
CONCLUSION
The examined CCM singers exhibited substantial diversity with regard to musical genre preferences, aspirations pertaining to singing endeavors, career affiliations, and source of income. Singing voice assessment revealed a greater degree of voice problems in the examined cohort than so far reported in the literature, based on the SVH and VTDS.
Topics: Humans; Poland; Singing; Male; Female; Adult; Cross-Sectional Studies; Middle Aged; Music; Voice Quality; Voice Disorders; Self-Assessment; Surveys and Questionnaires; Prevalence; Risk Factors; Young Adult; Speech
PubMed: 38813421
DOI: 10.3389/fpubh.2024.1256152 -
Brain Sciences Apr 2024Primary progressive apraxia of speech (PPAOS) is a neurodegenerative syndrome characterized by the progressive and initially isolated or predominant onset of...
Primary progressive apraxia of speech (PPAOS) is a neurodegenerative syndrome characterized by the progressive and initially isolated or predominant onset of difficulties in the planning/programming of movements necessary for speech production and can be accompanied by dysarthria. To date, no study has used an evidence-based treatment to address phonation control in patients with PPAOS. The aim of this study was to evaluate the feasibility and efficacy of LSVT LOUD as a treatment for phonatory control in speakers with PPAOS. Three speakers with PPAOS received LSVT LOUD therapy, and changes in phonatory control, voice quality and prosody were measured immediately, and one, four and eight weeks after the end of the treatment. Overall, the results suggest that the treatment is feasible and could improve voice quality, intensity, and control in some patients with PPAOS. The generalization of the results is also discussed.
PubMed: 38790396
DOI: 10.3390/brainsci14050417 -
Bioengineering (Basel, Switzerland) May 2024In voice analysis, the electroglottographic (EGG) signal has long been recognized as a useful complement to the acoustic signal, but only when the vocal folds are...
In voice analysis, the electroglottographic (EGG) signal has long been recognized as a useful complement to the acoustic signal, but only when the vocal folds are actually contacting, such that this signal has an appreciable amplitude. However, phonation can also occur without the vocal folds contacting, as in breathy voice, in which case the EGG amplitude is low, but not zero. It is of great interest to identify the transition from non-contacting to contacting, because this will substantially change the nature of the vocal fold oscillations; however, that transition is not in itself audible. The magnitude of the cycle-normalized peak derivative of the EGG signal is a convenient indicator of vocal fold contacting, but no current EGG hardware has a sufficient signal-to-noise ratio of the derivative. We show how the textbook techniques of spectral thresholding and static notch filtering are straightforward to implement, can run in real time, and can mitigate several noise problems in EGG hardware. This can be useful to researchers in vocology.
PubMed: 38790346
DOI: 10.3390/bioengineering11050479 -
Turkish Archives of Otorhinolaryngology Dec 2023This study aimed to classify the degree of edema in patients with Reinke's edema (RE) and examine its impact on their voice parameters using both objective and...
OBJECTIVE
This study aimed to classify the degree of edema in patients with Reinke's edema (RE) and examine its impact on their voice parameters using both objective and subjective assessment methods.
METHODS
Objective and subjective voice data of 104 patients diagnosed with RE between 2018 and 2021 were evaluated retrospectively. RE is classified into 4 groups (types 1, 2, 3, and 4). The evaluation included videolaryngostroboscopic examination, acoustic voice analysis, and aerodynamic measurements, GRBAS, Voice Handicap Index-10 (VHI-10), Voice-Related Quality of Life Scale (V-RQOL), and Reflux Septum Index (RSI).
RESULTS
Patients with type 1 RE had a significantly lower mean age than those with types 3-4. Although there were no significant differences in acoustic and aerodynamic parameters between the groups, it was observed that F0 and the maximum phonation time decreased as the degree of edema increased. The GRBASTotal, G, and R scores of types 1 and 2 were significantly lower than those of types 3 and 4, as were the scores of type 1 S. There were no statistically significant differences between the RE groups in terms of VHI-10, V-RQOL, and RSI scores.
CONCLUSION
It has been observed that as the severity of RE increases, voice perception and quality (especially types 3 and 4) are negatively affected. Determining the degree of edema will guide the clinician in both the planning of the intervention phase and the follow-up phase.
PubMed: 38784955
DOI: 10.4274/tao.2023.2023-8-10 -
CoDAS 2024To evaluate the immediate effect of the inspiratory exercise with a booster and a respiratory exerciser on the voice of women without vocal complaints.
PURPOSE
To evaluate the immediate effect of the inspiratory exercise with a booster and a respiratory exerciser on the voice of women without vocal complaints.
METHODS
25 women with no vocal complaints, between 18 and 34 years old, with a score of 1 on the Vocal Disorder Screening Index (ITDV) participated. Data collection was performed before and after performing the inspiratory exercise and consisted of recording the sustained vowel /a/, connected speech and maximum phonatory times (MPT) of vowels, fricative phonemes and counting numbers. In the auditory-perceptual judgment, the Vocal Deviation Scale (VSD) was used to verify the general degree of vocal deviation. Acoustic evaluation was performed using the PRAAT software and the parameters fundamental frequency (f0), jitter, shimmer, harmonium-to-noise ratio (HNR), Cepstral Peak Prominence Smoothed (CPPS), Acoustic Voice Quality Index (AVQI) and Acoustic Breathiness Index (ABI). To measure the aerodynamic measurements, the time of each emission was extracted in the Audacity program. Data were statistically analyzed using the Statistica for Windows software and normality was tested using the Shapiro-Wilk test. To compare the results, Student's and Wilcoxon's t tests were applied, adopting a significance level of 5%.
RESULTS
There were no significant differences between the results of the JPA and the acoustic measures, in the pre and post inspiratory exercise moments. As for the aerodynamic measures, it was possible to observe a significant increase in the value of the TMF /s/ (p=0.008).
CONCLUSION
There was no change in vocal quality after the inspiratory exercise with stimulator and respiratory exerciser, but an increase in the MPT of the phoneme /s/ was observed after the exercise.
Topics: Humans; Female; Adult; Voice Quality; Young Adult; Adolescent; Breathing Exercises; Speech Acoustics; Voice Disorders; Phonation
PubMed: 38775526
DOI: 10.1590/2317-1782/20242023148pt -
Journal of Voice : Official Journal of... May 2024The objective of this study was to assess voice changes in patients with nasopharyngeal carcinoma (NPC) using subjective and objective assessment tools and to make...
OBJECTIVES
The objective of this study was to assess voice changes in patients with nasopharyngeal carcinoma (NPC) using subjective and objective assessment tools and to make inferences regarding the underlying pathological causes for different phases of radiotherapy (RT).
METHODS
A total of 187 (123 males and 64 females) patients with post-RT NPC with no recurrence of malignancy or other voice diseases and 17 (11 males and 6 females) healthy individuals were included in this study. The patients were equally divided into 11 groups according to the number of years after RT. The acoustic analyses, GRBAS (grade, roughness, breathiness, asthenia, and strain) scales, and Voice Handicap Index (VHI)-10 scores were collected and analyzed.
RESULTS
The fundamental frequency (F0) parameters in years 1 and 2 and year 11 were significantly lower in patients with NPC than in healthy individuals. The maximum phonation times in years 1 and 11 were significantly shorter than those in healthy individuals. The jitter parameters were significantly different between year 1 and from years 8 to 11 and the healthy individuals. The shimmer parameters were significantly different between years 1, from years 9 to 11, and healthy individuals. Hoarseness was the most prominent problem compared to other items of the GRBAS. The VHI-10 scores were significantly different between years 1 and 2 and year 11 after RT in patients with NPC.
CONCLUSIONS
Voice quality was worse in the first 2 years and from years 8 to 11 but remained relatively normal from years 3 to 7 after RT. Patient-reported voice handicaps began during year 3 after RT. The most prominent problem was perceived hoarseness, which was evident in the first 2 years and from years 9 to 11 after RT. The radiation-induced mucous edema, laryngeal intrinsic muscle fibrosis, nerve injuries, upper respiratory tract changes, and decreased lung capacity might be the pathological reasons for voice changes in post-RT patients with NPC.
PubMed: 38772832
DOI: 10.1016/j.jvoice.2024.04.017 -
Journal of Clinical Medicine May 2024: Cervical esophageal reconstruction is vital to improve the quality of life in cancer surgery patients. Microsurgery is crucial in providing vascularized tissue for...
: Cervical esophageal reconstruction is vital to improve the quality of life in cancer surgery patients. Microsurgery is crucial in providing vascularized tissue for defect repair, particularly in secondary cases with a higher risk of failure due to larger defects and damage from previous surgery and radiotherapy. The purpose of this study was to describe the clinical characteristics of a series of patients who underwent secondary repair of esophageal defects and provide practical information for the management and treatment of such cases based on the authors' experience and the literature review. : We retrospectively reviewed the electronic medical records of the Plastic Surgery Clinic at the University of Trieste to identify cases of patients who underwent secondary esophageal microsurgical reconstructions following oncological surgery. Patient demographics, the etiology of esophageal defects, previous surgical history, and preoperative assessments were collected from medical records. Surgical techniques utilized for reconstruction, such as pedicled flaps or free tissue transfers, were documented along with intraoperative information. Postoperative outcomes, including complications, graft viability, and functional outcomes, were evaluated during follow-up. : We treated 13 cases of secondary esophageal reconstructions between 2011 and 2022. Most commonly, Antero-Lateral Thigh (ALT) flaps were used in 10 cases, while 2 cases employed a radial forearm flap (RFF), and 1 case employed a chimeric parascapular flap. No flap failures occurred during a median 50-month follow-up. One ALT flap patient experienced postop stricture but maintained swallowing ability. A single tracheoesophageal fistula occurred in an RFF patient with a history of radiotherapy and complete lymph node dissection. : Cervical esophageal reconstruction significantly impacts patients' quality of life by restoring oral feeding and phonation. When local flaps fall short, microsurgical reconstruction with intestinal flaps is valuable but is burdened by limitations. For challenging secondary cases, ALT or RFF flaps emerge as safer options due to their robust pedicles, yielding low complication rates and positive functional outcomes.
PubMed: 38731255
DOI: 10.3390/jcm13092726