-
IEEE Transactions on Pattern Analysis... Jun 2024In this paper, we formally address universal object detection, which aims to detect every category in every scene. The dependence on human annotations, the limited...
In this paper, we formally address universal object detection, which aims to detect every category in every scene. The dependence on human annotations, the limited visual information, and the novel categories in open world severely restrict the universality of detectors. We propose UniDetector, a universal object detector that recognizes enormous categories in the open world. The critical points for UniDetector are: 1) it leverages images of multiple sources and heterogeneous label spaces in training through image-text alignment, which guarantees sufficient information for universal representations. 2) it involves heterogeneous supervision training, which alleviates the dependence on the limited fully-labeled images. 3) it generalizes to open world easily while keeping the balance between seen and unseen classes. 4) it further promotes generalizing to novel categories through our proposed decoupling training manner and probability calibration. These contributions allow UniDetector to detect over 7k categories, the largest measurable size so far, with only about 500 classes participating in training. Our UniDetector behaves the strong zero-shot ability on large-vocabulary datasets - it surpasses supervised baselines by more than 5% without seeing any corresponding images. On 13 detection datasets with various scenes, UniDetector also achieves state-of-the-art performance with only a 3% amount of training data.
PubMed: 38861430
DOI: 10.1109/TPAMI.2024.3411595 -
International Journal of... Jun 2024The purpose of this study was to compare the speech and language outcomes of children with cleft palate with or without cleft lip (CP+/-L) in the USA to children with...
The impact of enhanced Milieu teaching with phonological emphasis (EMT + PE) on the speech and language outcomes for toddlers with cleft palate in Brazil and the United States of America.
PURPOSE
The purpose of this study was to compare the speech and language outcomes of children with cleft palate with or without cleft lip (CP+/-L) in the USA to children with CP+/-L in Brazil who underwent intervention with enhanced Milieu teaching with phonological emphasis (EMT + PE), as there are few cross-country intervention comparisons for children with CP+/-L.
METHOD
This is a retrospective analysis of 29 participants from the USA and 24 participants from Brazil who were matched on age. The US participants were between the ages of 13-35 months ( = 23.76), spoke Standard American English in the home, and were recruited from East Tennessee State University and Vanderbilt University. The Brazilian participants were between the ages of 20-34 months ( = 25.04), spoke Brazilian Portuguese in the home, and were recruited from the . All treatment participants received EMT + PE from trained speech-language pathologists in hospital-university clinics.
RESULT
The treatment groups demonstrated greater gains than comparison groups in percent consonants correct, number of different words, and expressive/receptive vocabulary. There was no main effect nor interaction by country.
CONCLUSION
The application of EMT + PE in a second culture and language is a viable early intervention option for participants with CP+/-L.
PubMed: 38859760
DOI: 10.1080/17549507.2024.2342783 -
Educational Research Review May 2024Morphemes are the smallest meaningful unit of language (e.g., affixes, base words) that express grammatical and semantic information. Additionally, morphological...
Morphemes are the smallest meaningful unit of language (e.g., affixes, base words) that express grammatical and semantic information. Additionally, morphological knowledge is significantly related to children's word reading and reading comprehension skills. Researchers have broadly assessed morphological knowledge by using a wide range of tasks and stimuli, which has influenced the interpretation of the relations between morphological knowledge and reading outcomes. This review of 103 studies used meta-analytic structural equation modeling (MASEM) to investigate the relations between commonly occurring morphological knowledge assessment features (e.g., written versus oral, spelling versus no spelling) in the literature to reading outcomes, including word reading and reading comprehension. Meta-regression techniques were used to examine moderators of age and reading ability. Morphological assessments that used a written modality (e.g., reading, writing) were more predictive of word reading outcomes than those administered orally. Assessments of morphological spelling were more predictive of both word reading and reading comprehension outcomes than those that did not examine spelling accuracy. Age was a significant moderator of the relation between morphology and word reading, such that the relation was stronger for the younger than the older children. Younger children also demonstrated higher relations between multiple task dimensions and reading comprehension, including oral tasks, tasks without decoding, and tasks that provided context clues. These findings have important implications for future morphological intervention studies aimed to improve children's reading outcomes, in particular the use of orthography and spelling within the context of teaching morphology.
PubMed: 38854741
DOI: 10.1016/j.edurev.2024.100602 -
PloS One 2024The Urdu language is spoken and written on different social media platforms like Twitter, WhatsApp, Facebook, and YouTube. However, due to the lack of Urdu Language...
The Urdu language is spoken and written on different social media platforms like Twitter, WhatsApp, Facebook, and YouTube. However, due to the lack of Urdu Language Processing (ULP) libraries, it is quite challenging to identify threats from textual and sequential data on the social media provided in Urdu. Therefore, it is required to preprocess the Urdu data as efficiently as English by creating different stemming and data cleaning libraries for Urdu data. Different lexical and machine learning-based techniques are introduced in the literature, but all of these are limited to the unavailability of online Urdu vocabulary. This research has introduced Urdu language vocabulary, including a stop words list and a stemming dictionary to preprocess Urdu data as efficiently as English. This reduced the input size of the Urdu language sentences and removed redundant and noisy information. Finally, a deep sequential model based on Long Short-Term Memory (LSTM) units is trained on the efficiently preprocessed, evaluated, and tested. Our proposed methodology resulted in good prediction performance, i.e., an accuracy of 82%, which is greater than the existing methods.
Topics: Language; Natural Language Processing; Humans; Social Media; Deep Learning; Internet; Machine Learning
PubMed: 38843283
DOI: 10.1371/journal.pone.0290915 -
IEEE Transactions on Pattern Analysis... Jun 2024Open-world instance-level scene understanding aims to locate and recognize unseen object categories that are not present in the annotated dataset. This task is...
Open-world instance-level scene understanding aims to locate and recognize unseen object categories that are not present in the annotated dataset. This task is challenging because the model needs to both localize novel 3D objects and infer their semantic categories. A key factor for the recent progress in 2D open-world perception is the availability of large-scale image-text pairs from the Internet, which cover a wide range of vocabulary concepts. However, this success is hard to replicate in 3D scenarios due to the scarcity of 3D-text pairs. To address this challenge, we propose to harness pre-trained vision-language (VL) foundation models that encode extensive knowledge from image-text pairs to generate captions for multi-view images of 3D scenes. This allows us to establish explicit associations between 3D shapes and semantic-rich captions. Moreover, to enhance the fine-grained visual-semantic representation learning from captions for object-level categorization, we design hierarchical point-caption association methods to learn semantic-aware embeddings that exploit the 3D geometry between 3D points and multi-view images. In addition, to tackle the localization challenge for novel classes in the open-world setting, we develop debiased instance localization, which involves training object grouping modules on unlabeled data using instance-level pseudo supervision. This significantly improves the generalization capabilities of instance grouping and, thus, the ability to accurately locate novel objects. We conduct extensive experiments on 3D semantic, instance, and panoptic segmentation tasks, covering indoor and outdoor scenes across three datasets. Our method outperforms baseline methods by a significant margin in semantic segmentation (e.g., 34.5%∼65.3%), instance segmentation (e.g., 21.8%∼54.0%), and panoptic segmentation (e.g., 14.7%∼43.3%). Code will be available.
PubMed: 38843054
DOI: 10.1109/TPAMI.2024.3410324 -
Advances in Physiology Education Jun 2024Student engagement while learning a new, unfamiliar vocabulary is challenging in health science courses. A group role-play activity was created to teach students medical...
Student engagement while learning a new, unfamiliar vocabulary is challenging in health science courses. A group role-play activity was created to teach students medical terminology and learn why its correct usage is important. This activity brought engagement and relevance to a topic traditionally taught through lecture and rote memorization and led to the development of an undergraduate and a stand-alone introductory course to teach students medical terminology. The undergraduate course was designed to be a fully online medical terminology course for health science students and a face-to-face course for first-year dental students founded in active learning and group work. The course's centerpiece learning activity focused on using published case studies with role-play. In this group activity, students are challenged to interpret a published patient case study as one of the members of a healthcare team. This course models the group work inherent in modern health care to practice building community and practicing professional skills. This approach gives students the capacity to work asynchronously in a team-based approach using our learning management system's wiki tool and requires students to take responsibility for their learning and group dynamics. Students practice identification, writing, analyzing, and speaking medical terms while rotating through the roles. Students in both classes self-reported a 92% to 99% strong or somewhat agreement using a 5-point Likert scale that the course pedagogy was valued and helpful in their learning of medical terminology. Overall, this method has shown to be an engaging way for students to learn medical terminology.
PubMed: 38841749
DOI: 10.1152/advan.00273.2023 -
Frontiers in Sociology 2024The present study was based on empirical data collected during the first phase (2016) of Study 1000, part of the 13-November Program: a corpus of 934 individual...
The present study was based on empirical data collected during the first phase (2016) of Study 1000, part of the 13-November Program: a corpus of 934 individual interviews conducted 6-11 months after the events. To process this empirical material, the authors used integrated TXM software, which provides several classic textometry tools. They mainly used the lexical specificity analysis tool, which statistically measures the irregularity of the word distribution according to the parts of the corpus. They also analyzed the concordances of certain very specific lexical forms. Analysis revealed the important influence of social roles on the construction of memories and narratives of this event. Application of textometry tools highlighted lexical fields specific to the different social roles played by the interviewees in this , and showed that it was through these specific vocabularies that they remembered and recounted this story. Social roles therefore influence the formation of memories both individual and collective, by modulating the way in which individuals select what to remember and what to forget. The article opens up several interesting avenues for future analyses, mainly a longitudinal perspective (including phases 2 and 3 of Study 1000) for the study of flashbulb memories and the gender issue to fine-tune the analysis of social roles.
PubMed: 38841401
DOI: 10.3389/fsoc.2024.1388380 -
American Journal of Speech-language... Jun 2024Prior work has identified weaknesses in commonly used indices of lexical diversity in spoken language samples, such as type-token ratio (TTR) due to sample size and...
PURPOSE
Prior work has identified weaknesses in commonly used indices of lexical diversity in spoken language samples, such as type-token ratio (TTR) due to sample size and elicitation variation, we explored whether TTR and other diversity measures, such as number of different words/100 (NDW), vocabulary diversity (VocD), and the moving average TTR would be more sensitive to child age and clinical status (typically developing [TD] or developmental language disorder [DLD]) if samples were obtained from standardized prompts.
METHOD
We utilized archival data from the norming samples of the Test of Narrative Language and the Edmonton Narrative Norms Instrument. We examined lexical diversity and other linguistic properties of the samples, from a total of 1,048 children, ages 4-11 years; 798 of these were considered TD, whereas 250 were categorized as having a language learning disorder.
RESULTS
TTR was the least sensitive to child age or diagnostic group, with good potential to misidentify children with DLD as TD and TD children as having DLD. Growth slopes of NDW were shallow and not very sensitive to diagnostic grouping. The strongest performing measure was VocD. Mean length of utterance, TNW, and verbs/utterance did show both good growth trajectories and ability to distinguish between clinical and typical samples.
CONCLUSIONS
This study, the largest and best controlled to date, re-affirms that TTR should not be used in clinical decision making with children. A second popular measure, NDW, is not measurably stronger in terms of its psychometric properties. Because the most sensitive measure of lexical diversity, VocD, is unlikely to gain popularity because of reliance on computer-assisted analysis, we suggest alternatives for the appraisal of children's expressive vocabulary skill.
PubMed: 38838249
DOI: 10.1044/2024_AJSLP-23-00457 -
Beyond words: an investigation of fine motor skills and the verbal communication spectrum in autism.Frontiers in Psychiatry 2024This study investigated the associations between fine motor skills and expressive verbal abilities in a group of 97 autistic participants (age 8-17, mean=12.41) and 46...
INTRODUCTION
This study investigated the associations between fine motor skills and expressive verbal abilities in a group of 97 autistic participants (age 8-17, mean=12.41) and 46 typically developing youth (age 8-17, mean=12.48).
METHODS
Participants completed assessments of motor and verbal communication skills, including finger tapping speed, grooved pegboard, grip strength, visual-motor integration tasks, and measures of speech and communication skills. ASD group performance on motor tests was compared to controls. Non-parametric tests were used to analyze group differences and correlations between motor and verbal communication skills. Based on prior research, we hypothesized that individuals on the autism spectrum would exhibit deficits in fine motor speed, dexterity, pencil motor control, but not manual motor strength. Additionally, we expected that impaired fine motor skills would be linked to poorer performance on standardized measures of verbal abilities.
RESULTS
The results indicated that 80% of autistic participants demonstrated an impairment on at least one measure of motor skills, and as a group, they exhibited significantly poorer fine motor performance compared to the non-ASD group in dominant hand finger tapping speed, bilateral fine motor dexterity measured via the grooved pegboard task, and pencil motor coordination and visual-motor integration measured on the Beery-Buktenica Developmental Test of Visual-Motor Integration-Sixth Edition. Moreover, impaired fine motor skills were associated with poorer performance on standardized clinical measures of verbal abilities, including articulation errors, receptive and expressive language and vocabulary, rapid naming, oromotor sequencing, and parent reported functional communication skills and social communication symptoms.
DISCUSSION
Overall,our findings suggest there is a high prevalence of fine motor impairments in ASD, and these impairments were associated with a range of verbal abilities. Further research is warranted to better understand the underlying mechanisms of these associations and develop targeted interventions to address both fine motor and verbal impairments in ASD.
PubMed: 38835552
DOI: 10.3389/fpsyt.2024.1379307 -
Ear and Hearing May 2024Speech recognition in cochlear implant (CI) recipients is quite variable, particularly in challenging listening conditions. Demographic, audiological, and cognitive...
OBJECTIVES
Speech recognition in cochlear implant (CI) recipients is quite variable, particularly in challenging listening conditions. Demographic, audiological, and cognitive factors explain some, but not all, of this variance. The literature suggests that rapid auditory perceptual learning explains unique variance in speech recognition in listeners with normal hearing and those with hearing loss. The present study focuses on the early adaptation phase of task-specific rapid auditory perceptual learning. It investigates whether adult CI recipients exhibit this learning and, if so, whether it accounts for portions of the variance in their recognition of fast speech and speech in noise.
DESIGN
Thirty-six adult CI recipients (ages = 35 to 77, M = 55) completed a battery of general speech recognition tests (sentences in speech-shaped noise, four-talker babble noise, and natural-fast speech), cognitive measures (vocabulary, working memory, attention, and verbal processing speed), and a rapid auditory perceptual learning task with time-compressed speech. Accuracy in the general speech recognition tasks was modeled with a series of generalized mixed models that accounted for demographic, audiological, and cognitive factors before accounting for the contribution of task-specific rapid auditory perceptual learning of time-compressed speech.
RESULTS
Most CI recipients exhibited early task-specific rapid auditory perceptual learning of time-compressed speech within the course of the first 20 sentences. This early task-specific rapid auditory perceptual learning had unique contribution to the recognition of natural-fast speech in quiet and speech in noise, although the contribution to natural-fast speech may reflect the rapid learning that occurred in this task. When accounting for demographic and cognitive characteristics, an increase of 1 SD in the early task-specific rapid auditory perceptual learning rate was associated with ~52% increase in the odds of correctly recognizing natural-fast speech in quiet, and ~19% to 28% in the odds of correctly recognizing the different types of speech in noise. Age, vocabulary, attention, and verbal processing speed also had unique contributions to general speech recognition. However, their contribution varied between the different general speech recognition tests.
CONCLUSIONS
Consistent with previous findings in other populations, in CI recipients, early task-specific rapid auditory perceptual, learning also accounts for some of the individual differences in the recognition of speech in noise and natural-fast speech in quiet. Thus, across populations, the early rapid adaptation phase of task-specific rapid auditory perceptual learning might serve as a skill that supports speech recognition in various adverse conditions. In CI users, the ability to rapidly adapt to ongoing acoustical challenges may be one of the factors associated with good CI outcomes. Overall, CI recipients with higher cognitive resources and faster rapid learning rates had better speech recognition.
PubMed: 38829780
DOI: 10.1097/AUD.0000000000001523