-
Computational and Mathematical Methods... 2022Text interpretation of public English vocabulary is a critical task in the subject of natural language processing, which uses technology to allow humans and computers to...
Text interpretation of public English vocabulary is a critical task in the subject of natural language processing, which uses technology to allow humans and computers to communicate effectively using natural language. Text feature extraction is one of the most fundamental and crucial elements in allowing computers to effectively grasp and read text. This paper proposes a text feature extraction method based on wavelet analysis that performs fast discrete wavelet transform and inverse discrete wavelet transform on the feature vectors under the traditional TF-IDF vector space model to address the problem of low feature differentiation of high-dimensional data in text feature extraction. In particular, due to the design of the Mallat algorithm, there is frequency aliasing in the signal decomposition process. This phenomenon is a problem that cannot be ignored when using wavelet analysis for feature extraction. Therefore, this paper proposes an improved inverse discrete wavelet transform method, in which the signal is decomposed by Mallat algorithm to obtain wavelet coefficients at each scale and then reconstructed to the required wavelet space coefficients according to the reconstruction method, and the reconstructed coefficients are used to analyze the signal at that scale instead of the wavelet coefficients obtained at the corresponding scale. Experiments on the public English vocabulary dataset reveal that the wavelet transform-based strategy suggested in this research outperforms existing feature extraction methods while maintaining greater classification accuracy while reducing the dimensionality of the TF-IDF vector space model.
Topics: Algorithms; Computers; Humans; Language; Vocabulary; Wavelet Analysis
PubMed: 35726227
DOI: 10.1155/2022/7125242 -
Journal of Speech, Language, and... Jun 2022Measuring the growth of young children's vocabulary is important for researchers seeking to understand language learning as well as for clinicians aiming to identify...
PURPOSE
Measuring the growth of young children's vocabulary is important for researchers seeking to understand language learning as well as for clinicians aiming to identify early deficits. The MacArthur-Bates Communicative Development Inventories (CDIs) are parent report instruments that offer a reliable and valid method for measuring early productive and receptive vocabulary across a number of languages. CDI forms typically include hundreds of words, however, and so the burden of completion is significant. We address this limitation by building on previous work using item response theory (IRT) models to create computer adaptive test (CAT) versions of the CDIs. We created CDI-CATs for both comprehension and production vocabulary, for both American English and Mexican Spanish.
METHOD
Using a data set of 7,633 English-speaking children ages 12-36 months and 1,692 Spanish-speaking children ages 12-30 months, across three CDI forms (Words & Gestures, Words & Sentences, and CDI-III), we found that a 2-parameter logistic IRT model fits well for a majority of the 680 pooled vocabulary items. We conducted CAT simulations on this data set, assessing simulated tests of varying length (25-400 items).
RESULTS
Even very short CATs recovered participant abilities very well with little bias across ages. An empirical validation study with = 204 children ages 15-36 months showed a correlation of = .92 between language ability estimated from full CDI versus CDI-CAT forms.
CONCLUSION
We provide our item bank along with fitted parameters and other details, offer recommendations for how to construct CDI-CATs in new languages, and suggest when this type of assessment may or may not be appropriate.
Topics: Child Language; Child, Preschool; Humans; Infant; Internet; Language; Language Development; Language Tests; Vocabulary
PubMed: 35658517
DOI: 10.1044/2022_JSLHR-21-00372 -
The British Journal of Educational... Mar 2021Executive functions have been proposed to account for individual variation in reading comprehension beyond the contributions of decoding skills and language skills....
BACKGROUND
Executive functions have been proposed to account for individual variation in reading comprehension beyond the contributions of decoding skills and language skills. However, insight into the direct and indirect effects of multiple executive functions on fifth-grade reading comprehension, while accounting for decoding and language skills, is limited.
AIM
The present study investigated the direct and indirect effects of fourth-grade executive functions (i.e., working memory, inhibition, and planning) on fifth-grade reading comprehension, after accounting for decoding and language skills.
SAMPLE
The sample included 113 fourth-grade children (including 65 boys and 48 girls; Age M = 9.89; SD = .44 years).
METHODS
The participants were tested on their executive functions (working memory, inhibition and planning), and their decoding skills, language skills (vocabulary and syntax knowledge) and reading comprehension, one year later.
RESULTS
Using structural equation modelling, the results indicated direct effects of working memory and planning on reading comprehension, as well as indirect effects of working memory and inhibition via decoding (χ = 2.46).
CONCLUSIONS
The results of the present study highlight the importance of executive functions for reading comprehension after taking variance in decoding and language skills into account: Both working memory and planning uniquely contributed to reading comprehension. In addition, working memory and inhibition also supported decoding. As a practical implication, educational professionals should not only consider the decoding and language skills children bring into the classroom, but their executive functions as well.
Topics: Child; Comprehension; Executive Function; Female; Humans; Male; Memory, Short-Term; Reading; Vocabulary
PubMed: 32441782
DOI: 10.1111/bjep.12355 -
Occupational Therapy International 2022In real communication, the context is complex and changeable and the color and meaning of some words will wander in the context. The development and changes of words are...
OBJECTIVE
In real communication, the context is complex and changeable and the color and meaning of some words will wander in the context. The development and changes of words are more complex and multidimensional than before. Compared with the rational meaning of words, the color meaning of words can better reflect the psychological mode and way of thinking of the Han nationality but it is difficult for foreign learners to accurately grasp and misunderstandings often occur. In order to solve this problem, it is necessary to summarize and explain the words whose color meanings are easily shifted, so as to help the students accurately grasp the color meanings of the words and better help Chinese learners to realize the communicative function of the language.
METHOD
This paper takes the scope of emotional words as the starting point and proposes that emotional words are words with emotional colors. The four aspects of whether words belong to emotional words define the concept of emotional words and introduce the specific methods of judging and extracting emotional words from the two aspects of dictionary definition and word collocation. This paper takes foreign students whose native language is English as the research object, through questionnaire survey and corpus analysis, to investigate the use of foreign students' emotional colors and to explore the influence of native language factors on emotional color acquisition. Based on the research of modern Chinese ontology and the existing research results in the field of teaching Chinese as a foreign language, this paper takes the theory of interlanguage and transfer theory as the theoretical basis and mainly uses the methods of comparative analysis and error analysis to try to find out the relationship between emotional color teaching and acquisition.
RESULTS
/. The basic pattern and quantity distribution of lexical emotion correction for beginners, intermediate, and advanced learners of Chinese as a second language were analyzed, and the restrictive factors and characteristics were explained. Similarities and differences and the rationale behind them were explored. In the process of international Chinese teaching, teachers mostly pay attention to the rational meaning of words, while ignoring the teaching of the emotional meaning of words. The lack of vocabulary emotion and meaning teaching is prone to errors in students' understanding and use. With the increase of the vocabulary of intermediate and advanced learners, many words with similar colors and meanings appear, which brings a lot of difficulties for students to distinguish between synonyms. If the use of words with emotional meanings is not accurate, it is easy to cause communication barriers.
Topics: Emotions; Humans; Language; Occupational Therapy; Students; Vocabulary
PubMed: 35495175
DOI: 10.1155/2022/5203122 -
Proceedings of the National Academy of... Jan 2023In the second year of life, infants begin to rapidly acquire the lexicon of their native language. A key learning mechanism underlying this acceleration is syntactic...
In the second year of life, infants begin to rapidly acquire the lexicon of their native language. A key learning mechanism underlying this acceleration is syntactic bootstrapping: the use of hidden cues in grammar to facilitate vocabulary learning. How infants forge the syntactic-semantic links that underlie this mechanism, however, remains speculative. A hurdle for theories is identifying computationally light strategies that have high precision within the complexity of the linguistic signal. Here, we presented 20-mo-old infants with novel grammatical elements in a complex natural language environment and measured their resultant vocabulary expansion. We found that infants can learn and exploit a natural language syntactic-semantic link in less than 30 min. The rapid speed of acquisition of a new syntactic bootstrap indicates that even emergent syntactic-semantic links can accelerate language learning. The results suggest that infants employ a cognitive network of efficient learning strategies to self-supervise language development.
Topics: Humans; Infant; Semantics; Learning; Language; Vocabulary; Linguistics; Language Development
PubMed: 36574655
DOI: 10.1073/pnas.2209153119 -
PloS One 2022The ability to predict upcoming information is crucial for efficient language processing and enables more rapid language learning. The present study explored how shared...
The ability to predict upcoming information is crucial for efficient language processing and enables more rapid language learning. The present study explored how shared reading experience influenced predictive brain signals and expressive vocabulary of 12-month-old infants. The predictive brain signals were measured by fNIRS responses in the occipital lobe with an unexpected visual-omission task. The amount of shared reading experience was correlated with the strength of this predictive brain signal and with infants' expressive vocabulary. Importantly, the predictive brain signal explained unique variance of expressive vocabulary beyond shared reading experience and maternal education. A further mediation analysis showed that the effect of shared reading experience on expressive vocabulary was explained by the infants' predictive brain signal. This is the first evidence indicating that richer shared reading experience strengthens predictive signals in the infant brain and in turn facilitates expressive vocabulary acquisition.
Topics: Brain; Humans; Infant; Language; Language Development; Reading; Vocabulary
PubMed: 35921370
DOI: 10.1371/journal.pone.0272438 -
Behavioural Neurology 2015Since the very beginning of the aphasia history it has been well established that there are two major aphasic syndromes (Wernicke's-type and Broca's-type aphasia); each... (Review)
Review
Since the very beginning of the aphasia history it has been well established that there are two major aphasic syndromes (Wernicke's-type and Broca's-type aphasia); each one of them is related to the disturbance at a specific linguistic level (lexical/semantic and grammatical) and associated with a particular brain damage localization (temporal and frontal-subcortical). It is proposed that three stages in language evolution could be distinguished: (a) primitive communication systems similar to those observed in other animals, including nonhuman primates; (b) initial communication systems using sound combinations (lexicon) but without relationships among the elements (grammar); and (c) advanced communication systems including word-combinations (grammar). It is proposed that grammar probably originated from the internal representation of actions, resulting in the creation of verbs; this is an ability that depends on the so-called Broca's area and related brain networks. It is suggested that grammar is the basic ability for the development of so-called metacognitive executive functions. It is concluded that while the lexical/semantic language system (vocabulary) probably appeared during human evolution long before the contemporary man (Homo sapiens sapiens), the grammatical language historically represents a recent acquisition and is correlated with the development of complex cognition (metacognitive executive functions).
Topics: Animals; Biological Evolution; Brain; Humans; Language; Linguistics; Semantics; Vocabulary
PubMed: 26124540
DOI: 10.1155/2015/872487 -
Journal of Biomedical Semantics Apr 2021Biomedical ontologies contain a wealth of metadata that constitutes a fundamental infrastructural resource for text mining. For several reasons, redundancies exist in...
BACKGROUND
Biomedical ontologies contain a wealth of metadata that constitutes a fundamental infrastructural resource for text mining. For several reasons, redundancies exist in the ontology ecosystem, which lead to the same entities being described by several concepts in the same or similar contexts across several ontologies. While these concepts describe the same entities, they contain different sets of complementary metadata. Linking these definitions to make use of their combined metadata could lead to improved performance in ontology-based information retrieval, extraction, and analysis tasks.
RESULTS
We develop and present an algorithm that expands the set of labels associated with an ontology class using a combination of strict lexical matching and cross-ontology reasoner-enabled equivalency queries. Across all disease terms in the Disease Ontology, the approach found 51,362 additional labels, more than tripling the number defined by the ontology itself. Manual validation by a clinical expert on a random sampling of expanded synonyms over the Human Phenotype Ontology yielded a precision of 0.912. Furthermore, we found that annotating patient visits in MIMIC-III with an extended set of Disease Ontology labels led to semantic similarity score derived from those labels being a significantly better predictor of matching first diagnosis, with a mean average precision of 0.88 for the unexpanded set of annotations, and 0.913 for the expanded set.
CONCLUSIONS
Inter-ontology synonym expansion can lead to a vast increase in the scale of vocabulary available for text mining applications. While the accuracy of the extended vocabulary is not perfect, it nevertheless led to a significantly improved ontology-based characterisation of patients from text in one setting. Furthermore, where run-on error is not acceptable, the technique can be used to provide candidate synonyms which can be checked by a domain expert.
Topics: Biological Ontologies; Data Mining; Ecosystem; Humans; Vocabulary; Vocabulary, Controlled
PubMed: 33845909
DOI: 10.1186/s13326-021-00241-5 -
Journal of Speech, Language, and... May 2023The goal of this work was to examine the semantic and syntactic properties of the vocabularies of autistic and non-autistic infants and toddlers to see if children in...
PURPOSE
The goal of this work was to examine the semantic and syntactic properties of the vocabularies of autistic and non-autistic infants and toddlers to see if children in these two groups know different kinds of words. We focused on both receptive and expressive vocabularies. For expressive vocabulary, we looked only at the "active" lexicon: Of those words that are already in children's receptive vocabulary, we asked which ones they also produce.
METHOD
We used an existing data set of 346 parent report vocabulary checklists (MacArthur-Bates Communicative Development Inventory: Words and Gestures) from 41 autistic and 27 non-autistic children at multiple timepoints between the ages of 6 and 43 months. We coded the words on the checklists for various semantic and syntactic properties and evaluated which properties predicted whether children understood and produced those words.
RESULTS
Overall, we replicated a common finding that autistic children have smaller receptive vocabularies than non-autistic children, but we found that of the words they understand, autistic children produce a similar proportion of those words as non-autistic children. While we found that some syntactic properties are more or less likely to be represented in children's early vocabularies (e.g., nouns are more likely to be understood and produced than words that are not nouns), these patterns did not differ across autistic and non-autistic children.
CONCLUSIONS
The semantic and syntactic compositions of autistic and non-autistic children's vocabularies are similar. Thus, while receptive vocabularies are relatively smaller for autistic children, they do not appear to have specific difficulty with words that have particular syntactic or semantic properties, or with adding words to the expressive vocabulary that they already understand.
Topics: Infant; Humans; Child, Preschool; Semantics; Vocabulary; Language; Language Development; Communication
PubMed: 37137280
DOI: 10.1044/2023_JSLHR-22-00369 -
PloS One 2019The mature lexicon encodes semantic relations between words, and these connections can alternately facilitate and interfere with language processing. We explore the...
The mature lexicon encodes semantic relations between words, and these connections can alternately facilitate and interfere with language processing. We explore the emergence of these processing dynamics in 18-month-olds (N = 79) using a novel approach that calculates individualized semantic structure at multiple granularities in participants' productive vocabularies. Participants completed two interleaved eye-tracked word recognition tasks involving semantically unrelated and related picture contexts, which sought to measure the impact of lexical facilitation and interference on processing, respectively. Semantic structure and vocabulary size differentially impacted processing in each task. Category level structure facilitated word recognition in 18-month-olds with smaller productive vocabularies, while overall lexical connectivity interfered with word recognition for toddlers with relatively larger vocabularies. The results suggest that, while semantic structure at multiple granularities is measurable even in small lexicons, mechanisms of semantic interference and facilitation are driven by the development of structure at different granularities. We consider these findings in light of accounts of adult word recognition that posits that different levels of structure index strong and weak activation from nearby and distant semantic neighbors. We also consider further directions for developmental change in these patterns.
Topics: Female; Humans; Infant; Language; Language Development; Male; Pattern Recognition, Visual; Recognition, Psychology; Semantics; Vocabulary
PubMed: 31295282
DOI: 10.1371/journal.pone.0219290