-
European Spine Journal : Official... Apr 2015One of the objectives of this review is to summarize the important features of a good scale. A second aim is to conduct a systematic review to identify scales that can... (Review)
Review
PURPOSE
One of the objectives of this review is to summarize the important features of a good scale. A second aim is to conduct a systematic review to identify scales that can detect the presence of cervical myelopathy and to determine their psychometric properties including validity, reliability and responsiveness.
METHODS
A thorough literature search was performed using MEDLINE, MEDLINE in process, EMBASE, and Cochrane Central Register of Controlled Trials. Articles were included in this study if they compared scale measurements between a control and a myelopathic patient population or if they discussed any psychometric property of a scale.
RESULTS
An ideal scale should be one that is quantifiable, valid, sensitive, responsive and easy to perform, has high inter/intra-rater reliability, internal consistency and a suitable distribution, and is one-dimensional and relevant. In the context of cervical spondylotic myelopathy, it is essential that the scale also addresses the pathophysiology, its key signs and symptoms as well as its natural history. For the systematic review, the search yielded 5,745 citations. Of these, 37 met inclusion criteria, 10 explored the ability of a scale to detect myelopathy, 23 examined validity by assessing correlation between scales, 10 reported reliability, 8 analyzed responsiveness, and 6 discussed internal consistency. The most frequently reported scale was short form-36 (n = 16) followed by Nurick grade (n = 14), Japanese Orthopaedic Association (n = 13), (modified) Japanese Orthopaedic Association (n = 7) and grip and release test (n = 6). Four studies each presented results on the Cooper, Harsh and 30-m walking test.
CONCLUSION
This review summarizes outcome measures used to assess the presence and severity of cervical myelopathy. It includes several validation studies as well as those that have reported the responsiveness and reliability of various measures.
Topics: Female; Humans; Male; Outcome Assessment, Health Care; Psychometrics; Reproducibility of Results; Spinal Cord Diseases; Spondylosis
PubMed: 24005994
DOI: 10.1007/s00586-013-2935-x -
Journal of Neurotrauma Jan 2020Outcome prognostication in traumatic brain injury (TBI) is important but challenging due to heterogeneity of the disease. The aim of this systematic review is to present...
Outcome prognostication in traumatic brain injury (TBI) is important but challenging due to heterogeneity of the disease. The aim of this systematic review is to present the current state-of-the-art on prognostic models for outcome after moderate and severe TBI and evidence on their validity. We searched for studies reporting on the development, validation or extension of prognostic models for functional outcome after TBI with Glasgow Coma Scale (GCS) ≤12 published between 2006-2018. Studies with patients age ≥14 years and evaluating a multi-variable prognostic model based on admission characteristics were included. Model discrimination was expressed with the area under the receiver operating characteristic curve (AUC), and model calibration with calibration slope and intercept. We included 58 studies describing 67 different prognostic models, comprising the development of 42 models, 149 external validations of 31 models, and 12 model extensions. The most common predictors were GCS (motor) score ( = 55), age ( = 54), and pupillary reactivity ( = 48). Model discrimination varied substantially between studies. The International Mission for Prognosis and Analysis of Clinical Trials (IMPACT) and Corticoid Randomisation After Significant Head injury (CRASH) models were developed on the largest cohorts (8509 and 10,008 patients, respectively) and were most often externally validated ( = 91), yielding AUCs ranging between 0.65-0.90 and 0.66-1.00, respectively. Model calibration was reported with a calibration intercept and slope for seven models in 53 validations, and was highly variable. In conclusion, the discriminatory validity of the IMPACT and CRASH prognostic models is supported across a range of settings. The variation in calibration, reflecting heterogeneity in reliability of predictions, motivates continuous validation and updating if clinical implementation is pursued.
Topics: Brain Injuries, Traumatic; Humans; Prognosis; Trauma Severity Indices; Validation Studies as Topic
PubMed: 31099301
DOI: 10.1089/neu.2019.6401 -
Systematic Reviews Sep 2017The four square step test (FSST) was first validated in healthy older adults to provide a measure of dynamic standing balance and mobility. The FSST has since been used... (Review)
Review
BACKGROUND
The four square step test (FSST) was first validated in healthy older adults to provide a measure of dynamic standing balance and mobility. The FSST has since been used in a variety of patient populations. The purpose of this systematic review is to determine the validity and reliability of the FSST in these different adult patient populations.
METHODS
The literature search was conducted to highlight all the studies that measured validity and reliability of the FSST. Six electronic databases were searched including AMED, CINAHL, MEDLINE, PEDro, Web of Science and Google Scholar. Grey literature was also searched for any documents relevant to the review. Two independent reviewers carried out study selection and quality assessment. The methodological quality was assessed using the QUADAS-2 tool, which is a validated tool for the quality assessment of diagnostic accuracy studies, and the COSMIN four-point checklist, which contains standards for evaluating reliability studies on the measurement properties of health instruments.
RESULTS
Fifteen studies were reviewed studying community-dwelling older adults, Parkinson's disease, Huntington's disease, multiple sclerosis, vestibular disorders, post stroke, post unilateral transtibial amputation, knee pain and hip osteoarthritis. Three of the studies were of moderate methodological quality scoring low in risk of bias and applicability for all domains in the QUADAS-2 tool. Three studies scored "fair" on the COSMIN four-point checklist for the reliability components. The concurrent validity of the FSST was measured in nine of the studies with moderate to strong correlations being found. Excellent Intraclass Correlation Coefficients were found between physiotherapists carrying out the tests (ICC = .99) with good to excellent test-retest reliability shown in nine of the studies (ICC = .73-.98).
CONCLUSIONS
The FSST may be an effective and valid tool for measuring dynamic balance and a participants' falls risk. It has been shown to have strong correlations with other measures of balance and mobility with good reliability shown in a number of populations. However, the quality of the papers reviewed was variable with key factors, such as sample size and test set up, needing to be addressed before the tool can be confidently used in these specified populations.
Topics: Accidental Falls; Exercise Test; Humans; Movement; Patient Selection; Postural Balance; Reproducibility of Results; Risk Assessment
PubMed: 28893312
DOI: 10.1186/s13643-017-0577-5 -
Ultrasound in Medicine & Biology Feb 2021Panoramic ultrasound (US) is a novel method used to assess linear dimensions, cross-sectional area, fatty infiltrate and echo-intensity features of muscles that cannot...
Panoramic ultrasound (US) is a novel method used to assess linear dimensions, cross-sectional area, fatty infiltrate and echo-intensity features of muscles that cannot be measured with B-mode US. However, a structured overview of its validity and reliability is lacking. MEDLINE, PubMed, SCOPUS and Web of Science databases were systematically searched for studies evaluating reliability or validity data on panoramic US imaging to determine the muscular morphology and/or quality of skeletal muscles. Most studies had acceptable methodological quality. Seventeen studies analyzing reliability (n = 16) or validity (n = 5) were included. Twelve studies assessed cross-sectional area, seven studies assessed echo-intensity, five assessed linear dimensions (fascicle/tendon length, muscle/subcutaneous adipose thickness or between-structure distance) and one assessed intramuscular fat. Panoramic US seems to be a reliable and valid tool for the assessment of muscle morphology and quality in healthy populations at specific locations, particularly the lower extremities. Studies including scanning procedures are needed to confirm these findings in locations not included in this revision and in both clinical and healthy populations.
Topics: Abdominal Muscles; Adipose Tissue; Back Muscles; Humans; Leg; Muscle, Skeletal; Reproducibility of Results; Thigh; Ultrasonography; Upper Extremity; Validation Studies as Topic
PubMed: 33189413
DOI: 10.1016/j.ultrasmedbio.2020.10.009 -
The Gerontologist Jan 2023A valid and reliable assessment of dementia dyadic communication and environment is essential to understand and facilitate social interaction and quality care. This...
BACKGROUND AND OBJECTIVES
A valid and reliable assessment of dementia dyadic communication and environment is essential to understand and facilitate social interaction and quality care. This review described the characteristics and evaluated psychometric properties of instruments that assess dyadic communication and environment between persons living with dementia and their caregivers.
RESEARCH DESIGN AND METHODS
A systematic review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guideline. Literature published until June 30, 2021, was searched. Ten psychometric properties and the ratio of sample size to the number of items were evaluated using the Psychometric Assessment for Self-report and Observational Tool.
RESULTS
A total of 3,708 scholarly records was identified, and 24 eligible instruments from 48 scholarly records were evaluated. Twenty-two instruments assessed dyadic communication, and 2 assessed both dyadic communication and environment. Eighteen instruments were developed to assess task-related communication and 15 for paid (professional) caregivers. All instruments were scored as low psychometric quality (score range = 0-7). Behavioral Observation Scoring System was scored the highest (total score = 7), followed by Dyadic Dementia Coding System, Grid for observation of physical and verbal behaviors of caregiver and resident, and Trouble-Indicating Behaviors and Repair (total score = 6). These instruments had low psychometric evidence for internal consistency, content validity, and structural validity.
DISCUSSION AND IMPLICATIONS
Existing instruments are in the early stages of development and validation in dementia population. Further testing is needed in diverse communication types in paid and unpaid dementia caregiver populations.
Topics: Humans; Psychometrics; Caregivers; Communication; Self Report; Dementia; Reproducibility of Results
PubMed: 34864998
DOI: 10.1093/geront/gnab178 -
Disability and Rehabilitation Jan 2013To systematically review and appraise studies examining self-report questionnaires measuring work-related aspects in cancer patients. (Review)
Review
PURPOSE
To systematically review and appraise studies examining self-report questionnaires measuring work-related aspects in cancer patients.
METHOD
Literature search methodology: Searches in Embase, PsycINFO, PSYNDEXplus, PSYNDEXplus Tests and PubMed for the period 1990-2011 were completed. Inclusion criteria were as follows: (i) the questionnaire measures work-related aspects; (ii) the questionnaire has been used in at least one study, which involved cancer patient as a relevant target group; and (iii) articles were written in English or German.
RESULTS
Twenty-two articles out of 350 records were reviewed and 13 questionnaires identified. The majority of measures cover several dimensions of work-related aspects representing a variety of work-related experiences and constructs such as aspects of the work environment, demands at work and work-related interpersonal relations. Nine of the 13 questionnaires showed good internal consistency whereas subscales of four instruments had fair or poor internal consistency. For 12 out of 13 measures, validity and reliability were tested in non-cancer populations.
CONCLUSIONS
The knowledge about reliability and validity of self-report questionnaires measuring work-related aspects in cancer patients is scarce and more high-quality validation studies are needed. Findings further emphasize the need for the development of valid multidimensional measures that are relevant for both research and rehabilitative occupational interventions.
Topics: Employment; Humans; Neoplasms; Psychometrics; Quality of Life; Reproducibility of Results; Self Report; Surveys and Questionnaires; Survival Rate; Work Capacity Evaluation
PubMed: 22697459
DOI: 10.3109/09638288.2012.688921 -
Epidemiology and Health 2020To systematically review and identify food frequency questionnaires (FFQs) developed for the Iranian population and their validation and reproducibility in order to...
OBJECTIVES
To systematically review and identify food frequency questionnaires (FFQs) developed for the Iranian population and their validation and reproducibility in order to determine possible research gaps and needs.
METHODS
Studies were selected by searching for relevant keywords in the PubMed, Scopus, Science Direct, Google Scholar, SID, and Iranmedex databases, unpublished data, and theses in November 2016 (updated in September 2019). All English-language and Persian-language papers were included. Duplicates, articles with unrelated content, and articles only containing a protocol were excluded. The FFQs were categorized based on: (1) number of food items in to short (≤80 items) and long (>80 items) and; (2) the aim of the FFQ to explore total consumption pattern/nutrients (general) or to detect specific nutrient(s)/food group(s) (specialized).
RESULTS
Sixteen reasonably validated questionnaires were identified. However, only 13 presented a reproducibility assessment. Ten FFQs were categorized as general (7 long, 3 short) and 6 as specialized (3 long, 3 short). The correlation coefficients for nutrient intake between dietary records or recalls and FFQs were 0.07-0.82 for long (general: 0.07-0.82 and specialized: 0.26-0.67) and 0.20-0.67 for short (general: 0.24-0.54 and specialized: 0.20-0.42) FFQs. Long FFQs showed higher validity and reproducibility than short FFQs. Reproducibility of FFQs was acceptable (0.32-0.89). The strongest correlations were reported by studies with shorter intervals between FFQs.
CONCLUSIONS
FFQs designed for the Iranian population appear to be appropriate tools for dietary assessment. Despite their acceptable reproducibility, their validity for assessing specific nutrients and their applicability for populations other than those they were developed for may be questionable.
Topics: Diet Surveys; Humans; Iran; Reproducibility of Results
PubMed: 32229793
DOI: 10.4178/epih.e2020015 -
Expert Review of Pharmacoeconomics &... Oct 2018This study aims to determine methodological variations in the event simulation approaches of published health economic decision models, in the field of obesity, and to... (Review)
Review
INTRODUCTION
This study aims to determine methodological variations in the event simulation approaches of published health economic decision models, in the field of obesity, and to investigate whether their predictiveness and validity were investigated via external event validation techniques, which investigate how well the model reproduces reality.
AREAS COVERED
A systematic review identified a total of 87 relevant papers, of which 72 that simulated obesity-associated events were included. Most frequently simulated events were coronary heart disease (≈ 83%), type 2 diabetes (≈ 74%), and stroke (≈ 66%). Only for ten published model-based health economic assessments in obesity an external event validation was performed (14%; 10 of 72), and only for one the predictiveness and validity of the event simulation was investigated in a cohort of obese subjects.
EXPERT COMMENTARY
We identified a wide range of obesity related event simulation approaches. Published obesity models lack information on the predictive quality and validity of the applied event simulation approaches. Further work on comparing and validating these event simulation approaches is required to investigate their predictiveness and validity, which will offer guidance future modelling in the field of obesity.
Topics: Computer Simulation; Coronary Disease; Decision Making; Diabetes Mellitus, Type 2; Humans; Models, Economic; Obesity; Reproducibility of Results; Stroke
PubMed: 30011385
DOI: 10.1080/14737167.2018.1501680 -
Psycho-oncology Jun 2012Prior research has shown that many cancer survivors experience ongoing fears of cancer recurrence (FCR) and that this chronic uncertainty of health status during and... (Review)
Review
BACKGROUND
Prior research has shown that many cancer survivors experience ongoing fears of cancer recurrence (FCR) and that this chronic uncertainty of health status during and after cancer treatment can be a significant psychological burden. The field of research on FCR is an emerging area of investigation in the cancer survivorship literature, and several standardised instruments for its assessment have been developed.
AIMS
This review aims to identify all available FCR-specific questionnaires and subscales and critically appraise their properties.
METHODS
A systematic review was undertaken to identify instruments measuring FCR. Relevant studies were identified via Medline (1950-2010), CINAHL (1982-2010), PsycINFO (1967-2010) and AMED (1985-2010) databases, reference lists of articles and reviews, grey literature databases and consultation with experts in the field. The Medical Outcomes Trust criteria were used to examine the psychometric properties of the questionnaires.
RESULTS
A total of 20 relevant multi-item measures were identified. The majority of instruments have demonstrated reliability and preliminary evidence of validity. Relatively few brief measures (2-10 items) were found to have comprehensive validation and reliability data available. Several valid and reliable longer measures (>10 items) are available. Three have developed short forms that may prove useful as screening tools.
CONCLUSIONS
This analysis indicated that further refinement and validation of existing instruments is required. Valid and reliable instruments are needed for both research and clinical care.
Topics: Anxiety; Fear; Humans; Neoplasm Recurrence, Local; Neoplasms; Psychometrics; Quality of Life; Reproducibility of Results; Self Report; Surveys and Questionnaires; Survivors
PubMed: 22021099
DOI: 10.1002/pon.2070 -
Anesthesiology Jan 2024The utilization of artificial intelligence and machine learning as diagnostic and predictive tools in perioperative medicine holds great promise. Indeed, many studies...
BACKGROUND
The utilization of artificial intelligence and machine learning as diagnostic and predictive tools in perioperative medicine holds great promise. Indeed, many studies have been performed in recent years to explore the potential. The purpose of this systematic review is to assess the current state of machine learning in perioperative medicine, its utility in prediction of complications and prognostication, and limitations related to bias and validation.
METHODS
A multidisciplinary team of clinicians and engineers conducted a systematic review using the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) protocol. Multiple databases were searched, including Scopus, Cumulative Index to Nursing and Allied Health Literature (CINAHL), the Cochrane Library, PubMed, Medline, Embase, and Web of Science. The systematic review focused on study design, type of machine learning model used, validation techniques applied, and reported model performance on prediction of complications and prognostication. This review further classified outcomes and machine learning applications using an ad hoc classification system. The Prediction model Risk Of Bias Assessment Tool (PROBAST) was used to assess risk of bias and applicability of the studies.
RESULTS
A total of 103 studies were identified. The models reported in the literature were primarily based on single-center validations (75%), with only 13% being externally validated across multiple centers. Most of the mortality models demonstrated a limited ability to discriminate and classify effectively. The PROBAST assessment indicated a high risk of systematic errors in predicted outcomes and artificial intelligence or machine learning applications.
CONCLUSIONS
The findings indicate that the development of this field is still in its early stages. This systematic review indicates that application of machine learning in perioperative medicine is still at an early stage. While many studies suggest potential utility, several key challenges must be first overcome before their introduction into clinical practice.
Topics: Artificial Intelligence; Bias; Databases, Factual; Machine Learning; Perioperative Medicine
PubMed: 37944114
DOI: 10.1097/ALN.0000000000004764