t-Test - OpenMD.com Journal Search

Analysis of Bayesian posterior significance and effect size indices for the two-sample t-test to support reproducible medical research.

BMC Medical Research Methodology Apr 2020

The replication crisis hit the medical sciences about a decade ago, but today still most of the flaws inherent in null hypothesis significance testing (NHST) have not...

Summary PubMed Full Text PDF

Authors: Riko Kelter

BACKGROUND

The replication crisis hit the medical sciences about a decade ago, but today still most of the flaws inherent in null hypothesis significance testing (NHST) have not been solved. While the drawbacks of p-values have been detailed in endless venues, for clinical research, only a few attractive alternatives have been proposed to replace p-values and NHST. Bayesian methods are one of them, and they are gaining increasing attention in medical research, as some of their advantages include the description of model parameters in terms of probability, as well as the incorporation of prior information in contrast to the frequentist framework. While Bayesian methods are not the only remedy to the situation, there is an increasing agreement that they are an essential way to avoid common misconceptions and false interpretation of study results. The requirements necessary for applying Bayesian statistics have transitioned from detailed programming knowledge into simple point-and-click programs like JASP. Still, the multitude of Bayesian significance and effect measures which contrast the gold standard of significance in medical research, the p-value, causes a lack of agreement on which measure to report.

METHODS

Therefore, in this paper, we conduct an extensive simulation study to compare common Bayesian significance and effect measures which can be obtained from a posterior distribution. In it, we analyse the behaviour of these measures for one of the most important statistical procedures in medical research and in particular clinical trials, the two-sample Student's (and Welch's) t-test.

RESULTS

The results show that some measures cannot state evidence for both the null and the alternative. While the different indices behave similarly regarding increasing sample size and noise, the prior modelling influences the obtained results and extreme priors allow for cherry-picking similar to p-hacking in the frequentist paradigm. The indices behave quite differently regarding their ability to control the type I error rates and regarding their ability to detect an existing effect.

CONCLUSION

Based on the results, two of the commonly used indices can be recommended for more widespread use in clinical and biomedical research, as they improve the type I error control compared to the classic two-sample t-test and enjoy multiple other desirable properties.

Topics: Bayes Theorem; Biomedical Research; Humans; Probability; Research Design; Sample Size

PubMed: 32321438
DOI: 10.1186/s12874-020-00968-2

The case for case-control studies in the field of suicide prevention.

Epidemiology and Psychiatric Sciences Oct 2019

Much of our knowledge about the risk factors for suicide comes from case-control studies that either use a psychological autopsy approach or are nested within large...

Summary PubMed Full Text PDF

Authors: Jane Pirkis, Angela Nicholas, David Gunnell...

Much of our knowledge about the risk factors for suicide comes from case-control studies that either use a psychological autopsy approach or are nested within large register-based cohort studies. We would argue that case-control studies are appropriate in the context of a rare outcome like suicide, but there are issues with using this design. Some of these issues are common in psychological autopsy studies and relate to the selection of controls (e.g. selection bias caused by the use of controls who have died by other causes, rather than live controls) and the reliance on interviewing informants (e.g. recall bias caused by the loved ones of cases having thought about the events leading up to the suicide in considerable detail). Register-based studies can overcome some of these problems because they draw upon contain information that is routinely collected for administrative purposes and gathered in the same way for cases and controls. However, they face issues that mean that psychological autopsy studies will still sometimes be the study design of choice for investigating risk factors for suicide. Some countries, particularly low and middle income countries, don't have sophisticated population-based registers. Even where they do exist, there will be variable of interest that are not captured by them (e.g. acute stressful life events that may immediately precede a suicide death), or not captured in a comprehensive way (e.g. suicide attempts and mental illness that do not result in hospital admissions). Future studies of risk factors should be designed to progress knowledge in the field and overcome the problems with the existing studies, particularly those using a case-control design. The priority should be pinning down the risk factors that are amenable to modification or mitigation through interventions that can successfully be rolled out at scale.

Topics: Case-Control Studies; Confounding Factors, Epidemiologic; Control Groups; Humans; Registries; Research Design; Suicide Prevention

PubMed: 31571561
DOI: 10.1017/S2045796019000581

Approaches for informing optimal dose of behavioral interventions.

Annals of Behavioral Medicine : a... Dec 2014

There is little guidance about to how select dose parameter values when designing behavioral interventions. (Review)

Summary PubMed Full Text PDF

Review

Authors: Corrine I Voils, Heather A King, Matthew L Maciejewski...

BACKGROUND

There is little guidance about to how select dose parameter values when designing behavioral interventions.

PURPOSE

The purpose of this study is to present approaches to inform intervention duration, frequency, and amount when (1) the investigator has no a priori expectation and is seeking a descriptive approach for identifying and narrowing the universe of dose values or (2) the investigator has an a priori expectation and is seeking validation of this expectation using an inferential approach.

METHODS

Strengths and weaknesses of various approaches are described and illustrated with examples.

RESULTS

Descriptive approaches include retrospective analysis of data from randomized trials, assessment of perceived optimal dose via prospective surveys or interviews of key stakeholders, and assessment of target patient behavior via prospective, longitudinal, observational studies. Inferential approaches include nonrandomized, early-phase trials and randomized designs.

CONCLUSIONS

By utilizing these approaches, researchers may more efficiently apply resources to identify the optimal values of dose parameters for behavioral interventions.

Topics: Behavior Therapy; Humans; Research Design

PubMed: 24722964
DOI: 10.1007/s12160-014-9618-7

Causal inference and adjustment for reference-arm risk in indirect treatment comparison meta-analysis.

Journal of Comparative Effectiveness... Jul 2020

To illustrate that bias associated with indirect treatment comparison and network meta-analyses can be reduced by adjusting for outcomes on common reference arms.... (Review)

Summary PubMed Full Text

Review

Authors: Elyse Swallow, Oscar Patterson-Lomba, Rajeev Ayyagari...

To illustrate that bias associated with indirect treatment comparison and network meta-analyses can be reduced by adjusting for outcomes on common reference arms. Approaches to adjusting for reference-arm effects are presented within a causal inference framework. Bayesian and Frequentist approaches are applied to three real data examples. Reference-arm adjustment can significantly impact estimated treatment differences, improve model fit and align indirectly estimated treatment effects with those observed in randomized trials. Reference-arm adjustment can possibly reverse the direction of estimated treatment effects. Accumulating theoretical and empirical evidence underscores the importance of adjusting for reference-arm outcomes in indirect treatment comparison and network meta-analyses to make full use of data and reduce the risk of bias in estimated treatments effects.

Topics: Bayes Theorem; Bias; Delivery of Health Care; Humans; Meta-Analysis as Topic; Models, Theoretical; Network Meta-Analysis; Research Design; Treatment Outcome

PubMed: 32490682
DOI: 10.2217/cer-2020-0042

Designed Learning: Missing Data in Clinical Research.

Annals of Internal Medicine May 2018

Summary PubMed Full Text

Authors: Catharine B Stack, Trevor Butterworth, Rebecca Goldin...

Topics: Biomedical Research; Clinical Trials as Topic; Data Interpretation, Statistical; Humans; Information Dissemination; Research Design; Simulation Training

PubMed: 29632949
DOI: 10.7326/M18-0534

Principles for valid histopathologic scoring in research.

Veterinary Pathology Nov 2013

Histopathologic scoring is a tool by which semiquantitative data can be obtained from tissues. Initially, a thorough understanding of the experimental design, study... (Review)

Summary PubMed Full Text PDF

Review

Authors: K N Gibson-Corley, A K Olivier, D K Meyerholz...

Histopathologic scoring is a tool by which semiquantitative data can be obtained from tissues. Initially, a thorough understanding of the experimental design, study objectives, and methods is required for the pathologist to appropriately examine tissues and develop lesion scoring approaches. Many principles go into the development of a scoring system such as tissue examination, lesion identification, scoring definitions, and consistency in interpretation. Masking (aka "blinding") of the pathologist to experimental groups is often necessary to constrain bias, and multiple mechanisms are available. Development of a tissue scoring system requires appreciation of the attributes and limitations of the data (eg, nominal, ordinal, interval, and ratio data) to be evaluated. Incidence, ordinal, and rank methods of tissue scoring are demonstrated along with key principles for statistical analyses and reporting. Validation of a scoring system occurs through 2 principal measures: (1) validation of repeatability and (2) validation of tissue pathobiology. Understanding key principles of tissue scoring can help in the development and/or optimization of scoring systems so as to consistently yield meaningful and valid scoring data.

Topics: Animals; Disease Models, Animal; Pathology; Research Design; Severity of Illness Index; Validation Studies as Topic

PubMed: 23558974
DOI: 10.1177/0300985813485099

Getting started in medical education scholarship.

The Keio Journal of Medicine 2010

Education scholarship and research are critically important in extending our ability to teach and assess effectively. Those considering a scholarly project in medical... (Review)

Summary PubMed Full Text

Review

Authors: David A Cook

Education scholarship and research are critically important in extending our ability to teach and assess effectively. Those considering a scholarly project in medical education should consider the following tips, learned from personal experience and supported by literature: 1) get some training, 2) find a mentor, 3) ask important questions, 4) start small and grow, 5) aim high, 6) don't wait for the perfect study, 7) plan for adequate time and other resources, 8) attend to ethical issues, 9) network with others in the field, and 10) recognize that this is hard work. By following these steps and planning ahead, scholars will be better poised to make meaningful contributions to the art and science of medical education.

Topics: Education, Medical; Humans; Mentors; Research Design

PubMed: 20881450
DOI: 10.2302/kjm.59.96

Clinimetric evaluation of methods to measure muscle functioning in patients with non-specific neck pain: a systematic review.

BMC Musculoskeletal Disorders Oct 2008

Neck pain is a significant health problem in modern society. There is evidence to suggest that neck muscle strength is reduced in patients with neck pain. This article... (Review)

Summary PubMed Full Text PDF

Review

Authors: Chantal H P de Koning, Sylvia P van den Heuvel, J Bart Staal...

BACKGROUND

Neck pain is a significant health problem in modern society. There is evidence to suggest that neck muscle strength is reduced in patients with neck pain. This article provides a critical analysis of the research literature on the clinimetric properties of tests to measure neck muscle strength or endurance in patients with non-specific neck pain, which can be used in daily practice.

METHODS

A computerised literature search was performed in the Medline, CINAHL and Embase databases from 1980 to January 2007. Two reviewers independently assessed the clinimetric properties of identified measurement methods, using a checklist of generally accepted criteria for reproducibility (inter- and intra-observer reliability and agreement), construct validity, responsiveness and feasibility.

RESULTS

The search identified a total of 16 studies. The instruments or tests included were: muscle endurance tests for short neck flexors, craniocervical flexion test with an inflatable pressure biofeedback unit, manual muscle testing of neck musculature, dynamometry and functional lifting tests (the cervical progressive iso-inertial lifting evaluation (PILE) test and the timed weighted overhead test). All the articles included report information on the reproducibility of the tests. Acceptable intra- and inter-observer reliability was demonstrated for t enduranctest for short neck flexors and the cervical PILE test. Construct validity and responsiveness have hardly been documented for tests on muscle functioning.

CONCLUSION

The endurance test of the short neck flexors and the cervical PILE test can be regarded as appropriate instruments for measuring different aspects of neck muscle function in patients with non-specific neck pain. Common methodological flaws in the studies were their small sample size and an inappropriate description of the study design.

Topics: Disability Evaluation; Humans; Neck Muscles; Neck Pain; Physical Endurance; Research Design

PubMed: 18928568
DOI: 10.1186/1471-2474-9-142

Evidence-based vector control? Improving the quality of vector control trials.

Trends in Parasitology Aug 2015

Vector-borne diseases (VBDs) such as malaria, dengue, and leishmaniasis cause a high level of morbidity and mortality. Although vector control tools can play a major... (Review)

Summary PubMed Full Text

Review

Authors: Anne L Wilson, Marleen Boelaert, Immo Kleinschmidt...

Vector-borne diseases (VBDs) such as malaria, dengue, and leishmaniasis cause a high level of morbidity and mortality. Although vector control tools can play a major role in controlling and eliminating these diseases, in many cases the evidence base for assessing the efficacy of vector control interventions is limited or not available. Studies assessing the efficacy of vector control interventions are often poorly conducted, which limits the return on investment of research funding. Here we outline the principal design features of Phase III vector control field studies, highlight major failings and strengths of published studies, and provide guidance on improving the design and conduct of vector control studies. We hope that this critical assessment will increase the impetus for more carefully considered and rigorous design of vector control studies.

Topics: Animals; Disease Vectors; Humans; Pest Control; Research Design

PubMed: 25999026
DOI: 10.1016/j.pt.2015.04.015

Editorial: Opposites Attract at CORR®-Machine Learning and Qualitative Research.

Clinical Orthopaedics and Related... Oct 2020

Summary PubMed Full Text PDF

Authors: Seth S Leopold, Raphaël Porcher, Mark C Gebhardt...

Topics: Humans; Machine Learning; Orthopedics; Periodicals as Topic; Qualitative Research; Research Design

PubMed: 32858722
DOI: 10.1097/CORR.0000000000001466