p-value - OpenMD.com Journal Search

Surprise!

American Journal of Epidemiology Feb 2021

Measures of information and surprise, such as the Shannon information value (S value), quantify the signal present in a stream of noisy data. We illustrate the use of...

Summary PubMed Full Text PDF

Authors: Stephen R Cole, Jessie K Edwards, Sander Greenland...

Measures of information and surprise, such as the Shannon information value (S value), quantify the signal present in a stream of noisy data. We illustrate the use of such information measures in the context of interpreting P values as compatibility indices. S values help communicate the limited information supplied by conventional statistics and cast a critical light on cutoffs used to judge and construct those statistics. Misinterpretations of statistics may be reduced by interpreting P values and interval estimates using compatibility concepts and S values instead of "significance" and "confidence."

Topics: Confidence Intervals; Data Interpretation, Statistical; Epidemiologic Methods; Humans; Uncertainty

PubMed: 32648906
DOI: 10.1093/aje/kwaa136

-value, compatibility, and S-value.

Global Epidemiology Dec 2022

Misinterpretations of -values and 95% confidence intervals are ubiquitous in medical research. Specifically, the terms significance or confidence, extensively used in...

Summary PubMed Full Text PDF

Authors: Mohammad Ali Mansournia, Maryam Nazemipour, Mahyar Etminan...

Misinterpretations of -values and 95% confidence intervals are ubiquitous in medical research. Specifically, the terms significance or confidence, extensively used in medical papers, ignore biases and violations of statistical assumptions and hence should be called overconfidence terms. In this paper, we present the compatibility view of -values and confidence intervals; the P-value is interpreted as an index of compatibility between data and the model, including the test hypothesis and background assumptions, whereas a confidence interval is interpreted as the range of parameter values that are compatible with the data under background assumptions. We also suggest the use of a surprisal measure, often referred to as the S-value, a novel metric that transforms the -value, for gauging compatibility in terms of an intuitive experiment of coin tossing.

PubMed: 37637018
DOI: 10.1016/j.gloepi.2022.100085

Why not to (over)emphasize statistical significance.

European Journal of Endocrinology Sep 2019

P values should not merely be used to categorize results into significant and non-significant. This practice disregards clinical relevance, confounds non-significance...

Summary PubMed

Authors: Olaf M Dekkers

P values should not merely be used to categorize results into significant and non-significant. This practice disregards clinical relevance, confounds non-significance with no effect and underestimates the likelihood of false-positive results. Better than to use the P value as a dichotomizing instrument, the P values and the confidence intervals around effect estimates can be used to put research findings in a context, thereby taking clinical relevance but also uncertainty genuinely into account.

Topics: Data Interpretation, Statistical; False Positive Reactions; Humans; Probability; Uncertainty

PubMed: 31330499
DOI: 10.1530/EJE-19-0531

The roles, challenges, and merits of the p value.

Patterns (New York, N.Y.) Dec 2023

Since the 18th century, the p value has been an important part of hypothesis-based scientific investigation. As statistical and data science engines accelerate,... (Review)

Summary PubMed Full Text PDF

Review

Authors: Oliver Y Chén, Julien S Bodelet, Raúl G Saraiva...

Since the 18th century, the p value has been an important part of hypothesis-based scientific investigation. As statistical and data science engines accelerate, questions emerge: to what extent are scientific discoveries based on p values reliable and reproducible? Should one adjust the significance level or find alternatives for the p value? Inspired by these questions and everlasting attempts to address them, here, we provide a systematic examination of the p value from its roles and merits to its misuses and misinterpretations. For the latter, we summarize modest recommendations to handle them. In parallel, we present the Bayesian alternatives for seeking evidence and discuss the pooling of p values from multiple studies and datasets. Overall, we argue that the p value and hypothesis testing form a useful probabilistic decision-making mechanism, facilitating causal inference, feature selection, and predictive modeling, but that the interpretation of the p value must be contextual, considering the scientific question, experimental design, and statistical principles.

PubMed: 38106615
DOI: 10.1016/j.patter.2023.100878

The value - and its historical underpinnings - pro and con.

Saudi Journal of Anaesthesia 2023

The derivation and interpretation of P values derived from inferential testing remain somewhat vague and ambiguous in the minds of some... (Review)

Summary PubMed Full Text PDF

Review

Authors: Victor Grech, Adelazeem A Eldawlatly

UNLABELLED

The derivation and interpretation of P values derived from inferential testing remain somewhat vague and ambiguous in the minds of some researchers/editors/reviewers/readers. The British polymath Fisher famously averred: "the value for which = 0.05, or 1 in 20, is 1.96 or nearly 2; it is convenient to take this point as a limit in judging whether a deviation is to be considered significant or not. Deviations exceeding twice the standard deviation are thus formally regarded as significant." This sometimes leads to an almost reductio ad absurdum mindset with an automatic discardment of studies with results where > 0.05. It must be remembered that results may be negatively impacted by myriad factors that may be out of the researcher/s control, such as small sample sizes, small effects, bias, and random error. This paper briefly reviews the historical events leading to the acceptance of ≤ 0.05 for statistical significance, the rationale behind the null hypothesis (H), the meaning of (and the potential for Type 1 and 2 Errors), α, β, the possibility of using non-0.05 cut-offs when studies are "trending toward statistical significance," and the importance of including confidence intervals (CIs) in results. values are vital but must be tempered by judicial consideration of CI and study design. P is a probability spectrum and not simply a binary significant/non-significant statistical metric.

MESH

95% confidence interval, biostatistics, value.

PubMed: 37601497
DOI: 10.4103/sja.sja_223_23

Single Versus Double-Sided Hypotheses and Probabilities.

Pediatric Emergency Care Jul 2021

Single-sided (1-tailed) and double-sided (2-tailed) probabilities are products of statistical tests that can be crucial to drawing accurate conclusions in scientific... (Review)

Summary PubMed

Review

Authors: Michelle A Murata, Loren G Yamamoto

UNLABELLED

Single-sided (1-tailed) and double-sided (2-tailed) probabilities are products of statistical tests that can be crucial to drawing accurate conclusions in scientific studies. In a review of articles published in issues of Pediatric Emergency Care from 2020, we identified 2 where single-sided versus double-sided probability issues potentially reversed a conclusion of study investigators. The purpose of this study is to describe single-sided versus double-sided probability issues found in Pediatric Emergency Care 2020 articles to increase awareness surrounding these issues.

METHODS

This study involved a review of all articles from 2020 issues of the Pediatric Emergency Care journal, examining whether P values between and including the values 0.05 and 0.10, were characterized as not significant when, in fact, they resulted from a double-sided test and arguably should have been halved to yield significant single-sided probabilities less than or equal to 0.05.

RESULTS

Two such studies were identified. In the first study, researchers concluded that their intervention resulted in "no statistically significant improvement," citing a P value of 0.08, but if a single-sided P value was used, it would have been 0.04 and the authors would have instead concluded that their intervention resulted in significant improvement. In the second study, researchers measured resuscitation times in pediatric and adult manikin simulations. They concluded no difference, citing a P value of 0.088, but if a single-sided P value was used, it would have been 0.044, and the authors would have instead concluded that the resuscitation times took longer in the pediatric simulation.

CONCLUSIONS

These articles demonstrate how single-sided versus double-sided probability issues can cause researchers to draw inaccurate conclusions. As such, we would urge that this be more rigorously evaluated when the P values are between 0.05 and 0.10.

Topics: Adult; Child; Computer Simulation; Humans; Probability; Resuscitation

PubMed: 34116549
DOI: 10.1097/PEC.0000000000002477

P-Values and Power in Orthopedic Research: Myths and Reality.

The Journal of Arthroplasty Oct 2022

The results of statistical tests in orthopedic studies are typically reported using P-values. If a P-value is smaller than the pre-determined level of significance (eg,... (Review)

Summary PubMed Full Text PDF

Review

Authors: Isabella Zaniletti, Katrina L Devick, Dirk R Larson...

The results of statistical tests in orthopedic studies are typically reported using P-values. If a P-value is smaller than the pre-determined level of significance (eg, < .05), the null hypothesis is rejected in support of the alternative. This automaticity in interpreting statistical results without consideration of the power of the study has been denounced over the years by statisticians, since it can potentially lead to misinterpretation of the study conclusions. In this paper, we review fundamental misconceptions and misinterpretations of P-values and power, along with their connection with confidence intervals, and we provide guidelines to orthopedic researchers for evaluating and reporting study results. We provide real-world orthopedic examples to illustrate the main concepts. Please visit the followinghttps://youtu.be/bdPU4luYmF0for videos that explain the highlights of the paper in practical terms.

Topics: Biomedical Research; Humans; Orthopedics; Statistics as Topic

PubMed: 36162927
DOI: 10.1016/j.arth.2022.05.026

p value variability and subgroup testing.

European Journal of Nutrition Dec 2021

This article discusses the variability and randomness of p values, the most widely used currency of evidence in nutritional and health studies. One implication of this,...

Summary PubMed Full Text PDF

Authors: Graham Horgan

This article discusses the variability and randomness of p values, the most widely used currency of evidence in nutritional and health studies. One implication of this, the importance of always testing interaction terms when subgroups are examined and presented separately is also discussed.

PubMed: 33585951
DOI: 10.1007/s00394-021-02498-z

Picking apart p values: common problems and points of confusion.

Knee Surgery, Sports Traumatology,... Oct 2022

Due to its frequent misuse, the p value has become a point of contention in the research community. In this editorial, we seek to clarify some of the common...

Summary PubMed

Authors: Sophia J Madjarova, Riley J Williams, Benedict U Nwachukwu...

Due to its frequent misuse, the p value has become a point of contention in the research community. In this editorial, we seek to clarify some of the common misconceptions about p values and the hazardous implications associated with misunderstanding this commonly used statistical concept. This article will discuss issues related to p value interpretation in addition to problems such as p-hacking and statistical fragility; we will also offer some thoughts on addressing these issues. The aim of this editorial is to provide clarity around the concept of statistical significance for those attempting to increase their statistical literacy in Orthopedic research.

Topics: Humans; Orthopedics

PubMed: 35920843
DOI: 10.1007/s00167-022-07083-3

Misleading medical literature: An observational study.

Emergency Medicine Australasia : EMA Feb 2022

Language that implies a conclusion not supported by the evidence is common in the medical literature. The hypothesis of the present study was that medical journal... (Observational Study)

Summary PubMed

Observational Study

Authors: Alexander Olaussen, Jeremy Abetz, Kirby R Qin...

OBJECTIVE

Language that implies a conclusion not supported by the evidence is common in the medical literature. The hypothesis of the present study was that medical journal publications are more likely to use misleading language for the interpretation of a demonstrated null (i.e. chance or not statistically significant) effect than a demonstrated real (i.e. statistically significant) effect.

METHODS

This was an observational study of the medical literature with a systematic sampling method. Articles published in The Journal of the American Medical Association, The Lancet and The New England Journal of Medicine over the last two decades were eligible. The language used around the P-value was assessed for misleadingness (i.e. either suggesting an effect existed when a real effect did not exist or vice versa).

RESULTS

There were 228 unique manuscripts examined, containing 400 statements interpreting a P-value proximate to 0.05. The P-value was between 0.036 and 0.050 for 303 (75.8%) statements and between 0.050 and 0.064 for 97 (24.3%) statements. Forty-four (11%) of the statements were misleading. There were 40 (41.2%) false-positive sentences, implying statistical significance when the P-value was >0.05, and four (1.3%) false-negative sentences, implying no statistical significance when the P-value <0.05 (relative risk 31.2; 95% confidence interval 11.5-85.1; P < 0.0001). The proportion of included manuscripts containing at least one misleading sentence was 16.2% (95% confidence interval 12.0-21.6).

CONCLUSIONS

Among a random selection of sentences in prestigious journals describing P-values close to 0.05, 1 in 10 are misleading (n = 44, 11%) and this is more prevalent when the P-values are above 0.05 compared to below 0.05. Caution is advised for researchers, clinicians and editors to align with the context and purpose of P-values.

Topics: Humans; Probability; Publishing; Research Design; United States

PubMed: 34355494
DOI: 10.1111/1742-6723.13831