p-value - OpenMD.com Journal Search

The Value and Statistical Significance: Misunderstandings, Explanations, Challenges, and Alternatives.

Indian Journal of Psychological Medicine 2019

The calculation of a value in research and especially the use of a threshold to declare the statistical significance of the value have both been challenged in recent... (Review)

Summary PubMed Full Text PDF

Review

Authors: Chittaranjan Andrade

The calculation of a value in research and especially the use of a threshold to declare the statistical significance of the value have both been challenged in recent years. There are at least two important reasons for this challenge: research data contain much more meaning than is summarized in a value and its statistical significance, and these two concepts are frequently misunderstood and consequently inappropriately interpreted. This article considers why 5% may be set as a reasonable cut-off for statistical significance, explains the correct interpretation of < 0.05 and other values of P, examines arguments for and against the concept of statistical significance, and suggests other and better ways for analyzing data and for presenting, interpreting, and discussing the results.

PubMed: 31142921
DOI: 10.4103/IJPSYM.IJPSYM_193_19

Surprise!

American Journal of Epidemiology Feb 2021

Measures of information and surprise, such as the Shannon information value (S value), quantify the signal present in a stream of noisy data. We illustrate the use of...

Summary PubMed Full Text PDF

Authors: Stephen R Cole, Jessie K Edwards, Sander Greenland...

Measures of information and surprise, such as the Shannon information value (S value), quantify the signal present in a stream of noisy data. We illustrate the use of such information measures in the context of interpreting P values as compatibility indices. S values help communicate the limited information supplied by conventional statistics and cast a critical light on cutoffs used to judge and construct those statistics. Misinterpretations of statistics may be reduced by interpreting P values and interval estimates using compatibility concepts and S values instead of "significance" and "confidence."

Topics: Confidence Intervals; Data Interpretation, Statistical; Epidemiologic Methods; Humans; Uncertainty

PubMed: 32648906
DOI: 10.1093/aje/kwaa136

-value, compatibility, and S-value.

Global Epidemiology Dec 2022

Misinterpretations of -values and 95% confidence intervals are ubiquitous in medical research. Specifically, the terms significance or confidence, extensively used in...

Summary PubMed Full Text PDF

Authors: Mohammad Ali Mansournia, Maryam Nazemipour, Mahyar Etminan...

Misinterpretations of -values and 95% confidence intervals are ubiquitous in medical research. Specifically, the terms significance or confidence, extensively used in medical papers, ignore biases and violations of statistical assumptions and hence should be called overconfidence terms. In this paper, we present the compatibility view of -values and confidence intervals; the P-value is interpreted as an index of compatibility between data and the model, including the test hypothesis and background assumptions, whereas a confidence interval is interpreted as the range of parameter values that are compatible with the data under background assumptions. We also suggest the use of a surprisal measure, often referred to as the S-value, a novel metric that transforms the -value, for gauging compatibility in terms of an intuitive experiment of coin tossing.

PubMed: 37637018
DOI: 10.1016/j.gloepi.2022.100085

P-Values and Power in Orthopedic Research: Myths and Reality.

The Journal of Arthroplasty Oct 2022

The results of statistical tests in orthopedic studies are typically reported using P-values. If a P-value is smaller than the pre-determined level of significance (eg,... (Review)

Summary PubMed Full Text PDF

Review

Authors: Isabella Zaniletti, Katrina L Devick, Dirk R Larson...

The results of statistical tests in orthopedic studies are typically reported using P-values. If a P-value is smaller than the pre-determined level of significance (eg, < .05), the null hypothesis is rejected in support of the alternative. This automaticity in interpreting statistical results without consideration of the power of the study has been denounced over the years by statisticians, since it can potentially lead to misinterpretation of the study conclusions. In this paper, we review fundamental misconceptions and misinterpretations of P-values and power, along with their connection with confidence intervals, and we provide guidelines to orthopedic researchers for evaluating and reporting study results. We provide real-world orthopedic examples to illustrate the main concepts. Please visit the followinghttps://youtu.be/bdPU4luYmF0for videos that explain the highlights of the paper in practical terms.

Topics: Biomedical Research; Humans; Orthopedics; Statistics as Topic

PubMed: 36162927
DOI: 10.1016/j.arth.2022.05.026

Using the confidence interval confidently.

Journal of Thoracic Disease Oct 2017

Biomedical research is seldom done with entire populations but rather with samples drawn from a population. Although we work with samples, our goal is to describe and...

Summary PubMed Full Text PDF

Authors: Avijit Hazra

Biomedical research is seldom done with entire populations but rather with samples drawn from a population. Although we work with samples, our goal is to describe and draw inferences regarding the underlying population. It is possible to use a sample statistic and estimates of error in the sample to get a fair idea of the population parameter, not as a single value, but as a range of values. This range is the confidence interval (CI) which is estimated on the basis of a desired confidence level. Calculation of the CI of a sample statistic takes the general form: CI = Point estimate ± Margin of error, where the margin of error is given by the product of a critical value (z) derived from the standard normal curve and the standard error of point estimate. Calculation of the standard error varies depending on whether the sample statistic of interest is a mean, proportion, odds ratio (OR), and so on. The factors affecting the width of the CI include the desired confidence level, the sample size and the variability in the sample. Although the 95% CI is most often used in biomedical research, a CI can be calculated for any level of confidence. A 99% CI will be wider than 95% CI for the same sample. Conflict between clinical importance and statistical significance is an important issue in biomedical research. Clinical importance is best inferred by looking at the effect size, that is how much is the actual change or difference. However, statistical significance in terms of P only suggests whether there is any difference in probability terms. Use of the CI supplements the P value by providing an estimate of actual clinical effect. Of late, clinical trials are being designed specifically as superiority, non-inferiority or equivalence studies. The conclusions from these alternative trial designs are based on CI values rather than the P value from intergroup comparison.

PubMed: 29268424
DOI: 10.21037/jtd.2017.09.14

The roles, challenges, and merits of the p value.

Patterns (New York, N.Y.) Dec 2023

Since the 18th century, the p value has been an important part of hypothesis-based scientific investigation. As statistical and data science engines accelerate,... (Review)

Summary PubMed Full Text PDF

Review

Authors: Oliver Y Chén, Julien S Bodelet, Raúl G Saraiva...

Since the 18th century, the p value has been an important part of hypothesis-based scientific investigation. As statistical and data science engines accelerate, questions emerge: to what extent are scientific discoveries based on p values reliable and reproducible? Should one adjust the significance level or find alternatives for the p value? Inspired by these questions and everlasting attempts to address them, here, we provide a systematic examination of the p value from its roles and merits to its misuses and misinterpretations. For the latter, we summarize modest recommendations to handle them. In parallel, we present the Bayesian alternatives for seeking evidence and discuss the pooling of p values from multiple studies and datasets. Overall, we argue that the p value and hypothesis testing form a useful probabilistic decision-making mechanism, facilitating causal inference, feature selection, and predictive modeling, but that the interpretation of the p value must be contextual, considering the scientific question, experimental design, and statistical principles.

PubMed: 38106615
DOI: 10.1016/j.patter.2023.100878

P-values - a chronic conundrum.

BMC Medical Research Methodology Jun 2020

In medical research and practice, the p-value is arguably the most often used statistic and yet it is widely misconstrued as the probability of the type I error, which...

Summary PubMed Full Text PDF

Authors: Jian Gao

BACKGROUND

In medical research and practice, the p-value is arguably the most often used statistic and yet it is widely misconstrued as the probability of the type I error, which comes with serious consequences. This misunderstanding can greatly affect the reproducibility in research, treatment selection in medical practice, and model specification in empirical analyses. By using plain language and concrete examples, this paper is intended to elucidate the p-value confusion from its root, to explicate the difference between significance and hypothesis testing, to illuminate the consequences of the confusion, and to present a viable alternative to the conventional p-value.

MAIN TEXT

The confusion with p-values has plagued the research community and medical practitioners for decades. However, efforts to clarify it have been largely futile, in part, because intuitive yet mathematically rigorous educational materials are scarce. Additionally, the lack of a practical alternative to the p-value for guarding against randomness also plays a role. The p-value confusion is rooted in the misconception of significance and hypothesis testing. Most, including many statisticians, are unaware that p-values and significance testing formed by Fisher are incomparable to the hypothesis testing paradigm created by Neyman and Pearson. And most otherwise great statistics textbooks tend to cobble the two paradigms together and make no effort to elucidate the subtle but fundamental differences between them. The p-value is a practical tool gauging the "strength of evidence" against the null hypothesis. It informs investigators that a p-value of 0.001, for example, is stronger than 0.05. However, p-values produced in significance testing are not the probabilities of type I errors as commonly misconceived. For a p-value of 0.05, the chance a treatment does not work is not 5%; rather, it is at least 28.9%.

CONCLUSIONS

A long-overdue effort to understand p-values correctly is much needed. However, in medical research and practice, just banning significance testing and accepting uncertainty are not enough. Researchers, clinicians, and patients alike need to know the probability a treatment will or will not work. Thus, the calibrated p-values (the probability that a treatment does not work) should be reported in research papers.

Topics: Biomedical Research; Humans; Probability; Reproducibility of Results; Research Design; Research Personnel

PubMed: 32580765
DOI: 10.1186/s12874-020-01051-6

P in the right place: Revisiting the evidential value of P-values.

Journal of Evidence-based Medicine Nov 2018

P-values are often calculated when testing hypotheses in quantitative settings, and low P-values are typically used as evidential measures to support research findings... (Review)

Summary PubMed Full Text PDF

Review

Authors: Per Lytsy

P-values are often calculated when testing hypotheses in quantitative settings, and low P-values are typically used as evidential measures to support research findings in published medical research. This article reviews old and new arguments questioning the evidential value of P-values. Critiques of the P-value include that it is confounded, fickle, and overestimates the evidence against the null. P-values may turn out falsely low in studies due to random or systematic errors. Even correctly low P-values do not logically provide support to any hypothesis. Recent studies show low replication rates of significant findings, questioning the dependability of published low P-values. P-values are poor indicators in support of scientific propositions. P-values must be inferred by a thorough understanding of the study's question, design, and conduct. Null hypothesis significance testing will likely remain an important method in quantitative analysis but may be complemented with other statistical techniques that more straightforwardly address the size and precision of an effect or the plausibility that a hypothesis is true.

Topics: Biomedical Research; Evidence-Based Medicine; Humans; Research Design; Statistics as Topic

PubMed: 30398018
DOI: 10.1111/jebm.12319

P values: from suggestion to superstition.

Journal of Investigative Medicine : the... Oct 2016

A threshold probability value of 'p≤0.05' is commonly used in clinical investigations to indicate statistical significance. To allow clinicians to better understand... (Review)

Summary PubMed Full Text PDF

Review

Authors: John Concato, John A Hartigan

A threshold probability value of 'p≤0.05' is commonly used in clinical investigations to indicate statistical significance. To allow clinicians to better understand evidence generated by research studies, this review defines the p value, summarizes the historical origins of the p value approach to hypothesis testing, describes various applications of p≤0.05 in the context of clinical research and discusses the emergence of p≤5×10(-8) and other values as thresholds for genomic statistical analyses. Corresponding issues include a conceptual approach of evaluating whether data do not conform to a null hypothesis (ie, no exposure-outcome association). Importantly, and in the historical context of when p≤0.05 was first proposed, the 1-in-20 chance of a false-positive inference (ie, falsely concluding the existence of an exposure-outcome association) was offered only as a suggestion. In current usage, however, p≤0.05 is often misunderstood as a rigid threshold, sometimes with a misguided 'win' (p≤0.05) or 'lose' (p>0.05) approach. Also, in contemporary genomic studies, a threshold of p≤10(-8) has been endorsed as a boundary for statistical significance when analyzing numerous genetic comparisons for each participant. A value of p≤0.05, or other thresholds, should not be employed reflexively to determine whether a clinical research investigation is trustworthy from a scientific perspective. Rather, and in parallel with conceptual issues of validity and generalizability, quantitative results should be interpreted using a combined assessment of strength of association, p values, CIs, and sample size.

Topics: Confidence Intervals; Genomics; Probability; Sample Size; Superstitions

PubMed: 27489256
DOI: 10.1136/jim-2016-000206

How to Use and Report on -values.

Perspectives on Medical Education 2024

The use of the p-value in quantitative research, particularly its threshold of "P < 0.05" for determining "statistical significance," has long been a cornerstone of...

Summary PubMed Full Text PDF

Authors: Christy K Boscardin, Justin L Sewell, Martin G Tolsgaard...

The use of the p-value in quantitative research, particularly its threshold of "P < 0.05" for determining "statistical significance," has long been a cornerstone of statistical analysis in research. However, this standard has been increasingly scrutinized for its potential to mislead findings, especially when the practical significance, the number of comparisons, or the suitability of statistical tests are not properly considered. In response to controversy around use of p-values, the American Statistical Association published a statement in 2016 that challenged the research community to abandon the term "statistically significant". This stance has been echoed by leading scientific journals to urge a significant reduction or complete elimination in the reliance on p-values when reporting results. To provide guidance to researchers in health professions education, this paper provides a succinct overview of the ongoing debate regarding the use of p-values and the definition of p-values. It reflects on the controversy by highlighting the common pitfalls associated with p-value interpretation and usage, such as misinterpretation, overemphasis, and false dichotomization between "significant" and "non-significant" results. This paper also outlines specific recommendations for the effective use of p-values in statistical reporting including the importance of reporting effect sizes, confidence intervals, the null hypothesis, and conducting sensitivity analyses for appropriate interpretation. These considerations aim to guide researchers toward a more nuanced and informative use of p-values.

Topics: Humans; Data Interpretation, Statistical; Research Design

PubMed: 38680196
DOI: 10.5334/pme.1324