-
The Cochrane Database of Systematic... Jun 2014Bronchiolitis is an acute, viral lower respiratory tract infection affecting infants and is sometimes treated with bronchodilators. (Meta-Analysis)
Meta-Analysis Review
BACKGROUND
Bronchiolitis is an acute, viral lower respiratory tract infection affecting infants and is sometimes treated with bronchodilators.
OBJECTIVES
To assess the effects of bronchodilators on clinical outcomes in infants (0 to 12 months) with acute bronchiolitis.
SEARCH METHODS
We searched CENTRAL 2013, Issue 12, MEDLINE (1966 to January Week 2, 2014) and EMBASE (1998 to January 2014).
SELECTION CRITERIA
Randomized controlled trials (RCTs) comparing bronchodilators (other than epinephrine) with placebo for bronchiolitis.
DATA COLLECTION AND ANALYSIS
Two authors assessed trial quality and extracted data. We obtained unpublished data from trial authors.
MAIN RESULTS
We included 30 trials (35 data sets) representing 1992 infants with bronchiolitis. In 11 inpatient and 10 outpatient studies, oxygen saturation did not improve with bronchodilators (mean difference (MD) -0.43, 95% confidence interval (CI) -0.92 to 0.06, n = 1242). Outpatient bronchodilator treatment did not reduce the rate of hospitalization (11.9% in bronchodilator group versus 15.9% in placebo group, odds ratio (OR) 0.75, 95% CI 0.46 to 1.21, n = 710). Inpatient bronchodilator treatment did not reduce the duration of hospitalization (MD 0.06, 95% CI -0.27 to 0.39, n = 349).Effect estimates for inpatients (MD -0.62, 95% CI -1.40 to 0.16) were slightly larger than for outpatients (MD -0.25, 95% CI -0.61 to 0.11) for oximetry. Oximetry outcomes showed significant heterogeneity (I(2) statistic = 81%). Including only studies with low risk of bias had little impact on the overall effect size of oximetry (MD -0.38, 95% CI -0.75 to 0.00) but results were close to statistical significance.In eight inpatient studies, there was no change in average clinical score (standardized MD (SMD) -0.14, 95% CI -0.41 to 0.12) with bronchodilators. In nine outpatient studies, the average clinical score decreased slightly with bronchodilators (SMD -0.42, 95% CI -0.79 to -0.06), a statistically significant finding of questionable clinical importance. The clinical score outcome showed significant heterogeneity (I(2) statistic = 73%). Including only studies with low risk of bias reduced the heterogeneity but had little impact on the overall effect size of average clinical score (SMD -0.22, 95% CI -0.41 to -0.03).Sub-analyses limited to nebulized albuterol or salbutamol among outpatients (nine studies) showed no effect on oxygen saturation (MD -0.19, 95% CI -0.59 to 0.21, n = 572), average clinical score (SMD -0.36, 95% CI -0.83 to 0.11, n = 532) or hospital admission after treatment (OR 0.77, 95% CI 0.44 to 1.33, n = 404).Adverse effects included tachycardia, oxygen desaturation and tremors.
AUTHORS' CONCLUSIONS
Bronchodilators such as albuterol or salbutamol do not improve oxygen saturation, do not reduce hospital admission after outpatient treatment, do not shorten the duration of hospitalization and do not reduce the time to resolution of illness at home. Given the adverse side effects and the expense associated with these treatments, bronchodilators are not effective in the routine management of bronchiolitis. This meta-analysis continues to be limited by the small sample sizes and the lack of standardized study design and validated outcomes across the studies. Future trials with large sample sizes, standardized methodology across clinical sites and consistent assessment methods are needed to answer completely the question of efficacy.
Topics: Acute Disease; Albuterol; Ambulatory Care; Bronchiolitis; Bronchodilator Agents; Hospitalization; Humans; Infant; Infant, Newborn; Oxygen; Randomized Controlled Trials as Topic
PubMed: 24937099
DOI: 10.1002/14651858.CD001266.pub4 -
Biochemia Medica 2013The Chi-square statistic is a non-parametric (distribution free) tool designed to analyze group differences when the dependent variable is measured at a nominal level....
The Chi-square statistic is a non-parametric (distribution free) tool designed to analyze group differences when the dependent variable is measured at a nominal level. Like all non-parametric statistics, the Chi-square is robust with respect to the distribution of the data. Specifically, it does not require equality of variances among the study groups or homoscedasticity in the data. It permits evaluation of both dichotomous independent variables, and of multiple group studies. Unlike many other non-parametric and some parametric statistics, the calculations needed to compute the Chi-square provide considerable information about how each of the groups performed in the study. This richness of detail allows the researcher to understand the results and thus to derive more detailed information from this statistic than from many others. The Chi-square is a significance statistic, and should be followed with a strength statistic. The Cramer's V is the most common strength test used to test the data when a significant Chi-square result has been obtained. Advantages of the Chi-square include its robustness with respect to distribution of the data, its ease of computation, the detailed information that can be derived from the test, its use in studies for which parametric assumptions cannot be met, and its flexibility in handling data from both two group and multiple group studies. Limitations include its sample size requirements, difficulty of interpretation when there are large numbers of categories (20 or more) in the independent or dependent variables, and tendency of the Cramer's V to produce relative low correlation measures, even for highly significant results.
Topics: Chi-Square Distribution; Data Interpretation, Statistical
PubMed: 23894860
DOI: 10.11613/bm.2013.018 -
Clinical Oral Implants Research Sep 2009The aim of the present review was to systematically assess the dental literature in terms of soft tissue grafting techniques. The focused question was: is one method... (Meta-Analysis)
Meta-Analysis Review
AIM
The aim of the present review was to systematically assess the dental literature in terms of soft tissue grafting techniques. The focused question was: is one method superior over others for augmentation and stability of the augmented soft tissue in terms of increasing the width of keratinized tissue (part 1) and gain in soft tissue volume (part 2).
METHODS
A Medline search was performed for human studies focusing on augmentation of keratinized tissue and/or soft tissue volume, and complemented by additional hand searching. Relevant studies were identified and statistical results were reported for meta-analyses including the test minus control weighted mean differences with 95% confidence intervals, the I-squared statistic for tests of heterogeneity, and the number of significant studies.
RESULTS
Twenty-five (part 1) and three (part 2) studies met the inclusion criteria; 14 studies (part 1) were eligible for comparison using meta-analyses. An apically positioned flap/vestibuloplasty (APF/V) procedure resulted in a statistically significantly greater gain in keratinized tissue than untreated controls. APF/V plus autogenous tissue revealed statistically significantly more attached gingiva compared with untreated controls and a borderline statistical significance compared with APF/V plus allogenic tissue. Statistically significantly more shrinkage was observed for the APF/V plus allogenic graft compared with the APF/V plus autogenous tissue. Patient-centered outcomes did not reveal any of the treatment methods to be superior regarding postoperative complications. The three studies reporting on soft tissue volume augmentation could not be compared due to lack of homogeneity. The use of subepithelial connective tissue grafts (SCTGs) resulted in statistically significantly more soft tissue volume gain compared with free gingival grafts (FGGs).
CONCLUSIONS
APF/V is a successful treatment concept to increase the width of keratinized tissue or attached gingiva around teeth. The addition of autogenous tissue statistically significantly increases the width of attached gingiva. For soft tissue volume augmentation, only limited data are available favoring SCTGs over FGG.
Topics: Collagen; Connective Tissue; Gingiva; Gingivoplasty; Humans; Keratins; Skin, Artificial; Vestibuloplasty
PubMed: 19663961
DOI: 10.1111/j.1600-0501.2009.01784.x -
Reproductive Health Apr 2021This article challenges the "tyranny of P-value" and promote more valuable and applicable interpretations of the results of research on health care delivery. We provide...
This article challenges the "tyranny of P-value" and promote more valuable and applicable interpretations of the results of research on health care delivery. We provide here solid arguments to retire statistical significance as the unique way to interpret results, after presenting the current state of the debate inside the scientific community. Instead, we promote reporting the much more informative confidence intervals and eventually adding exact P-values. We also provide some clues to integrate statistical and clinical significance by referring to minimal important differences and integrating the effect size of an intervention and the certainty of evidence ideally using the GRADE approach. We have argued against interpreting or reporting results as statistically significant or statistically non-significant. We recommend showing important clinical benefits with their confidence intervals in cases of point estimates compatible with results benefits and even important harms. It seems fair to report the point estimate and the more likely values along with a very clear statement of the implications of extremes of the intervals. We recommend drawing conclusions, considering the multiple factors besides P-values such as certainty of the evidence for each outcome, net benefit, economic considerations and values and preferences. We use several examples and figures to illustrate different scenarios and further suggest a wording to standardize the reporting. Several statistical measures have a role in the scientific communication of studies, but it is time to understand that there is life beyond the statistical significance. There is a great opportunity for improvement towards a more complete interpretation and to a more standardized reporting.
Topics: Data Interpretation, Statistical; Decision Making; Humans; Jurisprudence; Statistics as Topic
PubMed: 33865412
DOI: 10.1186/s12978-021-01131-w -
World Journal of Methodology Dec 2017A statistically significant research finding should not be defined as a -value of 0.05 or less, because this definition does not take into account study power....
A statistically significant research finding should not be defined as a -value of 0.05 or less, because this definition does not take into account study power. Statistical significance was originally defined by Fisher RA as a -value of 0.05 or less. According to Fisher, any finding that is likely to occur by random variation no more than 1 in 20 times is considered significant. Neyman J and Pearson ES subsequently argued that Fisher's definition was incomplete. They proposed that statistical significance could only be determined by analyzing the chance of incorrectly considering a study finding was significant (a Type I error) or incorrectly considering a study finding was insignificant (a Type II error). Their definition of statistical significance is also incomplete because the error rates are considered separately, not together. A better definition of statistical significance is the positive predictive value of a -value, which is equal to the power divided by the sum of power and the -value. This definition is more complete and relevant than Fisher's or Neyman-Peason's definitions, because it takes into account both concepts of statistical significance. Using this definition, a statistically significant finding requires a -value of 0.05 or less when the power is at least 95%, and a -value of 0.032 or less when the power is 60%. To achieve statistical significance, -values must be adjusted downward as the study power decreases.
PubMed: 29354483
DOI: 10.5662/wjm.v7.i4.112 -
Journal of Thoracic Disease Oct 2017Biomedical research is seldom done with entire populations but rather with samples drawn from a population. Although we work with samples, our goal is to describe and...
Biomedical research is seldom done with entire populations but rather with samples drawn from a population. Although we work with samples, our goal is to describe and draw inferences regarding the underlying population. It is possible to use a sample statistic and estimates of error in the sample to get a fair idea of the population parameter, not as a single value, but as a range of values. This range is the confidence interval (CI) which is estimated on the basis of a desired confidence level. Calculation of the CI of a sample statistic takes the general form: CI = Point estimate ± Margin of error, where the margin of error is given by the product of a critical value (z) derived from the standard normal curve and the standard error of point estimate. Calculation of the standard error varies depending on whether the sample statistic of interest is a mean, proportion, odds ratio (OR), and so on. The factors affecting the width of the CI include the desired confidence level, the sample size and the variability in the sample. Although the 95% CI is most often used in biomedical research, a CI can be calculated for any level of confidence. A 99% CI will be wider than 95% CI for the same sample. Conflict between clinical importance and statistical significance is an important issue in biomedical research. Clinical importance is best inferred by looking at the effect size, that is how much is the actual change or difference. However, statistical significance in terms of P only suggests whether there is any difference in probability terms. Use of the CI supplements the P value by providing an estimate of actual clinical effect. Of late, clinical trials are being designed specifically as superiority, non-inferiority or equivalence studies. The conclusions from these alternative trial designs are based on CI values rather than the P value from intergroup comparison.
PubMed: 29268424
DOI: 10.21037/jtd.2017.09.14 -
Evidence-based Mental Health Feb 2019It is difficult to reason correctly when the information available is uncertain. Reasoning under uncertainty is also known as probabilistic reasoning.
INTRODUCTION
It is difficult to reason correctly when the information available is uncertain. Reasoning under uncertainty is also known as probabilistic reasoning.
METHODS
We discuss probabilistic reasoning in the context of a medical diagnosis or prognosis. The information available are symptoms for the diagnosis or diagnosis for the prognosis. We show how probabilities of events are updated in the light of new evidence (conditional probabilities/Bayes' theorem). A resolution is explained in which the support of the information for the diagnosis or prognosis is measured by the comparison of two probabilities, a statistic also known as the likelihood ratio.
RESULTS
The likelihood ratio is a continuous measure of support that is not subject to the discrete nature of statistical significance where a result is either classified as 'significant' or 'not significant'. It updates prior beliefs about diagnoses or prognoses in a coherent manner and enables proper consideration of successive pieces of information.
DISCUSSION
Probabilistic reasoning is not innate and relies on good education. Common mistakes include the 'prosecutor's fallacy' and the interpretation of relative measures without consideration of the actual risks of the outcome, for example, interpretation of a likelihood ratio without taking into account the prior odds.
Topics: Clinical Decision-Making; Humans; Models, Theoretical; Thinking; Uncertainty
PubMed: 30679196
DOI: 10.1136/ebmental-2018-300074 -
Human Reproduction (Oxford, England) Nov 2023What were the frequency and temporal trends of reporting P-values and effect measures in the abstracts of reproductive medicine studies in 1990-2022, how were reported...
STUDY QUESTION
What were the frequency and temporal trends of reporting P-values and effect measures in the abstracts of reproductive medicine studies in 1990-2022, how were reported P-values distributed, and what proportion of articles that present with statistical inference reported statistically significant results, i.e. 'positive' results?
SUMMARY ANSWER
Around one in six abstracts reported P-values alone without effect measures, while the prevalence of effect measures, whether reported alone or accompanied by P-values, has been increasing, especially in meta-analyses and randomized controlled trials (RCTs); the reported P-values were frequently observed around certain cut-off values, notably at 0.001, 0.01, or 0.05, and among abstracts present with statistical inference (i.e. P-value, CIs, or significant terms), a large majority (77%) reported at least one statistically significant finding.
WHAT IS KNOWN ALREADY
Publishing or reporting only results that show a 'positive' finding causes bias in evaluating interventions and risk factors and may incur adverse health outcomes for patients.
UNLABELLED
Despite efforts to minimize publication reporting bias in medical research, it remains unclear whether the magnitude and patterns of the bias have changed over time.
STUDY DESIGN, SIZE, DURATION
We studied abstracts of reproductive medicine studies from 1990 to 2022. The reproductive medicine studies were published in 23 first-quartile journals under the category of Obstetrics and Gynaecology and Reproductive Biology in Journal Citation Reports and 5 high-impact general medical journals (The Journal of the American Medical Association, The Lancet, The BMJ, The New England Journal of Medicine, and PLoS Medicine). Articles without abstracts, animal studies, and non-research articles, such as case reports or guidelines, were excluded.
PARTICIPANTS/MATERIALS, SETTING, METHODS
Automated text-mining was used to extract three types of statistical significance reporting, including P-values, CIs, and text description. Meanwhile, abstracts were text-mined for the presence of effect size metrics and Bayes factors. Five hundred abstracts were randomly selected and manually checked for the accuracy of automatic text extraction. The extracted statistical significance information was then analysed for temporal trends and distribution in general as well as in subgroups of study designs and journals.
MAIN RESULTS AND THE ROLE OF CHANCE
A total of 24 907 eligible reproductive medicine articles were identified from 170 739 screened articles published in 28 journals. The proportion of abstracts not reporting any statistical significance inference halved from 81% (95% CI, 76-84%) in 1990 to 40% (95% CI, 38-44%) in 2021, while reporting P-values alone remained relatively stable, at 15% (95% CI, 12-18%) in 1990 and 19% (95% CI, 16-22%) in 2021. By contrast, the proportion of abstracts reporting effect measures alone increased considerably from 4.1% (95% CI, 2.6-6.3%) in 1990 to 26% (95% CI, 23-29%) in 2021. Similarly, the proportion of abstracts reporting effect measures together with P-values showed substantial growth from 0.8% (95% CI, 0.3-2.2%) to 14% (95% CI, 12-17%) during the same timeframe. Of 30 182 statistical significance inferences, 56% (n = 17 077) conveyed statistical inferences via P-values alone, 30% (n = 8945) via text description alone such as significant or non-significant, 9.3% (n = 2820) via CIs alone, and 4.7% (n = 1340) via both CI and P-values. The reported P-values (n = 18 417), including both a continuum of P-values and dichotomized P-values, were frequently observed around common cut-off values such as 0.001 (20%), 0.05 (16%), and 0.01 (10%). Of the 13 200 reproductive medicine abstracts containing at least one statistical inference, 77% of abstracts made at least one statistically significant statement. Among articles that reported statistical inference, a decline in the proportion of making at least one statistically significant inference was only seen in RCTs, dropping from 71% (95% CI, 48-88%) in 1990 to 59% (95% CI, 42-73%) in 2021, whereas the proportion in the rest of study types remained almost constant over the years. Of abstracts that reported P-value, 87% (95% CI, 86-88%) reported at least one statistically significant P-value; it was 92% (95% CI, 82-97%) in 1990 and reached its peak at 97% (95% CI, 93-99%) in 2001 before declining to 81% (95% CI, 76-85%) in 2021.
LIMITATIONS, REASONS FOR CAUTION
First, our analysis focused solely on reporting patterns in abstracts but not full-text papers; however, in principle, abstracts should include condensed impartial information and avoid selective reporting. Second, while we attempted to identify all types of statistical significance reporting, our text mining was not flawless. However, the manual assessment showed that inaccuracies were not frequent.
WIDER IMPLICATIONS OF THE FINDINGS
There is a welcome trend that effect measures are increasingly reported in the abstracts of reproductive medicine studies, specifically in RCTs and meta-analyses. Publication reporting bias remains a major concern. Inflated estimates of interventions and risk factors could harm decisions built upon biased evidence, including clinical recommendations and planning of future research.
STUDY FUNDING/COMPETING INTEREST(S)
No funding was received for this study. B.W.M. is supported by an NHMRC Investigator grant (GNT1176437); B.W.M. reports research grants and travel support from Merck and consultancy from Merch and ObsEva. W.L. is supported by an NHMRC Investigator Grant (GNT2016729). Q.F. reports receiving a PhD scholarship from Merck. The other author has no conflict of interest to declare.
TRIAL REGISTRATION NUMBER
N/A.
PubMed: 38015794
DOI: 10.1093/humrep/dead248 -
Problems and alternatives of testing significance using null hypothesis and -value in food research.Food Science and Biotechnology May 2023A testing method to identify statistically significant differences by comparing the significance level and the probability value based on the Null Hypothesis... (Review)
Review
A testing method to identify statistically significant differences by comparing the significance level and the probability value based on the Null Hypothesis Significance Test (NHST) has been used in food research. However, problems with this testing method have been discussed. Several alternatives to the NHST and the -value test methods have been proposed including lowering the -value threshold and using confidence interval (CI), effect size, and Bayesian statistics. The CI estimates the extent of the effect or difference and determines the presence or absence of statistical significance. The effect size index determines the degree of effect difference and allows for the comparison of various statistical results. Bayesian statistics enable predictions to be made even when only a small amount of data is available. In conclusion, CI, effect size, and Bayesian statistics can complement or replace traditional statistical tests in food research by replacing the use of NHST and -value.
PubMed: 37363053
DOI: 10.1007/s10068-023-01348-4 -
La Medicina Del Lavoro Oct 2017The P-value is widely used as a summary statistics of scientific results. Unfortunately, there is a widespread tendency to dichotomize its value in "P<0.05" (defined as...
BACKGROUND
The P-value is widely used as a summary statistics of scientific results. Unfortunately, there is a widespread tendency to dichotomize its value in "P<0.05" (defined as "statistically significant") and "P>0.05" ("statistically not significant"), with the former implying a "positive" result and the latter a "negative" one.
OBJECTIVE
To show the unsuitability of such an approach when evaluating the effects of environmental and occupational risk factors.
METHODS
We provide examples of distorted use of P-value and of the negative consequences for science and public health of such a black-and-white vision.
RESULTS
The rigid interpretation of P-value as a dichotomy favors the confusion between health relevance and statistical significance, discourages thoughtful thinking, and distorts attention from what really matters, the health significance.
DISCUSSION
A much better way to express and communicate scientific results involves reporting effect estimates (e.g., risks, risks ratios or risk differences) and their confidence intervals (CI), which summarize and convey both health significance and statistical uncertainty. Unfortunately, many researchers do not usually consider the whole interval of CI but only examine if it includes the null-value, therefore degrading this procedure to the same P-value dichotomy (statistical significance or not).
CONCLUSIONS
In reporting statistical results of scientific research present effects estimates with their confidence intervals and do not qualify the P-value as "significant" or "not significant".
Topics: Humans; Occupational Health; Uncertainty
PubMed: 29084124
DOI: 10.23749/mdl.v108i5.6603