statistical distribution - OpenMD.com Journal Search

Optimal Stein-type goodness-of-fit tests for count data.

Biometrical Journal. Biometrische... Feb 2023

Common count distributions, such as the Poisson (binomial) distribution for unbounded (bounded) counts considered here, can be characterized by appropriate Stein...

Summary PubMed

Authors: Christian H Weiß, Pedro Puig, Boris Aleksandrov...

Common count distributions, such as the Poisson (binomial) distribution for unbounded (bounded) counts considered here, can be characterized by appropriate Stein identities. These identities, in turn, might be utilized to define a corresponding goodness-of-fit (GoF) test, the test statistic of which involves the computation of weighted means for a user-selected weight function f. Here, the choice of f should be done with respect to the relevant alternative scenario, as it will have great impact on the GoF-test's performance. We derive the asymptotics of both the Poisson and binomial Stein-type GoF-statistic for general count distributions (we also briefly consider the negative-binomial case), such that the asymptotic power is easily computed for arbitrary alternatives. This allows for an efficient implementation of optimal Stein tests, that is, which are most powerful within a given class of weight functions. The performance and application of the optimal Stein-type GoF-tests is investigated by simulations and several medical data examples.

Topics: Binomial Distribution; Models, Statistical

PubMed: 36166681
DOI: 10.1002/bimj.202200073

Tweedie distributions for fitting semicontinuous health care utilization cost data.

BMC Medical Research Methodology Dec 2017

The statistical analysis of health care cost data is often problematic because these data are usually non-negative, right-skewed and have excess zeros for non-users....

Summary PubMed Full Text PDF

Authors: Christoph F Kurz

BACKGROUND

The statistical analysis of health care cost data is often problematic because these data are usually non-negative, right-skewed and have excess zeros for non-users. This prevents the use of linear models based on the Gaussian or Gamma distribution. A common way to counter this is the use of Two-part or Tobit models, which makes interpretation of the results more difficult. In this study, I explore a statistical distribution from the Tweedie family of distributions that can simultaneously model the probability of zero outcome, i.e. of being a non-user of health care utilization and continuous costs for users.

METHODS

I assess the usefulness of the Tweedie model in a Monte Carlo simulation study that addresses two common situations of low and high correlation of the users and the non-users of health care utilization. Furthermore, I compare the Tweedie model with several other models using a real data set from the RAND health insurance experiment.

RESULTS

I show that the Tweedie distribution fits cost data very well and provides better fit, especially when the number of non-users is low and the correlation between users and non-users is high.

CONCLUSION

The Tweedie distribution provides an interesting solution to many statistical problems in health economic analyses.

Topics: Algorithms; Computer Simulation; Health Care Costs; Health Services Research; Humans; Models, Economic; Monte Carlo Method; Patient Acceptance of Health Care; Statistical Distributions

PubMed: 29258428
DOI: 10.1186/s12874-017-0445-y

A Tutorial on Computing Bayes Factors for Single-Subject Designs.

Behavior Therapy Nov 2015

When researchers are interested in the effect of certain interventions on certain individuals, single-subject studies are often performed. In their most simple form,... (Review)

Summary PubMed

Review

Authors: Rivka M de Vries, Bregje M A Hartogs, Richard D Morey...

When researchers are interested in the effect of certain interventions on certain individuals, single-subject studies are often performed. In their most simple form, such single-subject studies require that a subject is measured on relevant criterion variables several times before an intervention and several times during or after the intervention. Scores from the two phases are then compared in order to investigate the intervention effect. Since observed scores typically consist of a mixture of true scores and random measurement error, simply looking at the difference in scores can be misleading. Hence, de Vries & Morey (2013) developed models and hypothesis tests for single-subject data, quantifying the evidence in data for the size and presence of an intervention effect. In this paper we give a non-technical overview of the models and hypothesis tests and show how they can be applied on real data using the BayesSingleSub R package, with the aid of an empirical data set.

Topics: Bayes Theorem; Data Interpretation, Statistical; Humans; Likelihood Functions; Models, Statistical; Psychology; Research Design; Sample Size; Statistical Distributions

PubMed: 26520223
DOI: 10.1016/j.beth.2014.09.013

Contagious statistical distributions: k-connections and applications in infectious disease environments.

PloS One 2022

Contagious statistical distributions are a valuable resource for managing contagion by means of k-connected chains of distributions. Binomial, hypergeometric, Pólya,...

Summary PubMed Full Text PDF

Authors: Victoriano García-García, María Martel-Escobar, Francisco-José Vázquez-Polo...

Contagious statistical distributions are a valuable resource for managing contagion by means of k-connected chains of distributions. Binomial, hypergeometric, Pólya, uniform distributions with the same values for all parameters except sample size n are known to be strongly associated. This paper describes how the relationship can be obtained via factorial moments, simplifying the process by including novel elements. We describe the properties of these distributions and provide examples of their real-world application, and then define a chain of k-connected distributions, which generalises the relationship among samples of any size for a given population and the Pólya urn model.

Topics: Communicable Diseases; Humans; Poly A; Sample Size; Statistical Distributions

PubMed: 35622844
DOI: 10.1371/journal.pone.0268810

Visual statistical learning is facilitated in Zipfian distributions.

Cognition Jan 2021

Humans can extract co-occurrence regularities from their environment, and use them for learning. This statistical learning ability (SL) has been studied extensively as a...

Summary PubMed

Authors: Ori Lavi-Rotbain, Inbal Arnon

Humans can extract co-occurrence regularities from their environment, and use them for learning. This statistical learning ability (SL) has been studied extensively as a way to explain how we learn the structure of our environment. These investigations have illustrated the impact of various distributional properties on learning. However, almost all SL studies present the regularities to be learned in uniform frequency distributions where each unit (e.g., image triplet) appears the same number of times: While the regularities themselves are informative, the appearance of the units cannot be predicted. In contrast, real-world learning environments, including the words children hear and the objects they see, are not uniform. Recent research shows that word segmentation is facilitated in a skewed (Zipfian) distribution. Here, we examine the domain-generality of the effect and ask if visual SL is also facilitated in a Zipfian distribution. We use an existing database to show that object combinations have a skewed distribution in children's environment. We then show that children and adults showed better learning in a Zipfian distribution compared to a uniform one, overall, and for low-frequency triplets. These results illustrate the facilitative impact of skewed distributions on learning across modality and age; suggest that the use of uniform distributions may underestimate performance; and point to the possible learnability advantage of such distributions in the real-world.

Topics: Adult; Child; Databases, Factual; Hearing; Humans; Intelligence; Spatial Learning; Statistical Distributions

PubMed: 33157380
DOI: 10.1016/j.cognition.2020.104492

Modeling COVID-19 contact-tracing using the ratio regression capture-recapture approach.

Biometrics Dec 2023

Contact-tracing is one of the most effective tools in infectious disease outbreak control. A capture-recapture approach based upon ratio regression is suggested to...

Summary PubMed

Authors: Dankmar Böhning, Rattana Lerdsuwansri, Patarawan Sangnawakij...

Contact-tracing is one of the most effective tools in infectious disease outbreak control. A capture-recapture approach based upon ratio regression is suggested to estimate the completeness of case detection. Ratio regression has been recently developed as flexible tool for count data modeling and has proved to be successful in the capture-recapture setting. The methodology is applied here to Covid-19 contact tracing data from Thailand. A simple weighted straight line approach is used which includes the Poisson and geometric distribution as special cases. For the case study data of contact tracing for Thailand, a completeness of 83% could be found with a 95% confidence interval of 74%-93%.

Topics: Humans; COVID-19; Contact Tracing; Disease Outbreaks; Statistical Distributions

PubMed: 36795803
DOI: 10.1111/biom.13842

Statistics Commentary Series: Commentary #16-Regression Toward the Mean.

Journal of Clinical Psychopharmacology Oct 2016

Summary PubMed

Authors: David L Streiner

Topics: Humans; Psychometrics; Statistical Distributions; Statistics as Topic

PubMed: 27496345
DOI: 10.1097/JCP.0000000000000551

Asymptotic uncertainty of false discovery proportion.

Biometrics Jan 2024

Multiple testing has been a prominent topic in statistical research. Despite extensive work in this area, controlling false discoveries remains a challenging task,...

Summary PubMed Full Text PDF

Authors: Meng Mei, Tao Yu, Yuan Jiang...

Multiple testing has been a prominent topic in statistical research. Despite extensive work in this area, controlling false discoveries remains a challenging task, especially when the test statistics exhibit dependence. Various methods have been proposed to estimate the false discovery proportion (FDP) under arbitrary dependencies among the test statistics. One key approach is to transform arbitrary dependence into weak dependence and subsequently establish the strong consistency of FDP and false discovery rate under weak dependence. As a result, FDPs converge to the same asymptotic limit within the framework of weak dependence. However, we have observed that the asymptotic variance of FDP can be significantly influenced by the dependence structure of the test statistics, even when they exhibit only weak dependence. Quantifying this variability is of great practical importance, as it serves as an indicator of the quality of FDP estimation from the data. To the best of our knowledge, there is limited research on this aspect in the literature. In this paper, we aim to fill in this gap by quantifying the variation of FDP, assuming that the test statistics exhibit weak dependence and follow normal distributions. We begin by deriving the asymptotic expansion of the FDP and subsequently investigate how the asymptotic variance of the FDP is influenced by different dependence structures. Based on the insights gained from this study, we recommend that in multiple testing procedures utilizing FDP, reporting both the mean and variance estimates of FDP can provide a more comprehensive assessment of the study's outcomes.

Topics: Uncertainty; Normal Distribution

PubMed: 38497826
DOI: 10.1093/biomtc/ujae015

Multinomial N-mixture models for removal sampling.

Biometrics Jun 2020

Multinomial -mixture models are commonly used to fit data from a removal sampling protocol. If the mixing distribution is negative binomial, the distribution of the...

Summary PubMed

Authors: Linda M Haines

Multinomial -mixture models are commonly used to fit data from a removal sampling protocol. If the mixing distribution is negative binomial, the distribution of the counts does not appear to have been identified, and practitioners approximate the requisite likelihood by placing an upper bound on the embedded infinite sum. In this paper, the distribution which underpins the multinomial -mixture model with a negative binomial mixing distribution is shown to belong to the broad class of multivariate negative binomial distributions. Specifically, the likelihood can be expressed in closed form as the product of conditional and marginal likelihoods and the information matrix shown to be block diagonal. As a consequence, the nature of the maximum likelihood estimates of the unknown parameters and their attendant standard errors can be examined and tests of the hypothesis of the Poisson against the negative binomial mixing distribution formulated. In addition, appropriate multinomial -mixture models for data sets which include zero site totals can also be constructed. Two illustrative examples are provided.

Topics: Animals; Biometry; Computer Simulation; Confidence Intervals; Ecology; Florida; Forests; Likelihood Functions; Maryland; Models, Statistical; Multivariate Analysis; Passeriformes; Perches; Poisson Distribution; Population Density; Population Dynamics; Rivers; Sample Size

PubMed: 31513284
DOI: 10.1111/biom.13147

A family of Gamma-generated distributions: Statistical properties and applications.

Statistical Methods in Medical Research Aug 2021

In this paper, we concentrate on the statistical properties of Gamma-X family of distributions. A special case of this family is the Gamma-Weibull distribution....

Summary PubMed

Authors: Hormatollah Pourreza, Ezzatallah Baloui Jamkhaneh, Einolah Deiri...

In this paper, we concentrate on the statistical properties of Gamma-X family of distributions. A special case of this family is the Gamma-Weibull distribution. Therefore, the statistical properties of Gamma-Weibull distribution as a sub-model of Gamma-X family are discussed such as moments, variance, skewness, kurtosis and Rényi entropy. Also, the parameters of the Gamma-Weibull distribution are estimated by the method of maximum likelihood. Some sub-models of the Gamma-X are investigated, including the cumulative distribution, probability density, survival and hazard functions. The Monte Carlo simulation study is conducted to assess the performances of these estimators. Finally, the adequacy of Gamma-Weibull distribution in data modeling is verified by the two clinical real data sets.: 62E99; 62E15.

Topics: Computer Simulation; Likelihood Functions; Models, Statistical; Monte Carlo Method; Statistical Distributions

PubMed: 34006148
DOI: 10.1177/09622802211009262