-
Fertility and Sterility Dec 2020
Topics: COVID-19; Humans; Pandemics; Parents; SARS-CoV-2; Selection Bias
PubMed: 33280724
DOI: 10.1016/j.fertnstert.2020.10.057 -
Journal of the American Geriatrics... Sep 2019Selection bias is a well-known concern in research on older adults. We discuss two common forms of selection bias in aging research: (1) survivor bias and (2) bias due... (Review)
Review
OBJECTIVES
Selection bias is a well-known concern in research on older adults. We discuss two common forms of selection bias in aging research: (1) survivor bias and (2) bias due to loss to follow-up. Our objective was to review these two forms of selection bias in geriatrics research. In clinical aging research, selection bias is a particular concern because all participants must have survived to old age, and be healthy enough, to take part in a research study in geriatrics.
DESIGN
We demonstrate the key issues related to selection bias using three case studies focused on obesity, a common clinical risk factor in older adults. We also created a Selection Bias Toolkit that includes strategies to prevent selection bias when designing a research study in older adults and analytic techniques that can be used to examine, and correct for, the influence of selection bias in geriatrics research.
RESULTS
Survivor bias and bias due to loss to follow-up can distort study results in geriatric populations. Key steps to avoid selection bias at the study design stage include creating causal diagrams, minimizing barriers to participation, and measuring variables that predict loss to follow-up. The Selection Bias Toolkit details several analytic strategies available to geriatrics researchers to examine and correct for selection bias (eg, regression modeling and sensitivity analysis).
CONCLUSION
The toolkit is designed to provide a broad overview of methods available to examine and correct for selection bias. It is specifically intended for use in the context of aging research. J Am Geriatr Soc 67:1970-1976, 2019.
Topics: Aged; Aged, 80 and over; Female; Geriatrics; Humans; Lost to Follow-Up; Male; Patient Selection; Research Design; Selection Bias; Survivors
PubMed: 31211407
DOI: 10.1111/jgs.16022 -
Epidemiology (Cambridge, Mass.) Sep 2021Confounding, selection bias, and measurement error are well-known sources of bias in epidemiologic research. Methods for assessing these biases have their own...
Confounding, selection bias, and measurement error are well-known sources of bias in epidemiologic research. Methods for assessing these biases have their own limitations. Many quantitative sensitivity analysis approaches consider each type of bias individually, although more complex approaches are harder to implement or require numerous assumptions. By failing to consider multiple biases at once, researchers can underestimate-or overestimate-their joint impact. We show that it is possible to bound the total composite bias owing to these three sources and to use that bound to assess the sensitivity of a risk ratio to any combination of these biases. We derive bounds for the total composite bias under a variety of scenarios, providing researchers with tools to assess their total potential impact. We apply this technique to a study where unmeasured confounding and selection bias are both concerns and to another study in which possible differential exposure misclassification and confounding are concerns. The approach we describe, though conservative, is easier to implement and makes simpler assumptions than quantitative bias analysis. We provide R functions to aid implementation.
Topics: Bias; Confounding Factors, Epidemiologic; Epidemiologic Studies; Humans; Research Design; Selection Bias
PubMed: 34224471
DOI: 10.1097/EDE.0000000000001380 -
Human Molecular Genetics Nov 2019Mendelian randomization (MR) is increasingly used to make causal inferences in a wide range of fields, from drug development to etiologic studies. Causal inference in MR... (Review)
Review
Mendelian randomization (MR) is increasingly used to make causal inferences in a wide range of fields, from drug development to etiologic studies. Causal inference in MR is possible because of the process of genetic inheritance from parents to offspring. Specifically, at gamete formation and conception, meiosis ensures random allocation to the offspring of one allele from each parent at each locus, and these are unrelated to most of the other inherited genetic variants. To date, most MR studies have used data from unrelated individuals. These studies assume that genotypes are independent of the environment across a sample of unrelated individuals, conditional on covariates. Here we describe potential sources of bias, such as transmission ratio distortion, selection bias, population stratification, dynastic effects and assortative mating that can induce spurious or biased SNP-phenotype associations. We explain how studies of related individuals such as sibling pairs or parent-offspring trios can be used to overcome some of these sources of bias, to provide potentially more reliable evidence regarding causal processes. The increasing availability of data from related individuals in large cohort studies presents an opportunity to both overcome some of these biases and also to evaluate familial environmental effects.
Topics: Family; Family Characteristics; Genetic Association Studies; Genotype; Humans; Mendelian Randomization Analysis; Polymorphism, Single Nucleotide; Population; Reproduction; Selection Bias; Sociobiology
PubMed: 31647093
DOI: 10.1093/hmg/ddz204 -
The Journal of Thoracic and... Apr 2020
Topics: Humans; Lung Neoplasms; Mesothelioma; Selection Bias
PubMed: 32035644
DOI: 10.1016/j.jtcvs.2019.11.083 -
Clinical Trials (London, England) Feb 2022In cluster randomized trials, patients are typically recruited after clusters are randomized, and the recruiters and patients may not be blinded to the assignment. This...
BACKGROUND
In cluster randomized trials, patients are typically recruited after clusters are randomized, and the recruiters and patients may not be blinded to the assignment. This often leads to differential recruitment and consequently systematic differences in baseline characteristics of the recruited patients between intervention and control arms, inducing post-randomization selection bias. We aim to rigorously define causal estimands in the presence of selection bias. We elucidate the conditions under which standard covariate adjustment methods can validly estimate these estimands. We further discuss the additional data and assumptions necessary for estimating causal effects when such conditions are not met.
METHODS
Adopting the principal stratification framework in causal inference, we clarify there are two average treatment effect (ATE) estimands in cluster randomized trials: one for the overall population and one for the recruited population. We derive analytical formula of the two estimands in terms of principal-stratum-specific causal effects. Furthermore, using simulation studies, we assess the empirical performance of the multivariable regression adjustment method under different data generating processes leading to selection bias.
RESULTS
When treatment effects are heterogeneous across principal strata, the average treatment effect on the overall population generally differs from the average treatment effect on the recruited population. A naïve intention-to-treat analysis of the recruited sample leads to biased estimates of both average treatment effects. In the presence of post-randomization selection and without additional data on the non-recruited subjects, the average treatment effect on the recruited population is estimable only when the treatment effects are homogeneous between principal strata, and the average treatment effect on the overall population is generally not estimable. The extent to which covariate adjustment can remove selection bias depends on the degree of effect heterogeneity across principal strata.
CONCLUSION
There is a need and opportunity to improve the analysis of cluster randomized trials that are subject to post-randomization selection bias. For studies prone to selection bias, it is important to explicitly specify the target population that the causal estimands are defined on and adopt design and estimation strategies accordingly. To draw valid inferences about treatment effects, investigators should (1) assess the possibility of heterogeneous treatment effects, and (2) consider collecting data on covariates that are predictive of the recruitment process, and on the non-recruited population from external sources such as electronic health records.
Topics: Bias; Causality; Computer Simulation; Humans; Intention to Treat Analysis; Randomized Controlled Trials as Topic; Research Design; Selection Bias
PubMed: 34894795
DOI: 10.1177/17407745211056875 -
BMC Medical Research Methodology Aug 2023When conducting randomised controlled trials is impractical, an alternative is to carry out an observational study. However, making valid causal inferences from... (Review)
Review
BACKGROUND
When conducting randomised controlled trials is impractical, an alternative is to carry out an observational study. However, making valid causal inferences from observational data is challenging because of the risk of several statistical biases. In 2016 Hernán and Robins put forward the 'target trial framework' as a guide to best design and analyse observational studies whilst preventing the most common biases. This framework consists of (1) clearly defining a causal question about an intervention, (2) specifying the protocol of the hypothetical trial, and (3) explaining how the observational data will be used to emulate it.
METHODS
The aim of this scoping review was to identify and review all explicit attempts of trial emulation studies across all medical fields. Embase, Medline and Web of Science were searched for trial emulation studies published in English from database inception to February 25, 2021. The following information was extracted from studies that were deemed eligible for review: the subject area, the type of observational data that they leveraged, and the statistical methods they used to address the following biases: (A) confounding bias, (B) immortal time bias, and (C) selection bias.
RESULTS
The search resulted in 617 studies, 38 of which we deemed eligible for review. Of those 38 studies, most focused on cardiology, infectious diseases or oncology and the majority used electronic health records/electronic medical records data and cohort studies data. Different statistical methods were used to address confounding at baseline and selection bias, predominantly conditioning on the confounders (N = 18/49, 37%) and inverse probability of censoring weighting (N = 7/20, 35%) respectively. Different approaches were used to address immortal time bias, assigning individuals to treatment strategies at start of follow-up based on their data available at that specific time (N = 21, 55%), using the sequential trial emulations approach (N = 11, 29%) or the cloning approach (N = 6, 16%).
CONCLUSION
Different methods can be leveraged to address (A) confounding bias, (B) immortal time bias, and (C) selection bias. When working with observational data, and if possible, the 'target trial' framework should be used as it provides a structured conceptual approach to observational research.
Topics: Humans; Biomedical Research; Selection Bias; Databases, Factual; MEDLINE; Medical Oncology; Observational Studies as Topic
PubMed: 37587484
DOI: 10.1186/s12874-023-02000-9 -
Bioinformatics (Oxford, England) Sep 2022Synthetic lethality (SL) between two genes occurs when simultaneous loss of function leads to cell death. This holds great promise for developing anti-cancer...
MOTIVATION
Synthetic lethality (SL) between two genes occurs when simultaneous loss of function leads to cell death. This holds great promise for developing anti-cancer therapeutics that target synthetic lethal pairs of endogenously disrupted genes. Identifying novel SL relationships through exhaustive experimental screens is challenging, due to the vast number of candidate pairs. Computational SL prediction is therefore sought to identify promising SL gene pairs for further experimentation. However, current SL prediction methods lack consideration for generalizability in the presence of selection bias in SL data.
RESULTS
We show that SL data exhibit considerable gene selection bias. Our experiments designed to assess the robustness of SL prediction reveal that models driven by the topology of known SL interactions (e.g. graph, matrix factorization) are especially sensitive to selection bias. We introduce selection bias-resilient synthetic lethality (SBSL) prediction using regularized logistic regression or random forests. Each gene pair is described by 27 molecular features derived from cancer cell line, cancer patient tissue and healthy donor tissue samples. SBSL models are built and tested using approximately 8000 experimentally derived SL pairs across breast, colon, lung and ovarian cancers. Compared to other SL prediction methods, SBSL showed higher predictive performance, better generalizability and robustness to selection bias. Gene dependency, quantifying the essentiality of a gene for cell survival, contributed most to SBSL predictions. Random forests were superior to linear models in the absence of dependency features, highlighting the relevance of mutual exclusivity of somatic mutations, co-expression in healthy tissue and differential expression in tumour samples.
AVAILABILITY AND IMPLEMENTATION
https://github.com/joanagoncalveslab/sbsl.
SUPPLEMENTARY INFORMATION
Supplementary data are available at Bioinformatics online.
Topics: Humans; Synthetic Lethal Mutations; Selection Bias; Neoplasms; Genes, Synthetic
PubMed: 35876858
DOI: 10.1093/bioinformatics/btac523 -
BMC Medical Research Methodology Sep 2021The lung allocation system in the U.S. prioritizes lung transplant candidates based on estimated pre- and post-transplant survival via the Lung Allocation Scores (LAS)....
BACKGROUND
The lung allocation system in the U.S. prioritizes lung transplant candidates based on estimated pre- and post-transplant survival via the Lung Allocation Scores (LAS). However, these models do not account for selection bias, which results from individuals being removed from the waitlist due to receipt of transplant, as well as transplanted individuals necessarily having survived long enough to receive a transplant. Such selection biases lead to inaccurate predictions.
METHODS
We used a weighted estimation strategy to account for selection bias in the pre- and post-transplant models used to calculate the LAS. We then created a modified LAS using these weights, and compared its performance to that of the existing LAS via time-dependent receiver operating characteristic (ROC) curves, calibration curves, and Bland-Altman plots.
RESULTS
The modified LAS exhibited better discrimination and calibration than the existing LAS, and led to changes in patient prioritization.
CONCLUSIONS
Our approach to addressing selection bias is intuitive and can be applied to any organ allocation system that prioritizes patients based on estimated pre- and post-transplant survival. This work is especially relevant to current efforts to ensure more equitable distribution of organs.
Topics: Humans; Lung Transplantation; Patient Selection; Retrospective Studies; Selection Bias; Tissue and Organ Procurement; Waiting Lists
PubMed: 34548017
DOI: 10.1186/s12874-021-01379-7 -
Epidemiology (Cambridge, Mass.) Jul 2019When epidemiologic studies are conducted in a subset of the population, selection bias can threaten the validity of causal inference. This bias can occur whether or not...
When epidemiologic studies are conducted in a subset of the population, selection bias can threaten the validity of causal inference. This bias can occur whether or not that selected population is the target population and can occur even in the absence of exposure-outcome confounding. However, it is often difficult to quantify the extent of selection bias, and sensitivity analysis can be challenging to undertake and to understand. In this article, we demonstrate that the magnitude of the bias due to selection can be bounded by simple expressions defined by parameters characterizing the relationships between unmeasured factor(s) responsible for the bias and the measured variables. No functional form assumptions are necessary about those unmeasured factors. Using knowledge about the selection mechanism, researchers can account for the possible extent of selection bias by specifying the size of the parameters in the bounds. We also show that the bounds, which differ depending on the target population, result in summary measures that can be used to calculate the minimum magnitude of the parameters required to shift a risk ratio to the null. The summary measure can be used to determine the overall strength of selection that would be necessary to explain away a result. We then show that the bounds and summary measures can be simplified in certain contexts or with certain assumptions. Using examples with varying selection mechanisms, we also demonstrate how researchers can implement these simple sensitivity analyses. See video abstract at, http://links.lww.com/EDE/B535.
Topics: Confounding Factors, Epidemiologic; Data Interpretation, Statistical; Epidemiologic Research Design; Humans; Models, Statistical; Selection Bias
PubMed: 31033690
DOI: 10.1097/EDE.0000000000001032