We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Use and misuse of predicted values in epidemiologic data analyses (TG4)

00:00

Formal Metadata

Title
Use and misuse of predicted values in epidemiologic data analyses (TG4)
Title of Series
Number of Parts
19
Author
License
CC Attribution - NonCommercial - NoDerivatives 4.0 International:
You are free to use, copy, distribute and transmit the work or content in unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
For many epidemiologic settings, the principle exposure or outcome under study can only be imprecisely measured. In an attempt to address error-in-variables, sometimes the analyst will adjust these variables, say through a calibration or prediction equation, and use the resulting predicted value in the analysis in place of the observed value. When a predicted quantity is used in place of an observed value in a data analysis, consideration of the impact of the uncertainty in the predicted quantity on the study results is needed, but this is not always done in practice. Such predicted variables usually have Berkson error. The result of ignoring this uncertainty, or prediction error, for some settings could be that the parameter estimates are biased, the standard errors are biased, or both. We examine three common examples for how predicted values are used in an analysis in place of an error-prone variable: 1) to estimate the distribution of a variable, 2) to compare values of a variable between groups by using the predicted value in a two-group statistic (e.g. t-statistic) or as an outcome variable in a regression, and 3) to estimate the effect of an error-prone variable on an outcome, where the predicted quantity is used as exposure variable in a regression. For each example, we present an overview of the potential consequences for using a predicted quantity in an analysis in place of the true value without appropriate statistical adjustment. We further illustrate some concepts with data from a large population-based cohort, the Hispanic Community Health Study/Study of Latinos (HCHS/SOL).