lunes, 5 de agosto de 2019

Estimating County-Level Mortality Rates Using Highly Censored Data From CDC WONDER

Estimating County-Level Mortality Rates Using Highly Censored Data From CDC WONDER

PCD logo

Estimating County-Level Mortality Rates Using Highly Censored Data From CDC WONDER

Harrison Quick, PhD

Suggested citation for this article: Quick H. Estimating County-Level Mortality Rates Using Highly Censored Data From CDC WONDER. Prev Chronic Dis 2019;16:180441. DOI: http://dx.doi.org/10.5888/pcd16.180441external icon.
PEER REVIEWED
Summary
What is already known on this topic?
Ignoring the impact of suppression due to small counts leads to biased inference.
What is added by this report?
This work describes and compares multiple approaches for analyzing highly suppressed data from CDC WONDER. R and WinBUGS code are provided to conduct the analyses.
What are the implications for public health practice?
The use of spatial Bayesian models can yield improved inference from the analysis of highly suppressed data such as those available on CDC WONDER.

Abstract

Introduction
CDC WONDER is a system developed to promote information-driven decision making and provide access to detailed public health information to the general public. Although CDC WONDER contains a wealth of data, any counts fewer than 10 are suppressed for confidentiality reasons, resulting in left-censored data. The objective of this analysis was to describe methods for the analysis of highly censored data.
Methods
A substitution approach was compared with 1) a simple, nonspatial Bayesian model that smooths rates toward their statewide averages and 2) a more complex Bayesian model that accounts for spatial and between-age sources of dependence. Age group–specific county-level data on heart disease mortality were used for the comparisons.
Results
Although the substitution and nonspatial approach provided age-standardized rate estimates that were more highly correlated with the true rate estimates, the estimates from the spatial Bayesian model provided a superior compromise between goodness-of-fit and model complexity, as measured by the deviance information criterion. In addition, the spatial Bayesian model provided rate estimates with greater precision than the nonspatial approach; in contrast, the substitution approach did not provide estimates of uncertainty.
Conclusion
Because of the ability to account for multiple sources of dependence and the flexibility to include covariate information, the use of spatial Bayesian models should be considered when analyzing highly censored data from CDC WONDER.

No hay comentarios: