Bayesian estimation of a cancer population by capture-recapture with individual capture heterogeneity and small sample

BMC Med Res Methodol. 2015 Apr 24:15:39. doi: 10.1186/s12874-015-0029-7.

Abstract

Background: Cancer incidence and prevalence estimates are necessary to inform health policy, to predict public health impact and to identify etiological factors. Registers have been used to estimate the number of cancer cases. To be reliable and useful, cancer registry data should be complete. Capture-recapture is a method for estimating the number of cases missed, originally developed in ecology to estimate the size of animal populations. Capture recapture methods in cancer epidemiology involve modelling the overlap between lists of individuals using log-linear models. These models rely on assumption of independence of sources and equal catchability between individuals, unlikely to be satisfied in cancer population as severe cases are more likely to be captured than simple cases.

Methods: To estimate cancer population and completeness of cancer registry, we applied M(th) models that rely on parameters that influence capture as time of capture (t) and individual heterogeneity (h) and compared results to the ones obtained with classical log-linear models and sample coverage approach. For three sources collecting breast and colorectal cancer cases (Histopathological cancer registry, hospital Multidisciplinary Team Meetings, and cancer screening programmes), individual heterogeneity is suspected in cancer population due to age, gender, screening history or presence of metastases. Individual heterogeneity is hardly analysed as classical log-linear models usually pool it with between-"list" dependence. We applied Bayesian Model Averaging which can be applied with small sample without asymptotic assumption, contrary to the maximum likelihood estimate procedure.

Results: Cancer population estimates were based on the results of the M(h) model, with an averaged estimate of 803 cases of breast cancer and 521 cases of colorectal cancer. In the log-linear model, estimates were of 791 cases of breast cancer and 527 cases of colorectal cancer according to the retained models (729 and 481 histological cases, respectively).

Conclusions: We applied M(th) models and Bayesian population estimation to small sample of a cancer population. Advantage of M(th) models applied to cancer datasets, is the ability to explore individual factors associated with capture heterogeneity, as equal capture probability assumption is unlikely. M(th) models and Bayesian population estimation are well-suited for capture-recapture in a heterogeneous cancer population.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Bayes Theorem*
  • Breast Neoplasms / epidemiology
  • Colorectal Neoplasms / epidemiology
  • Epidemiologic Methods
  • Female
  • Humans
  • Incidence
  • Linear Models*
  • Male
  • Neoplasms / epidemiology*
  • Population Surveillance / methods
  • Prevalence
  • Registries / statistics & numerical data*
  • Reproducibility of Results
  • Sample Size