Hormone assay results below the assay detection limit (DL) can introduce bias into quantitative analysis. Although complex maximum likelihood estimation methods exist, they are not widely used, whereas simple substitution methods are often used ad hoc to replace the undetectable (UD) results with numeric values to facilitate data analysis with the full data set. However, the bias of substitution methods for steroid measurements is not reported. Using a large data set (n = 2896) of serum testosterone (T), DHT, estradiol (E2) concentrations from healthy men, we created modified data sets with increasing proportions of UD samples (≤40%) to which we applied five different substitution methods (deleting UD samples as missing and substituting UD sample with DL, DL/√2, DL/2, or 0) to calculate univariate descriptive statistics (mean, SD) or bivariate correlations. For all three steroids and for univariate as well as bivariate statistics, bias increased progressively with increasing proportion of UD samples. Bias was worst when UD samples were deleted or substituted with 0 and least when UD samples were substituted with DL/√2, whereas the other methods (DL or DL/2) displayed intermediate bias. Similar findings were replicated in randomly drawn small subsets of 25, 50, and 100. Hence, we propose that in steroid hormone data with ≤40% UD samples, substituting UD with DL/√2 is a simple, versatile, and reasonably accurate method to minimize left censoring bias, allowing for data analysis with the full data set.
© Crown copyright 2019.