Background: Progression-free survival (PFS) is a common endpoint in cancer clinical trials. This study was undertaken to assess the impact of data errors and data handling on the statistical estimation of PFS.
Methods: Data from four trials conducted by the Japan Clinical Oncology Group were examined. Three types of data handling methods were defined: (1) data handling method A (METHOD-A), the collected event data are used as much as possible, (2) METHOD-C, only reliable data with firm evidence are used, and (3) METHOD-B is intermediate between METHOD-A and METHOD-C. To assess the impact of each of the three methods, Kaplan-Meier survival curves, median PFS, proportion of PFS, log-rank p values and hazard ratios were estimated.
Results: In three trials that collected PFS data periodically, no remarkable differences in median PFS and the proportion of PFS were observed. In one trial with non-periodic data cleaning, however, the ratio of median PFS by METHOD-C to that by METHOD-B was 0.85, the maximum difference of proportion of PFS between METHOD-C and METHOD-B was 12.0% and the largest spread in PFS curves amongst the three methods was observed in this trial. In all trials, log-rank p values and hazard ratios for between arm comparisons did not differ between the three methods.
Conclusions: Periodic data management can reduce errors in comparisons of PFS and is a critical requirement when using PFS as a major endpoint. Furthermore, proper data handling is essential in the estimation of patient benefit and caution is needed when making clinical decisions based on PFS.