We examine the power and sample size requirements for testing an interaction in the situation of a 2 x k factorial design with time to failure as the outcome of interest. Using the distribution of a general test statistic, based on weighted residual sum of squares for testing a general interaction in a 2 x k factorial experiment, we describe the relationship between the power of the test and the size of the sample. In a simulation study, we evaluate the behavior of three commonly used estimators as methods for estimating the parameters of the test statistic. These are the Mantel-Haenszel (MH) method, a method (O/E) based on the ratio of the observed to expected number of events, and the maximum likelihood (MLE) method. We show that in most cases nominal test sizes and appropriate powers are attained using the MLE and MH methods, whereas the O/E method yielded test sizes and powers less than expected. With both large baseline hazard rates and large differences in relative hazard rates, the difference between the simulated and asymptotic powers for all three methods become larger; however, the size of this difference is small and unlikely to seriously affect the use of either the MLE or MH methods. The proposed methods could also be used to calculate the power and sample size for testing a treatment-covariate interaction in a stratified data analysis.