The two-stage design involves sample size recalculation using an interim variance estimate. Stein proposed the design in 1945; biostatisticians recently have shown renewed interest in it. Wittes and Brittain proposed a modification aimed at greater efficiency; Gould and Shih proposed a similar procedure, but with a different interim variance estimate based on blinded data. We compare the power of Stein's original test, an idealized version of the Wittes-Brittain test, and a theoretical optimal test which can be approximated in practice. We also compare two procedures that control the conditional type I error rate given the actual final sample size: Gould and Shih's procedure and a newly proposed 'second segment' procedure. The comparison among the first three procedures indicates that the Stein test is, unexpectedly, the test of choice under the original design alternative, whereas the approximate-optimal and Wittes-Brittain procedures appear to have superior power for detecting smaller treatment differences. As between the latter two procedures, the second segment procedure is more powerful when many observations are likely to be taken after the interim resizing, whereas otherwise the Gould-Shih procedure is superior.
Copyright 1999 John Wiley & Sons, Ltd.