This study investigated the impact of high-speed videoendoscopy (HSV) frame rates on the assessment of nine clinically-relevant vocal-fold vibratory features. Fourteen adult patients with voice disorder and 14 adult normal controls were recorded using monochromatic rigid HSV at a rate of 16000 frames per second (fps) and spatial resolution of 639×639 pixels. The 16000-fps data were downsampled to 16 other rate denominations. Using paired comparisons design, nine common clinical vibratory features were visually compared between the downsampled and the original images. Three raters reported the thresholds at which: (1) a detectable difference between the two videos was first noticed, and (2) differences between the two videos would result in a change of clinical rating. Results indicated that glottal edge, mucosal wave magnitude and extent, aperiodicity, contact and loss of contact of the vocal folds were the vibratory features most sensitive to frame rate. Of these vibratory features, the glottal edge was selected for further analysis, due to its higher rating reliability, universal prevalence and consistent definition. Rates of 8000 fps were found to be free from visually-perceivable feature degradation, and for rates of 5333 fps, degradation was minimal. For rates of 4000 fps and higher, clinical assessments of glottal edge were not affected. Rates of 2000 fps changed the clinical ratings in over 16% of the samples, which could lead to inaccurate functional assessment.
Keywords: clinical voice assessment; frame rate; high-speed videoendoscopy; laryngeal imaging.