Robotic surgical systems such as Intuitive Surgical's da Vinci system provide a rich source of motion and video data from surgical procedures. In principle, this data can be used to evaluate surgical skill, provide surgical training feedback, or document essential aspects of a procedure. If processed online, the data can be used to provide context-specific information or motion enhancements to the surgeon. However, in every case, the key step is to relate recorded motion data to a model of the procedure being performed. This paper examines our progress at developing techniques for "parsing" raw motion data from a surgical task into a labelled sequence of surgical gestures. Our current techniques have achieved >90% fully automated recognition rates on 15 datasets.