Minimal invasive procedures such as transcatheter valve interventions are substituting conventional surgical techniques. Thus, novel operating rooms have been designed to augment traditional surgical equipment with advanced imaging systems to guide the procedures. We propose a novel method to fuse pre-operative and intra-operative information by jointly estimating anatomical models from multiple image modalities. Thereby high-quality patient-specific models are integrated into the imaging environment of operating rooms to guide cardiac interventions. Robust and fast machine learning techniques are utilized to guide the estimation process. Our method integrates both the redundant and complementary multimodal information to achieve a comprehensive modeling and simultaneously reduce the estimation uncertainty. Experiments performed on 28 patients with pairs of multimodal volumetric data are used to demonstrate high quality intra-operative patient-specific modeling of the aortic valve with a precision of 1.09mm in TEE and 1.73mm in 3D C-arm CT. Within a processing time of 10 seconds we additionally obtain model sensitive mapping between the pre- and intraoperative images.