Self-supervised pretext tasks have been introduced as an effective strategy when learning target tasks on small annotated data sets. However, while current research focuses on exploring novel pretext tasks for meaningful and reusable representation learning for the target task, the study of its robustness and generalizability has remained relatively under-explored. Specifically, it is crucial in medical imaging to proactively investigate performance under different perturbations for reliable deployment of clinical applications. In this work, we revisit medical imaging networks pre-trained with self-supervised learnings and categorically evaluate robustness and generalizability compared to vanilla supervised learning. Our experiments on pneumonia detection in X-rays and multi-organ segmentation in CT yield conclusive results exposing the hidden benefits of self-supervision pre-training for learning robust feature representations.