Facial expression morphing: enhancing visual fidelity and preserving facial details in CycleGAN-based expression synthesis

Chayanon Sub-R-Pa; Rung-Ching Chen; Ming-Zhong Fan

doi:10.7717/peerj-cs.2438

Facial expression morphing: enhancing visual fidelity and preserving facial details in CycleGAN-based expression synthesis

PeerJ Comput Sci. 2024 Oct 25:10:e2438. doi: 10.7717/peerj-cs.2438. eCollection 2024.

Authors

Chayanon Sub-R-Pa¹, Rung-Ching Chen¹, Ming-Zhong Fan¹

Affiliation

¹ Department of Information Management, Chaoyang University of Technology, Taichung, Taiwan.

Abstract

Recent advancements in facial expression synthesis using deep learning, particularly with Cycle-Consistent Adversarial Networks (CycleGAN), have led to impressive results. However, a critical challenge persists: the generated expressions often lack the sharpness and fine details of the original face, such as freckles, moles, or birthmarks. To address this issue, we introduce the Facial Expression Morphing (FEM) algorithm, a novel post-processing method designed to enhance the visual fidelity of CycleGAN-based outputs. The FEM method blends the input image with the generated expression, prioritizing the preservation of crucial facial details. We experimented with our method on the Radboud Faces Database (RafD) and evaluated employing the Fréchet Inception Distance (FID) standard benchmark for image-to-image translation and introducing a new metric, FSD (Facial Similarity Distance), to specifically measure the similarity between translated and real images. Our comprehensive analysis of CycleGAN, UNet Vision Transformer cycle-consistent GAN versions 1 (UVCGANv1) and 2 (UVCGANv2) reveals a substantial enhancement in image clarity and preservation of intricate details. The average FID score of 31.92 achieved by our models represents a remarkable 50% reduction compared to the previous state-of-the-art model's score of 63.82, showcasing the significant advancements made in this domain. This substantial enhancement in image quality is further supported by our proposed FSD metric, which shows a closer resemblance between FEM-processed images and the original faces.

Keywords: CycleGAN; Facial expression synthesis; GANs; Image processing; Image translation; Image-to-image.

Grants and funding

This article is supported by the NSTC, Taiwan Project No. NSTC-112-2221-E-324-003-MY3 and NSTC-112-2221-E-324-011-MY2. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.