Login | Register

Personalized Visual Dubbing through Virtual Dubber and Full Head Reenactment

Title:

Personalized Visual Dubbing through Virtual Dubber and Full Head Reenactment

Jeon, Bobae ORCID: https://orcid.org/0009-0005-6910-7556 (2024) Personalized Visual Dubbing through Virtual Dubber and Full Head Reenactment. Masters thesis, Concordia University.

[thumbnail of Jeon_MCompSC_S2025.pdf]
Preview
Text (application/pdf)
Jeon_MCompSC_S2025.pdf - Submitted Version
Available under License Spectrum Terms of Access.
49MB

Video (video/mp4)
supplementary_videos.mp4 - Submitted Version
Available under License Spectrum Terms of Access.
147MB

Abstract

Visual dubbing aims to modify facial expressions to "lip-sync" a new audio track. While person-generic talking head generation methods have made significant progress in expressive lip synchronization across arbitrary identities, they usually lack person-specific details and fail to generate high-quality results. On the other hand, person-specific approaches enable realistic identity preservation and high lip-sync quality but require extensive training, limiting their adaptability in real-world applications.

Our method combines the strengths of both approaches to generate balanced results in lip synchronization and visual quality while achieving training efficiency. To this end, our pipeline incorporates a virtual dubber, a person-generic talking head, as an intermediate representation. This simplifies identity swapping, enhances efficiency, and improves both visual quality and expression accuracy.

Key innovations include full-head identity swapping and reenactment, eliminating artifacts such as the double chin effect while ensuring temporal stability. Through extensive quantitative and qualitative evaluations, we demonstrate that our approach achieves a superior balance between lip-sync accuracy and realistic facial reenactment.

Furthermore, we validate the robustness of our method with experiments in challenging real-world scenarios, including tilted head poses and facial occlusions. Notably, our pipeline operates effectively with short video clips, emphasizing its efficiency and practicality.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Jeon, Bobae
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science
Date:20 December 2024
Thesis Supervisor(s):Mudur, Sudhir and Popa, Tiberiu
ID Code:994925
Deposited By: Bobae Jeon
Deposited On:17 Jun 2025 17:33
Last Modified:17 Jun 2025 17:33
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top