Mr. Haiming Dong Profile

Haiming Dong

at Beijing Univ of Posts and Telecommunications

SPIE Involvement:

Author

Publications (1)

Proceedings Article | 3 January 2020 Paper

A generative-predictive framework used for video conversion

Jinquan Li, Haiming Dong

Proceedings Volume 11373, 113731L (2020) https://doi.org/10.1117/12.2557941

KEYWORDS: Data modeling, Statistical modeling, Head, Neural networks, Video processing, Image processing, Error analysis, Performance modeling, Detection and tracking algorithms

Read Abstract +

We proposed a model framework which was based on generative adversarial network for video conversion. Our goal is that two different target videos can synchronize the movements (such as the head displacement and facial movements of the person), and the movements was not existed in the original video. Our key observation is that a video prediction model is added to the original framework of the generative adversarial network, so that the generated video can get the time sequence characteristics of the target video to improve the action consistency and time synchronization stability. In the training process, we obtained and aligned the spatial position of the action in video through landmark points detection, to ensure that the generated samples would not appear the phenomenon of spatial dislocation. In the training process, we will generate sample t and obtain t+1 sample through pre-trained time predictor, calculating the generate sample loss feedback pre-trained generative model. Using this framework, we can: (1) obtain more convenient to make available training samples and improve the available range of the model; (2) improve the accuracy of target generate video.

We proposed a model framework which was inspired by generative adversarial network for video conversion. Our goal is that two different target videos can synchronize the movements (such as the head displacement and facial movements of the person), and the movements were not existed in the original video. Our key observation is that a video prediction model is added to the original framework of the generative adversarial network, so that the generated video can get the time sequence characteristics of the target video to improve the action consistency and time synchronization stability. In the training process, we obtained and aligned the spatial position of the action in video through landmark points detection, to ensure that the generated samples would not appear the phenomenon of spatial dislocation. In the training process, we will generate sample t and obtain sample t + 1 through pre-trained time predictor, calculating the generate sample loss feedback pre-trained generative model. Using this framework, we can: (1) obtain more convenient to make available training samples and improve the available range of the model; (2) improve the accuracy of target generates video.

View contact details

UPDATE YOUR PROFILE

Is this your profile? Update it now.

Sign into your SPIE.org account

Don’t have a profile and want one?

Create an account on SPIE.org

Keywords/Phrases

Search In:

Publication Years