Video quality metrics are essential for improved video processing algorithms. Common video quality metrics are simple averages of independently computed per frame spatial metrics, but human quality perception is not uniform across frames. In particular, the order of frames matter, as does content complexity and scene changes. In this work, we develop a video quality framework that comprehensively integrates both spatial and temporal metrics at three levels: frame, scene, and full video. We experimentally demonstrate improved correlation of spatial metrics with human evaluation as well a new well-correlated temporal metric (jerkiness) based on this framework.
|