Three-dimensional Light Field Displays (LFDs) promise to provide realistic and comfortable viewing for one or multiple users simultaneously without any eyewear by overcoming the vergence-accommodation conflict. However, LFDs have not yet gained widespread adoption and remain a hot topic of research. Currently, LFDs are based on refractive Microlens Array (MLA) optics, which have inherent limitations including high optical aberrations and/or bulkiness. Metasurfaces are flat optics made of a distribution of subwavelength size nanopillars that can manipulate light wave properties including phase, amplitude, and polarization and be fabricated in a single lithographic step. They can be used as a more compact alternative to refractive MLAs. However, current designs cannot achieve comparable full-color and wide field-of-view imaging by multiple layers of refractive lenses. In this work, we demonstrate a deconvolution neural network model based on the U-Net architecture and Wiener non-blind deconvolution that reduces the effects of aberrations caused by a designed metasurface, enabling high image quality 3D LFDs. We employ an analytical model to determine the metasurface phase profile and point spread function for a five-by-five view LFD scenario. Our model is trained and evaluated using 52 images of 8.1 megapixels each from online databases of multiview images. To minimize the spatially varying aberration effects, a loss function is used that incorporates spatial pixel-wise error, structural quality, and angular consistency. Compared to the output images without preprocessing images using the designed PSFs, our neural network model improved PSNR by 10 dB and MS-SSIM by 2% overall for all views and reduced variations between different views by 40% and 70%, respectively, for PSNR and MS-SSIM.
|