The resolution of the TPEF and OPEF images is , while the resolution of the wide-field transmission microscopy is . This is roughly equal to the images we can expect to obtain using microendoscopy-based imaging with the same NA from the middle ear space. Rows of hair cells are seen with sharp details and high signal-to-noise ratio (SNR) when TPEF is used [Fig. 3(a)]. The OPEF image demonstrates fewer details and a lower SNR [Fig. 3(b)], due to linear dependence of excitation/emission, instead of the square dependence in TPEF. The wide-field transmission image is blurry [Fig. 3(c)], mainly due to the lack of axial resolution (optical sectioning) and strong scattering from the surrounding bone. The image contrast, defined as the ratio of the mean intensity of the hair cell region to the mean intensity of the imaging plane 10 μm above the hair cell, are and for TPEF and OPEF images, respectively. The SNR, defined as the ratio of the mean intensity of the hair cell region to the standard deviation of the background in the peripheral area of the same imaging plane, is and for TPEF and OPEF images, respectively. The organ of Corti region is about 500 μm below the plane of the round window. The round window opening has a diameter of . The resulting viewing angle onto the organ of Corti through the round window is . The collection angles of the objectives used for TPEF and OPEF are and , respectively. The TPEF objective used is better than the OPEF objective in the efficiency of signal collection. This might also contribute to the dimmer signal in the OPEF images. However, since the signal levels in all three cases are much stronger than the detector noises, we believe that the optical sectioning capability of TPEF is the major contributor to the enhanced image quality compared to OPEF.