Microcomputer-based artificial vision support system for real-time image processing for camera-driven visual prostheses

Wolfgang Fink; Cindy X. You; Mark A. Tarbell

doi:10.1117/1.3292012

1 January 2010 Microcomputer-based artificial vision support system for real-time image processing for camera-driven visual prostheses

Wolfgang Fink, Cindy X. You, Mark A. Tarbell

Author Affiliations +

Journal of Biomedical Optics, Vol. 15, Issue 1, 016013 (January 2010). https://doi.org/10.1117/1.3292012

Abstract

It is difficult to predict exactly what blind subjects with camera-driven visual prostheses (e.g., retinal implants) can perceive. Thus, it is prudent to offer them a wide variety of image processing filters and the capability to engage these filters repeatedly in any user-defined order to enhance their visual perception. To attain true portability, we employ a commercial off-the-shelf battery-powered general purpose Linux microprocessor platform to create the microcomputer-based artificial vision support system (μAVS²) for real-time image processing. Truly standalone, μAVS² is smaller than a deck of playing cards, lightweight, fast, and equipped with USB, RS-232 and Ethernet interfaces. Image processing filters on μAVS² operate in a user-defined linear sequential-loop fashion, resulting in vastly reduced memory and CPU requirements during execution. μAVS² imports raw video frames from a USB or IP camera, performs image processing, and issues the processed data over an outbound Internet TCP/IP or RS-232 connection to the visual prosthesis system. Hence, μAVS² affords users of current and future visual prostheses independent mobility and the capability to customize the visual perception generated. Additionally, μAVS² can easily be reconfigured for other prosthetic systems. Testing of μAVS² with actual retinal implant carriers is envisioned in the near future.

1. Introduction

Systems providing artificial vision are becoming a reality.^{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23} In particular, (extraocular or intraocular) camera-driven epiretinal implants are being used in chronic patient trials already^{6, 5} (Fig. 1 ). With blind subjects, it is difficult to predict exactly what they can perceive with such camera-driven visual prostheses, especially since they (currently) provide only tens of stimulating retinal electrodes, thereby allowing only for limited visual perception (pixelation). Thus it is important to offer them a wide variety of image processing filters and the capability to engage these filters repeatedly in any user-defined order to enhance their visual perception in daily life.

Fig. 1

One instantiation of an artificial vision prosthesis: an intraocular retinal prosthesis using an external microelectronic system to capture and process image data and transmit the information to an implanted microelectronic system. The implanted system decodes the data and stimulates the retina via an electrode array with a pattern of electrical impulses to generate a visual perception.

Image processing systems (Fig. 2 ), such as the artificial vision simulator (AVS)^{24, 25, 26} (Fig. 3 ), provide these capabilities and perform real-time (i.e., $30 fps$ ) image processing and enhancement of digital camera image streams before they enter the visual prosthesis. AVS, in particular, comprises numerous efficient image manipulation and processing modules, such as user-defined pixelation, contrast and brightness enhancement, grayscale equalization for luminance control under severe contrast and brightness conditions, user-defined grayscale levels for potential reduction of data volume transmitted to the visual prosthesis, blur algorithms, and edge detection.^{27, 28} AVS provides the unique ability and flexibility for visual prosthesis carriers to further customize their individual visual perception afforded by their respective vision systems by actively manipulating parameters of individual image processing filters, even altering the sequence of these filters.

Fig. 2

Schematic diagram of a real-time image processing system for artificial vision prostheses, which are driven by an external or internal camera system.

Fig. 3

(Top) Typical palette of image processing modules that can be applied in real time to a video camera stream driving an artificial visual prosthesis. (Bottom) a user-defined subset of filters (pixelation plus grayscale equalization plus contrast enhancement) applied in a real world scenario: recognizing a door and doorknob in a low contrast environment.

The original implementation of the AVS processing system has been laptop computer based and hence not truly portable. To provide independent mobility for the blind and visually impaired using camera-driven artificial visual prostheses (Fig. 1), we introduce in the following a stand-alone, battery-powered, portable microcomputing platform and software system for real-time image processing for camera-driven artificial vision prostheses, the microcomputer-based artificial vision support system $(μ {AVS}^{2})$ .²⁹

2. Methods

2.1.

Hardware Platform

For the purpose of creating a small, stand-alone, battery-operated, and portable version of AVS, namely $μ {AVS}^{2}$ , we employed a commercial off-the-shelf, battery-powered, general purpose miniaturized Linux processing platform: the Gumstix^™ (by Gumstix, Incorporated, Portola Valley, California) microcomputer environment (Fig. 4 ). It is lightweight $(8 g)$ , fast ( $600 - MHz$ clock speed or better), and equipped with USB, Ethernet, and RS-232 interfaces.

Fig. 4

Gumstix^™-based $μ {AVS}^{2}$ , a stand-alone, battery-operated, portable, general purpose microcomputing platform for real-time image processing.

In particular, the Verdex-Pro class Gumstix microcomputer has the following specifications (see http://www.gumstix.net/Hardware/view/Hardware-Specifications/Verdex-Pro-Specifications/112.html):

• Marvell^® PXA270 CPU with XScale^™
• $600 - MHz$ clock speed
• Ethernet
• USB
• RS-232
• dynamic $RAM + Flash$ RAM
• $80 \times 20 mm$ , $8 g$ , $- 25 to 85 ° C$ , 3.5 to 5.0 VDC
• battery powered (includes driving USB cameras and onboard USB/Ethernet/RS-232 interfaces).

2.2.

Microcomputer-Based Artificial Vision Support System Processing Architecture

For visual processing purposes, $μ {AVS}^{2}$ imports raw video frames from a camera connected via its built-in USB port (or alternatively via its built-in Ethernet port, see Fig. 5 ). $μ {AVS}^{2}$ then reduces the resolution of the video frames to match the pixel resolution of the patient’s visual prosthesis (i.e., pixelation or downsampling, see Fig. 5). $μ {AVS}^{2}$ subsequently processes the downsampled video frames through user-selected image filters in a linear, sequential-loop fashion, resulting in vastly reduced memory and CPU requirements during execution, making the use of a microprocessor possible. The frequency and order of the image processing modules are determined via a predefined, user-selectable “control string” or script without recompilation of the image processing code (Fig. 5). $μ {AVS}^{2}$ then issues the processed video frame data over an outbound connection (via RS-232, wired Ethernet, or Internet) to the visual prosthesis system in real time (Fig. 5).

Fig. 5

Schematic of $μ {AVS}^{2}$ processing architecture.

The control string or script defines the order and frequency that the image filters are applied to the input video frames. Each image filter in a specific sequence is applied to the results of the filter before it, in a net cascading effect. Thus, filter 1 is applied to the raw pixelated video frame; filter 2 is then applied to that result, and so on to the last filter, at which time the processed image is ready to be issued for stimulation of the prosthesis. This cycle repeats anew for each incoming video frame. Before use, the blind subject is tested and the specific sequence of filters that provides him the best results is determined. This sequence is incorporated into the control script, which is then downloaded onto the device. In the current implementation of $μ {AVS}^{2}$ , the control script is not changeable by the user; however, implementing $μ {AVS}^{2}$ on an advanced embedded platform such as an Apple iPhone could allow the user the selection of a number of prepared control scripts in real time (e.g., one for daytime, one for nighttime, etc.).

In particular $μ {AVS}^{2}$ adheres to the following modular processing architecture (Fig. 5).

Stage 1: camera image acquisition

$μ {AVS}^{2}$ utilizes access to an “image generator,” i.e., a digital video camera. The camera may be local, directly connected via the built-in USB port, or it may be a local or remote IP camera on an accessible network. For a locally connected (hardwired) camera, $μ {AVS}^{2}$ will open an exclusive read channel to the device over the USB bus to gain access to the camera’s video buffer. For an IP camera, $μ {AVS}^{2}$ will access the camera at its IP address, using the access protocol specific to the type of camera in use (e.g., HTTP, UDP).

Stage 2: image capture

Once $μ {AVS}^{2}$ has established a connection to a digital video camera, it must extract discrete image frames from the camera. It does this by reading the camera’s YUV frame buffer.

Stage 3: custom image processing including pixelation

The standardized image representation is reduced in resolution (pixelated) to match the pixel resolution of the patient’s retinal implant electrode array. For example, if the original image representation has a resolution of $640 \times 480$ raw pixels, the image is downsampled to the patient’s electrode array size, e.g., $16 \times 16 pixels$ , thus becoming a $256 -byte pixel$ array. This downsampling affords $μ {AVS}^{2}$ its real time capability, as it allows for a subsequent dramatic speed-up of otherwise computationally expensive image processing filters that can now be equally efficiently executed on the reduced-size pixel array rather than the full-resolution camera video frames (for more detail see Sec. 2.4). Specialized image processing filters (e.g., contrast, brightness, gray equalization, gray levels, inversion, edge detection, and image blur^{27, 28}) are then applied to the resolution-reduced/pixelated image (pixel array) according to a predefined/user-selectable control string or script to enhance its essential characteristics, cleaning up the image before issuance to the retinal implant.

Stage 4: image data transmission

The final stage in $μ {AVS}^{2}$ processing causes the processed pixel array to be sent to a power and data telemetry system (Fig. 5) that subsequently outputs the image data, e.g., by means of electrical current pulses on the patient’s retinal implant electrode array to generate visual perceptions, i.e., so-called phosphenes.

2.3.

Microcomputer-Based Artificial Vision Support System Connectivity

$μ {AVS}^{2}$ supports a variety of interfaces that enable its connectivity to input devices such as digital cameras, and to output devices such as artificial vision systems. The interfaces currently supported are as follows.

2.3.1.

Universal Serial Bus Camera Interface

$μ {AVS}^{2}$ contains a fully supported, general purpose USB port, allowing for the connection of a wide range of USB devices, including many types of USB web cameras, without the need of an external power supply. Alternatively, accessing an IP camera is possible as well, using an HTTP protocol to an IP camera connected to the onboard Ethernet port. When using a YUV-capable USB or IP camera, the camera’s grayscale video frame buffer is directly accessible to the $μ {AVS}^{2}$ image processing filters.

2.3.2.

Ethernet-Based (Transmission Control Protocol/Internet Protocol) Data Exchange Protocol

To integrate the $μ {AVS}^{2}$ with artificial vision prostheses, an extensible Ethernet-based (i.e., TCP/IP) data exchange protocol (Fig. 6 ) has been developed (for details see Ref. 26) to facilitate the transfer of the $μ {AVS}^{2}$ -manipulated (video) output data to, e.g., power and data telemetry systems for retinal prostheses.³⁰ This Ethernet-based protocol allows for maximum flexibility between systems, and is capable of two-way data transfer as well as status message transmission. The interface protocol is sufficiently flexible, so that it is also possible for a power and data telemetry system to transmit its own data (e.g., measurement data obtained from the vision implant) back to $μ {AVS}^{2}$ for feedback control, further processing, and analysis. Additionally, contingency measures are integrated into the protocol, providing for negative acknowledgements and data resend requests, should the need arise in which data is required to be resent.

Fig. 6

Schematic view of Ethernet-based data exchange protocol between $μ {AVS}^{2}$ and a power and data telemetry system of an artificial vision prosthesis.³⁰

2.3.3.

RS232 Communication Protocol

Using the onboard universal asynchronous receiver/transmitter, it is possible to write the $μ {AVS}^{2}$ -manipulated (video) output data to the Gumstix serial port at speeds of up to 921,600 baud, using a RS-232 8N1 format: each data byte is actually nine bits long, i.e., eight data bits and one stop bit.

2.4.

Microcomputer-Based Artificial Vision Support System Real-Time Image Processing Capability

The real-time image processing capability of $μ {AVS}^{2}$ is enabled by the downsampling of the native resolution of the artificial vision prosthesis camera system before image processing filters are applied. For example, if the original image representation has a resolution of $640 \times 480$ raw pixels (i.e., $307,200 pixels$ ), and if the implanted electrode array of an epiretinal vision implant has a dimension of $16 \times 16$ stimulating electrodes (i.e., 256 electrodes), then the downsampling/binning of the $640 \times 480 pixel$ image to $16 \times 16 pixels$ affords a speed-up factor of 1200 (i.e., 3 orders of magnitude) in subsequent filter processing. This process allows otherwise computationally expensive image processing filters (e.g., blurring) to be executed in real time (i.e., in excess of $30 fps$ ) without loss of filter validity/efficiency. In the Appendix{ label needed for app[@id='x0'] } in Sec. 5 we show mathematically and numerically, for a set of relevant image processing filters, that downsampling first and subsequent filtering of the reduced size image yields the exact or nearly exact same end result (i.e., is commutable or nearly commutable) compared to downsampling after image processing filters have been applied to the full-resolution camera video frames. This speed-up enables the employment of computationally low power, portable, and battery-powered microcomputing platforms to perform real-time image processing within an artificial vision system, thereby affording users enhanced mobility and independence.

3. Results

Table 1 demonstrates the binning/downsampling performance efficiency of $μ {AVS}^{2}$ for various artificial vision prostheses with electrode array dimensions ranging from $4 \times 4$ to $32 \times 32$ . For example, $μ {AVS}^{2}$ is capable of binning an in-memory source image (i.e., a frame) of $160 \times 120 pixels$ down to $8 \times 8 pixels$ at a rate of 354 frames per second (fps).

Table 1

Binning/downsampling performance efficiency of μAVS2 in frames per second for artificial vision prostheses with various electrode array dimensions and a camera frame resolution of 160×120pixels .

Camera frame resolution	160×120pixels
Electrode array dimension	4×4pixels	8×8pixels	16×16pixels	32×32pixels
Binning performance	370	354	303	265

Table 2 shows individual and total filter performance efficiencies for various electrode array dimensions, obtained with an in-memory frame of $160 \times 120 pixels$ . The results show that the application of filters to the workflow impact the processing only marginally. For example, adding a contrast filter to the $8 \times 8$ case reduces the frame rate by only 3, from 354 (no filters) to $351 fps$ . This demonstrates that binning/downsampling represents the bulk of the processing, whereas the individual filters (or even a sequence of all listed filters) pose a minimal additional processing burden, i.e., the image processing filters are “lightweight.” In practice, the real-time frame rate is much more dependent on the maximum frame rate of the camera used, some of which are limited to $10 fps$ , such as the camera used for our development.

Table 2

Individual and total filter performance efficiencies (including prior binning/downsampling) of μAVS2 in frames per second for artificial vision prostheses with various electrode array dimensions and a camera frame resolution of 160×120pixels .

Binning dimension	4×4pixels	8×8pixels	16×16pixels	32×32pixels
Contrast filter	369	351	296	244
Brightness filter	368	349	293	233
Gray equalizationfilter	300	288	253	221
Gray level filter	370	352	301	256
Inversion filter	371	353	302	262
Edge detectionfilter	368	341	266	170
Image blur filter	367	349	293	236
All filters engaged	302	280	217	137

As far as the actual CPU utilization of the Gumstix is concerned, using a camera frame resolution of $160 \times 120 pixels$ with a sustained frame rate of $10 fps$ (including the bidirectional communication between $μ {AVS}^{2}$ and the camera to generate the YUV frame buffer) results in only a 10% CPU load (measured using the Unix “top” command). In practice, the CPU load is linearly dependent on the camera frame rate, theoretically allowing for a maximum of $100 fps$ at a camera frame resolution of $160 \times 120 pixels$ , whereas many implanted prostheses require only 60 or fewer frames per second for optimal functioning.

Figure 7 displays application of the previous filters to a typical scenario (i.e., a darkened hallway). Here the simulated electrode array dimensions of the artificial vision prosthesis (i.e., binning) are $32 \times 32$ .

Fig. 7

Example filters of $μ {AVS}^{2}$ applied to a typical scenario (i.e., a darkened hallway). The electrode array dimensions of the artificial vision prosthesis are $32 \times 32$ .

Furthermore, duration tests with two common battery types were performed. The first test used a single $6 - V$ 2CR5 alkaline battery, resulting in an average duration of $2.7 h$ . The second test used a NiMH $6 - V \times 5 - Ah$ rechargeable battery pack, resulting in an average duration of $12 h$ .

The Gumstix natively supports a multiboot versatility. If a microSD memory card is inserted into the Gumstix’s card slot, the Gumstix will boot from the system installed on it instead of booting from its own internal flash RAM. Thanks to this capability, we have been able to provide several custom-tailored image filter cascades, each on its own microSD card, such that a change from one setting to another can easily be accomplished by swapping cards. On reboot, $μ {AVS}^{2}$ automatically executes the new processing cascade established on the microSD card without further user interaction, and continuously performs real-time image processing.

4. Conclusion

To support existing and future camera-driven artificial vision prostheses, we have devised the microcomputer-based artificial vision support system $(μ {AVS}^{2})$ : a small, stand-alone, battery-operated, customizable image processing system that utilizes a general purpose miniaturized Linux processing platform (Fig. 4). $μ {AVS}^{2}$ provides real-time image processing for artificial vision systems while maintaining portability and thus independence for its users. Moreover, with its general purpose computing capabilities, it can easily be reconfigured to support prosthetic systems beyond artificial vision, such as control of artificial limbs.

Multi-purpose systems, such as smartphones (e.g., Apple iPhone, BlackBerry, or Android phones), would be ideal platforms to host $μ {AVS}^{2}$ , as they provide a cell phone link and integrated GPS functionality, in addition to image processing capabilities. It should be noted that at the time of this writing, the application programming interface (API) of the iPhone software development kit (SDK) did not provide for live camera stream access. Such support is anticipated with the next SDK update. Nevertheless, we have proceeded to port a static version of $μ {AVS}^{2}$ to the iPhone, applying the suite of filters to still-image manipulation and storage functionalities. This will be updated to perform real-time video processing on the iPhone as soon as its SDK supports live camera stream access.

We would like to emphasize that the employment of $μ {AVS}^{2}$ in visual prosthetics is by no means limited to retinal implants (epi- or subretinal) only.^{1, 4, 5, 6} On the contrary, $μ {AVS}^{2}$ is directly and immediately applicable to any (artificial) vision-providing/stimulating system that is based on an external (e.g., eyeglass-mounted) or internal (e.g., intraocular camera) video-camera system as the first step in the stimulation/processing cascade, such as optic nerve implants,^{10, 11, 12} cortical implants,^{13, 14, 15, 16} electric tongue stimulators,^{17, 18, 19, 20} and tactile stimulators (both electrical and mechanical^{21, 22, 23}). In addition, $μ {AVS}^{2}$ can interface to infrared (IR) camera systems to augment the visual cues with thermal information, allowing for “supervision” at night and during adverse weather conditions such as fog.

Acknowledgments

The work described in this publication was carried out at the California Institute of Technology under support of the National Science Foundation grant EEC-0310723. Fink and Tarbell may have proprietary interest in the technology presented here as a provisional patent has been filed on behalf of Caltech. You has no proprietary interest.

References

1.

E. Zrenner, “Will retinal implants restore vision?,” Science, 295 (5557), 1022 –1025 (2002). https://doi.org/10.1126/science.1067996 0036-8075 Google Scholar

2.

S. C. DeMarco, “The architecture, design, and electromagnetic and thermal modeling of a retinal prosthesis to benefit the visually impaired,” North Carolina State Univ., (2001). Google Scholar

3.

E. Zrenner, K. D. Miliczek, V. P. Gabel, H. G. Graf, E. Guenther, H. Haemmerle, B. Hoefflinger, K. Kohler, W. Nisch, M. Schubert, A. Stett, and S. Weiss, “The development of subretinal microphotodiodes for replacement of degenerated photoreceptors,” Ophthalmic Res., 29 269 –28 (1997). https://doi.org/10.1159/000268025 0030-3747 Google Scholar

4.

J. F. Rizzo, J. L. Wyatt, “Prospects for a visual prosthesis,” Neuroscientist, 3 (4), 251 –262 (1997). https://doi.org/10.1177/107385849700300413 1073-8584 Google Scholar

5.

M. S. Humayun, J. Weiland, G. Fujii, R. J. Greenberg, R. Williamson, J. Little, B. Mech, V. Cimmarusti, G. van Boemel, G. Dagnelie, and E. de Juan Jr., “Visual perception in a blind subject with a chronic microelectronic retinal prosthesis,” Vision Res., 43 (24), 2573 –2581 (2003). https://doi.org/10.1016/S0042-6989(03)00457-7 0042-6989 Google Scholar

6.

W. Liu and M. S. Humayun, “Retinal prosthesis,” 218 –219 (2004). Google Scholar

7.

P. R. Singh, W. Liu, M. Sivaprakasam, M. S. Humayun, and J. D. Weiland, “A matched biphasic microstimulator for an implantable retinal prosthetic device,” 1 –4 (2004). Google Scholar

8.

J. D. Weiland, W. Fink, M. Humayun, W. Liu, D. C. Rodger, Y. C. Tai, and M. Tarbell, “Progress towards a high-resolution retinal prosthesis,” 7373 –7375 (2005). Google Scholar

9.

J. D. Weiland, W. Fink, M. S. Humayun, W. Liu, W. Li, M. Sivaprakasam, Y. C. Tai, and M. A. Tarbell, “System design of a high resolution retinal prosthesis,” (2008). http://dx.doi.org/10.1109/IEDM.2008.4796682 Google Scholar

10.

C. Veraart, C. Raftopoulos, J. T. Mortimer, J. Delbeke, D. Pins, G. Michaux, A. Vanlierde, S. Parrini, and M. C. Wanet-Defalque, “Visual sensations produced by optic nerve stimulation using an implanted self-sizing spiral cuff electrode,” Brain Res., 813 181 –186 (1998). https://doi.org/10.1016/S0006-8993(98)00977-9 0006-8993 Google Scholar

11.

C. Veraart, J. Delbeke, M. C. Wanet-Defalque, A. Vanlierde, G. Michaux, S. Parrini, O. Glineur, M. Verleysen, C. Trullemans, and J. T. Mortimer, 57 –59 (1999). Google Scholar

12.

C. Veraart, M. C. Wanet-Defalque, B. Gerard, A. Vanlierde, and J. Delbeke, “Pattern recognition with the optic nerve visual prosthesis,” Artif. Organs, 27 996 –1004 (2003). https://doi.org/10.1046/j.1525-1594.2003.07305.x 0160-564X Google Scholar

13.

W. H. Dobelle, M. G. Mladejovsky, and J. P. Girvin, “Artificial vision for the blind: electrical stimulation of visual cortex offers hope for a functional prosthesis,” Science, 183 (4123), 440 –444 (1974). https://doi.org/10.1126/science.183.4123.440 0036-8075 Google Scholar

14.

W. H. Dobelle, M. G. Mladejovsky, and J. P. Girvin, “Artificial vision for the blind by electrical stimulation of the visual cortex,” Neurosurgery, 5 (4), 521 –527 (1979). https://doi.org/10.1097/00006123-197910000-00021 0148-396X Google Scholar

15.

W. H. Dobelle, “Artificial vision for the blind: the summit may be closer than you think,” ASAIO J., 40 (4), 919 –922 (1994). 1058-2916 Google Scholar

16.

W. H. Dobelle, “Artificial vision for the blind by connecting a television camera to the visual cortex,” ASAIO J., 46 (1), 3 –9 (2000). https://doi.org/10.1097/00002480-200001000-00002 1058-2916 Google Scholar

17.

P. Bach-y-Rita, K. A. Kaczmarek, M. E. Tyler, and M. Garcia-Lara, “Form perception with a 49-point electrotactile stimulus array on the tongue: a technical note,” J. Rehabil. Res. Dev., 35 (4), 427 –430 (1998). 0748-7711 Google Scholar

18.

K. A. Kaczmarek, P. Bach-y-Rita, and M. E. Tyler, “Electrotactile pattern perception on the tongue,” BMES, 5 –131 (1998) Google Scholar

19.

R. Kupers and M. Ptito, “Seeing through the tongue: cross-modal plasticity in the congenitally blind,” Intl. Congress Series, 1270 79 –84 (2004) Google Scholar

20.

M. Ptito, S. Moesgaard, A. Gjedde, and R. Kupers, “Cross-modal plasticity revealed by electrotactile stimulation of the tongue in the congenitally blind,” Brain, 128 (3), 606 –614 (2005). https://doi.org/10.1093/brain/awh380 0006-8950 Google Scholar

21.

C. C. Collins and F. A. Saunders, “Pictorial display by direct electrical stimulation of the skin,” J. Biomed. Sys., 1 3 –16 (1970). Google Scholar

22.

C. C. Collins, “On mobility aids for the blind,” Electronic Spatial Sensing for the Blind, 35 –64 Matinus Nijhoff, Dordrecht, The Netherlands (1985). Google Scholar

23.

K. Kaczmarek, P. Bach-y-Rita, W. J. Tompkins, and J. G. Webster, “A tactile vision-substitution system for the blind: computer-controlled partial image sequencing,” IEEE Trans. Biomed. Eng., BME-32 (8), 602 –608 (1985). https://doi.org/10.1109/TBME.1985.325599 0018-9294 Google Scholar

24.

W. Fink and M. Tarbell, “Artificial vision simulator (AVS) for enhancing and optimizing visual perception of retinal implant carriers,” Invest. Ophthalmol. Visual Sci., 46 1145 (2005). 0146-0404 Google Scholar

25.

W. Liu, W. Fink, M. Tarbell, and M. Sivaprakasam, “Image processing and interface for retinal visual prostheses,” 2927 –2930 (2005). http://dx.doi.org/10.1109/ISCAS.2005.1465240 Google Scholar

26.

W. Fink, M. A. Tarbell, L. Hoang, and W. Liu, “Artificial vision support system

({AVS}^{2})

for enhancing and optimizing low-resolution visual perception for visual prostheses,” Google Scholar

27.

J. C. Russ, The Image Processing Handbook, CRC Press, Boca Raton, FL (2002). Google Scholar

28.

H. R. Myler and A. R. Weeks, The Pocket Handbook of Image Processing Algorithms in C, Prentice Hall PTR, Englewood Cliffs, NJ (1993). Google Scholar

29.

W. Fink and M. Tarbell, “

μ {AVS}^{2}

: microcomputer-based artificial vision support system for real-time image processing for camera-driven visual prostheses,” Invest. Ophthalmol. Visual Sci., 50 4748 (2009). 0146-0404 Google Scholar

30.

G. Wang, W. Liu, M. Sivaprakasam, and G. A. Kendir, “Design and analysis of an adaptive transcutaneous power telemetry for biomedical implants,” IEEE Trans. Circuits Syst., 52 2109 –2117 (2005). https://doi.org/10.1109/TCSI.2005.852923 0098-4094 Google Scholar

Appendices

{ label needed for app[@id='x0'] }

Appendix

In the following we show mathematically and numerically, for a set of relevant image processing filters, that downsampling first and subsequent filtering of the reduced size image yields the exact or nearly exact same end result compared to downsampling after image processing filters have been applied to the full-resolution camera video frames. The degree of commutation between both procedures for the respective image processing filters is reported in the following in units of gray value differences.

General Definitions for Image Processing Filter Proofs

In the following, all divisions in $⌊ a ∕ b ⌋$ form denote integer divisions, and all divisions not enclosed within brackets represent regular divisions. All $q$ (quotients, with $0 ⩽ q ⩽ 255$ as applied to filters operating on 256 grayscale values) and $r$ (respective remainders) are $∊ N$ . Note that in general, for any integer division $⌊ a ∕ b ⌋ = q_{*}$ with corresponding remainder $r_{*}$ , we can eliminate the integer division notation via

q_{*} + \frac{r_{*}}{b} = \frac{a}{b}, 0 ⩽ r_{*} < b .

Let

x_{i}

,

0 ⩽ x_{i} ⩽ 255

, denote the gray value of the pixel to be filtered,

f (x_{i})

be the image processing filter applied to the pixel, and

n

denote the number of pixels to be processed into one superpixel, i.e., downsampling. The downsampling is as follows:

\bar{x} = ⌊ \frac{\sum_{i = 1}^{n} x_{i}}{n} ⌋ = \frac{(\sum_{i = 1}^{n} x_{i}) - r_{1}}{n},

with

r_{1}

being the remainder of

\sum_{i = 1}^{n} x_{i} ∕ n

,

0 ⩽ r_{1} < n

, and

q_{1}

the quotient,

0 ⩽ q_{1} ⩽ 255

.

General Observations

Let $a, b, c \neq 0$ , and $∊ Z$ , but $b ∕ c$ is not an integer. Then $⌊ a + b ∕ c ⌋$ can be rewritten by:

\begin{matrix} If & a > 0 & b ∕ c > 0, & then \\ a < 0 & b ∕ c < 0, \end{matrix} a + ⌊ \frac{b}{c} ⌋

a > 0 b ∕ c < 0, a + ⌊ \frac{b}{c} ⌋ - 1

a < 0 b ∕ c > 0, a + ⌊ \frac{b}{c} ⌋ + 1 .

With the previous definitions, the downsampling first versus filtering first procedures are:

Procedure1: downsample first, filter second.

f (⌊ \frac{\sum_{i = 1}^{n} x_{i}}{n} ⌋) = f (\bar{x}) .

Procedure2: filter first, downsample second.

⌊ \frac{\sum_{i = 1}^{n} f (x_{i})}{n} ⌋ = \bar{f (x)} .

Type 1: Additive Filters

Filter description

f (x_{i}) = x_{i} + a, a is a constant ∊ Z .

Evaluation

Procedure1:

f (\bar{x}) = \bar{x} + a .

Procedure2:

since $r_{1} ∕ n < 1$ , and $a, q_{1}$ are both integers,

\bar{f (x)} = ⌊ \frac{\sum_{i = 1}^{n} (x_{i} + a)}{n} ⌋,

= ⌊ \bar{x} + \frac{r_{1}}{n} + a ⌋,

= \bar{x} + a .

Comparison

f (\bar{x}) = \bar{x} + a = \bar{f (x)} .

The order of downsampling and filtering is inconsequential with respect to the resulting gray values. Both procedures are commutable.

Type 2: Multiplicative Filters

Filter description

f (x_{i}) = ⌊ a x_{i} ⌋, a = \frac{b}{d} is a rational constant ∊ R > 0 .

Evaluation

Procedure1: let $r_{2}$ be the remainder of $f (\bar{x}) = ⌊ a \bar{x} ⌋$ , such that

f (\bar{x}) = ⌊ \frac{\frac{b}{d} ((\sum_{i = 1}^{n} x_{i}) - r_{1})}{n} ⌋,

= \frac{\frac{b}{d} ((\sum_{i = 1}^{n} x_{i}) - r_{1}) - r_{2}}{n},

= \frac{b (\sum_{i = 1}^{n} x_{i})}{d n} - \frac{r_{1}}{n} - \frac{r_{2}}{n} .

Procedure2: let

r_{i}

be the remainder of

⌊ b x_{i} ∕ d ⌋

,

0 ⩽ r_{i} < d

, such that

\bar{f (x)} = ⌊ \frac{\sum_{i = 1}^{n} ⌊ a x_{i} ⌋}{n} ⌋ = ⌊ \frac{[\sum_{i = 1}^{n} (\frac{b x_{i} - r_{i}}{d})]}{n} ⌋ .

with

r_{3}

,

0 ⩽ r_{3} < n

, being the remainder of the prior division, it follows

\bar{f (x)} = \frac{[\sum_{i = 1}^{n} (\frac{b x_{i} - r_{i}}{d})] - r_{3}}{n},

= \frac{b (\sum_{i = 1}^{n} x_{i})}{d n} - \frac{\sum_{i = 1}^{n} r_{i}}{d n} - \frac{r_{3}}{n} .

Comparison

The two procedures differ by

(\frac{\sum_{i = 1}^{n} r_{i}}{d n} + \frac{r_{3}}{n}) - (\frac{r_{1}}{n} + \frac{r_{2}}{n}) .

Note that

0 ⩽ r_{1} < n

,

0 ⩽ r_{i} < d

, and

0 ⩽ r_{3} < n

. Hence, the greatest difference with respect to the resulting gray values between the two procedures is

(\frac{\sum_{i = 1}^{n} r_{i}}{d n} + \frac{r_{3}}{n}) - (\frac{r_{1}}{n} + \frac{r_{2}}{n}) < 2 .

Both procedures are nearly commutable.

Type 3: Inversion Filter

Filter description

f (x_{i}) = 255 - x_{i} .

Evaluation

Procedure1:

f (\bar{x}) = 255 - \bar{x} .

Procedure2:

\bar{f (x)} = ⌊ \frac{\sum_{i = 1}^{n} (255 - x_{i})}{n} ⌋,

= ⌊ 255 - \frac{\sum_{i = 1}^{n} x_{i}}{n} ⌋,

= 255 - ⌊ \frac{\sum_{i = 1}^{n} x_{i}}{n} ⌋ - 1,

= 255 - \bar{x} - 1 .

Comparison

f (\bar{x}) = 255 - \bar{x} = \bar{f (x)} + 1 .

Therefore, the respective gray values of the two procedures differ by a constant value of 1. Hence, they are nearly commutable.

Type 4: Gray-Value Reduction Filter

Filter description

Let $v$ denote the new number of gray levels. For practical purposes, $1 < v < 256$ , $v ∊ N$ . Let $c$ be the number of original grayscale values incorporated into each new grayscale value, and $Δ$ be the interval between gray values of the new grayscale.

c = ⌊ \frac{256}{v} ⌋, Δ = ⌊ \frac{255}{v - 1} ⌋, 1 ⩽ c ⩽ Δ ⩽ 2 c (*) .

The gray-value reduction filter function then reads:

f (x_{i}) = ⌊ \frac{x_{i}}{c} ⌋ Δ .

Evaluation

Procedure1: let $r_{4}$ be the remainder of $\bar{x} ∕ c$ , $0 ⩽ r_{4} < c$ . Recall that $\bar{x} = ((\sum_{i = 1}^{n} x_{i}) - r_{1}) ∕ n$ . Hence, Procedure1 yields:

f (\bar{x}) = ⌊ \frac{\bar{x}}{c} ⌋ Δ = (\frac{\bar{x} - r_{4}}{c}) Δ,

= \frac{[\frac{(\sum_{i = 1}^{n} x_{i}) - r_{1}}{n}] - r_{4}}{c} Δ,

= \frac{Δ \sum_{i = 1}^{n} x_{i}}{c n} - \frac{Δ r_{1}}{c n} - \frac{Δ r_{4}}{c} .

Procedure 2: let

r_{i}

be the remainder of

⌊ x_{i} ∕ c ⌋

,

0 ⩽ r_{i} < c

, and

r_{5}

be the remainder of

⌊ \sum_{i = 1}^{n} ⌊ x_{i} ∕ c ⌋ Δ ∕ n ⌋

,

0 ⩽ r_{5} < v

.

\bar{f (x)} = ⌊ \frac{\sum_{i = 1}^{n} ⌊ \frac{x_{i}}{c} ⌋ Δ}{n} ⌋,

= \frac{\sum_{i = 1}^{n} ⌊ \frac{x_{i}}{c} ⌋ Δ - r_{5}}{n},

= \frac{\sum_{i = 1}^{n} (\frac{x_{i} - r_{i}}{c}) Δ - r_{5}}{n},

= \frac{Δ \sum_{i = 1}^{n} x_{i}}{c n} - \frac{Δ \sum_{i = 1}^{n} r_{i}}{c n} - \frac{r_{5}}{n} .

Comparison

The two procedures differ by

(\frac{Δ r_{1}}{c n} + \frac{Δ r_{4}}{c}) - (\frac{Δ \sum_{i = 1}^{n} r_{i}}{c n} + \frac{r_{5}}{n}) .

Recall that

0 ⩽ r_{1} < n

,

0 ⩽ r_{4} < c

,

0 ⩽ r_{i} < c

,

0 ⩽ r_{5} < n

,

1 ⩽ c ⩽ Δ ⩽ 2 c

.

0 ⩽ (\frac{Δ r_{1}}{c n} + \frac{Δ r_{4}}{c}) < (\frac{Δ}{c} + Δ) < 2 + 2 c,

0 ⩽ (\frac{Δ \sum_{i = 1}^{n} r_{i}}{c n} + \frac{r_{5}}{n}) < (\frac{Δ \sum_{i = 1}^{n} c}{c n} + 1) = Δ + 1 < 2 c + 1 .

Therefore, since

0 ⩽ (\frac{Δ r_{1}}{c n} + \frac{r_{4}}{c}) < 2 + 2 c, 0 ⩽ (\frac{\sum_{i = 1}^{n} r_{i}}{c n} + \frac{r_{5}}{n}) < 2 c + 1,

(\frac{Δ r_{1}}{c n} + \frac{r_{4}}{c}) - (\frac{\sum_{i = 1}^{n} r_{i}}{c n} + \frac{r_{5}}{n}) < 2 c + 2 .

Thus, the greatest difference between the resulting gray values of the two procedures is

2 c + 2

. Hence, the lower the number of new gray values, the larger the deviation of resulting gray values, and vice versa. Thus, the degree of commutation between both procedures depends on the number of new gray values.

(∗) Proof of $1 ⩽ c ⩽ Δ ⩽ 2 c$

Consider $c = ⌊ 256 ∕ v ⌋ ⩾ 1$ , $Δ = ⌊ 255 ∕ v - 1 ⌋ = ⌊ (255 v ∕ v - 1) ∕ v ⌋$ . With $1 < v < 256$ , $v ∊ N$ , the numerator of $Δ$ can take on values

\frac{255^{2}}{254} ⩽ (\frac{255 v}{v - 1}) ⩽ \frac{255 * 2}{1} .

Note the left side of this inequality is

> 256

but

< 257

and corresponds to

v = 255

, while the right side corresponds to

v = 2

, and is equal to 510. To the left side of this series of inequalities, we can then tag

256 < \frac{255^{2}}{254} ⩽ (\frac{255 v}{v - 1}) ⩽ \frac{255 * 2}{1},

which then implies that

c < Δ

.

Furthermore, $2 c = 2 ⌊ 256 ∕ v ⌋ ⩾ ⌊ 2 * 256 ∕ v ⌋ - 1 = ⌊ 512 - v ∕ v ⌋$ . The smallest possible value for the numerator of $2 c$ is $512 - v = 512 - 2 = 510 = 255 * 2 ∕ 1$ . Hence, $Δ ⩽ 2 c$ .

Type 5: Blur Filter

For the blurring, we developed and implemented a computationally inexpensive code that consists of two subsequent sweeps across the respective image (downsampled or full resolution): first a vertical sweep (i.e., by columns) followed by a horizontal sweep (i.e., by rows). In both sweeps only $NN = 1$ horizontal or vertical nearest neighbors, respectively, are considered. The nearest neighbors are equally weighted. This blur procedure delivers an effective blur (e.g., Fig. 7) and is applied in lieu of computationally expensive Gaussian blur filters.^{27, 28} We randomly generated $320 \times 320 pixel$ images, then proceeded to: 1. blur the image once with the number of nearest neighbors $NN = 1$ , and downsample the image to either $4 \times 4$ , $8 \times 8$ , $16 \times 16$ , or $32 \times 32$ ; or 2. downsample the image to either $4 \times 4$ , $8 \times 8$ , $16 \times 16$ , or $32 \times 32$ , then blur it. Results between the same downsampling dimensions of the two procedures were compared, and the respective average deviation per pixel was recorded. The numerical simulations revealed that the deviation between the two procedures is minimal (i.e., they are nearly commutable): worst average deviation was $4.4 \pm 3.4$ on $32 \times 32$ downsampled image, and best average deviation was $0.58 \pm 0.58$ on a $4 \times 4$ downsampled image. 10,000 simulation runs were conducted, respectively.

Type 6: Gray Equalization Filter

For the gray equalization we employed the filter described in Refs. 27, 28. We randomly generated $320 \times 320 pixel$ images, then proceeded to: 1. equalize the image once with $slope = 0$ , and downsample the image to either $4 \times 4$ , $8 \times 8$ , $16 \times 16$ , or $32 \times 32$ ; or 2. downsample the image to either $4 \times 4$ , $8 \times 8$ , $16 \times 16$ , or $32 \times 32$ , then equalize it (with $slope = 0$ , see Refs. 27, 28). Results between the same downsampling dimensions of the two procedures were compared, and the respective average deviation per pixel was recorded. Average deviations were obtained via averaging the differences between corresponding pixels in images produced through first equalization then binning and first binning then equalization. Increased binning results in increased deviation: $32 \times 32$ resulted in an average deviation of $58 \pm 33$ , $16 \times 16$ resulted in $61 \pm 35$ , $8 \times 8$ resulted in $64 \pm 37$ , and $4 \times 4$ resulted in $72 \pm 34$ . 10,000 simulation runs were conducted, respectively. The numerical simulations revealed that large downsampling (e.g., $4 \times 4$ ) after equalization merely centered pixel gray values around the middle of the grayscale (i.e., 128), suggesting that it is more effective to perform gray equalization after downsampling to attain meaningful equalization effects, as it utilizes the entire grayscale range of 0 to 255.

Citation Download Citation

Wolfgang Fink, Cindy X. You, and Mark A. Tarbell "Microcomputer-based artificial vision support system for real-time image processing for camera-driven visual prostheses," Journal of Biomedical Optics 15(1), 016013 (1 January 2010). https://doi.org/10.1117/1.3292012

Published: 1 January 2010

Access the abstract

JOURNAL ARTICLE
10 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

CITATIONS

Cited by 11 scholarly publications.

Explore citations on Lens.org

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Image filtering

Image processing

Optical filters

Cameras

Machine vision

Visualization

Imaging systems

1.

Introduction

Fig. 1

Fig. 2

Fig. 3

2.

Methods

2.1.

Hardware Platform

Fig. 4

2.2.

Microcomputer-Based Artificial Vision Support System Processing Architecture

Fig. 5

Stage 1: camera image acquisition

Stage 2: image capture

Stage 3: custom image processing including pixelation

Stage 4: image data transmission

2.3.

Microcomputer-Based Artificial Vision Support System Connectivity

2.3.1.

Universal Serial Bus Camera Interface

2.3.2.

Ethernet-Based (Transmission Control Protocol/Internet Protocol) Data Exchange Protocol

Fig. 6

2.3.3.

RS232 Communication Protocol

2.4.

Microcomputer-Based Artificial Vision Support System Real-Time Image Processing Capability

3.

Results

Table 1

Table 2

Fig. 7

4.

Conclusion

Acknowledgments

References

Appendices

{ label needed for app[@id='x0'] }

Appendix

General Definitions for Image Processing Filter Proofs

General Observations

Type 1: Additive Filters

Filter description

Evaluation

Comparison

Type 2: Multiplicative Filters

Filter description

Evaluation

Comparison

Type 3: Inversion Filter

Filter description

Evaluation

Comparison

Type 4: Gray-Value Reduction Filter

Filter description

Evaluation

Comparison

(∗) Proof of 1⩽c⩽Δ⩽2c

Type 5: Blur Filter

Type 6: Gray Equalization Filter

Show All Keywords

Keywords/Phrases

Search In:

Publication Years

(∗) Proof of $1 ⩽ c ⩽ Δ ⩽ 2 c$