Translator Disclaimer
1 January 2010 Microcomputer-based artificial vision support system for real-time image processing for camera-driven visual prostheses
Author Affiliations +
Abstract
It is difficult to predict exactly what blind subjects with camera-driven visual prostheses (e.g., retinal implants) can perceive. Thus, it is prudent to offer them a wide variety of image processing filters and the capability to engage these filters repeatedly in any user-defined order to enhance their visual perception. To attain true portability, we employ a commercial off-the-shelf battery-powered general purpose Linux microprocessor platform to create the microcomputer-based artificial vision support system (μAVS2) for real-time image processing. Truly standalone, μAVS2 is smaller than a deck of playing cards, lightweight, fast, and equipped with USB, RS-232 and Ethernet interfaces. Image processing filters on μAVS2 operate in a user-defined linear sequential-loop fashion, resulting in vastly reduced memory and CPU requirements during execution. μAVS2 imports raw video frames from a USB or IP camera, performs image processing, and issues the processed data over an outbound Internet TCP/IP or RS-232 connection to the visual prosthesis system. Hence, μAVS2 affords users of current and future visual prostheses independent mobility and the capability to customize the visual perception generated. Additionally, μAVS2 can easily be reconfigured for other prosthetic systems. Testing of μAVS2 with actual retinal implant carriers is envisioned in the near future.

1.

Introduction

Systems providing artificial vision are becoming a reality.1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23 In particular, (extraocular or intraocular) camera-driven epiretinal implants are being used in chronic patient trials already6, 5 (Fig. 1 ). With blind subjects, it is difficult to predict exactly what they can perceive with such camera-driven visual prostheses, especially since they (currently) provide only tens of stimulating retinal electrodes, thereby allowing only for limited visual perception (pixelation). Thus it is important to offer them a wide variety of image processing filters and the capability to engage these filters repeatedly in any user-defined order to enhance their visual perception in daily life.

Fig. 1

One instantiation of an artificial vision prosthesis: an intraocular retinal prosthesis using an external microelectronic system to capture and process image data and transmit the information to an implanted microelectronic system. The implanted system decodes the data and stimulates the retina via an electrode array with a pattern of electrical impulses to generate a visual perception.

016013_1_030001jbo1.jpg

Image processing systems (Fig. 2 ), such as the artificial vision simulator (AVS)24, 25, 26 (Fig. 3 ), provide these capabilities and perform real-time (i.e., 30fps ) image processing and enhancement of digital camera image streams before they enter the visual prosthesis. AVS, in particular, comprises numerous efficient image manipulation and processing modules, such as user-defined pixelation, contrast and brightness enhancement, grayscale equalization for luminance control under severe contrast and brightness conditions, user-defined grayscale levels for potential reduction of data volume transmitted to the visual prosthesis, blur algorithms, and edge detection.27, 28 AVS provides the unique ability and flexibility for visual prosthesis carriers to further customize their individual visual perception afforded by their respective vision systems by actively manipulating parameters of individual image processing filters, even altering the sequence of these filters.

Fig. 2

Schematic diagram of a real-time image processing system for artificial vision prostheses, which are driven by an external or internal camera system.

016013_1_030001jbo2.jpg

Fig. 3

(Top) Typical palette of image processing modules that can be applied in real time to a video camera stream driving an artificial visual prosthesis. (Bottom) a user-defined subset of filters (pixelation plus grayscale equalization plus contrast enhancement) applied in a real world scenario: recognizing a door and doorknob in a low contrast environment.

016013_1_030001jbo3.jpg

The original implementation of the AVS processing system has been laptop computer based and hence not truly portable. To provide independent mobility for the blind and visually impaired using camera-driven artificial visual prostheses (Fig. 1), we introduce in the following a stand-alone, battery-powered, portable microcomputing platform and software system for real-time image processing for camera-driven artificial vision prostheses, the microcomputer-based artificial vision support system (μAVS2) .29

2.

Methods

2.1.

Hardware Platform

For the purpose of creating a small, stand-alone, battery-operated, and portable version of AVS, namely μAVS2 , we employed a commercial off-the-shelf, battery-powered, general purpose miniaturized Linux processing platform: the Gumstix (by Gumstix, Incorporated, Portola Valley, California) microcomputer environment (Fig. 4 ). It is lightweight (8g) , fast ( 600-MHz clock speed or better), and equipped with USB, Ethernet, and RS-232 interfaces.

Fig. 4

Gumstix-based μAVS2 , a stand-alone, battery-operated, portable, general purpose microcomputing platform for real-time image processing.

016013_1_030001jbo4.jpg

In particular, the Verdex-Pro class Gumstix microcomputer has the following specifications (see http://www.gumstix.net/Hardware/view/Hardware-Specifications/Verdex-Pro-Specifications/112.html):

  • Marvell® PXA270 CPU with XScale

  • 600-MHz clock speed

  • Ethernet

  • USB

  • RS-232

  • dynamic RAM+Flash RAM

  • 80×20mm , 8g , 25to85°C , 3.5 to 5.0 VDC

  • battery powered (includes driving USB cameras and onboard USB/Ethernet/RS-232 interfaces).

2.2.

Microcomputer-Based Artificial Vision Support System Processing Architecture

For visual processing purposes, μAVS2 imports raw video frames from a camera connected via its built-in USB port (or alternatively via its built-in Ethernet port, see Fig. 5 ). μAVS2 then reduces the resolution of the video frames to match the pixel resolution of the patient’s visual prosthesis (i.e., pixelation or downsampling, see Fig. 5). μAVS2 subsequently processes the downsampled video frames through user-selected image filters in a linear, sequential-loop fashion, resulting in vastly reduced memory and CPU requirements during execution, making the use of a microprocessor possible. The frequency and order of the image processing modules are determined via a predefined, user-selectable “control string” or script without recompilation of the image processing code (Fig. 5). μAVS2 then issues the processed video frame data over an outbound connection (via RS-232, wired Ethernet, or Internet) to the visual prosthesis system in real time (Fig. 5).

Fig. 5

Schematic of μAVS2 processing architecture.

016013_1_030001jbo5.jpg

The control string or script defines the order and frequency that the image filters are applied to the input video frames. Each image filter in a specific sequence is applied to the results of the filter before it, in a net cascading effect. Thus, filter 1 is applied to the raw pixelated video frame; filter 2 is then applied to that result, and so on to the last filter, at which time the processed image is ready to be issued for stimulation of the prosthesis. This cycle repeats anew for each incoming video frame. Before use, the blind subject is tested and the specific sequence of filters that provides him the best results is determined. This sequence is incorporated into the control script, which is then downloaded onto the device. In the current implementation of μAVS2 , the control script is not changeable by the user; however, implementing μAVS2 on an advanced embedded platform such as an Apple iPhone could allow the user the selection of a number of prepared control scripts in real time (e.g., one for daytime, one for nighttime, etc.).

In particular μAVS2 adheres to the following modular processing architecture (Fig. 5).

Stage 1: camera image acquisition

μAVS2 utilizes access to an “image generator,” i.e., a digital video camera. The camera may be local, directly connected via the built-in USB port, or it may be a local or remote IP camera on an accessible network. For a locally connected (hardwired) camera, μAVS2 will open an exclusive read channel to the device over the USB bus to gain access to the camera’s video buffer. For an IP camera, μAVS2 will access the camera at its IP address, using the access protocol specific to the type of camera in use (e.g., HTTP, UDP).

Stage 2: image capture

Once μAVS2 has established a connection to a digital video camera, it must extract discrete image frames from the camera. It does this by reading the camera’s YUV frame buffer.

Stage 3: custom image processing including pixelation

The standardized image representation is reduced in resolution (pixelated) to match the pixel resolution of the patient’s retinal implant electrode array. For example, if the original image representation has a resolution of 640×480 raw pixels, the image is downsampled to the patient’s electrode array size, e.g., 16×16pixels , thus becoming a 256-bytepixel array. This downsampling affords μAVS2 its real time capability, as it allows for a subsequent dramatic speed-up of otherwise computationally expensive image processing filters that can now be equally efficiently executed on the reduced-size pixel array rather than the full-resolution camera video frames (for more detail see Sec. 2.4). Specialized image processing filters (e.g., contrast, brightness, gray equalization, gray levels, inversion, edge detection, and image blur27, 28) are then applied to the resolution-reduced/pixelated image (pixel array) according to a predefined/user-selectable control string or script to enhance its essential characteristics, cleaning up the image before issuance to the retinal implant.

Stage 4: image data transmission

The final stage in μAVS2 processing causes the processed pixel array to be sent to a power and data telemetry system (Fig. 5) that subsequently outputs the image data, e.g., by means of electrical current pulses on the patient’s retinal implant electrode array to generate visual perceptions, i.e., so-called phosphenes.

2.3.

Microcomputer-Based Artificial Vision Support System Connectivity

μAVS2 supports a variety of interfaces that enable its connectivity to input devices such as digital cameras, and to output devices such as artificial vision systems. The interfaces currently supported are as follows.

2.3.1.

Universal Serial Bus Camera Interface

μAVS2 contains a fully supported, general purpose USB port, allowing for the connection of a wide range of USB devices, including many types of USB web cameras, without the need of an external power supply. Alternatively, accessing an IP camera is possible as well, using an HTTP protocol to an IP camera connected to the onboard Ethernet port. When using a YUV-capable USB or IP camera, the camera’s grayscale video frame buffer is directly accessible to the μAVS2 image processing filters.

2.3.2.

Ethernet-Based (Transmission Control Protocol/Internet Protocol) Data Exchange Protocol

To integrate the μAVS2 with artificial vision prostheses, an extensible Ethernet-based (i.e., TCP/IP) data exchange protocol (Fig. 6 ) has been developed (for details see Ref. 26) to facilitate the transfer of the μAVS2 -manipulated (video) output data to, e.g., power and data telemetry systems for retinal prostheses.30 This Ethernet-based protocol allows for maximum flexibility between systems, and is capable of two-way data transfer as well as status message transmission. The interface protocol is sufficiently flexible, so that it is also possible for a power and data telemetry system to transmit its own data (e.g., measurement data obtained from the vision implant) back to μAVS2 for feedback control, further processing, and analysis. Additionally, contingency measures are integrated into the protocol, providing for negative acknowledgements and data resend requests, should the need arise in which data is required to be resent.

Fig. 6

Schematic view of Ethernet-based data exchange protocol between μAVS2 and a power and data telemetry system of an artificial vision prosthesis.30

016013_1_030001jbo6.jpg

2.3.3.

RS232 Communication Protocol

Using the onboard universal asynchronous receiver/transmitter, it is possible to write the μAVS2 -manipulated (video) output data to the Gumstix serial port at speeds of up to 921,600 baud, using a RS-232 8N1 format: each data byte is actually nine bits long, i.e., eight data bits and one stop bit.

2.4.

Microcomputer-Based Artificial Vision Support System Real-Time Image Processing Capability

The real-time image processing capability of μAVS2 is enabled by the downsampling of the native resolution of the artificial vision prosthesis camera system before image processing filters are applied. For example, if the original image representation has a resolution of 640×480 raw pixels (i.e., 307,200pixels ), and if the implanted electrode array of an epiretinal vision implant has a dimension of 16×16 stimulating electrodes (i.e., 256 electrodes), then the downsampling/binning of the 640×480pixel image to 16×16pixels affords a speed-up factor of 1200 (i.e., 3 orders of magnitude) in subsequent filter processing. This process allows otherwise computationally expensive image processing filters (e.g., blurring) to be executed in real time (i.e., in excess of 30fps ) without loss of filter validity/efficiency. In the Appendix{ label needed for app[@id='x0'] } in Sec. 5 we show mathematically and numerically, for a set of relevant image processing filters, that downsampling first and subsequent filtering of the reduced size image yields the exact or nearly exact same end result (i.e., is commutable or nearly commutable) compared to downsampling after image processing filters have been applied to the full-resolution camera video frames. This speed-up enables the employment of computationally low power, portable, and battery-powered microcomputing platforms to perform real-time image processing within an artificial vision system, thereby affording users enhanced mobility and independence.

3.

Results

Table 1 demonstrates the binning/downsampling performance efficiency of μAVS2 for various artificial vision prostheses with electrode array dimensions ranging from 4×4 to 32×32 . For example, μAVS2 is capable of binning an in-memory source image (i.e., a frame) of 160×120pixels down to 8×8pixels at a rate of 354 frames per second (fps).

Table 1

Binning/downsampling performance efficiency of μAVS2 in frames per second for artificial vision prostheses with various electrode array dimensions and a camera frame resolution of 160×120pixels .

Camera frame resolution 160×120pixels
Electrode array dimension 4×4pixels 8×8pixels 16×16pixels 32×32pixels
Binning performance370354303265

Table 2 shows individual and total filter performance efficiencies for various electrode array dimensions, obtained with an in-memory frame of 160×120pixels . The results show that the application of filters to the workflow impact the processing only marginally. For example, adding a contrast filter to the 8×8 case reduces the frame rate by only 3, from 354 (no filters) to 351fps . This demonstrates that binning/downsampling represents the bulk of the processing, whereas the individual filters (or even a sequence of all listed filters) pose a minimal additional processing burden, i.e., the image processing filters are “lightweight.” In practice, the real-time frame rate is much more dependent on the maximum frame rate of the camera used, some of which are limited to 10fps , such as the camera used for our development.

Table 2

Individual and total filter performance efficiencies (including prior binning/downsampling) of μAVS2 in frames per second for artificial vision prostheses with various electrode array dimensions and a camera frame resolution of 160×120pixels .

Binning dimension 4×4pixels 8×8pixels 16×16pixels 32×32pixels
Contrast filter369351296244
Brightness filter368349293233
Gray equalizationfilter300288253221
Gray level filter370352301256
Inversion filter371353302262
Edge detectionfilter368341266170
Image blur filter367349293236
All filters engaged302280217137

As far as the actual CPU utilization of the Gumstix is concerned, using a camera frame resolution of 160×120pixels with a sustained frame rate of 10fps (including the bidirectional communication between μAVS2 and the camera to generate the YUV frame buffer) results in only a 10% CPU load (measured using the Unix “top” command). In practice, the CPU load is linearly dependent on the camera frame rate, theoretically allowing for a maximum of 100fps at a camera frame resolution of 160×120pixels , whereas many implanted prostheses require only 60 or fewer frames per second for optimal functioning.

Figure 7 displays application of the previous filters to a typical scenario (i.e., a darkened hallway). Here the simulated electrode array dimensions of the artificial vision prosthesis (i.e., binning) are 32×32 .

Fig. 7

Example filters of μAVS2 applied to a typical scenario (i.e., a darkened hallway). The electrode array dimensions of the artificial vision prosthesis are 32×32 .

016013_1_030001jbo7.jpg

Furthermore, duration tests with two common battery types were performed. The first test used a single 6-V 2CR5 alkaline battery, resulting in an average duration of 2.7h . The second test used a NiMH 6-V×5-Ah rechargeable battery pack, resulting in an average duration of 12h .

The Gumstix natively supports a multiboot versatility. If a microSD memory card is inserted into the Gumstix’s card slot, the Gumstix will boot from the system installed on it instead of booting from its own internal flash RAM. Thanks to this capability, we have been able to provide several custom-tailored image filter cascades, each on its own microSD card, such that a change from one setting to another can easily be accomplished by swapping cards. On reboot, μAVS2 automatically executes the new processing cascade established on the microSD card without further user interaction, and continuously performs real-time image processing.

4.

Conclusion

To support existing and future camera-driven artificial vision prostheses, we have devised the microcomputer-based artificial vision support system (μAVS2) : a small, stand-alone, battery-operated, customizable image processing system that utilizes a general purpose miniaturized Linux processing platform (Fig. 4). μAVS2 provides real-time image processing for artificial vision systems while maintaining portability and thus independence for its users. Moreover, with its general purpose computing capabilities, it can easily be reconfigured to support prosthetic systems beyond artificial vision, such as control of artificial limbs.

Multi-purpose systems, such as smartphones (e.g., Apple iPhone, BlackBerry, or Android phones), would be ideal platforms to host μAVS2 , as they provide a cell phone link and integrated GPS functionality, in addition to image processing capabilities. It should be noted that at the time of this writing, the application programming interface (API) of the iPhone software development kit (SDK) did not provide for live camera stream access. Such support is anticipated with the next SDK update. Nevertheless, we have proceeded to port a static version of μAVS2 to the iPhone, applying the suite of filters to still-image manipulation and storage functionalities. This will be updated to perform real-time video processing on the iPhone as soon as its SDK supports live camera stream access.

We would like to emphasize that the employment of μAVS2 in visual prosthetics is by no means limited to retinal implants (epi- or subretinal) only.1, 4, 5, 6 On the contrary, μAVS2 is directly and immediately applicable to any (artificial) vision-providing/stimulating system that is based on an external (e.g., eyeglass-mounted) or internal (e.g., intraocular camera) video-camera system as the first step in the stimulation/processing cascade, such as optic nerve implants,10, 11, 12 cortical implants,13, 14, 15, 16 electric tongue stimulators,17, 18, 19, 20 and tactile stimulators (both electrical and mechanical21, 22, 23). In addition, μAVS2 can interface to infrared (IR) camera systems to augment the visual cues with thermal information, allowing for “supervision” at night and during adverse weather conditions such as fog.

Acknowledgments

The work described in this publication was carried out at the California Institute of Technology under support of the National Science Foundation grant EEC-0310723. Fink and Tarbell may have proprietary interest in the technology presented here as a provisional patent has been filed on behalf of Caltech. You has no proprietary interest.

References

1. 

E. Zrenner, “Will retinal implants restore vision?,” Science, 295 (5557), 1022 –1025 (2002). https://doi.org/10.1126/science.1067996 0036-8075 Google Scholar

2. 

S. C. DeMarco, “The architecture, design, and electromagnetic and thermal modeling of a retinal prosthesis to benefit the visually impaired,” North Carolina State Univ., (2001). Google Scholar

3. 

E. Zrenner, K. D. Miliczek, V. P. Gabel, H. G. Graf, E. Guenther, H. Haemmerle, B. Hoefflinger, K. Kohler, W. Nisch, M. Schubert, A. Stett, and S. Weiss, “The development of subretinal microphotodiodes for replacement of degenerated photoreceptors,” Ophthalmic Res., 29 269 –28 (1997). https://doi.org/10.1159/000268025 0030-3747 Google Scholar

4. 

J. F. Rizzo, J. L. Wyatt, “Prospects for a visual prosthesis,” Neuroscientist, 3 (4), 251 –262 (1997). https://doi.org/10.1177/107385849700300413 1073-8584 Google Scholar

5. 

M. S. Humayun, J. Weiland, G. Fujii, R. J. Greenberg, R. Williamson, J. Little, B. Mech, V. Cimmarusti, G. van Boemel, G. Dagnelie, and E. de Juan Jr., “Visual perception in a blind subject with a chronic microelectronic retinal prosthesis,” Vision Res., 43 (24), 2573 –2581 (2003). https://doi.org/10.1016/S0042-6989(03)00457-7 0042-6989 Google Scholar

6. 

W. Liu and M. S. Humayun, “Retinal prosthesis,” 218 –219 (2004). Google Scholar

7. 

P. R. Singh, W. Liu, M. Sivaprakasam, M. S. Humayun, and J. D. Weiland, “A matched biphasic microstimulator for an implantable retinal prosthetic device,” 1 –4 (2004). Google Scholar

8. 

J. D. Weiland, W. Fink, M. Humayun, W. Liu, D. C. Rodger, Y. C. Tai, and M. Tarbell, “Progress towards a high-resolution retinal prosthesis,” 7373 –7375 (2005). Google Scholar

9. 

J. D. Weiland, W. Fink, M. S. Humayun, W. Liu, W. Li, M. Sivaprakasam, Y. C. Tai, and M. A. Tarbell, “System design of a high resolution retinal prosthesis,” (2008). http://dx.doi.org/10.1109/IEDM.2008.4796682 Google Scholar

10. 

C. Veraart, C. Raftopoulos, J. T. Mortimer, J. Delbeke, D. Pins, G. Michaux, A. Vanlierde, S. Parrini, and M. C. Wanet-Defalque, “Visual sensations produced by optic nerve stimulation using an implanted self-sizing spiral cuff electrode,” Brain Res., 813 181 –186 (1998). https://doi.org/10.1016/S0006-8993(98)00977-9 0006-8993 Google Scholar

11. 

C. Veraart, J. Delbeke, M. C. Wanet-Defalque, A. Vanlierde, G. Michaux, S. Parrini, O. Glineur, M. Verleysen, C. Trullemans, and J. T. Mortimer, 57 –59 (1999). Google Scholar

12. 

C. Veraart, M. C. Wanet-Defalque, B. Gerard, A. Vanlierde, and J. Delbeke, “Pattern recognition with the optic nerve visual prosthesis,” Artif. Organs, 27 996 –1004 (2003). https://doi.org/10.1046/j.1525-1594.2003.07305.x 0160-564X Google Scholar

13. 

W. H. Dobelle, M. G. Mladejovsky, and J. P. Girvin, “Artificial vision for the blind: electrical stimulation of visual cortex offers hope for a functional prosthesis,” Science, 183 (4123), 440 –444 (1974). https://doi.org/10.1126/science.183.4123.440 0036-8075 Google Scholar

14. 

W. H. Dobelle, M. G. Mladejovsky, and J. P. Girvin, “Artificial vision for the blind by electrical stimulation of the visual cortex,” Neurosurgery, 5 (4), 521 –527 (1979). https://doi.org/10.1097/00006123-197910000-00021 0148-396X Google Scholar

15. 

W. H. Dobelle, “Artificial vision for the blind: the summit may be closer than you think,” ASAIO J., 40 (4), 919 –922 (1994). 1058-2916 Google Scholar

16. 

W. H. Dobelle, “Artificial vision for the blind by connecting a television camera to the visual cortex,” ASAIO J., 46 (1), 3 –9 (2000). https://doi.org/10.1097/00002480-200001000-00002 1058-2916 Google Scholar

17. 

P. Bach-y-Rita, K. A. Kaczmarek, M. E. Tyler, and M. Garcia-Lara, “Form perception with a 49-point electrotactile stimulus array on the tongue: a technical note,” J. Rehabil. Res. Dev., 35 (4), 427 –430 (1998). 0748-7711 Google Scholar

18. 

K. A. Kaczmarek, P. Bach-y-Rita, and M. E. Tyler, “Electrotactile pattern perception on the tongue,” BMES, 5 –131 (1998) Google Scholar

19. 

R. Kupers and M. Ptito, “Seeing through the tongue: cross-modal plasticity in the congenitally blind,” Intl. Congress Series, 1270 79 –84 (2004) Google Scholar

20. 

M. Ptito, S. Moesgaard, A. Gjedde, and R. Kupers, “Cross-modal plasticity revealed by electrotactile stimulation of the tongue in the congenitally blind,” Brain, 128 (3), 606 –614 (2005). https://doi.org/10.1093/brain/awh380 0006-8950 Google Scholar

21. 

C. C. Collins and F. A. Saunders, “Pictorial display by direct electrical stimulation of the skin,” J. Biomed. Sys., 1 3 –16 (1970). Google Scholar

22. 

C. C. Collins, “On mobility aids for the blind,” Electronic Spatial Sensing for the Blind, 35 –64 Matinus Nijhoff, Dordrecht, The Netherlands (1985). Google Scholar

23. 

K. Kaczmarek, P. Bach-y-Rita, W. J. Tompkins, and J. G. Webster, “A tactile vision-substitution system for the blind: computer-controlled partial image sequencing,” IEEE Trans. Biomed. Eng., BME-32 (8), 602 –608 (1985). https://doi.org/10.1109/TBME.1985.325599 0018-9294 Google Scholar

24. 

W. Fink and M. Tarbell, “Artificial vision simulator (AVS) for enhancing and optimizing visual perception of retinal implant carriers,” Invest. Ophthalmol. Visual Sci., 46 1145 (2005). 0146-0404 Google Scholar

25. 

W. Liu, W. Fink, M. Tarbell, and M. Sivaprakasam, “Image processing and interface for retinal visual prostheses,” 2927 –2930 (2005). http://dx.doi.org/10.1109/ISCAS.2005.1465240 Google Scholar

26. 

W. Fink, M. A. Tarbell, L. Hoang, and W. Liu, “Artificial vision support system (AVS2) for enhancing and optimizing low-resolution visual perception for visual prostheses,” Google Scholar

27. 

J. C. Russ, The Image Processing Handbook, CRC Press, Boca Raton, FL (2002). Google Scholar

28. 

H. R. Myler and A. R. Weeks, The Pocket Handbook of Image Processing Algorithms in C, Prentice Hall PTR, Englewood Cliffs, NJ (1993). Google Scholar

29. 

W. Fink and M. Tarbell, “μAVS2: microcomputer-based artificial vision support system for real-time image processing for camera-driven visual prostheses,” Invest. Ophthalmol. Visual Sci., 50 4748 (2009). 0146-0404 Google Scholar

30. 

G. Wang, W. Liu, M. Sivaprakasam, and G. A. Kendir, “Design and analysis of an adaptive transcutaneous power telemetry for biomedical implants,” IEEE Trans. Circuits Syst., 52 2109 –2117 (2005). https://doi.org/10.1109/TCSI.2005.852923 0098-4094 Google Scholar

Appendices

{ label needed for app[@id='x0'] }

Appendix

In the following we show mathematically and numerically, for a set of relevant image processing filters, that downsampling first and subsequent filtering of the reduced size image yields the exact or nearly exact same end result compared to downsampling after image processing filters have been applied to the full-resolution camera video frames. The degree of commutation between both procedures for the respective image processing filters is reported in the following in units of gray value differences.

General Definitions for Image Processing Filter Proofs

In the following, all divisions in ab form denote integer divisions, and all divisions not enclosed within brackets represent regular divisions. All q (quotients, with 0q255 as applied to filters operating on 256 grayscale values) and r (respective remainders) are N . Note that in general, for any integer division ab=q* with corresponding remainder r* , we can eliminate the integer division notation via

q*+r*b=ab,0r*<b.
Let xi , 0xi255 , denote the gray value of the pixel to be filtered, f(xi) be the image processing filter applied to the pixel, and n denote the number of pixels to be processed into one superpixel, i.e., downsampling. The downsampling is as follows:
x¯=i=1nxin=(i=1nxi)r1n,
with r1 being the remainder of i=1nxin , 0r1<n , and q1 the quotient, 0q1255 .

General Observations

Let a,b,c0 , and Z , but bc is not an integer. Then a+bc can be rewritten by:

Ifa> 0bc> 0,thena<0bc<0,a+bc
a> 0bc<0,a+bc1
a<0bc> 0,a+bc+1.
With the previous definitions, the downsampling first versus filtering first procedures are:

Procedure1: downsample first, filter second.

f(i=1nxin)=f(x¯).
Procedure2: filter first, downsample second.
i=1nf(xi)n=f(x)¯.

Type 1: Additive Filters

Filter description

f(xi)=xi+a,aisaconstantZ.

Evaluation

Procedure1:

f(x¯)=x¯+a.

Procedure2:

since r1n<1 , and a,q1 are both integers,

f(x)¯=i=1n(xi+a)n,
=x¯+r1n+a,
=x¯+a.

Comparison

f(x¯)=x¯+a=f(x)¯.
The order of downsampling and filtering is inconsequential with respect to the resulting gray values. Both procedures are commutable.

Type 2: Multiplicative Filters

Filter description

f(xi)=axi,a=bdisarationalconstantR> 0.

Evaluation

Procedure1: let r2 be the remainder of f(x¯)=ax¯ , such that

f(x¯)=bd((i=1nxi)r1)n,
=bd((i=1nxi)r1)r2n,
=b(i=1nxi)dnr1nr2n.
Procedure2: let ri be the remainder of bxid , 0ri<d , such that
f(x)¯=i=1naxin=[i=1n(bxirid)]n.
with r3 , 0r3<n , being the remainder of the prior division, it follows
f(x)¯=[i=1n(bxirid)]r3n,
=b(i=1nxi)dni=1nridnr3n.

Comparison

The two procedures differ by

(i=1nridn+r3n)(r1n+r2n).
Note that 0r1<n , 0ri<d , and 0r3<n . Hence, the greatest difference with respect to the resulting gray values between the two procedures is
(i=1nridn+r3n)(r1n+r2n)<2.
Both procedures are nearly commutable.

Type 3: Inversion Filter

Filter description

f(xi)=255xi.

Evaluation

Procedure1:

f(x¯)=255x¯.
Procedure2:
f(x)¯=i=1n(255xi)n,
=255i=1nxin,
=255i=1nxin1,
=255x¯1.

Comparison

f(x¯)=255x¯=f(x)¯+1.
Therefore, the respective gray values of the two procedures differ by a constant value of 1. Hence, they are nearly commutable.

Type 4: Gray-Value Reduction Filter

Filter description

Let v denote the new number of gray levels. For practical purposes, 1<v<256 , vN . Let c be the number of original grayscale values incorporated into each new grayscale value, and Δ be the interval between gray values of the new grayscale.

c=256v,Δ=255v1,1cΔ2c().
The gray-value reduction filter function then reads:
f(xi)=xicΔ.

Evaluation

Procedure1: let r4 be the remainder of x¯c , 0r4<c . Recall that x¯=((i=1nxi)r1)n . Hence, Procedure1 yields:

f(x¯)=x¯cΔ=(x¯r4c)Δ,
=[(i=1nxi)r1n]r4cΔ,
=Δi=1nxicnΔr1cnΔr4c.
Procedure 2: let ri be the remainder of xic , 0ri<c , and r5 be the remainder of i=1nxicΔn , 0r5<v .
f(x)¯=i=1nxicΔn,
=i=1nxicΔr5n,
=i=1n(xiric)Δr5n,
=Δi=1nxicnΔi=1nricnr5n.

Comparison

The two procedures differ by

(Δr1cn+Δr4c)(Δi=1nricn+r5n).
Recall that 0r1<n , 0r4<c , 0ri<c , 0r5<n , 1cΔ2c .
0(Δr1cn+Δr4c)<(Δc+Δ)<2+2c,
0(Δi=1nricn+r5n)<(Δi=1nccn+1)=Δ+1<2c+1.
Therefore, since
0(Δr1cn+r4c)<2+2c,0(i=1nricn+r5n)<2c+1,
(Δr1cn+r4c)(i=1nricn+r5n)<2c+2.
Thus, the greatest difference between the resulting gray values of the two procedures is 2c+2 . Hence, the lower the number of new gray values, the larger the deviation of resulting gray values, and vice versa. Thus, the degree of commutation between both procedures depends on the number of new gray values.

(∗) Proof of 1cΔ2c

Consider c=256v1 , Δ=255v1=(255vv1)v . With 1<v<256 , vN , the numerator of Δ can take on values

2552254(255vv1)25521.
Note the left side of this inequality is > 256 but <257 and corresponds to v=255 , while the right side corresponds to v=2 , and is equal to 510. To the left side of this series of inequalities, we can then tag
256<2552254(255vv1)25521,
which then implies that c<Δ .

Furthermore, 2c=2256v2256v1=512vv . The smallest possible value for the numerator of 2c is 512v=5122=510=25521 . Hence, Δ2c .

Type 5: Blur Filter

For the blurring, we developed and implemented a computationally inexpensive code that consists of two subsequent sweeps across the respective image (downsampled or full resolution): first a vertical sweep (i.e., by columns) followed by a horizontal sweep (i.e., by rows). In both sweeps only NN=1 horizontal or vertical nearest neighbors, respectively, are considered. The nearest neighbors are equally weighted. This blur procedure delivers an effective blur (e.g., Fig. 7) and is applied in lieu of computationally expensive Gaussian blur filters.27, 28 We randomly generated 320×320pixel images, then proceeded to: 1. blur the image once with the number of nearest neighbors NN=1 , and downsample the image to either 4×4 , 8×8 , 16×16 , or 32×32 ; or 2. downsample the image to either 4×4 , 8×8 , 16×16 , or 32×32 , then blur it. Results between the same downsampling dimensions of the two procedures were compared, and the respective average deviation per pixel was recorded. The numerical simulations revealed that the deviation between the two procedures is minimal (i.e., they are nearly commutable): worst average deviation was 4.4±3.4 on 32×32 downsampled image, and best average deviation was 0.58±0.58 on a 4×4 downsampled image. 10,000 simulation runs were conducted, respectively.

Type 6: Gray Equalization Filter

For the gray equalization we employed the filter described in Refs. 27, 28. We randomly generated 320×320pixel images, then proceeded to: 1. equalize the image once with slope=0 , and downsample the image to either 4×4 , 8×8 , 16×16 , or 32×32 ; or 2. downsample the image to either 4×4 , 8×8 , 16×16 , or 32×32 , then equalize it (with slope=0 , see Refs. 27, 28). Results between the same downsampling dimensions of the two procedures were compared, and the respective average deviation per pixel was recorded. Average deviations were obtained via averaging the differences between corresponding pixels in images produced through first equalization then binning and first binning then equalization. Increased binning results in increased deviation: 32×32 resulted in an average deviation of 58±33 , 16×16 resulted in 61±35 , 8×8 resulted in 64±37 , and 4×4 resulted in 72±34 . 10,000 simulation runs were conducted, respectively. The numerical simulations revealed that large downsampling (e.g., 4×4 ) after equalization merely centered pixel gray values around the middle of the grayscale (i.e., 128), suggesting that it is more effective to perform gray equalization after downsampling to attain meaningful equalization effects, as it utilizes the entire grayscale range of 0 to 255.

©(2010) Society of Photo-Optical Instrumentation Engineers (SPIE)
Wolfgang Fink, Cindy X. You, and Mark A. Tarbell "Microcomputer-based artificial vision support system for real-time image processing for camera-driven visual prostheses," Journal of Biomedical Optics 15(1), 016013 (1 January 2010). https://doi.org/10.1117/1.3292012
Published: 1 January 2010
JOURNAL ARTICLE
10 PAGES


SHARE
Advertisement
Advertisement
RELATED CONTENT

Processing imagery for a spatial light modulator
Proceedings of SPIE (September 01 1990)
Failure Mode Imaging With Laser Strobe And X-Ray Techniques
Proceedings of SPIE (November 02 1984)
A Multispectral Video Imaging And Analysis System
Proceedings of SPIE (January 14 1986)
Electronic pan/tilt/zoom camera system
Proceedings of SPIE (November 15 1993)

Back to Top