Optical Coherence Tomography and the Human Ear
Optical coherence tomography (OCT) is an optical imaging technique that uses a low-coherence interferometer to produce depth-resolved scans in tissue.1 The use of OCT for anatomical middle ear imaging in humans was first suggested by Pitris et al.2 Since then, it has been used as a diagnostic tool for examination of the tympanic membrane (TM) and for use during middle ear and cochlear surgery,34.–5 and it has been successfully used to study ear anatomy in several animal models over the last decade.67.8.9.–10 More recently, Doppler-resolved OCT has been shown to provide useful diagnostic information in both chinchilla11 and cadaveric human middle ear12 models. A competing imaging technology, high-frequency ultrasound, has been used to generate middle ear images comparable to those obtainable with OCT, but with the disadvantage that the ear must be filled with an acoustic coupling medium such as saline in order to be imaged.13
Despite the clear promise of OCT for middle ear diagnostics, no system yet proposed or demonstrated appears to be suitable for real-time clinical imaging of the middle ear in humans without requiring removal of the TM. As compared to rodent models, human middle ears are deep and capturing the entire middle ear in an image frame strains the scanning range capabilities of most OCT technologies. However, there is a compelling case for development of OCT for middle ear use since noninvasive clinical middle ear imaging could reduce the number of unnecessary ear surgeries being performed, improve diagnosis of middle ear disease, and provide better imaging for postoperative follow-up.
In the clinic, ears are typically visualized using surgical microscopes with long working distances of 20–30 cm. The distance from the ear canal entrance to the distal side of the middle ear cavity is about 40 mm and the diameter of the ear canal is typically 7 mm. If the middle ear is to be imaged from outside the ear canal, then the ear canal geometry restricts the numerical aperture (NA) of ear imaging systems to be below 0.09. In order to extend the depth of field and avoid vignetting by the ear canal, the typical NA of surgical microscopes used in the clinic is less than 0.02.14 A low NA reduces lateral resolution and, more importantly for otological OCT, reduces the amount of collected light, and thus reduces the achievable shot-noise-limited signal-to-noise ratio (SNR) and penetration. In a noninvasive middle ear imaging system, the amount of optical power reflected from internal middle ear structures like the ossicles is further reduced by the need for the light to pass twice through the scattering tissue of the TM. Moreover, simultaneous imaging of the relatively bright TM and the much dimmer middle ear structures requires that the imaging system has a large dynamic range.
From a clinical perspective, any proposed OCT system must fit well within the existing clinical work flow, provide the clinician with obvious diagnostic advantages, and produce images at real-time frame rates. Unlike more commonplace retinal and skin OCT, the structures of interest in the middle ear are millimeter sized and do not require exceptionally high resolution to visualize. However, the fact that they are located behind the eardrum, the fact that they are distributed throughout approximately two cubic centimeters of air-filled space and the fact that, if they are to be imaged from an external microscope, the NA will be low, conspire to create a set of requirements unlike other OCT applications.
In this paper, we will present two-dimensional (2-D) and three-dimensional (3-D) imaging results obtained from a time-domain OCT (TD-OCT) system specifically developed to assess these challenges and their associated design constraints for real-time noninvasive clinical middle-ear OCT. While it is well known that TD-OCT suffers from a low frame rate and low sensitivity as compared to Fourier-domain OCT (FD-OCT) methods,15 spectral-domain OCT (SD-OCT), and swept-source OCT (SS-OCT), TD-OCT has advantanges over FD-OCT in terms of dynamic range and maximum scanning range and is unaffected by complex-phase ambiguity and aliasing artifacts.16 While TD-OCT is unlikely to form the basis for a real-time clinical system due to the difficulty of achieving real-time frame rates, it presents an excellent vehicle for preclinical imaging aimed at understanding the requirements that a clinical system needs to meet in order to produce acceptable images, which is the main aim of this study.
Middle Ear Anatomy and Pathology
The middle ear, shown in Fig. 1, consists of the eardrum or TM, middle ear bones (ossicles) and associated muscles, tendons, and ligaments. The TM separates the ear canal from the air-filled middle ear cavity containing the three ossicles (malleus, incus, and stapes), which act to mechanically transmit the vibrations of the TM to the cochlea. Clinically, since the middle ear cavity is covered by the optically scattering TM, the ossicles cannot be seen clearly with direct microscopy, but since the TM is only thick,17 it is possible to use OCT to image through the TM and into the middle ear cavity.
Pathology of the ossicles is a common cause of hearing loss. Ossicular problems include: traumatic fracture or dislocation; fixation due to otosclerosis; and erosion, particularly at the long process of the incus, by TM retraction or an abnormal growth of soft tissue called cholesteatoma. Surgical intervention aimed at restoring normal function to the middle ear can involve the release of fixed ossicles or the reconstruction of parts of the ossicular chain with prosthetic autologous, or artificial materials, but accurate in-clinic diagnosis of the pathology underlying a conductive hearing loss is currently limited by the inability to visualize the middle ear directly without taking the patient to the operating room. This need has sparked interest in transtympanic OCT as a possible noninvasive imaging modality for the middle ear.2
Key Challenges of Optical Coherence Tomography-Based Middle Ear Imaging
In conventional OCT systems, a focused beam with a depth of field roughly equal to the scanning range is scanned laterally across the field of view. In order to image the full depth of the middle ear from the TM to the cochlear floor, a depth scanning range of at least 10 mm is required, necessitating an equivalent depth of field. A 10-mm lateral scanning range is sufficient to capture the majority of the middle ear cavity and ossicles.
While broadband light sources are available at visible wavelengths, the fact that scattering losses in the TM are lower at longer wavelengths18 makes operation in the near-infrared (NIR) desirable. For our system, we used a center wavelength of 1310 nm to take advantage of the high-powered, low-noise superluminescent diodes (SLDs) available at that wavelength. Even longer wavelengths could be used to further reduce scattering losses in the TM, but at the expense of lower resolution and a poorer availability of high-powered, broadband optical sources.
For generic OCT sample arm optics, the axial depth of field of the scanning beam is related to the system NA by19
Because the NA is proportional to , the ear’s long depth requires low NA optics in order to maintain focus throughout the scan. For a depth of field of 10 mm at a wavelength of 1310 nm, NA is limited to . We found that acceptable images could be obtained with an NA as high as 0.022, but any further increase leads to noticeable levels of defocusing at the proximal and distal ends of the images, i.e., at the TM and the cochlear floor, respectively. This NA is an order of magnitude lower than those typically used in opthalmic OCT scanners.20 With an NA this low, the resulting decrease in lateral resolution may be acceptable in the ear, but the associated loss in light-gathering ability leads to challenges in collecting sufficient backscatter to form an image with adequate SNR for clinical fidelity.
It is important to consider the metrics of the imaging system’s performance that matter most to clinical otologists. They are contrast, frame rate, and resolution. Because all important anatomical structures in the middle ear are suspended in air, contrast in otology images is equivalent to SNR. If we also reasonably assume that the field of view is sampled laterally in steps roughly equal to the lateral resolution, then following a derivation similar to that in Ref. 19 but rewriting it in terms of the parameters of interest, the SNR of a feature within a shot-noise-limited TD-OCT image using a Mach–Zehnder interferometer with balanced detection and optimal measurement bandwidth, like the system shown in Fig. 2 is given byFig. 3, which shows 2-D images containing 200 lines acquired at frame rates of 0.5, 2.5, and 5.0 fps. This trade-off holds even for non-shot-noise-limited imaging so long as the noise is white. Throughout this paper we specify SNR as
With a total of arriving at the detector from the reference optics, and 9.2 mW available at the sample, we have measured our system sensitivity to be at the center of the image when imaging at 50 lines per second over 10 mm (i.e., a reflection in the sample of will give 0 dB SNR). System noise was measured at 3.6 dB above the expected shot-noise floor.
We are able to generate images with of SNR at the incus through the TM. However, for an image containing 100 image lines, this SNR is only achievable at a frame rate of 0.5 fps, significantly slower than real time (). If the frame rate is increased to 5 fps, the sensitivity decreases to approximately 87 dB with an SNR at the incus of just 37 dB.
We designed a TD-OCT imaging system to determine system requirements needed in order to obtain clinically acceptable middle ear images. We specifically looked at the acceptable numerical aperture, required sensitivity and dynamic range, and the impact of transtympanic acquisition on the image. We also used the system to investigate potential new applications in imaging specific otological pathologies.
Images were obtained in two cadaveric human temporal bones taken from the same head. The bones were obtained fresh-frozen (unfixed) from anatomy gifts (Hanover, Maryland), and thawed to room temperature before use. The bones were stored in a refrigerator overnight at 4°C. They were allowed to come up to room temperature before imaging each day and kept moist by periodic spraying with saline. The external ear canal and soft tissue were removed from the bones for convenience, but images were acquired along a line of sight representative of that available to clinicians, along the ear canal. All procedures were undertaken under the oversight of the Dalhousie University Research Ethics Board.
The overall system topology is shown in Fig. 2. The light source is a Denselight Semiconductors (Singapore) DL-CS3504A fiber coupled InP SLD. Its emitted power was measured to be 54.3 mW with a nominal FWHM bandwidth of 56 nm centered at 1310 nm. The source was guided to a 90/10 nonpolarizing beamsplitter (PBS) which directed 90% of the optical power to the sample arm and 10% to the reference arm of a custom built, Mach–Zehnder fiber interferometer. Optical circulators directed incident light to the sample and reference arms and the reflections to a 50/50 fiber beamsplitter with a balanced detector.21
The variable time delay of the sample-arm light required for axial scanning in TD-OCT was generated using a diffraction-grating-based rapid-scanning optical delay-line (RSOD).22 Its design was based on one used previously23 using polarizing elements to perform double-pass beam de-scanning, but it was optimized for scanning speed, scanning range, and insertion loss. In our system, we achieved 12 mm of useful scanning range in air using a lens with a focal length of 100 mm, at an A-line rate of 1 kHz. Total RSOD insertion loss was limited to just 6.9 dB by operating near the Littrow condition using an incidence angle of 3 deg onto a diffraction grating, relative to grating normal, with a pitch of and a blaze angle of 20 deg. By relocating the QWP beyond the grating, it ensures that only pure horizontally or vertically polarized light is incident on the grating rather than circular, avoiding the polarization-dependent losses that would have been seen in Ref. 23. The RSOD was adjusted for dispersion compensation24 to minimize the width of the point spread function.
In the sample arm, light exited from a fiber collimator (Thorlabs TC18APC-1310) with a 3.2 mm beam diameter. A two-axis galvanometric mirror provided lateral scanning across the field of view. The centroid of the two mirrors’ axes was located at the back focal plane of a 100-mm achromatic doublet objective lens for approximate telecentricity (Thorlabs AC254-100-C). The overall system NA was limited by the size of the collimated beam to . The beam profile was measured to be which agreed with the diffraction limited value to within experimental error. Quarter-wave and half-wave plates were used to control the polarization in the sample arm and maximize fringe visibility.
For better usability in the clinic, a modified otoscope speculum can be rigidly mounted at the objective for aiding with imaging around the slight curvature of the human ear canal. Simultaneous conventional imaging was incorporated using the complementary metal oxide semiconductor (CMOS) sensor from a Logitech c270 HD webcam. Available thin-element dichroic mirrors were found to cause significant ghosting due to multiple internal reflections; therefore, a thick-element PBS was used to separate the NIR and visible light.
The differential optical signal from the interferometer was guided to a Thorlabs PDB145C balanced detector with a transimpedance gain of . A coarse analog bandpass filter with a mid-band gain of for DC blocking and antialiasing provided signal conditioning before digitization of the image line data with a 16-bit PCIe digitizer (Alazar ATS9462). The sampling rate for digitization was locked to the A-line scan rate so that 9760 points were acquired for each A-line regardless of the scan rate. Final filtering of the acquisition data was digitally performed with an eighth-order Butterworth zero-phase-delay bandpass filter with a center frequency and passband width that also scaled with the A-line rate. Image lines were generated from the filtered A-line data by decimation of the absolute value and plotting on a logarithmic scale. Synchronization of the RSOD, lateral scanning mirror control and acquisition triggering was accomplished using a multifunction data acquisition board (National Instruments NI-USB-6259) and controlled using custom scripts written in Python. Scanning parameters were controlled using a custom GUI written in Python that provides real-time display of both the OCT imaging and an en-face view of the TM from the CMOS camera.
Measurement of Optical Loss across the Tympanic Membrane
While tomographic imaging of the TM itself can provide clinical value,3 the present system is primarily designed to image the volume behind the TM and the TM presents an impediment to this in two ways. First, it causes substantial optical losses due to scattering within the TM. Second, it creates a strong reflection which can obscure the weaker reflections of the structures in the middle ear.
One can obtain an estimate of the amount of loss generated in the TM by considering the optical scattering coefficient. While the scattering coefficient in human TM tissue has not been measured to our knowledge, other authors have made the reasonable assumption that scattering in the TM will be similar to scattering in dermis,25 which has been characterized in a number of studies.26,27 The scattering coefficient of dermis at 1310 nm is .26 The probability that a photon incident on the TM will not be scattered in passing through tissue of length is , so in a thick TM 11% of the incident photons will pass through without scattering. In a low-NA system, the scattered photons have a very small probability of being scattered back into the incident mode and so the 89% of photons that do scatter can be approximately treated as lost. Since any light reflecting off structures distal to the TM must pass through it twice in order to be collected, only of the light passes through the TM without scattering. The TM can, therefore, be expected to act like a roughly loss source. Compared with scattering losses, the losses due to specular reflection from the tissue–air boundary and due to tissue absorption in the TM are negligible.
We confirmed this analysis by measuring the loss across a cadaveric TM. The TM was excised and mounted over a clear aperture and placed in front of plain white paper, as seen in Fig. 4(a). This entire assembly was then imaged with the OCT system with sections of both the unobstructed paper and the TM-obstructed paper within the field of view. From the relative intensities observed in the two areas of interest, the two-way optical loss across the TM was calculated by taking the ratio of the averaged peak fringe amplitude in each line within the two areas of interest; a loss of 13.5 dB. This is a lower loss than was estimated indicating that the TM we measured had either a lower scattering coefficient than or less than thickness. A more thorough study of optical scattering in the TM across multiple specimens is being planned for the future.
Despite the increased use of imaging technologies such as CT and MRI in middle ear imaging, the gold standard for diagnosis remains exploratory tympanotomy.28 This procedure involves the removal or incision of TM to allow direct visual microscopy of the middle ear. OCT offers a possible tool for enhancing exploratory tympanotomy by providing 3-D images of the middle ear anatomy of ears with the TM removed as shown in Fig. 5(a). More excitingly, transtympanic OCT enables the possibility of performing digital tympanotomy, in which the TM is removed by postprocessing a 3-D OCT ear image that includes a TM as shown in Fig. 5(b). This technique could potentially remove the necessity of exploratory tympanotomy in many cases as the digital tympanotomy image displays all the same features as the exploratory tympanotomy with only a somewhat degraded image quality due to the presence of the TM. Figure 5(c) shows an alternative approach to visualizing the same data, in which different color schemes are used for displaying the TM and middle ear. This could be useful in surgical planning and for highlighting anatomical landmarks.
For preliminary validation of the system as a clinical tool, two human cadaver temporal bones were prepared so as to simulate clinically relevant conditions and were subsequently imaged with OCT.
The first preparation, shown diagrammatically in Fig. 6(a), simulated an eroded long process of the incus, a commonly encountered middle ear disorder in which the tip of the long process of the incus is eroded either by a retracted TM laid onto it or a cholesteatoma, resulting in ossicular discontinuity between the head of the stapes and the incus.
The second preparation, shown diagramatically in Fig. 7(a), simulated a dislodged partial ossicular reconstruction prosthesis (PORP). This implant is used to reconstruct the ossicular chain when the long process of the incus has been eroded: the remaining incus is removed and the arm of the implant is placed onto the head of the stapes with its head in contact with the medial side of the TM to allow transmission of sound. In this scenario, a gap was left between the TM and the prosthesis, simulating a situation that would result in residual postoperative hearing loss.
In both cases, the temporal bones were prepared by first lifting the posterior edge of the TM. The surgical manipulation was then made to the middle ear and the TM was replaced so that the middle ear could be imaged through it. The resulting 2-D and 3-D images of the eroded incus are shown in Figs. 6(b) and 6(c). The site of the manipulation is indicated.
In the case of the dislodged prosthesis shown in 2-D and 3-D in Figs. 7(b) and 7(c), the titanium prosthesis can be clearly seen under the TM and the gap separating the prosthesis from the membrane can be readily discerned. A surgeon could use this image to diagnose the failure of an implant and to plan a subsequent intervention to replace it. Intraoperatively this could also be used to detect migration or misplacement of a prosthetic following closure of the TM but prior to bringing the patient out of anesthesia, preventing the need for a separate revision surgery.
The images were taken at a frame rate of 0.5 fps and an NA of 0.022.
Multiple Scattering in the Ossicles
The images shown in Figs. 5Fig. 6–7 all exhibit prominent artifacts related to multiple scattering. While multiple scattering is commonly observed in soft-tissue OCT where it is often the limiting factor on contrast, it is particularly prominent in middle ear images of the boney ossicles and cochlear floor. Photons that are multiply scattered take long paths through tissue and so appear to originate from a deeper depth, often from a depth completely beyond the structure being imaged. This is the source of the long, speckle-filled tails that appear behind the bony structures, even to a degree that may impact one’s ability to visualize bone-air boundaries within the middle ear, and certainly enough to obscure fine detail in nearby structures. In the specific case of Fig. 6(b), where an example of the artifact has been identified trailing behind the malleus, the multiple scattering is sufficiently strong to partially camouflage the discontinuity in the ossicular chain between the incus and stapes in the 2-D B-mode image. The missing bonemass is more obvious in Fig. 6(c) in 3-D, suggesting 3-D renderings of middle ear anatomy may be particularly important for identifying abnormalities.
There are a few strategies for reducing multiple scattering. Some improvement can be obtained through spatial compounding29 and by image processing techniques using wavelet transforms.30 Polarization-mode OCT can also be used to selectively attenuate the multiply scattered light.31 Another approach to improving the system would be to increase the numerical aperture. Endoscopic imaging has been used to obtain a higher NA for TM imaging,3 but this is less convenient and comfortable than a free space approach applied from outside of the ear canal, and would require dynamic focusing32 or synthetic aperture techniques33 to maintain depth of field. These approaches are being evaluated in our lab to assess their potential to improve image quality from a diagnostic perspective.
Clinical Otological Optical Coherence Tomography Design Considerations
Taking the image in Fig. 3(a) as a representative example of the desired level of image fidelity necessary for real-time diagnostic use, and given the losses present in the TM, estimates of the sensitivity and dynamic range required for a clinical otological OCT system can be made. First, satisfactory bone-to-air contrast can be expected at the osscicles if detection sensitivity can be made to exceed ; a challenging but theoretically achievable goal at real-time rates, assuming the full sensitivity advantages of Fourier-domain methods can be realized.15 Perhaps a more challenging requirement to meet in an FD-OCT system is the required dynamic range. Given of loss at the TM and of SNR desired at the ossicles, of dynamic range is required. TD-OCT is tolerant of saturation from bright reflectors and saturation at one point in the scan, as it does not prevent valid data being obtained from weaker reflectors elsewhere. However, in FD-OCT, at any one point in time light from all depths is collected, and if the brightest reflector in the image is reflective enough to saturate the detector, then the collected signal is no longer the Fourier transform of the A-line. The image artifacts arising due to saturation by the bright reflectors contaminate the entire A-line, making simultaneous high-dynamic-range and high-sensitivity imaging difficult in FD-OCT, and causing artifacts from the bright TM reflection to obscure weaker reflectors like the ossicles.16 It may be possible to mitigate this problem using logarithmic detection34 or by purposely putting the TM out of focus in order to reduce its contribution to the reflected intensity.
Assuming that the dynamic range issue can be addressed, SS-OCT is likely the most suitable approach to apply to middle ear imaging, owing largely to recent advances in swept-source laser development. Swept lasers that can achieve SNR within a few dB of the shot-noise limit and, at the same time, coherence length sufficient to allow a scanning range in excess of 20 mm35 have only very recently become available. While this much range is of limited use in soft tissue OCT applications where the imaging range is limited by losses or multiple scattering, the long, transparent, and air-filled middle ear requires it. Commercial swept-source systems are now available with A-line rates in excess of 100 KHz which would be more than adequate for real-time middle ear imaging. A next-generation middle ear imaging system based on SS-OCT is under active development in our lab.
As our two pathology samples demonstrated, an OCT imaging system for otology could improve the surgeon’s diagnostic capability in the clinic. Diagnosis of middle ear pathology currently uses a combination of optical visualization (otoscopy or microscopy in the clinic), audiometric testing (pure tone audiometry and tympanometry), and radiological imaging (typically with CT). OCT of the middle ear could potentially complement and extend the diagnostic toolset available to clinicians in a number of scenarios. For example, in a patient who has a persistent conductive hearing loss following ossiculoplasty (surgical reconstruction of the ossicular chain), the ability to visualize the reconstruction would allow a surgeon to decide whether any improvement is possible with revision surgery. It would also enable diagnosis of conductive hearing loss in the presence of a normal TM, which may be caused by otosclerosis (fixation of the stapes footplate in the oval window), tympanosclerotic fixation (scarring, usually around the head of the malleus and body of the incus), congenital ossicular abnormality (absence or fixation of part of the ossicular chain from birth), or fracture/dislocation of the chain. The surgical management of each of these conditions differs and carries different success rates and risks. Surgeons are routinely forced to improvise solutions and face problems in the operating room that may not have been discussed as a likely scenario with the patient. The ability to make more accurate diagnoses in the clinic would allow the surgeon to better counsel patients preoperatively about the risks and benefits of intervention and to aid them in their decision-making process as well as reducing the number of unnecessary surgeries.
We have demonstrated an OCT system for noninvasive imaging of the human middle ear to assess the design challenges faced in developing clinically useful otological OCT. Our study assesses the feasibility of bringing OCT-based middle ear imaging into clinical practice. We investigated the deleterious effects of being forced to image at low NA () and from the outside the TM(loss ), and determined some requirements for system sensitivity () and dynamic range () in order to obtain acceptable image quality. Preliminary validation in human temporal bones shows that images obtained with our TD-OCT system provide diagnostically useful information of erosion of the long process of the incus and of prosthesis migration, highlighting how OCT may fit within clinical otology and outlining new motivations for bringing OCT into the otology clinical space.
This work was funded by the Natural Sciences and Engineering Council (Award No. 387373-2010) and under the Atlantic Canada Opportunities Agency Atlantic Innovation Fund program (Project No. 197819).
Dan MacDougall is a PhD student in the School of Biomedical Engineering at Dalhousie University, where he also received his bachelor’s degree in electrical engineering.
James Rainsbury is a consultant ENT surgeon in Plymouth, UK, with subspecialty interests in otology and pediatric ENT. Prior to that he was an otology fellow at Dalhousie University, Halifax. His main research interests include Eustachian tube dysfunction and middle ear imaging.
Jeremy Brown received his PhD in applied physics from Queens University in 2005. Currently, he is an associate professor in the Biomedical Engineering Department at Dalhousie University with cross-appointment to Electrical Engineering and the Department of Surgery, and has an affiliated scientist status at Capital Health in the Department of Otolaryngology. His research is focused on the design, fabrication, and testing of both ultrasonic and sonic frequency piezoelectric transducers and associated electronic hardware for medical applications.
Manohar Bance received his degree in medicine from the University of Manchester, UK, in 1985. He completed his residency in otolaryngology-head and neck surgery at the University of Toronto in 1995, followed by a fellowship in otology/neurotology in Manchester, UK. He was an assistant professor in the Otolaryngology Department at the University of Toronto from 1996 to 2001 and then moved to Dalhousie University, where he is now a professor and head of the Division of Otolaryngology-Head and Neck Surgery.
Robert Adamson received his PhD in physics from the University of Toronto in 2008. He has been an assistant professor at the School of Biomedical Engineering at Dalhousie University since 2010. His research interests include high-frequency ultrasound and optical imaging technologies particularly for use in the auditory system, optoacoustics, and novel biomedical uses for ultrasound, including powering of biomedical implants.