Label-free in vivo pathology of human epithelia with a high-speed handheld dual-axis confocal microscope

Abstract. There would be clinical value in a miniature optical-sectioning microscope to enable in vivo interrogation of tissues as a real-time and noninvasive alternative to gold-standard histopathology for early disease detection and surgical guidance. To address this need, a reflectance-based handheld line-scanned dual-axis confocal microscope was developed and fully packaged for label-free imaging of human skin and oral mucosa. This device can collect images at >15  frames/s with an optical-sectioning thickness and lateral resolution of 1.7 and 1.1  μm, respectively. Incorporation of a sterile lens cap design enables pressure-sensitive adjustment of the imaging depth by the user during clinical use. In vivo human images and videos are obtained to demonstrate the capabilities of this high-speed optical-sectioning microscopy device.

The visualization of glandular, cellular, and subcellular features from thinly sectioned tissues mounted on glass slides, known as histopathology, provides a clinical gold standard by which diseases are diagnosed. Since this process is destructive of tissues, time-consuming, and costly, a limited number of sections are typically prepared from each tissue specimen, which leads to severe sampling errors. For rapid intraoperative consultations, frozen sections can be prepared, but this still requires the selective invasive removal of tissues, which is risky, and has the same sampling limitations as histology of formalin-fixed paraffinembedded tissues.
Portable in vivo optical-sectioning microscopes have the potential to enable real-time noninvasive pathology that can circumvent many of the limitations of conventional histology methods. [1][2][3][4][5][6][7][8][9][10] Confocal microscopy has traditionally been the most-popular optical-sectioning technique for imaging tissues, in which a pinhole is typically used as a spatial filter to reject out-of-focus and multiply scattered background light. While the first confocal microscopes were developed in the 1950s and 1960s, originally by Minsky,11 early systems that relied upon analog detection through an eyepiece were bulky and slow. In the past few decades, rapid advances in lasers, fibers, scanners, detectors, and computers have enabled the development of portable, handheld, and even endoscopic systems for various clinical applications. [1][2][3][4][5][6][7][8][9][10] Although conventional single-axis confocal (SAC) microscopes have become standard equipment in lifescience and clinical laboratories, a dual-axis confocal (DAC) architecture is utilized in this study. 12 Unlike the SAC architecture, which utilizes a single objective and common beam path for illumination and collection, the DAC architecture utilizes off-axis low-numerical-aperture (NA) illumination and collection beams that intersect at their foci. This configuration has been shown, through diffraction-theory analysis and Monte-Carlo scattering simulations, to provide more effective optical sectioning, which in turn enables higher-contrast imaging [improved signal-to-background ratios (SBRs)] and imaging depth within biological tissues (compared with SAC microscopy). [12][13][14][15] In addition, the use of low-NA beams provides a long working distance, which can be an advantage for miniaturization. 12 Handheld devices for the early detection of skin and oral malignancies, and/or surgical guidance of a variety of anatomical sites, should ideally acquire images at a high frame rate in order to minimize motion artifacts during clinical use on patients. While most previous DAC microscopes have utilized point scanning, in which an image is constructed by scanning a localized focal volume in two dimensions for two-dimensional (2-D) imaging, the DAC microscope described in this letter utilizes line scanning in order to achieve a high frame rate. In a line-scanned confocal microscope, the illumination beam is focused to a line within the specimen, and a detection slit is used in front of a linear detector array, instead of a pinhole, to reject out-of-focus light. While a line-scanned system sacrifices one dimension of confocality (along the focal line), simulations and experiments have demonstrated that an linescanned dual-axis confocal (LS-DAC) microscope is capable of achieving adequate contrast (SBR) when imaging near tissue surfaces (∼100-μm depth) in comparison to a point-scanned dual-axis confocal (PS-DAC) microscope. 13,15,16 In comparison to an earlier proof-of-concept prototype that was not fully packaged for clinical use, 17 a number of technical improvements are reported here: (1) an optimized illumination module has been fabricated to improve the imaging resolution and contrast. (2) A portable detector has been incorporated into a fully packaged handheld device to enable portable clinical use.
(3) A sterile lens cap has been designed to enable pressuresensitive adjustment of the optical-sectioning depth by the user during imaging. Collectively, these technical advances have allowed us to obtain first-in-human reflectance images of skin and oral mucosa.
The handheld LS-DAC microscope developed in this study consists of three major modules [ Fig. 1(a)]: (1) a main body that houses the optics for the illumination (blue) and collection (green) beams, a MEMS scanning mirror, and two alignment mirrors; (2) a custom relay objective lens with a lens cap that provides 3× magnification; and (3)  array. A single-mode optical fiber (SM670) is used to couple laser radiation at a wavelength of 660 nm into the illumination path of the main body (Gaussian beams are assumed throughout this work). A newly optimized illumination fiber module, assembled by GRINTECH GmbH (Jena, Germany), consists of two doublet achromat lenses packaged within a stainless steel cylindrical tube with an inner diameter of 3.0 mm and an outer diameter of 3.2 mm. Lens L1 is a spherical achromat from Edmund Optics (Barrington, New Jersey, catalog number 45262) that has been reduced in diameter (3.0-mm diameter) by BMV Optical (Ottawa, Ontario, Canada). Lens C is a custom cylindrical lens, fabricated by BMV Optical, that is based on the lens prescription of a spherical achromat from Edmund Optics (catalog number 45090). Compared to the fiber module utilized in a previous prototype, 17 which utilized three lenses, the new two-lens fiber module is simpler to assemble and exhibits reduced diffraction sidelobes in the profile of the illumination focal line due to reduced clipping of the Gaussian beam (see Fig. 3 and associated text). The MEMS scanning mirror utilized in our device is from Mirrorcle Technologies, Richmond, California [shown in Fig. 1(c)]. The MEMS chip is packaged into an LCC18 package (measuring 8.89 × 7.24 mm) and soldered on to a custom-designed printed circuit board chip (measuring 10.16 × 8.64 mm) manufactured by Advanced Circuits Inc. The detector in Fig. 1(c) is a 2-D detector array (Basler ace acA2000-340 km) with 2048 × 1088 pixels, in which the size of each pixel measures 5.5 × 5.5 μm. To utilize this detector as a linear array, a region of interest of 4 rows × 2048 pixels was binned to generate a 1 × 2048 output. A lens cap was designed to provide a means for adjusting the imaging depth [shown in Fig. 1(d)]. The distal face of the lens cap provides a flat surface that comes into contact with the tissue during imaging. A 3.5-mm-diameter hole at the center of the distal face of the lens cap provides optical access and is covered with a sterile plastic film that wraps around the entire device to maintain sterility. As the user adjusts the pressure of the device against the tissue, the tissue deflects slightly into the hole of the lens cap, which in turn allows the microscope to image more deeply.
The imaging system is controlled by a LabVIEW program that runs on a standard PC [shown in Fig. 2(a)]. The MEMS controller amplifies voltage signals from a USB port on the PC to scan the MEMS mirror (triangular waveform). An field-programmable gate array (FPGA) based frame grabber (NI PCIe-1473R) collects video data from the Basler detector and stitches the lines to form 2-D images. When the LabVIEW program is started, the initial data acquisition is triggered by the next available horizontal synchronization (HSYNC) signal from the detector, and the scanning of the MEMS mirror is triggered through software. Since these two tasks are triggered by two unsynchronized trigger sources, a MATLAB script is embedded in the LabVIEW program to provide software-based synchronization to prevent image drifting.
The image of a reflective 1951 USAF resolution test chart, shown in Fig. 3(a), shows the ability of the microscope to resolve features at the micron scale. Figures 3(b) and 3(c) show the plots of the axial response to a flat mirror on a linear and log scale, respectively. The FWHM optical-sectioning thickness at the center of the field of view (FOV) is measured to be ∼1.72 μm. Figure 3(c) shows that the background signal in the axial-response plot, as the mirror is translated away from the focal plane, is ∼0.01% of the maximum signal from a mirror located at the focal plane. This is a significant improvement from our initial miniature device, 17 which leads to improved contrast and imaging depth in tissues. Note that there is vignetting at the Fig. 1 (a) Optical circuit of the handheld LS-DAC microscope. The illumination and collection beams are depicted in blue and green, respectively. The mirrors (M1 and M2) are used to align the dual-axis beams such that they intersect at the back focal plane of the custom objective, which relays the beams from the back focal plane (at the left side of the objective) to the front focal plane in tissue (right side) with 3× magnification. The focusing angle of the beams in tissue, α, and crossing angle, θ, enable high-contrast optical sectioning with micron-scale resolution (see text). The lower right inset shows a cross-sectional view of the lens cap. edges of the FOV due to slight field curvature introduced by the scanning MEMS mirror.
To demonstrate the ability of our device to acquire label-free reflectance images in vivo, we imaged human facial skin and oral mucosa from healthy volunteers at the Memorial Sloan-Kettering Cancer Center under IRB approval and patient consent. All images were acquired in real time at a frame rate of 15 frames/s. For direct comparison to imaging with point scanning, we also imaged with a handheld point-scanned single-axis confocal (PS-SAC) microscope, the VivaScope 3000 (Caliber I.D. Inc., Andover, Massachusetts), which has a frame rate of 7 frames/s. Figures 4(a) and 4(e) show photographs of the two devices as a size comparison. Figures 4(b) and 4(c) show the distinct morphological features in the facial skin, such as the stratum spinosum (red arrow) around a hair follicle (green arrow) in Fig. 4(b) and the epidermis (red arrow) and dermal papillae (green arrow) at the dermal-epidermal junction in Fig. 4(c). Supplemental videos demonstrate the ability to manually adjust the lateral position and imaging depth of the handheld device smoothly in real time (see Videos 1 and 2). Figure 4(d) shows distinct hyperreflective nuclei (red arrow) in the squamous cells of the labial mucosa (see Video 3). Figures 4(f)-4(h) show reflectance images of similar features obtained at the same skin and oral mucosa sites with the PS-SAC microscope. Visual comparison shows that optical sectioning and resolution are preserved for the LS-DAC approach and are comparable to that of PS-SAC approach down to the basal cell layer (∼50-to 150-μm depth), which confirms our earlier modeling and experimental measurements. 12,13,16 Compared to the images collected by the PS-SAC (VivaScope 3000) device, speckle noise is more apparent in the images collected by the handheld LS-DAC microscope. This is, in part, due to a narrower confocal slit, which preserves resolution and thin sectioning, but at the trade-off of higher speckle contrast. 18 Future devices can mitigate speckle noise, if desired, by increasing the physical slit width and/or altering the magnification of the collection optics, with the attendant trade-offs described in the literature. 18 The utilization of an incoherent light source (if bright enough) can also suppress speckle noise, but with some trade-offs in resolution. 19   As described in previous publications, the SBR of a pointscanned confocal architecture is superior to a line-scanned architecture due to loss of confocality along the focal line of a linescanned device. 12,13,15 However, the dual-axis configuration acts to mitigate this deterioration in SBR, somewhat, because of the fact that the illumination and collection beams are spatially separated, except where they intersect at their respective foci. 12,13 This is seen in the image comparisons between the LS-DAC device and the PS-SAC VivaScope device, in which the PS-SAC device exhibits slightly improved contrast (SBR) at deeper depths, such as at the dermal-epidermal junction in Figs. 4(c) and 4(g). Since the LS-SAC device is smaller than the VivaScope device, diffraction noise (sidelobes) from lens apertures may be more severe and could be an additional source of minor degradations in contrast (SBR). This is a consequence of the fact that large beam diameters are desired to achieve relatively large NAs (high resolution) with a long working distance but are severely constrained in a miniature system, in which all optical components (including lens apertures) are necessarily small. However, at shallow depths, image quality is comparable between the LS-DAC and PS-SAC approaches, as expected and as previously shown with tabletop systems. 16 In summary, we have developed a fully packaged handheld LS-DAC microscope, with pressure-sensitive depth control (via a lens cap design), which is the first device of its kind to be used for in vivo imaging of human skin and oral mucosa. Compared to previous miniature PS-DAC microscopes (4 frames/s), this handheld LS-DAC microscope has a much higher frame rate (15 frames/s), with reduced motion artifacts, and improved axial and lateral resolution (FWHM of 1.7 and 1.1 μm, respectively). 20,21 As mentioned previously, maximizing the frame rate of a handheld device is critical for minimizing motion artifacts during handheld use and also for enabling effective video mosaicking. 22 Note that for the reflectance-based LS-DAC device developed here, the imaging speed is limited not by signal-to-noise ratio (low photon counts) but rather by the limited readout rate of the detector used in the device. In the future, higher imaging speeds of >30 frames∕s should be possible with higher-speed linear array detectors.
Ultimately, the LS-DAC design architecture has allowed us to develop a system with a miniature form factor that is conducive for clinical use to image skin and oral mucosa, and potentially other exposed tissues, such as during surgical resection procedures. The benefits in terms of imaging speed, while coming at the cost of a slight reduction in contrast, make the LS-DAC approach an attractive design choice. In the future, we plan to implement real-time mosaicking algorithms to provide users with instant feedback to comprehensively image a larger area of interest (e.g., a suspicious oral lesion) while avoiding redundant imaging of certain regions. Large clinical studies are also needed to assess the sensitivity and specificity of our device for detecting various malignancies.

Disclosures
Dr. Milind Rajadhyaksha is a former employee of and owns equity in Caliber ID (formerly, Lucid Inc.), the company that manufactures and sells the VivaScope confocal microscope.