En-face optical coherence tomography/fluorescence endomicroscopy for minimally invasive imaging using a robotic scanner

Abstract. We report a compact rigid instrument capable of delivering en-face optical coherence tomography (OCT) images alongside (epi)-fluorescence endomicroscopy (FEM) images by means of a robotic scanning device. Two working imaging channels are included: one for a one-dimensional scanning, forward-viewing OCT probe and another for a fiber bundle used for the FEM system. The robotic scanning system provides the second axis of scanning for the OCT channel while allowing the field of view (FoV) of the FEM channel to be increased by mosaicking. The OCT channel has resolutions of 25/60  μm (axial/lateral) and can provide en-face images with an FoV of 1.6×2.7  mm2. The FEM channel has a lateral resolution of better than 8  μm and can generate an FoV of 0.53×3.25  mm2 through mosaicking. The reproducibility of the scanning was determined using phantoms to be better than the lateral resolution of the OCT channel. Combined OCT and FEM imaging were validated with ex-vivo ovine and porcine tissues, with the instrument mounted on an arm to ensure constant contact of the probe with the tissue. The OCT imaging system alone was validated for in-vivo human dermal imaging with the handheld instrument. In both cases, the instrument was capable of resolving fine features such as the sweat glands in human dermal tissue and the alveoli in porcine lung tissue.


Introduction
Recent advances in surgical techniques, particularly minimally invasive keyhole oncological procedures, have seen significant improvements in patient outcomes, leading to improved recovery times and reduced complications. 1 A potential avenue for further improvement is the introduction of intraoperative image guidance, which would allow identification of tumor margins in situ. This could enable a complete resection of cancerous tissue while preserving as many healthy tissues as possible (e.g., Ref. 2). Provided that high-quality rapid diagnosis can be achieved, "optical biopsy" techniques could offer a real-time alternative to current intraoperative techniques such as timeconsuming and expensive frozen section biopsy.
To compete effectively with conventional excisional biopsy and histology or frozen section, optical biopsy techniques must be able to provide comparable diagnostic information, in or near real time, and should be minimally invasive and easy to integrate with the clinical workflow. Where morphological imaging is used, the technique must provide sufficient resolution to allow identification of relevant tissue features, and sufficient field of view (FoV) for an effective reading of the images. A number of optical biopsy techniques have been proposed, ranging from low-resolution wide-field fluorescence imaging 2 to point-based spectroscopic measurements. 3 One such promising candidate is optical coherence tomography (OCT). This is an inherently noncontact-based optical technique using low-power, near-infrared light, allowing full three-dimensional (3-D) imaging of the tissue with a high penetration depth of typically 1 to 2 mm, and having imaging rates of a few kilohertz (with volume rates of a few hertz 4 ). To allow for in-situ imaging, a wide range of sideviewing and forward-viewing (FV) endoscopic OCT probes have been developed 5 and potential surgical applications have been identified (e.g., Ref. 6). Depending on the kind of tissue being investigated, the imaging contrast provided by OCT might not be sufficient for clinical diagnosis, since it stems solely from single backscattering events and changes in the refractive index. Therefore, it is recommendable to combine OCT with other optical and nonoptical modalities, [7][8][9][10][11] to enable more comprehensive characterization of the tissue properties.
Another approach to optical biopsy is the family of endoscopic fluorescence microscopy techniques, including both wide-field (epi)-fluorescence endomicroscopy (FEM) 12,13 and fluorescence confocal laser endomicroscopy (CLE). 14,15 FEM/ CLE have demonstrated the potential to distinguish and grade cancerous tissue in situ and has been suggested as a potential intraoperative tool for rapid tissue imaging in breast surgery, 16 neurosurgery, 17 and laparoscopic surgery. 18 Fiber imaging bundles, with or without distal micro-optics, can be used as thin, flexible, and passive FEM/CLE probes, removing the need for any distal scanning systems. However, a downside is that the FoV of fiber bundle-based FEM/CLE probes is small, typically 0.25 to 0.8 mm, depending on the selection of the distal optics.
To address this limitation, several systems for mechanical and robotic scanning of optical biopsy probes have been proposed. For example, Rosa et al. 19 developed a system for surgical imaging that increased the effective FoV by scanning a CLE probe in a spiral pattern, using visual servoing to optimize the trajectory. Zhang et al. 20 integrated CLE and OCT probes with the da Vinci surgical robot, demonstrating closed-loop scanning using information from both imaging channels. Zuo et al. 21,22 demonstrated prototypes of several open-loop scanning systems specifically designed for intraoperative breast imaging. The operation of such systems has indicated the potential for a scanning FEM/CLE system to obtain images over significant areas of tissue.
FEM/CLE and OCT offer potentially complementary imaging modes. FEM/CLE provide high-resolution two-dimensional (2-D) surface scans but with little penetration depth and a small FoV. OCT generally has lower resolution but offers 3-D volumetric imaging, with penetration of 1 to 2 mm into tissue (although real-time OCT imaging typically comprises 2-D cross-sectional "B" scans, particularly in endoscopic implementations). Taking full advantage of the respective strengths of OCT and FEM/CLE requires a dual-modality imaging probe, combining OCT with fluorescence imaging. [8][9][10][11] A dual-modality probe would have the benefit of enabling en-face OCT views, in addition to cross-sectional OCT images. This would provide more flexibility in imaging while also permitting coregistration with the inherently en-face fluorescence images. While OCT is capable of delivering low frame rate en-face slices extracted from 3-D volumes, this is particularly challenging for miniaturized surgical imaging systems. In particular, given that en-face images are normally constructed by raster scanning the beam over the sample, two orthogonally placed scanners are necessary. While this is common in bulk, bench-top OCT systems for a variety of applications, ranging from ophthalmic imaging systems to handheld dermal probes, introducing two scanning directions in a small form-factor, minimally invasive probe is not straightforward. Typically, it involves a cantilevered fiber being driven by either a piezoelectric device over a Lissajous or spiral pattern [23][24][25][26] (which however requires high voltages to be operated), or employing small AC motors and exploiting the resonance properties of the cantilevered fiber. 27,28 Approaches using coherent imaging fiber bundles, with no distal scanning, have also been reported, 29 but the image quality is compromised by cross talk between fiber cores, the few-mode behavior of most fiber bundles, and their high numerical apertures (NA).
Although side-viewing geometries in multimodal imaging probes are now commonplace, 8,10,11 there are fewer reports on FV, multimodal probes. Ryu et al. 9 have reported a doubleclad fiber (DCF)-based probe, which can simultaneously deliver OCT and fluorescence imaging; however, no in-probe scanning optics were used, with the sample being translated in relation to the probe in order to construct an OCT/fluorescence en-face image.
In this paper, we attempt to address the issue of en-face OCT imaging in a small-scale, minimally invasive probe combined with a fiber bundle-based FEM system. This is achieved by the use of a rigid robotic scanning device, originally designed for CLE and laser ablation, 30 into which a one-dimensional (1-D) FV OCT probe 27,31 and an FEM fiber bundle are introduced. The robotic device enables en-face OCT imaging by providing a second, orthogonal scanning direction for the OCT probe.
In addition, to compensate for the small FoV of the FEM probe, a mosaicking algorithm 32 is used to fuse the FEM images obtained during scanning of the robotic device. Since the robotic scanning device has a relatively small physical footprint, it can be used handheld, which brings benefits in terms of cost and surgical workflow. 33 The system is described in Sec. 2, which details (i) the robotic scanning device, (ii) the OCT 1-D FV probe and system, (iii) the FEM fiber bundle probe, and (iv) system integration and software. The system is fully characterized in Sec. 3, and examples of images acquired with both arm-mounted and handheld operation are presented and discussed in Sec. 4.

Experimental Setup
A schematic representation of the robotic scanning OCT/FEM system is shown in Fig. 1(a). The computer workstation (PC Specialist custom build, Intel i7-5960X octo-core processor, 16 GB RAM) is used for the acquisition and control of the three subsystems: motor control for the robotic scanning device, the OCT endoscopic scanning system, and the FEM imaging system. These are detailed in Secs. 2.1-2.3, respectively. Custom LabVIEW™ (National Instruments, Austin, Texas) software was devised to control the three subsystems and to acquire and process the data, as described in Sec. 2.4.
The two fiber probes (the OCT 1-D FV probe and the fiber bundle for the FEM imaging system) are placed side by side inside the tip of the robotic scanning device [inner diameter of 2.7 mm; outer diameter (OD) of 3.3 mm along the shaft; and 3.7 mm at the tip], as shown in Fig. 1(c), and are shown individually in the photograph in Fig. 1(b). In Figs. 1(b) and 1(c), (i) indicates the robotic scanning device shaft, (ii) indicates the OCT probe [projecting a linear scanning pattern (iv) on the infrared detection card], and (iii) indicates the FEM fiber bundle [projecting a divergent blue/violet beam (v)].

Robotic Scanning Device
The robotic scanning device, which may be mounted on a robotic arm for autonomous or hands-on operation, or used in a handheld fashion, is described in Refs. 30 and 34. In summary, it consists of a 58-mm-long hollow steel tube, with 3.3 mm OD, through which imaging probes can be passed, mounted inside a 3-D printed case. The tube is fixed in place at the back of the case while the distal (tissue) end protrudes from the case and is free to move. Partway along the length of the tube, at the front of the case, the tube is fixed to a cam-roller assembly, which allows it to be deflected in two dimensions by the rotation of two brushless DC servomotors. This allows the tip of the tube to be moved over a 2-D workspace of up to 14 mm 2 with an absolute positioning accuracy of better than 30 μm. The tube is offset within the assembly so that it is always under the load throughout the entire workspace, minimizing backlash and hysteresis effects.
The robotic scanner is driven via two analog input signals applied to the motor controllers, which are mounted in a separate casing to minimize the weight of the scanner. For the study reported here, only 1-D motion was required. For ease of alignment of the OCT scanning probe direction, this motion was not necessarily along any of the intrinsic axes of the scanner. The scan was, therefore, performed by sending linear ramps to both motors, scaled by cos θ and sin θ, where θ is the angle of the OCT 1-D scanning line with respect to the axis of the scanner.
Journal of Biomedical Optics 066006-2 June 2019 • Vol. 24 (6) At the tip of the tube, a holder with an OD of 3.7 mm is fixed, originally designed to hold a Mauna Kea Cellvizio Gastroflex UHD endomicroscopy probe (diameter of 2.6 mm) and a laser ablation fiber (diameter of approximately 0.7 mm), as seen in the photograph in Fig. 1(c). For this study, the OCT probe was passed through the endomicroscopy channel and the FEM probe was passed through the laser ablation channel of the original design. The axial positions of the probes were fixed at the rear end of the device case. The OCT probe was mounted with its tip slightly (∼1 to 2 mm) behind the FEM fiber bundle tip, which was placed flush against the distal end of the 3.7 mm OD holder.

Optical Coherence Tomography Endoscopic Scanning Subsystem
The OCT endoscopic scanning subsystem shown in Fig. 1(e) is a swept-source-based system (SS-OCT) that incorporates an FV, 1-D scanning endoscopic probe in the object arm. This probe, the principle of which has been described elsewhere, 31,35 is based on the voice coil principle, employing the optical fiber as a cantilever, which is attached to the electrical coil, as pictured in Fig. 2. The electrical coil is placed around a magnetic system comprising two magnets with the same poles facing each other. The probe operates in an open-loop configuration, with no position sensing. The fiber tip is imaged onto the sample by a gradient-index (GRIN) lens with a magnification of 5×, with a working distance of 1 mm. By applying an alternating electrical current through the coil, a force is generated that laterally shifts the fiber tip, creating a raster scan pattern of up to 2 mm on the sample. The mechanical dimensions of the probe are provided in Fig. 1(b). The part labeled as (ii) has an OD of 1.81 mm at the metal tip, which has a rigid length of 13.20 mm. Elsewhere, the probe tubing has an OD of 1.71 mm.
Briefly, the SS-OCT system is driven by a 100 kHz A-scan rate, 1310 nm wavelength swept-source, with a bandwidth of 110 nm (Axsun Technologies, Billerica, Massachusetts), the output of which is directed to a fiber-based Mach-Zehnder interferometer, whose arms include the 1-D scanning endoscopic probe and a fiber-based reference arm. The optical power incident on the sample is ∼2.8 mW. Detection is performed by a balanced InGaAs photodetector, BPD (PDB435C-AC, 350 MHz cut-off frequency, from Thorlabs, Newton, New Jersey), whose electrical signal is digitized after a passive high-pass filter (10 MHz cut-off frequency) by a 12-bit high-speed PCIe digitizer (AlazarTech ATS9350, up to 500 MS∕s sampling rate, Pointe-Claire, Quebec, Canada), configured with a voltage range of AE1 V and digitizing 2048 points per each spectral sweep at 500 MS∕s.
To produce OCT images, data are processed using the Complex Master-Slave (CMS) method. 36 As the CMS method does not need a clock, the maximum sampling speed of the digitizer can be used. 37 No hardware-based dispersion compensation is required due to CMS being inherently tolerant to dispersion mismatches in the interferometer; 38 CMS method replaces the Fourier transform conventional processing with comparison operations of the channeled spectrum from the sample against masks. These masks are channeled spectra delivered by the same interferometer for different optical path differences in the OCT interferometer using a mirror instead of the sample. As the channeled spectra involved in the comparison operation are produced by the same interferometer, such a procedure is tolerant to chirped channeled spectra, irrespective of whether such chirp is due to nonlinear sweeping or dispersion in the interferometer. 39 The B-scan and C-scan images produced via CMS processing were displayed as 8-bit integers with full dynamic range conversion in the LabVIEW IMAQ display control.

Fiber Bundle Fluorescence Endomicroscopy
Imaging Subsystem The FEM uses a flexible fiber bundle (Fujikura FIGH-30-650S) comprising ∼30;000 cores to deliver and collect light from a sample, as shown diagrammatically in Fig. 1(f). As no distal optics were used, the fiber bundle, therefore, needed to be in direct contact with the sample. A Texas Instruments Lightcrafter 3000 digital micromirror device (DMD) was used as an illumination source to project uniform blue light onto the sample. The blue light is produced by a light-emitting diode (LED) incorporated within the DMD (with central wavelength of 450 nm and optical power of 32 mW at 450 nm). The output beam from the DMD traverses a low-pass filter (with a 450 nm cut-off), a dichroic mirror, and a 10X microscope objective to deliver 4.3 mW incident on the proximal end of the fiber bundle. While the DMD was destined to generate structured light patterns, for this study it was simply used to deliver flood illumination. The magnification between the DMD pixels and the fiber bundle was sufficient so that the illumination was effectively uniform, as though the bundle is directly illuminated by the LED. The returning fluorescence is imaged onto a CMOS camera (PointGrey Flea 3) via the dichroic and an emission filter (cut-off >500 nm). The raw image is cropped to a circular diameter of 525 μm, and Gaussian-filtered (σ ¼ 1.2 μm) to remove pixelation due to the fiber bundle cores. Images were acquired at 20 fps.

System Integration and Real-Time Display
As mentioned earlier, the three subsystems are controlled by a single workstation PC. The acquired data from the OCT and FEM subsystems are processed by this PC, with the option of displaying both OCT and FEM en-face frames in real-time during acquisition through a custom-made LabVIEW™ virtual instrument (VI) interface. A screenshot of the graphical user interface of this VI is presented in Fig. 3, with a video demonstration of the real-time operation in Video 1. A number of parameters and settings are indicated in Fig. 3. The OCT channel output can be displayed as a live B-scan image by reconstructing each depth using the CMS algorithm, 36 which is equivalent to the conventional approach of performing a fast-Fourier transform (FFT) plus any additional dispersion compensation/resampling of the raw B-scan data. This is computationally expensive and unnecessary if an en-face OCT image is preferred (which would only require a few depths out of the whole depth range). To mitigate this issue, the "fast scan" mode in Fig. 3(vi) makes use of the unique property of the CMS method whereby it can reconstruct a subset of the depths with a reduced processing time. The position of this depth interval can be specified by the depth position control, as shown in Fig. 3(iii), and its range can be adjusted by the average control, as shown in Fig. 3(ix). In this way, rather than generating an entire B-scan for each position of the robotic scanner, which would be the conventional approach using FFT-based reconstruction, only a small subset of the B-scan is generated at the required depth. When the "average" parameter is set to 1, this is simply a 1-D array (a lateral T-scan). When the average value is >1 (i.e., the reconstructed interval has an axial range >1 pixel), the T-scan is created by averaging over this number of depth points. The full OCT en-face image in Fig. 3(ii) at this depth is then constructed from these T-scans. In parallel, the entire dataset can be recorded, allowing subsequent offline reconstruction of volumes, B-scans, and en-face images at any depth. The 1-D OCT scanner is driven by a sinusoidal waveform; therefore, each raw T-scan is sampled nonlinearly along the lateral scanning function. This was corrected by a resampling algorithm, which can be toggled by control, as represented in Fig. 3(vii).
To improve the small FoV allowed by the fiber bundle [0.275 mm 2 , as shown in Fig. 3(iv)], a mosaicking algorithm similar to previously reported approaches 32,40 was employed to stitch frames acquired during a single scan of the robotic device [as shown in Fig. 3(v)]. For real-time display, frames were registered pairwise using normalized cross correlation (NCC) (i.e., the shift of each frame was calculated relative to the previous frame). The mosaic was then formed by placing the frames dead-leaf (i.e., without blending) in their estimated shifted positions. For off-line reconstruction of saved datasets, the two-way registration described in Ref. 32 was employed and the frames were stitched using alpha blending to remove any visible seams between the images.
The robotic scanning device can potentially be run at high speeds, and the scan velocity can be specified, as shown in Fig. 3(ix). However, since both the OCT and the FEM channels have a limited acquisition rate, a compromise must be made between imaging speed and scan density. At a higher imaging speed (assuming the same scan range), the limited acquisition rate of the channels introduces undersampling in the OCT image along the scan direction (for the FEM channel, it is also necessary to ensure sufficient overlap between image frames, although in practice the OCT sampling requirement is the dominant factor). For real-time display, there was no buffering between the OCT acquisition system and the en-face imaging display system, which runs asynchronously. The sampling along the robotic scan direction was, therefore, not always linear, and so linear interpolation was used between T-scans to generate a uniformly sampled en-face image [activated by the checkbox in Fig. 3(x)].
3 System Validation

System Resolution and Imaging Range
The main parameters of the two imaging subsystems are shown in Table 1. The axial resolution of the OCT subsystem was measured to be ∼24 μm in air. By performing an analysis on the raw spectral shape of the spectra acquired, similar to what is presented in Appendix C of Rivet et al., 36 we have estimated the theoretical axial resolution to be about 16 μm. The lower axial resolution obtained experimentally can be attributed to the spectral window which has been applied during the signal processing stage. This step was necessary to increase the signal-to-noise ratio (SNR) of the system, thus improving image quality. The OCT subsystem sensitivity was measured to be 84 dB. The sensitivity measurement procedure is a variation of that described by Bradu et al. 41 and Leitgeb et al. 42 Briefly, the endoscopic OCT probe was set so as to maximize the recoupling of the optical power returned from a mirror into the probe. The power P high was then measured at one of the detection ports of the optical interferometer (into the BPD). Then, the mirror was replaced by a low-reflectance target (a block of brushed aluminum), and the optical power measurement was repeated, obtaining P low . With this target in place, an A-scan was obtained, and the peak value p and the noise floor value n were extracted from it, leading to a peak-to-noise ratio value of 50 dB. From the measurements taken, the sensitivity S was calculated as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 6 3 ; 3 5 3 S ¼ 20 log The lateral resolution of the OCT subsystem was measured using a United States Air Force (USAF) resolution target (Thorlabs R1DS1N). At the lateral range used for this study (1.6 × 2.7 mm), it was possible to resolve element 4.1, which is highlighted in Fig. 4(a). This puts the lateral resolution of the OCT subsystem at roughly 60 μm, which is expected due to the probe's limited NA.
For system characterization, the robotic scanning device was mounted on an adjustable arm and fixed in place over a table.  Test samples were placed horizontally on the table and, except where described otherwise, fixed in place. For all the experiments reported here, the imaging size was kept constant. With the 1-D OCT probe being driven by a 5V amplitude sine wave at 100 Hz (500 A-scans per B-scan acquired) and the robotic scanning device having a velocity setting of ∼0.76 mm∕s for 400 buffered B-scans, the en-face image dimension was measured to be 1.6 × 2.7 mm 2 (x × y).
The FEM resolution is limited by the sampling density of the fiber bundle cores. For the fiber bundle used, the core spacing is ∼3 μm, leading to a Nyquist limited resolution of approximately 6 μm (although this is shift and rotationally variant). In the image of the USAF target shown in Fig. 4(c), both orientations of element 7.1 can be clearly resolved, conservatively indicating a resolution of at least 7.8 μm. The diameter of each image is 525 μm and the nominal length of the mosaic, for a scan distance of 2.7 mm, is ∼3.25 mm.
Considering the separate nature of the two scanning directions of the OCT subsystem (one provided by the 1-D OCT probe and the other by the robotic scanner), it is important to understand whether there are significant differences in terms of resolution or distortion. To this end, a resolution target card (Edmund Optics Pocket USAF Optical Test Pattern) featuring a Ronchi Ruling Pattern with 1 line∕mm was imaged and an en-face representation was constructed at a depth just beneath the laminated layer. The card was manually oriented in such a way that the patterned lines were perpendicular to the scan direction of the 1-D OCT probe, as shown in Fig. 4(c). The card was then rotated by 90 deg and the same scanning protocol was carried out, yielding the en-face image shown in Fig. 4(d).
To examine whether the spacing between lines was maintained, Fig. 4(e) was constructed from Figs. 4(c) and 4(d).
The lines in the images were thresholded and colored; Fig. 4(c) is shown in red in its original orientation. Figure 4(d) was rotated by 90 deg and stretched to fit the vertical dimension of Fig. 4(c). Two versions were produced by translating the figure horizontally in order to match the lines with those of Fig. 4(c), shown in blue and gold.
While the spacing between lines does not seem to change appreciably between the two scanning directions, the behavior around the edges of the image in Fig. 4(d) is not linear; moreover, the bars themselves are slightly curved, as suggested by Fig. 4(d). This may be due to imperfect corrections in the sinusoidal re-sampling function along the lateral scanning direction, as mentioned in Sec. 2.4 (with some deviations at the edges of the image frame), and also due to the operating principle of the 1-D OCT probe, 35 which will necessarily introduce off-axis behavior and therefore will introduce a degree of curvature into the imaging plane.
In addition, some artifacts are present in the OCT en-face images [most evidently in Figs. 4(a) and 4(d)], in the form of lines and streaks along the robotic scan direction (the y-axis). These are caused by back-reflections in the OCT endoscope introduced by small imperfections (e.g., scratches) on the GRIN optics.
For the study presented in Figs. 4(c)-4(e), and to obtain any of the subsequent OCT en-face images, it is essential to ensure that the two independent scanning directions are approximately orthogonal to each other, as otherwise shear would affect the OCT images and the total imaged area would be reduced. While approximate alignment can be achieved by manually rotating the probe prior to fixing it in place, fine adjustment was achieved via an iterative procedure using a suitable resolution target. Since the robotic scanning device is able to scan in two dimensions (albeit not as fast as the 1-D OCT probe), the scanning protocol allows for a linear scan along any arbitrary angle θ. By changing this angle and observing the corresponding en-face OCT image, it is possible to calibrate the robotic scanning direction, as shown in Fig. 5. In practice, for a clinical device, the OCT probe would need to be modified so that it could only be inserted in the correct orientation.
In Figs. 5(a) and 5(b), two OCT en-face images are shown, acquired with different θ settings on the robotic scanning device (2.7 and 2.4 radians, respectively). For this particular case, it was found that the optimum setting for θ is 2.7 radians; it is clear from Fig. 5(b) that the USAF pattern is deformed, presenting some shearing. Figure 5(c) presents this effect more clearly, with four OCT en-face images superposed together (after the USAF patterns were thresholded out), with their corresponding initial angle θ setting ranging from 2.7 rad (red) to 2.4 rad (gold). It becomes clear that the larger elements present a larger shift (up to ∼280 μm) than the smaller elements in the bottom of the image, indicating a shear-based distortion.
The robotic scanning instrument also presents good repeatability in terms of the scanning direction; in Fig. 5(d), an OCT en-face image for a setting of 2.7 þ π radians is shown. Since the initial angle is varied by π, the B-scans making up the OCT volume are now buffered in reverse order, which explains why the image is flipped along the horizontal axis in relation to that in Fig. 5(a). To show that the angle adjustment does not appreciably affect the image, Fig. 5(e) shows a thresholded version of Fig. 5(a) (in red) overlaid on a thresholded and flipped version of Fig. 5(d) (in blue). It is clear that, apart from a slight vertical shift, amounting to ∼40 μm on average (which could be attributed to triggering and timing issues in the acquisition itself), the two images are well matched, which shows that the scanner is invariant to the direction of travel.

Repeatability Studies
To assess the scanning repeatability of the system, a procedure was devised involving an imaging phantom, which is schematically represented in Fig. 6(a). This phantom comprises a piece of fluorescent-stained lens tissue paper Two different datasets were acquired, each having five volumes (five runs of the robotic scanning probe). In the first set, layer (i) was removed and the probe was allowed to run in a contactless manner (therefore, no FEM images were produced, only en-face OCT images). In the second set, both layers were present, and the probe was in contact with layer (i) throughout the whole span of the robotic scanning probe movement, and both FEM and en-face OCT images were then obtained.
To visually compare the different volumes acquired within each dataset, the colocalization plug-in from ImageJ Colocalization Colormap 43 was used. This plug-in colors each of the two volumes with a separate color (either red or green), with any overlapped points shown in yellow. The results from the first dataset (comparing the first with the last volume) are shown in Figs. 6(b1)-6(b2) (OCT B-scans) and Fig. 6(b3) (en-face OCT), and the results from the second dataset (also comparing the first with the last volume) are shown in Figs. 6(c1)-6(c3) (FEM) and Figs. 6(d1)-6(d2) (en-face OCT).
In Fig. 6(b3), the composite en-face OCT image appears to have a gradient of color, ranging from red at the top to green at the bottom. We believe that this is caused by an axial shift, rather than lateral shift, as shown by the two B-scans in Figs. 6(b1) and 6(b2). Across the five volumes, this axial shift ranges from 50 to 100 μm, and it seems to be more pronounced in the case when the probe is run without contact, i.e., when layer (i) is removed.
To quantify the lateral pixel shifts, the NCC was computed across the en-face OCT images [Figs. 6(b3) and 6(d1)-6(d2)] extracted from the five volumes for each of the two datasets. In addition, in the dataset where the two layers are present, the NCC is computed across the FEM data [Figs. 6(c1)-6(c2)]. Figure 7 depicts a 2-D map of NCC values calculated for each image in the set considered as reference with the other four template images in the set. Obviously, comparing each frame to itself yields a NCC value of 1, as shown by the values along the diagonals in all plots. It is also worth noting that the NCC values in all plots are symmetric in relation to the diagonals since all five images are considered either as references or templates.
It was found that, for the OCT en-face frames, the average lateral pixel shift is less than the lateral resolution (60 μm). It was also found that the average of the NCC values (without including the values on the diagonal) is lower (∼0.76) for the layer depicted in Fig. 6(d1) than the average for the layers of Figs. 6(b3) and 6(d2) (∼0.83 and ∼0.88, respectively), possibly due to the probe dragging and deforming layer (i) during each scan. The higher average NCC value in Fig. 6(d2) may be due to the relative stability conferred by the probe contact in terms of axial movements. Moreover, the NCC analysis for the FEM data [ Fig. 7(b)] yielded the highest NCC values (∼0.93), with average lateral shifts of around 10 to 11 μm. This seems to be consistent with the standard deviation of the measured mosaic lengths (∼13 μm).

Coregistration between Channels
As shown in Fig. 8(a), the two fiber probes are laterally separated by ∼1.6 mm inside the robotic scanning device tube. Therefore, at any given moment, the OCT and the FEM subsystems image different regions of the sample. However, given the considerable size of the lateral scanning, especially in the y-direction (∼2.7 mm), it is possible to identify a region of the sample covered by both imaging subsystems. This overlapping region depends on the initial setup and calibration of the scanning procedure, particularly the rotation of the OCT probe and hence the scan angle required to generate an orthogonal scan with the robotic scanner. When placing the OCT and FEM probes inside the robotic scanner, we optimized the relative position of the probes so that the separation was roughly along (c) Comparison between the five en-face OCT frames whose depth is set to cover layer (i), as shown in Fig. 6(d1). (d) Comparison between the five en-face OCT frames whose depth is set to cover layer (ii), as shown in Fig. 6(d2). Journal of Biomedical Optics 066006-9 June 2019 • Vol. 24 (6) the diagonal of the OCT image, in order to maximize the overlap between the two channels. Figures 8(b)-8(e) depict a study carried out with the Edmund Optics USAF test pattern. In this case, the surface of the card was stained with a fluorescent marker and the probe is allowed to run in near contact to the surface of the card. In this way, images of the USAF patterns are obtained in both channels.
The images were scaled so that the pixel sizes were the same, and then the FEM image was translated so that it would match the pattern observed on the OCT en-face image. The overlapped region is shaded in red in all images in Fig. 8. Approximately 7% to 8% of the OCT images were also included in the FEM mosaic. The same probe configuration was used for all other datasets presented in this paper, and so the overlap region remains the same.
There is one important difference between the two acquired datasets shown in Figs. 8(b)/8(c) and 8(d)/8(e). In the latter, it seems that the mosaicked FEM image (d) has a shorter vertical span than the former (b). In fact, when the FEM image is translated and placed above the OCT en-face image (e), the top end of the shaded region does not agree with the features shown in (e), namely element 3.6. This is due to the mosaicking algorithm requiring features in the image to detect the movement. Since in (b) the probe was slightly more shifted to the right, it was able to cover features throughout the whole scanning range. In (d), on the other hand, there is a sizeable gap with no features between elements 3.6 and 2.1, which impacted the mosaicking algorithm by not registering motion where there should have been some. In principle, this problem could be mitigated by using the known velocity of the robotic scanner when there are insufficient image details for registration, although this procedure would not then deal with any tissue deformation.

Animal Tissue Experiments
To test the system in a biomedical imaging environment, ex-vivo animal tissue imaging was performed. Ovine kidney and porcine lung/esophagus were stained using the topical staining agent acriflavine hydrochloride (0.01% in water) for 60 s. The samples were then rinsed with water, as described in Ref. 16. During these imaging trials, the robotic scanning was mounted on an arm attached to the table, and the probe was in contact with the sample, yielding OCT volumes and FEM mosaics, as shown in Fig. 9.
In Fig. 9(a1), a 3-D render of the full OCT volume is shown, with a good depiction of the tissue texture being present. Figure 9(a2) employs the same data but in a color-coded depth projection.
As discussed in Sec. 3.3, there is a region that is common to both the OCT images and the FEM mosaic. A correspondence in features can be observed in the boxed region in both Figs. 9(a2) and 9(a3), with a crevice in the surface of the tissue being visible on both images.
If the probe is prevented from running smoothly across the surface of the sample being imaged (due to snagging or extreme tissue deformation), as shown in some of the repeatability studies in Sec. 3.2, then artifacts will arise in both the OCT en-face image and the FEM mosaic. This can be observed in Figs. 9(a3)-9(a4), showing two image datasets collected from the same area of tissue; Figure 9(a4) is evidently much shorter than Fig. 9(a3), despite the fact that the probe was run across the exact same distance and at the same speed. This is because the probe has dragged the tissue during scanning along the robotic device scanning direction. When the tissue moves with the probe, the length of FEM mosaics, which are generated purely based on image registration, is reduced. Artifacts are also observed in the OCT en-face image in Fig. 9(a5) but of a different form. Because the OCT image is assembled based on the expected scanning motion, tissue deformation is exhibited as elongation of tissue features along the y-scanning direction, as shown in the regions delimited by the yellow dashed boxes.
Figures 9(b1)-9(b4) all correspond to the same section of porcine lung tissue acquired with both OCT and FEM subsystems. It is possible to identify individual alveoli (yellow arrows) Journal of Biomedical Optics 066006-11 June 2019 • Vol. 24 (6) in this particular piece of lung tissue on both B-scan along the robotic scan direction (b2) and en-face (b3) visualizations, with the individual widths ranging between 100 and 130 μm, which is consistent with the values presented in the literature. 44 Similar to (a4), the FEM mosaic does not have the full length consistent with the actual probe scan, but unlike the previous case, the probe does not appear to be caught in the tissue, dragging it with it; instead, it appears that, due to the topography of the tissue, the fiber bundle of the FEM subsystem lost contact with the tissue in parts of the scan, consequently yielding a shorter mosaic. Finally, in Figs. 9(c1)-9(c4), a section of porcine esophageal tissue has been analyzed, with the OCT volume allowing for clear distinction between the different layers in both the 3-D volume render (c1) and the B-scan along the robotic scan direction (c2). Unlike the previous two cases, the FEM mosaic (c4) corresponding to this tissue section is complete [with no scanning artifacts present in the en-face visualization in (c3)], although some of the structures appear to have moved with the probe during scanning, which when combined with the mosaic reconstruction yield some artifacts in the image, such as the spiral structure present in the middle of the FEM mosaic (arrowed).

Handheld Operation
The robotic scanning device has a relatively small footprint, making handheld operation possible. In Fig. 10(a), a photograph depicts the robotic scanning device in handheld operation, imaging the skin of a volunteer's index finger. However, in practice we were only able to generate good quality handheld scans when the device was not in direct contact with the tissue. Handheld operation is, therefore, shown only for OCT imaging modality, where the probe could be held a small distance above the tissue.
Owing to the limited speed and the increased possibility of motion artifacts affecting the images, the 1-D OCT scanning frequency was increased to 250 Hz and the number of lateral points in the OCT volumes was reduced to 200 × 200 (x × y). A full volume was then acquired in 0.8 s.
Despite some motion artifacts, it is possible to resolve some fine features from in-vivo skin samples, as shown in Figs. 10(b) and 10(c), where an OCT volume was obtained from a volunteer's thumb. It is possible to distinguish the stratum corneum from the stratum germinativum, and as shown in Fig. 10(c) [which is a B-scan taken from the full OCT volume in (b)], it is possible to visualize sweat glands [arrowed, labeled (i)].
In an en-face visualization of the same region [ Fig. 10(d)], it is possible to recognize the dendrite-like structure of the stratum germinativum [arrowed, labeled (ii)].
As mentioned earlier in Sec. 3.1, the OCT images present a slight curvature along the fast (x) scanning axis, due to the construction and operating principle of the 1-D OCT probe and the fact that we had to employ a sine wave function to perform the scanning. This can be seen in the 3-D render shown in Fig. 10(b), with the curvature being more evident along the x-axis than along the y-axis, for a lateral scanning range of the same order of magnitude in each direction.

Discussion
In this paper, we have presented a combined OCT/FEM system that is capable of bi-dimensional lateral scans by means of a robotic scanning device. The 1-D, FV OCT scanning probe allows a lateral FoV of ∼1.6 mm with a resolution of 60 μm; the OCT subsystem is capable of resolving ∼24 μm in depth (in air). The fiber bundle-based FEM subsystem has a lateral resolution of better than 8 μm, limited by the core spacing of the imaging bundle. The scanning device is capable of scanning ∼2.7 mm, generating images in as short as 0.8 s.
Scanning-wise, the lateral deviations between successive scans (average pixel shifts) are lower than the lateral resolution of the OCT subsystem, quantified to about 10 to 11 μm in the FEM subsystem, which is in line with the deviation measured . shows a screencast of the acquisition UI in real time, handheld operation, with the OCT en-face frames refreshing at 1 Hz (sped up by a factor of 2). Owing to the limited en-face imaging rate and the fact that a single en-face image (albeit averaged over ∼200 μm) is being displayed, there are some unavoidable motion artifacts both axially and laterally.
Journal of Biomedical Optics 066006-12 June 2019 • Vol. 24 (6) across mosaic length, 13 μm. The most significant deviations seem to happen in the axial direction (OCT only), particularly when the probe is not in contact with the sample (and therefore no FEM imaging can take place). Other reports 19,20 have also reported this dependency of repeatability on the sample and imaging procedure (contact versus noncontact), therefore this is not specific to our implementation. Our implementation does have some shortcomings, which are summarized below.
• The robotic scanning device is rigid, which limits the scope of applications.
• Owing to the separate nature of the two channels and probes, there is only a very small overlap between the areas covered by the two subsystems, as shown in Sec. 3.3.
• While still able to deliver images of biological tissue (as shown in Fig. 9 in Sec. 4), the lateral resolution of the OCT subsystem is relatively poor (especially in comparison with its axial resolution). Lateral resolution was also affected by imperfections on the GRIN optics, whose effects could not be easily compensated for.
• Comparing the system sensitivity with that of the other equivalent systems we assembled in the past with different interface optics, we can assume that the OCT FV probe employed in this study contributed to losses in the system that led to lower sensitivity.
• Owing to the fact that we are employing a fiber bundle for imaging with no distal imaging optics, when imaging with the FEM subsystem, the probe must remain in contact with the tissue/sample being investigated, with the undesirable effect of some tissue dragging.
Because it was not possible to decouple the effect of each scanning subsystem on the overall instrument performance, the repeatability study could only be conducted on the combined effect of the robotic scanner and that of the raster scanner in the OCT scanning probe. Given that the OCT scanning probe runs in open loop, the system repeatability is sensitive to external factors.
Some of these shortcomings could be overcome by improved mechanical design of the robotic scanning device or by merging the two subsystems into a single optical probe, by means of a DCF 9 inside a 1-D, FV scanning probe, such as the one employed in the OCT subsystem. However, such an approach would introduce a penalty in the imaging resolution for the FEM subsystem, as a cantilevered optical fiber introduces off-axis scanning aberrations, and the low numerical aperture of the miniature lens in the probe prevents matching of the imaging resolution to that of a fiber bundle.
Handheld operation was not possible using both OCT and FEM channels, as this requires contact between the probe tip and the tissue. It is extremely difficult to maintain sufficient pressure to obtain an image while avoiding pressing too hard and generating severe tissue deformations. One possibility to address this, which has been explored to some extent by Zhang et al., 20 would be to use an FEM probe with a finite working distance and employ the OCT system to estimate the distance of the probe from the tissue, and actuate the fiber bundle axially using a high-speed motor to maintain the correct working distance. Alternatively, a force sensor 34 could be used to ensure optimal contact between the probe and the sample.
Handheld operation was shown to be possible for the OCT imaging channel alone, including generating real-time en-face slices using a 1-D scanning probe, as shown in Sec. 4.2. In this mode of operation it would still be possible to then bring the probe into contact with the tissue to obtain a single FEM image or a small manually formed FEM mosaic.

Conclusions and Future Outlook
In this report, we have presented a combined OCT/FEM system that is capable of bidimensional lateral scans by means of a robotic scanning device. The whole probe is housed in a compact, lightweight package with a minimal footprint, with a scanning end of 3.7 mm OD. This makes it suitable for in-situ tissue investigations, presenting some advantages over conventional robotic approaches in terms of footprint, cost, and surgical workflow. 33 The device can be operated handheld or supported by an articulated arm. It is possible to obtain higher speed (∼1 fps), direct en-face OCT images in conjunction with the FEM mosaics by sacrificing some lateral resolution in the OCT channel along the robotic scanning direction. While the device only offers two degrees of freedom, it is the simplicity of the mechanism that enables a level of reproducibility suitable for assembling both FEM mosaics and OCT volumes. More complex and expensive devices, such as robotic arms, provide more flexibility and have been used for FEM mosaicking (e.g., Ref. 34), but the assembly of OCT volumes from B-scans is more sensitive to small errors than FEM mosaicking, where the errors are automatically corrected by the mosaicking registration algorithm.
As a rigid instrument, the scanner is unsuitable for general endoscopic applications. However, there are a number of suitable interventional procedures, such as in breast surgery and neurosurgery, where a direct line of sight is available, and there is need to determine tissue characteristics-and in particular to identify tumor margins-in real time. Probe based imaging has been proposed for these applications (e.g., Refs. 16 and 45), but without a scanning mechanism the field of view (FoV) is inevitably small compared to the region of interest. 21 The type of device presented here could, therefore, provide a good compromise between ease of deployment and imaged area. In particular, even though the scanner is rigid, both of the probes are flexible along the remainder of their length, allowing the bulk of the optical systems to be sited a convenient distance away from the patient. A modified device with a longer scanning tube could also be considered for more general laparoscopic surgical applications, although further work would be required to confirm that the scanning performance is not degraded.
Despite the limitations described above, the results presented here demonstrate the feasibility of using a compact robotic scanner to provide one direction of scanning for en-face and volumetric OCT imaging and of combining high-resolution FEM and OCT imaging in a minimally invasive probe. These promising results will support further development of laparoscopic and endoscopic multimodal, en-face imaging systems.

Disclosures
The authors declare that there are no conflicts of interest related to this article.