Retina-simulating phantom for optical coherence tomography

Abstract. Optical coherence tomography (OCT) is a rapidly growing imaging modality, particularly in the field of ophthalmology. Accurate early diagnosis of diseases requires consistent and validated imaging performance. In contrast to more well-established medical imaging modalities, no standardized test methods currently exist for OCT quality assurance. We developed a retinal phantom which mimics the thickness and near-infrared optical properties of each anatomical retinal layer as well as the surface topography of the foveal pit. The fabrication process involves layer-by-layer spin coating of nanoparticle-embedded silicone films followed by laser micro-etching to modify the surface topography. The thickness of each layer and dimensions of the foveal pit are measured with high precision. The phantom is embedded into a commercially available, water-filled model eye to simulate ocular dispersion and emmetropic refraction, and for ease of use with clinical OCT systems. The phantom was imaged with research and clinical OCT systems to assess image quality and software accuracy. Our results indicate that this phantom may serve as a useful tool to evaluate and standardize OCT performance.


Introduction
Optical coherence tomography (OCT) has improved our ability to detect a variety of retinal diseases. 1 Although numerous advances in OCT technology have been made over the past 20 years, there has been relatively little progress in the development of standardized test methods to characterize OCT system performance. Test methods for established imaging modalities [e.g., computed tomography (CT), ultrasound] often involve the measurement of well-validated physical models known as phantoms. These methods are used to establish initial device performance, ensure quality control over time in clinical settings, perform quality assurance testing for manufacturing, and provide researchers and developers with a consistent method for device-to-device performance comparison and assessment of device and software modifications. Due to their widespread significance, phantom-based test methods often form the foundation of international consensus standards and medical professional society guidelines. For example, the American College of Radiology (ACR) has accredited an X-ray CT metric-specificphantom 2 designed to determine critical characteristics such as resolution, slice width, measurement accuracy, and noise.
The growing impact of OCT on the management of ocular diseases emphasizes the need for developing similar consensus test methods. Currently, it is estimated that clinics across the United States perform over 52,000 OCT scans/day 3 using instruments from more than a dozen OCT manufacturers. 4 Furthermore, a number of ocular conditions are evaluated through the use of OCT, 5 and it has become a key diagnostic tool in a number of retinal diseases. 6 Diagnostic decisions are often guided by quantitative measurements obtained via OCT. For example, visualization of macular edema is used as a diagnostic guide for age-related macular degeneration patients, 7 while parameters such as increased macular thickness are often used to assess central vision deterioration. 8 Glaucoma diagnosis has been significantly enhanced through nerve fiber layer (NFL) thickness measurements 1,9 obtained by OCT, and by quantitative monitoring of the cup-to-disc ratio. 10 The growing clinical use calls for better validation of OCT measurements, as does the increasing subset of literature highlighting inconsistencies across systems. Wolf-Schnurrbusch et al. 11 have reported central retinal thickness measurement variations across six different clinical systems; and a study conducted by Hatef et al. 12 reported low agreement of macular thickness from retinal vein occlusions and diabetic retinopathy patients. A similar study by Matt et al. 13 noted severe retinal segmentation failures due to vein occlusions and macular edema. These studies illustrate the variability and nonrepeatability associated with OCT measurements collected from different systems. In addition to the clinical impact of OCT, such variability presents significant challenges when pooling patient data across devices, such as during multicenter clinical trials. The sources of these inconsistencies are difficult to isolate without the use of a controlled test object, such as a phantom.
Recently, several groups have been working toward developing phantoms for performance evaluation of clinical OCT devices for retinal imaging. Agrawal et al. 14 presented a nanoparticle-embedded phantom designed to evaluate the threedimensional point spread function (PSF) and its variation across the image volume produced by retinal OCT devices. Retinal tissue-mimicking phantoms have been reported by two groups. Rowe and Zawadzki 15 developed a phantom consisting of five transparent 60-μm layers with differing refractive indices and a realistic fovea. De Kinkelder et al. also demonstrated a phantom designed for NFL thickness measurement accuracy. This phantom consisted of five 50-μm layers embedded with varying amounts of scattering particles in each layer. 16 However, both of these phantoms are limited in the accuracy with which they model the layered morphology and optical properties of retinal tissue.
We have developed a retina-mimicking phantom to assess OCT image quality and software accuracy. The retinal phantom is the first to incorporate all retinal layers visible with current clinical OCT systems. Furthermore, each layer is designed to emulate the optical properties, namely scattering, and the thickness of the corresponding anatomical layer while mimicking the surface topography at the foveal pit. For ease of use with clinical systems, the phantom is embedded into a water-filled model eye, which accurately reproduces the dispersion and refraction of the human eye. Imaging with bench top and clinical grade OCT systems was then performed to evaluate the phantom as a tool for standardized assessment of system performance.

Materials and Methods
Tissue-mimicking phantoms must accurately model the biological morphology of the target tissue and the relevant physical properties for the imaging modality. Each layer of the retina has a unique thickness and optical scattering which the retinal phantom should accurately portray. The overall design flow of the retinal phantom development is shown in Fig. 1.
The phantom is constructed of thin scattering films of polydimethylsiloxane (PDMS). Much work has been done on the development of silicone phantoms for optical imaging systems. 17 Silicone allows for formation of phantoms with complex shapes and varying optical properties due to its low viscosity prior to curing. There is also established precedent for spin coating silicone into micron-scale thin films with strong adhesion and index matching between stacked layers. These versatile and highly tunable features allow for phantoms which represent real biological structure and optical properties. 18 Nano-and microparticles are embedded in PDMS to mimic the effective scattering of each retinal layer. The target parameters of each of the phantom layer, such as thickness and particle concentration were determined through analysis of human retinal OCT images as described below. The phantom was then fabricated layer-bylayer through spin coating followed by stylus profilometry to obtain the thickness of each layer with micron precision. These two steps were repeated to replicate the layered structure of the retina. The surface was then laser microetched to create a structure similar to the foveal pit. After fabrication, the phantom was placed into a model eye.

Human Retinal Analysis
Retinal OCT images (50) from five normal eyes were provided through the courtesy of Physical Sciences Inc. (PSI, Andover, MA). The images were collected using a research grade adaptive optics spectral-domain OCT (AO-SDOCT) system, which operates at a center wavelength of 855 nm with a 56-nm full-width at half-maximum (FWHM) spectral bandwidth. AO-SDOCT has a narrow depth-of-focus and in this case the focus was set in the retinal-pigmented epithelium (RPE)-photoreceptor region. Each image was filtered with a 10 × 10 (σ ¼ 2.5) Gaussian filter to remove high frequency components. Images exhibiting significant eye movement were discarded. The RPE was used as a landmark for image coregistration. Using customized algorithms, the RPE was aligned throughout the image stack to obtain an average retinal OCT image for each eye.
By averaging a series of image A-scans in a planar region, we developed a cross-sectional retinal profile from the NFL to the choroid as shown in Fig. 2(b). The optical thickness of each retinal layer was obtained from FWHM of each peak or valley in the axial profile. Optical thickness was then divided by the refractive index (n retina ) to obtain physical thickness. We assume n retina ¼ 1.36 for our analysis. 19 The OCT signal intensity is given by the mean value of the A-scan profile between the edge points of the FWHM, as shown by the dots in Fig. 2(b).

Preparation and Characterization of Phantom Materials
The Sigma-Aldrich, St. Louis, MO), and silica microspheres (SiO 2 , 24327, Polysciences, Inc., Warrington, PA). Particles were dispersed throughout PDMS using a probe tip sonicator for 10 h. To allow for heat dissipation, samples were sonicated with a 30-s on and 60-s off periodic cycle. Afterwards, the sample was placed in a vacuum chamber to remove air pockets. Each stock sample was further diluted with additional PDMS base to create samples with a particle concentration ranging from 1% to 10% by mass. To characterize the relationship between particle concentration and OCT signal intensity, small droplets of each sample were imaged with a similar PSI SDOCT instrument in our laboratory. Care was taken to fix the reference arm length and beam focus depth in the sample during each recording session. Each sample's signal intensity was extracted near the surface just beneath the air-PDMS specular reflection signal. Using the linear relationship between PDMS-particle concentration and OCT signal intensity, the target concentration of each phantom layer was established.

Fabrication
The phantom fabrication is divided into several different stages, as shown in Fig. 1. The first of these stages involves spin coating thin films of PDMS to achieve a layered structure. Our group previously presented an approach involving spin coating silicone films to generate robust, stable phantoms with appropriate microscale layer geometries. 20 The retina phantom was constructed in a similar layer-by-layer fashion, however, we expanded this protocol to fabricate near-micron layers (<10 μm) and layers with different scattering levels. For a given layer, the PDMS-particulate mixture with the appropriate scattering characteristics was mixed with the curing agent in a 10∶1 ratio, followed by a final degassing step in the vacuum chamber prior to spin coating and curing. Our previous fabrication method was limited by a minimum layer thickness; layers <10 μm were difficult to achieve. However, retinal tissue structures, such as the external limiting membrane (ELM), are below this minimum. A convenient solution is to use tert-butyl alcohol (TBA) as a solvent to reduce viscosity during spin coating. 21 TBA (tert-butanol, A401-500, Fischer Scientific, Pittsburg, PA), which is solid at room temperature, was softened by heating to 45°C. After degassing, liquid TBA was added to the diluted PDMS-curing agent sample and mixed until homogenous, usually within 1 to 2 min.

Spin coating procedure
The substrate for the phantom was a 1-mm-thick glass microscope slide. Silane (SIO6715.5, Gelest, Morrisville, PA) was applied to the glass substrate to provide a hydrophobic coating, allowing easy removal of cured PDMS from glass. After further cleaning and treatment, the slide was affixed to a spin coater (WS-650Mz-23NPP, Laurell Technologies, North Wales, PA), and a 1 g drop of the PDMS-particulate sample was placed on the center of the slide.
PDMS film thickness depends upon rotational speed, spin time, and TBA dilution. 21 After spinning, the PDMS was cured in a laboratory oven at 150°C for an hour. The phantom surface was profiled with a stylus profilometer (Dektak 150, Veeco Instruments Inc, Plainview, NY) along the length of the phantom. After surface measurements, an uncured PDMS sample for the next layer was deposited on top. The phantom was again spun with the appropriate settings, cured, and profiled. This process was repeated for all layers.

Laser etching of foveal pit
To create a foveal pit in the phantom, we used a custom laser microetching technique involving a femtosecond fiber laser with a central wavelength of 1060 nm and 300 fs pulse width. After fabrication of the final layer, a fovea-like structure was etched into the phantom surface based on nominal foveal dimensions of 1.5-mm diameter and 125-μm depth. 22 Laser parameters (e.g., laser power, substrate speed, and number of etching passes) were optimized to inscribe a 2-mm-long trench with a fovealike cross-sectional profile suitable for imaging with horizontal OCT scans across the trench.

Phantom Assembly
After fabrication, the phantom was cut into a 10-mm-diameter circular section for placement into a model eye (OEMI-7, Ocular Instruments, Inc., Bellevue, WA). This model eye was used previously with a phantom to measure the OCT PSF. 14 The finalized phantom was positioned with the foveal pit at the visual axis (∼4 to 8 deg from the optical axis) onto a molded surface in the posterior segment to match the retinal curvature. The anterior and posterior chambers were filled with water to match the refractive index of the aqueous and vitreous humor (n ∼ 1.33). 23

Clinical Data Collection
The assembled phantom was imaged with a commercially available clinical OCT system. We collected several 3-mm-wide images centered along the foveal trench with a B-scan and Ascan spacing of 10.97 and 11.16 μm, respectively, and with a 3.87 μm axial resolution. Clear visualization of the RPE photoreceptors served as an image quality assessment. Using the manufacturer's proprietary clinical software, each B-scan was segmented from the internal limiting membrane to the RPE to measure the total retinal thickness (TRT). As the clinical software reports physical TRT, we applied a correction to the thickness measurements to account for the difference between n retina and n PDMS . For comparison, the phantom was also imaged and segmented manually with the research-grade PSI SDOCT system mentioned previously. We collected 2-mm-wide images centered along the foveal trench with a B-scan and A-scan spacing of 1.95 and 20 μm, respectively, and with a 1.90-μm axial resolution. The acquisition parameters differ for each device due to limitations set by manufacturer software. Measurements were compared to profilometry to assess TRT measurement accuracy.

Characterization of Materials
The relationship between OCT intensity and particle concentration is shown in Fig. 3. The intensity results are normalized to a 10% BaSO 4 -PDMS sample to study the resulting contrast between the PDMS-particulate samples and for comparison with human OCT data. The results show that SiO 2 and BaSO 4 follow a linear trend providing low and moderate levels of OCT signal intensity. TiO 2 provides a large dynamic range of OCT signal intensity, however, the signal saturates with highly concentrated samples.
To design the target parameters for each phantom layer, this data is compared to the human retinal intensity analysis (Fig. 2) to determine layer concentration and particle type. This comparison shows that SiO 2 and BaSO 4 are ideal for low and moderately scattering layers such as the inner retinal layers [NFL, ganglion cell layer (GCL), outer plexiform layer (OPL), outer nuclear layer (ONL), inner nuclear layer (INL), and inner plexiform layer (IPL)], whereas TiO 2 is ideal for the highly scattering region of the RPE-photoreceptor complex.

Fabrication Results
A profilometry-based surface map was created after spin coating of each layer. To correct for misalignments between profilometric maps, a rigid body transformation was used to align each surface map. The difference between the profilometric maps yields the resulting thickness between the given layers. TRT can be determined by subtracting the NFL and choroidal surface maps, as shown in Fig. 4. The TRT map shows that spin coating allows for a high degree of lateral uniformity. Figure 5(a) shows a B-scan taken across the laser-etched trench representing the foveal pit. This current fabrication protocol allows for a realistic visual representation of the retina with a dynamic range of backscattering and layer thicknesses close to anatomy, which is shown in Fig. 5(b). Qualitatively, the phantom exhibits excellent structural similarity with tissue. Currently, the RPE, outer segment layer (OSL), inner segments/outer segments (IS/OS), inner segment layer (ISL), ELM, ONL, OPL, INL, IPL, GCL, and NFL are readily identifiable with OCT systems, as shown in Fig. 5(b). This phantom is the first to incorporate all of these layers and each layer is visually distinguishable, as shown in Fig. 5(a). Furthermore, very fine layers such as the photoreceptors and the ELM are represented for the first time with visually realistic thickness and intensity. The phantom, however, does show a lower dynamic range of tissue structures, and the converging layers and smooth curvature of the foveal pit seen in retinal tissue are not mimicked.
The thicknesses of most phantom layers are within one standard deviation of the human measurements, as shown in Fig. 6(a). The two exceptions are the ISL and the IS/OS where thickness discrepancy was 7 and 3 μm, respectively. The OCT intensities of the phantom and retinal tissue are compared in Fig. 6(b). All intensity values are normalized to the RPE's intensity. The results in Fig. 6(b) indicate that a majority of the retinal phantom layers accurately mimic the relative intensities of their   Fig. 5(a).
anatomical counterpart. The IPL, OPL, ELM, IS/OS, OSL, and RPE are within 10% of the human measurements. The largest discrepancies occur in the GCL, INL, ONL, and ISL for which the phantom exhibits 20%, 28%, 47%, and 22% lower intensity for each layer, respectively.

Clinical Device Assessment
An example of a B-scan recorded by the clinical OCT system is shown in Fig. 7(a). The blue and red lines show the automated TRT segmentation by this device. Figure 7(b) shows the phantom imaged and segmented with the research-grade PSI SDOCT. A total of six B-scans from the clinical system and laboratory system were averaged reflecting regions of ∼3 mm× 55 μm and 2 mm × 100 μm, respectively. Figure 8(a) compares the TRT measured by each system as a function of position across the phantom. Figure 8(b) shows the standard deviation within the region of interest for each TRT measurement point. The profilometry, clinical system, and laboratory system TRT measurements have mean values of σ ¼ 0.86 AE 0.75 μm, 2.24 AE 1.27 μm, and 3.68 AE 1.16 μm, respectively. Figure 8(c) shows the difference between the OCT and profilometry measurements. When compared to profilometry, the clinical device has a mean discrepancy of 19.12 AE 6.31 μm and our laboratory system shows a 14.02 AE 11.08 μm mean absolute discrepancy with profilometry. The low thickness standard deviation in the clinical system across the B-scan indicates a systematic measurement difference rather than a random difference caused by the subtle variations in the thickness of the phantom.

Discussion
We have developed and validated a protocol to produce an OCT phantom that accurately replicates the optical properties and morphology of the retina. Due to our ability to spin coat scatter-embedded thin silicone layers as thin as 5 μm, the results presented here represent a major advancement over prior retinal phantoms which were limited to thicknesses of 50 μm or greater. 15,16 These thin layers are critical for representing regions such as the OSL, IS/OS, and ELM, which have thicknesses close to the axial resolution of OCT systems. Laser microetching was demonstrated as a novel approach for machining a foveal pit, a key morphological feature in retinal OCT images. Furthermore, through stylus profilometry, the thickness of each phantom layer is known with micron accuracy. This high degree of precision allows for a more rigorous evaluation of OCT system performance than current methods allow. The completed phantom is embedded into a water-filled optomechanical model with dimensions that simulate the human eye. Previous work has shown that optomechanical eye models replicate refractive errors. 24 The water-filled eye allows the model to match the  dispersive characteristics of the vitreous and aqueous humors of the human eye, 25 allowing for a more realistic study of OCT performance. Currently, this phantom does not mimic the irregular layer boundaries observed in retinal tissue. Additionally, in retinal tissue, the NFL, GCL, IPL, and OPL become thin as they approach the foveal pit, as shown in Fig. 5(b). The spin coating technique is limited to thin flat layers, and replicating such exact morphology still remains a challenge. However, our approach provides significant advantages over previous fabrication methods and allows for a more detailed evaluation of OCT performance than previously achieved. The data generated in this study provide new insights into fabrication of cutting-edge phantoms as well as the OCT system performance.
The finalized phantom was imaged with two different OCT systems, which demonstrated its ability to assess measurement variations with high precision. Overall quantitative agreement in TRT measurement was quite good, with the mean discrepancy between profilometry and the clinical and research OCT systems being 19.12 AE 6.31 μm and 14.02 AE 11.08 μm, respectively. Our research machine showed a strong lateral ΔTRT variation, which can be attributed to the change in path length as a function of scan angle. The clinical device variability was within the range of the previous NFL thickness study reported by de Kinkelder et al., who found a mean disagreement of 18 AE 4 μm across multiple systems. 16 The exact source of this disagreement is not known, but we can rule out several possible sources. An immediate source of error might be the lack of proper definition of TRT. Our analysis employs the distance between the top of the NFL and the bottom of the RPE as the TRT, which may not be consistent with the proprietary clinical software. As shown in Fig. 8(a), the RPE boundary does not lie entirely along the RPE-choroid boundary. It is unclear whether this is a device assumption for the retinal boundary or software inaccuracy. However, modifying the segmentation for consistency with our analysis would increase the measured TRT and further amplify the disparity. Similarly, our assumption for n retina ¼ 1.36 could be inconsistent with the clinical software. A higher refractive index assumption would only increase the disparity with profilometry, and a low-end estimate of n retina ¼ 1.35 corresponds to only a 2.38 μm mean TRT decrease. Given that this adjustment is much lower than the actual disagreement of ∼20 μm, errors due to variations in the refractive index are assumed to be minimal. Another source of the deviations could be a lack of coregistration between the different data sets. Due to the placement of the model eye and each device's unique acquisition parameters, each measurement shown in Fig. 8(a) represents a slightly different region on the phantom. As previously mentioned, the clinical and laboratory OCT data represent 3 mm × 55 μm and 2 mm × 100 μm regions, respectively. To ensure overlap of both regions, we selected a 1.23 mm × 3 mm region from the profilometry map. Profilometry indicates that there is a minimal thickness variation within this region, with a mean σ of 0.86 μm. The most significant thickness deviations measured by profilometry occur in or around the fovea where the mean σ value is 1.43 μm. This increased variability may be induced during the topographic surface modification. Nonetheless, this is much smaller than the ∼20 μm discrepancy in question, indicating that any lack of coregistration is not a critical factor.
Quantitative comparison between the retina phantom and biological tissue (Fig. 6) indicate that each layer matches its anatomic counterpart well. Some exceptions are the intensities in the GCL, INL, ONL, and ISL phantom layers where the RPErelative intensities are lower than tissue. These lower intensities exhibited by our phantom may be due to a combination of factors. A significant contributor to the intensity mismatch might be the fact that the human images were captured with a system employing adaptive optics, which has a more narrow depthof-focus than a standard clinical system. The effect of adaptive optics generally is to increase the intensity of the layer where the focus resides while lowering the intensity for other layers. In this case, the focus was set near the photoreceptors/RPE, and so we would expect reduced signal in the superficial layers. 26 Additionally, the GCL, INL, ONL, and ISL are embedded with silica microspheres which exhibit minimal backscattering, as shown in Fig. 3, and the signal is close to the noise floor. As shown in Fig. 6, the mismatch of these layers becomes more pronounced with increasing depth, suggesting that the silica layers are especially sensitive to the attenuation effect from superficial layers. In any case, the simplicity of the phantom design allows for straightforward adjustment of particle type and concentration to further refine the OCT intensities yielded by the phantom.
Future investigative work will include fabrication of other retinal structures in the healthy eye such as the optic nerve head and a circular fovea. Our current fabrication method limits our ability to embed any distinct features that would help to coregister data between profilometry and OCT. Constructing these more realistic topographies, however, will provide a better basis for coregistration. For diagnosis, these enhancements provide alternate disease-specific targets for the imaging system and also for their associated algorithms. Such phantoms could be used in round-robin studies providing additional insight into the diagnostic capabilities of several OCT systems. In addition, this phantom may be used to bench test OCT system performance under a multitude of imaging conditions. Studies have highlighted the impact of signal attenuation on retinal thickness measurements 27 and the effect of vitreous opacity on image quality. 28 This phantom can help to evaluate the effect of these specific parameters and allow for a more thorough understanding of the elements influencing OCT performance.

Conclusion
We have presented a highly novel OCT retinal phantom with realistic optical properties and morphology. This phantom may serve as a convenient tool to evaluate and standardize OCT image quality and measurement accuracy. Such performance standardization can lead to improved repeatability and reproducibility of measurements from clinical and research OCT devices, thereby enhancing reliability and consistency of diagnostic decisions in retinal disease and facilitating the development of innovative diagnostic technologies.