With the increasing amount of patient information that is being collected today, the idea of using this information to inform future patient care has gained momentum. In many cases, this information comes in the form of medical images. Several algorithms have been presented to automatically segment these images, and to extract structures relevant to different diagnostic or surgical procedures. Consequently, this allows us to obtain large data-sets of shapes, in the form of triangular meshes, segmented from these images. Given correspondences between these shapes, statistical shape models (SSMs) can be built using methods like Principal Component Analysis (PCA). Often, the initial correspondences between the shapes need to be improved, and SSMs can be used to improve these correspondences. However, just as often, initial segmentations also need to be improved. Unlike many correspondence improvement algorithms, which do not affect segmentation, many segmentation improvement algorithms negatively affect correspondences between shapes. We present a method that iteratively improves both segmentation as well as correspondence by using SSMs not only to improve correspondence, but also to constrain the movement of vertices during segmentation improvement. We show that our method is able to maintain correspondence while achieving as good or better segmentations than those produced by methods that improve segmentation without maintaining correspondence. We are additionally able to achieve segmentations with better triangle quality than segmentations produced without correspondence improvement.
In this work we present a method for dense reconstruction of anatomical structures using white light endoscopic imagery based on a learning process that estimates a mapping between light reflectance and surface geometry. Our method is unique in that few unrealistic assumptions are considered (i.e., we do not assume a Lambertian reflectance model nor do we assume a point light source) and we learn a model on a per-patient basis, thus increasing the accuracy and extensibility to different endoscopic sequences. The proposed method assumes accurate video-CT registration through a combination of Structure-from-Motion (SfM) and Trimmed-ICP, and then uses the registered 3D structure and motion to generate training data with which to learn a multivariate regression of observed pixel values to known 3D surface geometry. We demonstrate with a non-linear regression technique using a neural network towards estimating depth images and surface normal maps, resulting in high-resolution spatial 3D reconstructions to an average error of 0.53mm (on the low side, when anatomy matches the CT precisely) to 1.12mm (on the high side, when the presence of liquids causes scene geometry that is not present in the CT for evaluation). Our results are exhibited on patient data and validated with associated CT scans. In total, we processed 206 total endoscopic images from patient data, where each image yields approximately 1 million reconstructed 3D points per image.
We present an automatic segmentation and statistical shape modeling system for the paranasal sinuses which allows us to locate structures in and around the sinuses, as well as to observe the variability in these structures. This system involves deformably registering a given patient image to a manually segmented template image, and using the resulting deformation field to transfer labels from the template to the patient image. We use 3D snake splines to correct errors in this initial segmentation. Once we have several accurately segmented images, we build statistical shape models to observe the population mean and variance for each structure. These shape models are useful to us in several ways. Regular registration methods are insufficient to accurately register pre-operative computed tomography (CT) images with intra-operative endoscopy video of the sinuses. This is because of deformations that occur in structures containing erectile tissue. Our aim is to estimate these deformations using our shape models in order to improve video-CT registration, as well as to distinguish normal variations in anatomy from abnormal variations, and automatically detect and stage pathology. We can also compare the mean shapes and variances in different populations, such as different genders or ethnicities, in order to observe differences and similarities, as well as in different age groups in order to observe the developmental changes that occur in the sinuses.
Functional Endoscopic Sinus Surgery (FESS) is a challenging procedure for otolaryngologists and is the main surgical approach for treating chronic sinusitis, to remove nasal polyps and open up passageways. To reach the source of the problem and to ultimately remove it, the surgeons must often remove several layers of cartilage and tissues. Often, the cartilage occludes or is within a few millimeters of critical anatomical structures such as nerves, arteries and ducts. To make FESS safer, surgeons use navigation systems that register a patient to his/her CT scan and track the position of the tools inside the patient. Current navigation systems, however, suffer from tracking errors greater than 1 mm, which is large when compared to the scale of the sinus cavities, and errors of this magnitude prevent from accurately overlaying virtual structures on the endoscope images. In this paper, we present a method to facilitate this task by 1) registering endoscopic images to CT data and 2) overlaying areas of interests on endoscope images to improve the safety of the procedure. First, our system uses structure from motion (SfM) to generate a small cloud of 3D points from a short video sequence. Then, it uses iterative closest point (ICP) algorithm to register the points to a 3D mesh that represents a section of a patients sinuses. The scale of the point cloud is approximated by measuring the magnitude of the endoscope's motion during the sequence. We have recorded several video sequences from five patients and, given a reasonable initial registration estimate, our results demonstrate an average registration error of 1.21 mm when the endoscope is viewing erectile tissues and an average registration error of 0.91 mm when the endoscope is viewing non-erectile tissues. Our implementation SfM + ICP can execute in less than 7 seconds and can use as few as 15 frames (0.5 second of video). Future work will involve clinical validation of our results and strengthening the robustness to initial guesses and erectile tissues.
The observation and 3D quantification of arbitrary scenes using optical imaging systems is challenging, but increasingly necessary in many fields. This paper provides a technical basis for the application of plenoptic cameras in medical and medical robotics applications, and rigorously evaluates camera integration and performance in the clinical setting. It discusses plenoptic camera calibration and setup, assesses plenoptic imaging in a clinically relevant context, and in the context of other quantitative imaging technologies. We report the methods used for camera calibration, precision and accuracy results in an ideal and simulated surgical setting. Afterwards, we report performance during a surgical task. Test results showed the average precision of the plenoptic camera to be 0.90mm, increasing to 1.37mm for tissue across the calibrated FOV. The ideal accuracy was 1.14mm. The camera showed submillimeter error during a simulated surgical task.
We present a system for registering the coordinate frame of an endoscope to pre- or intra- operatively acquired CT data based on optimizing the similarity metric between an endoscopic image and an image predicted via rendering of CT. Our method is robust and semi-automatic because it takes account of physical constraints, specifically, collisions between the endoscope and the anatomy, to initialize and constrain the search. The proposed optimization method is based on a stochastic optimization algorithm that evaluates a large number of similarity metric functions in parallel on a graphics processing unit. Images from a cadaver and a patient were used for evaluation. The registration error was 0.83 mm and 1.97 mm for cadaver and patient images respectively. The average registration time for 60 trials was 4.4 seconds. The patient study demonstrated robustness of the proposed algorithm against a moderate anatomical deformation.
Automating surgery using robots requires robust visual tracking. The surgical environment often has poor light
conditions where several organs have similar visual appearances. In addition, the field of view might be occluded
by blood or tissue. In this paper, the feasibility of near-infrared (NIR) fluorescent marking and imaging for
vision-based robot control is studied. The NIR region of the spectrum has several useful properties including
deep tissue penetration. We study the optical properties of a clinically-approved NIR fluorescent dye, indocyanine
green (ICG), with different concentrations and quantify image positioning error of ICG marker when obstructed