**0.48±0.13 mm**for femur and

**0.53±0.18 mm**for tibia. The results demonstrate that this method outperforms previous segmentation methods in capturing anatomical shape variations, accounting for image quality differences and guiding accurate segmentation.

## 1.

## Introduction

The segmentation of knee x-ray images has found wide applications in the analysis of anatomical structure and kinematics,^{1}2.3.^{–}^{4} the assessment of loss of joint space width (JSW), the diagnosis of osteoarthritis (OA) and osteoporosis in terms of fracture detection and bone density measurement, and the planning for joint replacement.^{5} Manual segmentation of these anatomical structures is time consuming and subjective. Our aim is to design a robust automatic segmentation method to extract the distal femur and proximal tibia contours from x-ray images. However, automatic segmentation is challenging due to the complex structure of the knee in x-ray images. Images may be contaminated by noise, sampling artifacts, spatial aliasing, insufficient contrast or resolution, or luminous intensities depending on the x-ray equipment such that the boundaries of the regions of interest become indistinct or disconnected. Interpatient bone shape, size, and deformation variations, such as osteophytes, add complexity to structure identification. Moreover, the loss of key features of the femur contour may occur from overlapping neighboring bones, such as the tibia, and occlusion by implants or operational tools.

Previous work on automatic femur and tibia segmentation has primarily focused on using the active shape model (ASM).^{6}7.8.^{–}^{9} ASM was first developed in the mid-1990s by Cootes et al.^{10} performing principal component analysis (PCA) on a series of manually marked training data to generate a statistical shape model. There are several advantages in using ASM: ASM has good performance for low contrast images and is based on user-defined forces. In addition, ASM is guided by prior knowledge of shape from manually annotated training dataset, making it less susceptible to imaging artifacts. Historically, the ASM has been modified and improved. Behiels et al.^{6} added a regulation term in the cost function to impose smoothness of shape changes. Lindner et al.^{7} integrated random forest regression voting to ASM to segment proximal femur. Seise et al.^{9} used a double contour active shape model to segment tibia and femur from knee x-ray images.

Original ASM using Mahalanobis distance (MD) to find the object of interest has some limitations. MD^{11} is a statistical measurement to determine similarity between sets that are assumed to have normal distribution. In an x-ray image, the point with the minimum MD between its surrounding intensity values and the mean lies on the border of the object. The MD calculation involves the inverse of a covariance matrix of the training data. If the covariance matrix is sparse, its inverse will be undefined and consequently, MD will be undefined. The covariance matrix becomes sparse when the number of pixels sampled is greater than the number of training images. Therefore, either having a limited training dataset or attempting to sample a large number of pixels would prove problematic.

The active appearance model (AAM)^{12} is another segmentation method that assumes the appearance spaces to be Gaussian. However, this assumption fails in real patients as the interpatient bone structure varies significantly, especially for bones with osteophytes. In such cases, a single mean mode is insufficient to capture the variations of appearance spaces with a single Gaussian distribution. A further limitation of AAM is that x-ray images of degenerated bones create outliers, which may lead to incorrect segmentation.

We present a robust segmentation method integrating spectral clustering and ASM to accurately extract bone contours in knee radiographs. Spectral clustering^{13} is an unsupervised learning algorithm to extract global information from an image so that only the most salient edge points remain and the original image has been denoised. Active contour search is then performed on the denoised x-ray image to obtain the candidate contour points with the highest intensity gradient. ASM constrains the contour in feasible shapes, leading to an accurate and robust segmentation result. This method captures large appearance variation, thus solving the limitations of ASM using MD and AAM.

The remainder of the paper is organized as follows: Sec. 2 provides a description of the segmentation framework, followed by the experiments and evaluation in Sec. 3 and conclusions in Sec. 4.

## 2.

## Approach

The proposed method consists of two parts: spectral clustering for image denoising and ASM for segmentation. In the following section, we will discuss spectral clustering in Sec. 2.1, ASM in Sec. 2.2, and the overall segmentation algorithm in Sec. 2.3.

## 2.1.

### Spectral Clustering

The traditional AAM^{12} is constructed from the intensity distribution of the region of interest. However, the intensities of the femur and tibia in x-ray images may vary significantly with different equipment and intensity settings, leading to an inaccurate appearance model. For images with large, interdataset intensity variations, spectral clustering was employed to determine the similarity between image pixels in a similar manner to the “normalized cuts” approach by Shi and Malik.^{13} Spectral clustering relies on the eigenvalues and eigenvectors of an affinity matrix defined in Eq. (1) to partition points into disjoint clusters with high intracluster similarity and high intercluster dissimilarity. The eigenvectors obtained from spectral partitioning are then used as weighted candidates for the following oriented segmentation to create a direct partition of the image, rather than using a clustering algorithm, such as $k$-means clustering.^{13} Our method differs from the method of Shi and Malik in that eigenvectors obtained from spectral partitioning are treated as denoised images for the following ASM-based segmentation, rather than being directly partitioned by $k$-means clustering.

As an input to spectral clustering, a sparse symmetric affinity matrix $W$ is constructed using the intervening contour cue,^{14} which is the maximal value of intensity (Im) along a line connecting two pixels in the image. The two pixels have a low affinity as a strong boundary separates them from a high affinity in the same region. The affinity matrix is formed by connecting all pixels $i$ and $j$ within a fixed radius $r$ with affinity:

Shi and Malik^{13} proposed a normalized cut, $N\mathit{cut}(A,B)$, to describe the degree of dissimilarity between two sets, $A$ and $B$. The normalized cut can be minimized using the condition:

## (2)

$${\mathrm{min}}_{x}\text{\hspace{0.17em}}N\mathit{cut}(x)={\mathrm{min}}_{v}\frac{{v}^{T}(D-W)v}{{v}^{T}Dv},$$Equation (2) is the Rayleigh quotient,^{15} which can be minimized by solving the generalized eigenvalue problem (generalized SVD):

^{16}themselves. Thus, each eigenvector, ${\mathbf{v}}_{i}$, is treated as an image, which is then filtered with Gaussian directional derivative filters at multiple orientations and weighted by ${\lambda}_{i}$ in Eq. (4):

## (4)

$$\mathrm{SIm}(x,y)=\underset{\theta}{\mathrm{max}}[{\mathrm{\Sigma}}_{i}{\nabla}_{\theta}{v}_{i}(x,y)].$$^{17}

To demonstrate the spectral clustering’s ability to denoise the original x-ray image, the first seven generalized eigenvectors are depicted compared to the same image filtered by a $3\times 3$ Sobel operator,^{18} as shown in Fig. 1. In practice, 16 eigenvectors were used for spectral clustering.^{16} To qualitatively compare the results from spectral clustering and Sobel operator, the intensity of both images is normalized to [0,1], as shown in Figs. 1(i) and 1(j). Furthermore, we compared the signal-to-noise ratio (SNR) of these two images; SNR is 16.05 for the spectral clustering image in Fig. 1(i) and 10.04 for the Sobel filtered image in Fig. 1(j). Both qualitative and quantitative results show that spectral clustering outperforms the conventional Sobel operator, because it extracts only the most salient edges in the image while the latter extracts data at all the edges.

## 2.2.

### Active Shape Model

Two commonly used segmentation strategies after performing spectral clustering are k-way segmentation^{13} and hierarchical segmentation.^{16} K-way segmentation can be improved by performing a recursive two-way cut, using a clustering algorithm, such as k-means to create a hard partition of the image. Unfortunately, this can lead to an incorrect segmentation as large uniform regions in which the eigenvectors vary smoothly are broken up. Hierarchical segmentation^{16} calculates nonoverlapped regions using an oriented watershed transform (OWT)^{19} producing an ultrametric contour map^{16} for segmentation. However, this method is time consuming as it searches the whole image, and infeasible segmentation cannot be avoided.

Using prior knowledge captured in ASM for segmentation is both more computationally efficient and robust to image inconsistencies. ASM represents the global shape information with the dominant shape variations in the training dataset. In this study, the training dataset includes a set of bone contours aligned to each other with the following method. First, a manual segmentation is performed of the objects of interest in the original x-ray images. The manually segmented landmarks are defined on the bone boundary that is segmented from the x-ray images by an operator. Ten equally spaced points were sampled between successive manual landmarks. The number of manually segmented landmarks is determined by the operator, with a range of the number of points from 92 to 150. These manually segmented landmarks are then interpolated by cubic spline to get contours. Finally, corresponding points between contours are identified from the interpolated contour points by first aligning centroids, then using rigid alignment using a standard vertex-to-vertex iterative closest point algorithm. This is followed by a general affine transformation to align the training contour to the new contour using 12 degrees of freedom (rotations, translations, scaling, and shear). New points on the new contour are created with similar local spatial characteristics to the template contour.^{20} Thus, the generated training set consists of $N$ training contours denoted as $S=\{{C}^{\alpha}|\alpha \in \{1,\dots ,N\}\}$, each contour ${C}^{\alpha}\subset S$ has a set of landmark points ${X}^{\alpha}\subset {C}^{\alpha}$, where ${X}^{\alpha}=\{{C}_{n}^{\alpha}|n\in \{1,\dots ,M\}\}$ and where ${\mathrm{c}}_{n}^{\alpha}=({x}_{n}^{\alpha},{y}_{n}^{\alpha})$ denotes the coordinates of the $n$’th landmark point in ${C}_{\alpha}$ The contours before and after alignment are shown in Fig. 2. The number of landmarks is 603 points for femur and 1470 points for tibia.

Once the training shapes have been aligned, the mean shape is calculated using $\overline{X}=\{{\overline{c}}_{n}|n\in \{1,\dots ,N\}\}$, where ${\overline{c}}_{n}=({\overline{x}}_{n},{\overline{y}}_{n})$ and ${\overline{x}}_{n}=(1/K)\sum _{\alpha}{x}_{m}^{\alpha}$ and ${\overline{y}}_{n}=(1/K)\sum _{\alpha}{y}_{m}^{\alpha}$, where $K$ is the number of points in each contour.

Subsequently, PCA is applied to the training dataset,^{10} so that any valid bone shape $\mathbf{x}$ can be represented as

^{21}

Figure 3 depicts the statistics of shape deformation, showing that after the first few eigenvectors, the variance becomes very small; thus, the first $m$ dominant eigenvectors are adequate to build the active shape model that accounts for the variability in the contour shapes.

To test how ASM can model the variation of bone shape, the shape parameter was limited to fall within the range $[-3\sqrt{{\lambda}_{i}},3\sqrt{{\lambda}_{i}}]$. The eigenvector is a set of displacement vectors along, which the mean shape is deformed. To illustrate this point, the first six eigenvectors have been plotted on the mean shape in Fig. 4, resulting in the deformation from the mean shape. Due to the limitations on the number of eigenvectors and the value of shape parameters, we cannot expect ASM based on remaining examples to model the test case perfectly. The average error is $0.27\pm 0.15\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{mm}$ and the maximal error is 1.10 mm. This comparison was performed by computing the RMS error of the closest corresponding points between the reconstructed contour from ASM and the original contour. The maximum error is the maximum RMS error between corresponding points of reconstructed contour from ASM and the original contour. These values represent lower limits on the errors obtained in the segmentation.

## 2.3.

### Algorithm

The overall framework consists of two parts: spectral clustering for image denoising and ASM for segmentation. Spectral clustering first generates an affinity matrix from the original image.

This is followed by a solution to the generalized eigenvalue problem using a linear combination of the first $n$ eigenvectors of the affinity matrix. The output of spectral clustering is a linear combination of the eigenvectors filtered by Gaussian directional derivative filters. The segmentation part has four steps: (1) the initial contours were placed near the actual boundary by manually adjusting translation, rotation, and uniform scale. (2) An active contour search is performed to obtain the maximal gradient response in the normal profile, as shown in Fig. 5. The calculated direction of normal profile is filtered by Savitzky–Golay filter^{22} to get a smooth profile. (3) ASM segmentation is used to constrain the contour points in a feasible shape. (4) Relaxation is performed by the active contour search in the local neighboring area to relax the constraint from ASM so that the final boundary is on the local gradient maximum. The overall algorithm is shown in Fig. 6.

## 3.

## Experiments and Evaluation

All images were obtained from public use datasets of the Osteoarthritis Initiative (OAI, version 0.E.1 clinical dataset; see the website at Ref. 23) of randomly selected $\sim 400$ anteroposterior knee radiographs for subjects suffering from OA. The images have been collected from different radiographic centers, resulting in large intensity differences due to the use of different x-ray and recording equipment. In addition, the presence of neighboring bones and soft tissue may result in incorrect segmentation when these artifacts have stronger edge intensities in the images than the bone of interest. Therefore, a robust automatic segmentation algorithm is important for accurate segmentation in a wide variation of x-ray images.

Experiments were designed to validate the proposed automatic segmentation algorithm. For the femur and tibia bone contours analyzed in this study, the testing images are left out and the statistical shape model is computed based on the training contours. Contours extracted from 328 x-ray images were used as the training dataset, and 80 femur and 20 tibia x-ray images were used as the testing set.

Segmentation results for each step are shown in Fig. 7. The bottom portion of the femur in Fig. 7(c) illustrates how the mistakenly segmented points in the tibia image are corrected by the ASM constraint.

Qualitative results from four examples of the validation test of the proposed automatic segmentation algorithm with respect to its capability for finding the correct femur and tibia contours are shown in Fig. 8. In these images, the ground truth (manual segmentation by an expert) is in green and the automatic segmentation is in red.

We used dice similarity coefficient (DSC) and the root mean square (RMS) error as quantitative evaluation with respect to the capability of finding the correct object boundary. DSC is used to compare the similarity between segmented and ground truth contours.^{24} The range of DSC is [0,1], with 0 indicating no overlapping and 1 indicating complete congruence of the two sets. DSC of the segmented femur contour is $0.97\pm 0.01$ with the worst case 0.93. DSC of the segmented tibia contour is $0.98\pm 0.01$ with the worst case 0.95. Because DSC measures the overlap ratio between the automatically and manually segmented contours, the above results indicate good overlapping of the automatic segmentation with the manual segmentation.

We compared RMS error between the original ASM algorithms, ASM with relaxation, ASM combined with MD, and the proposed algorithm (spectral clustering-based ASM). This comparison was performed by computing RMS error of the closest corresponding points between manually and automatically segmented contours. The maximum error is the maximum RMS error between corresponding points of aligned contours from automatic and manual segmentation. Totally, 80 images were used for femur testing and 20 images for tibia. Ideally, these distances are zero for a perfect fit when the automatic segmentation is the same as the manual segmentation. Table 1 shows the RMS error, its standard deviation and maximum error in millimeters.

## Table 1

Comparison of performance: original ASM, ASM with relaxation, ASM with MD, and spectral clustering-based ASM.

RMS error (mm) | Standard deviation (mm) | Maximum error (mm) | |
---|---|---|---|

ASM, femur | 0.62 | 0.17 | 1.17 |

ASM with relaxation, femur | 0.60 | 0.21 | 1.13 |

ASM with MD, femur | 1.01 | 0.23 | 1.49 |

Spectral clustering-based ASM, femur | 0.48 | 0.13 | 0.90 |

Spectral clustering-based ASM, tibia | 0.53 | 0.18 | 1.13 |

The robustness of the proposed automatic segmentation algorithm was examined with respect to the presence of neighboring bones and low image contrast. The segmentations with and without ASM, shown in Figs. 9(b) and 9(c), demonstrate that ASM avoids infeasible points when incorrect segmentation occurs at the neighboring bone.

## 4.

## Conclusions

This research presents a robust automatic segmentation algorithm combining statistical shape models and spectral clustering with the objective of accurately segmenting distal femur and proximal tibia in x-ray images of varying quality. The results demonstrate a significant improvement in addressing previously unanswered problems, such as the existence of neighboring bones, image artifacts or inconsistencies, significant contrast variations, and interpatient anatomical femur/tibia shape variations.

The proposed method has significant implications for JSW assessment. Incorrect segmentation primarily occurs in the femoral condyle and tibia plateau, both of which impact the measurement of JSW significantly. The JSW measurement uses the interior contour of the tibia,^{25} but the output of the proposed segmentation algorithm is the exterior contour of the tibia. However, the interior contour can be calculated as a postsegmentation step, using our result as an initial contour. Evaluation of JSW calculations is the subject of future work.

## Acknowledgments

The OAI is a public-private partnership comprised of five contracts (Grant Nos. N01-AR-2-2258; N01-AR-2-2259; N01-AR-2-2260; N01-AR-2-2261; N01-AR-2-2262) funded by the National Institutes of Health, a branch of the Department of Health and Human Services, and conducted by the OAI Study Investigators. Private funding partners include Merck Research Laboratories; Novartis Pharmaceuticals Corporation, GlaxoSmithKline; and Pfizer, Inc. Private sector funding for the OAI is managed by the Foundation for the National Institutes of Health. This manuscript was prepared using an OAI public use data set and does not necessarily reflect the opinions or views of the OAI investigators, the NIH, or the private funding partners. The authors would like to thank Dr. Michael Johnson, Dr. Hatem El Dakhakhni, and Jayne Dadmun for their help in preparing the paper. The authors would also like to thank the anonymous reviewers for their helpful comments.

## References

J. B. Stiehl et al., “Fluoroscopic analysis of kinematics after posterior-cruciate-retaining knee arthroplasty,” Bone Joint J. 77(6), 884–889 (1995).Google Scholar

M. R. Mahfouz et al., “A robust method for registration of three-dimensional knee implant models to two-dimensional fluoroscopy images,” IEEE Trans. Med. Imaging 22(12), 1561–1574 (2003).ITMID40278-0062http://dx.doi.org/10.1109/TMI.2003.820027Google Scholar

A. Sharma, R. D. Komistek and M. R. Mahfouz, “*In vivo* kinematics evaluation in flexion of patients implanted with primary TKA,” Tech. Knee Surg. 10(2), 66–72 (2011).http://dx.doi.org/10.1097/BTK.0b013e31821cabb2Google Scholar

J. Wu et al., “Fully automatic initialization of two-dimensional-three-dimensional medical image registration using hybrid classifier,” J. Med. Imaging 2(2), 024007 (2015).JMEIET0920-5497http://dx.doi.org/10.1117/1.JMI.2.2.024007Google Scholar

T. E. McAlindon et al., “Knee pain and disability in the community,” Br. J. Rheumatol. 31(3), 189–192 (1992).BJRHDF1460-2172http://dx.doi.org/10.1093/rheumatology/31.3.189Google Scholar

G. Behiels et al., “Active shape model-based segmentation of digital X-ray images,” in Medical Image Computing and Computer-Assisted Intervention, pp. 128–137 (1999).Google Scholar

C. Lindner et al., “Fully automatic segmentation of the proximal femur using random forest regression voting,” IEEE Trans. Med. Imaging 32(8), 1462–1472 (2013).http://dx.doi.org/10.1109/TMI.2013.2258030Google Scholar

J. Mu et al., “Segmentation of knee joints in x-ray images using decomposition-based sweeping and graph search,” Proc. SPIE 7962, 79620H (2011).PSISDG0277-786Xhttp://dx.doi.org/10.1117/12.878414Google Scholar

M. Seise et al., “Segmenting tibia and femur from knee x-ray images,” in Proc. of Medical Image Understanding and Analysis, pp. 103–106, Bristol, United Kingdom (2005).Google Scholar

T. F. Cootes et al., “Active shape models—their training and application,” Comput. Vision Image Underst. 61(1), 38–59 (1995).http://dx.doi.org/10.1006/cviu.1995.1004Google Scholar

P. C. Mahalanobis, “On the generalised distance in statistics,” Proc. Nat. Inst. Sci. India 2(1), 49–55 (1936).Google Scholar

T. F. Cootes et al., “Active appearance models,” IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001).ITPIDJ0162-8828http://dx.doi.org/10.1109/34.927467Google Scholar

J. B. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000).ITPIDJ0162-8828http://dx.doi.org/10.1109/34.868688Google Scholar

C. Fowlkes, D. Martin and J. Malik, “Learning affinity functions for image segmentation: combining patch-based and gradient-based approaches,” in IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, Vol. 2, pp. 54–61 (2003).http://dx.doi.org/10.1109/CVPR.2003.1211452Google Scholar

G. H. Golub and C. F. Van Loan, Matrix Computations, John Hopkins University Press, Baltimore, Maryland (1989).Google Scholar

P. Arbeláez et al., “Contour detection and hierarchical image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011).ITPIDJ0162-8828http://dx.doi.org/10.1109/TPAMI.2010.161Google Scholar

S. Belongie and J. Malik, “Finding boundaries in natural images: a new method using point descriptors and area completion,” in Computer Vision—ECCV 1998, pp. 751–766, Springer, Berlin/Heidelberg (1998).Google Scholar

R. C. Gonzalez and R. E. Woods. Digital Image Processing, Addison-Wesley Longman Publishing, Co., Inc., Boston, MA (1992).Google Scholar

L. Najman and M. Schmitt, “Geodesic saliency of watershed contours and hierarchical segmentation,” IEEE Trans. Pattern Anal. Mach. Intell. 18(12), 1163–1173 (1996).ITPIDJ0162-8828http://dx.doi.org/10.1109/34.546254Google Scholar

B. C. M. Merkl and M. R. Mahfouz, “Unsupervised three-dimensional segmentation of medical images using an anatomical bone atlas,” in 12th Int. Conf. Biomed. Engineering, Singapore, Singapore (2005).Google Scholar

T. F. Cootes and C. J. Taylor, “Statistical models of appearance for computer vision,” in Technical Report, University of Manchester, United Kingdom (2000).Google Scholar

A. Savitzky and M. J. E. Golay, “Smoothing and differentiation of data by simplified least squares procedures analytical chemistry,” Anal. Chem. 36(8), 1627–1639 (1964).ANCHAM0003-2700http://dx.doi.org/10.1021/ac60214a047Google Scholar

The Osteoarthritis Initiative (OAI) database provided the data used in the preparation of this article, which is available for public access at http://www.oai.ucsf.edu/ (January 2016).Google Scholar

K. H. Zou et al., “Statistical validation of image segmentation quality based on a spatial overlap index—scientific reports,” Acad. Radiol. 11, 178–189 (2004).http://dx.doi.org/10.1016/S1076-6332(03)00671-8Google Scholar

J. Duryea et al., “Trainable rule-based algorithm for the measurement of joint space width in digital radiographic images of the knee,” Med. Phys. 27, 580–591 (2000).MPHYA60094-2405http://dx.doi.org/10.1118/1.598897Google Scholar

## Biography

**Jing Wu** earned her bachelor’s and master’s (2005 and 2008) degrees in material science and engineering (MSE) from Shanghai Jiao Tong University, and a PhD in biomedical engineering (2016) from the University of Tennessee. She is currently a junior research engineer at TechMah LLC. She has 20+ publications in journal and conference proceedings. Her current research interests include medical imaging, computer vision, signal processing, machine learning, and optimization.

**Mohamed R. Mahfouz** is a professor of biomedical engineering at Mechanical Aerospace and Biomedical Engineering Department and director of the Center of Musculoskeletal Research at the University of Tennessee, Knoxville. He has authored book chapters, over 150 articles, and numerous abstracts and has given over 40 guest lectureships. His interests include ultra-wide band microwave, biomedical instrumentation, medical imaging, motion kinematic analysis, surgical navigation, biosensors, and 3-D bone and soft tissue reconstruction. He was the recipient of numerous NIJ, NIH, and NSF grants. He was awarded the titles of career development professor and research fellow by UT’s College of Engineering.