Sound evokes complex patterns of cellular motion within the hearing organ. A detailed characterization of these motions is essential in order to understand the way cochlear hair cells, the sensory cells of the hearing organ, are stimulated in response to sound.1, 2 The small size, the anatomical location inside thick bone, the fragility of its cells, and the complex three-dimensional (3-D) geometry, combined with a requirement for probing motion at the submicron scale makes it challenging to characterize hearing organ vibrations. Although a number of techniques3, 4, 5, 6 have allowed obtaining a great amount of information, no technique can be said to fully solve the problem.
The most sensitive method is laser interferometry, which allows in vivo measurements of cochlear vibrations down to the nanoscale (for reviews, see Refs. 7, 8). A major limitation of this approach is that it only permits measurements at one point at a time, making it hard to assess cellular deformations and the way different structures are mechanically coupled. An alternative approach is to use optical microscopy combined with motion analysis algorithms. This has permitted a more detailed monitoring of the motion patterns of the hearing organ, but has thus far been restricted to two dimensions.9, 10, 11, 12
Many algorithms useful for extracting object motion from an image scene have been described (for a review of the older literature, see Ref. 13; more recent developments are reviewed in Ref. 14). A key assumption in many of these algorithms is that brightness changes originate solely from motion. Horn and Schunck15 formulated this mathematically in the brightness constancy constraint equation [Eq. 1] which states that the sum of the spatial and temporal derivatives of the image sequence is zero. This implies that all changes in pixel brightness are due to motion alone. Because of detector noise, changes in cellular properties, and in the case of fluorescence experiments, bleaching, this constraint can never be fully satisfied in any biological experiment. Also, in the case of two-dimensional image sequences, the algorithm will only detect the projection of the motion on the image plane. Cells are usually part of 3-D structures that do not respect the two-dimensional (2-D) boundaries of an image. Consequently, it is desirable to develop algorithms that can assess the full 3-D motion pattern while providing a degree of immunity to unavoidable brightness variations. Such an algorithm is described here. The algorithm is an extension of the wavelet-based approach described previously.16
Methods and Materials
Optical Flow Formulations
Our approach to optical flow estimation is based on applying a differential constancy constraint equation15 to the imagesdenotes image intensity ( and being its temporal derivative and spatial gradient, respectively) and is the unknown displacement vector, at a particular position and time . For Eq. 1 to be useful, the motion must be sampled at a high enough rate and with a voxel size small enough for the motion vector to vary little among neighboring voxels. However, even under these conditions, Eq. 1 is generally not exactly satisfied in biological experiments. It may however be assumed to be valid after a proper normalization of the image sequence.17 To implement this idea, let us assume that the image is given by , where satisfies 1 and is a multiplication factor taking into account intensity variations not caused by motion of objects in the observed scene. The image then satisfies is a function of position and time, which we refer to as the brightness variation rate, because it determines the intensity variation of a given image particle along its trajectory. This multiplicative model is convenient for the case of the positive images generated by our confocal microscope, but additive models are, in principle, equivalent and could also be used.
The optic flow components and the brightness variation factor are now to be estimated altogether. To this end, we apply a discrete wavelet transform (DWT) of the image: , where runs over the number of scales used in the DWT, and labels the polarizations of the wavelet filters (there are eight such filters in three dimensions). Equation 2 then yields a set of constraints (one constraint for each wavelet filter),, , , and vary little over the supports of the wavelet filters . This set of equations can be written in the matrix form and are matrices of sizes and , respectively ( being the number of wavelet components used), defined by3 is found by least-squares inversion, i.e., we choose to minimize ) is given by , where and denote the transpose and inverse of the matrix , respectively. The above least-squares estimation procedure is local in the sense that the total least-squares error is obtained by minimizing the local errors with respect to independently for each position and time.
In practice, local irregularities sometimes occur even in areas where the aperture problem is expected not to affect results. To reduce such irregularities, the optical flow map was low-pass filtered with a 3-D Gaussian kernel with a standard deviation of . For our images, this kernel size turned out to be a good compromise, allowing irregularities to be substantially reduced without spoiling the optical flow accuracy by oversmoothing.
All computations were performed using Matlab (the Mathworks, Natick, Massachusetts), running on a Linux PC with of random access memory.
Animal Preparation and Visualization
The preparation used in these experiments has previously been described in detail.18, 11, 19 In brief, temporal bones from young anesthetized guinea pigs were excised, using procedures approved by the local ethics committee (permit no. N319/06). The bulla was opened, and the preparation attached to a holder in a chamber containing tissue culture medium (Minimum Essential Medium with Hank’s salts, Gibco, Paisley, Scotland). A small opening was made in the basal turn of the cochlea [see schematic drawing in Fig. 1 ], and a thin piece of plastic tubing inserted in scala tympani (ST) to allow perfusion of oxygenated medium and fluorescent dyes, which were used to label the inner ear structures. The sensory hair cells and auditory nerve fibers were labeled by the styryl dye RH795. Supporting cells were stained by the cytoplasmic dye calcein/acetoxymethyl ester (both from Molecular Probes, Leiden, the Netherlands). An additional opening in the apical part of the cochlea allowed visualization of the organ of Corti while providing exit for the medium.
The preparation was visualized with a water-immersion objective (Achroplan , numerical aperture 0.8, Zeiss, Jena, Germany) using a laser-scanning confocal microscope (LSM 510, Zeiss) equipped with a krypton/argon laser and a helium/neon laser. To avoid photobleaching and consequent cellular damage, all stacks were acquired at the minimum laser power compatible with an acceptable signal-to-noise ratio in the images. In most experiments, this resulted in a laser setting of of the maximum power for the laser line and for the line.
Organ of Corti Motion Evoked by Pressure Changes
ST pressure was changed by moving the perfusion reservoir above and below the fluid level of the preparation in a sequence forming a cycle, typically 0, , 0, , , where corresponded to the surface of the liquid surrounding the preparation. Each position was maintained for a few seconds to allow the hearing organ to reach a new equilibrium position, and a stack of confocal images acquired. The typical pixel format of the stacks was , requiring about to acquire at a pixel dwell time of . The spacing between the sections was .
The pressure changes in the ST are linearly related to the changes in height of the reservoir. They remain within the physiological range and may be assumed a fraction of or less.20 These changes induce reproducible, micron-sized movements of the cochlear partition.
In these experiments, the orientation of the cochlea varied and to compare results across experiments, it is necessary to use a standardized coordinate system. The reference frame that we used is an orthogonal coordinate system whose three axes at any particular point along the cochlea are defined as follows. The -axis, or longitudinal axis, is parallel to the coil of the cochlea and points toward its base; the -axis, or radial axis, is taken parallel to the plane of the reticular lamina and oriented from the inner hair cells to the outer hair cells; the -axis, or transverse axis, is taken orthogonal to and . Figure 1 shows the coordinate system superimposed on a maximum brightness projection of a stack of images acquired at different focal planes within the organ of Corti.
Validation of the Optical Flow Algorithm
The optical flow algorithm was evaluated on two different synthetic image sequences, one showing a sine pattern and the other a more complex template image showing cochlear structures. Both images had a maximal amplitude of 1 and minimum amplitude of zero. A uniform translation was applied to each pattern, using bicubic interpolation. The exact motion is therefore known, and a direct measure of the error in the optical flow estimation is possible. Because noise is unavoidably present in real image sequences, Gaussian or Poisson-distributed noise was added to each frame following interpolation. No systematic difference between the two types of noise was noted. The Gaussian noise had a mean of zero and standard deviation ranging between 0.01 and 0.04.
Results are illustrated in Figs. 2, 2, 2 for the sine sequence. The standard deviation of the Gaussian noise in Fig. 2 was 2.5%, representing an image with noise levels typical of those found experimentally. In the absence of added noise, the magnitude error, which is defined as the difference in Euclidean vector magnitude of the estimated and true motion vectors, was found to be for all displacements of voxels [Fig. 2 solid line]. A systematic downward translation of the entire curve was observed when the noise level increased. At intermediate noise levels, this downward translation resulted in curves with magnitude error throughout the range tested.
It is also important to determine the direction of motion. The angular error [Fig. 2] is a measure of the difference in orientation of the estimated and true motion vectors in the plane that they span. The angular error was found to be at all tested noise levels and displacements. For this artificial image, the direction of motion was therefore more precisely estimated than its size.
The above computations were performed on an image sequence that lacked variations in brightness over time. To simulate an actual experiment with bleaching due to laser exposure, additional tests were performed on sequences where the image amplitude was reduced by 2% for each motion step. After interpolation and brightness reduction, Gaussian random noise was added as described above. The results are shown in Figs. 2, 2, 2. The left half of Fig. 2 is identical to the one shown in Fig. 2. To give a measure of the magnitude of the brightness variations, the right half of Fig. 2 shows the final image in the series, where brightness was reduced to 56% of the original. Apparently, the algorithm can handle such brightness variation. The main effect is a reduction in the slope of the magnitude error curves [Fig. 2], which results in larger magnitude errors at high noise levels. However, angular errors were almost unchanged [Fig. 2]. All the curves remain , except for large displacements at the highest noise level. To estimate the improvement from using the brightness-compensated algorithm, we also performed computations where brightness compensation was removed from the algorithm. When using sequences with the above reduction in brightness, we found a substantially reduced performance for small motions ( voxels): The magnitude errors increased and so did angular errors. However, for larger displacements, the results were similar to those obtained with the brightness-compensated algorithm. The performance did not depend on the pattern of brightness reductions: A linear decrease in brightness over time gave the same results as a monoexponential one.
The moving sine pattern is an unrealistic test case because it contains only a single spatial frequency. Sharp boundaries are absent, making this sequence difficult for any gradient-based method. A more realistic test is achieved by using real images, which contain larger spatial variations in brightness.
Figure 2 shows a group of hair cells in situ. Important cellular structures such as the stereocilia bundles at the top of the outer hair cells are clearly visible. Noise levels, brightness variations and imposed motions were identical to those used in Fig. 2, 2, 2. From Fig. 2, it is evident that small displacements are detected with high fidelity when noise levels are low , but there is an overestimation of motion for the range between 3 and 5 voxels. At higher noise levels, good motion detection is seen at voxels, but results deteriorate for displacements larger than this. Thus, in terms of the magnitude error, some noise seems to produce better results than no noise. The reason for this unusual, but quite desirable behavior is not currently known. However, the angular error behaves differently. Near-perfect estimation of vector directions was achieved for artificial motions that lacked noise. Results get systematically worse with increasing noise, but the error remains at at all noise levels if the motion is voxels.
The factor in Eq. 2 provides an estimate of the brightness changes in the image sequence. To evaluate the accuracy of this estimate, we first assessed its variability in image sequences that lacked variations in brightness. In this case, the algorithm returned a faithful estimate, provided that motion was voxels [Fig. 3 ]. The most precise estimates were obtained when a moderate amount of noise was present. The “no-noise” condition as well as the extreme case of noise with a standard deviation of 0.05 both caused performance deterioration. The brightness estimates therefore appear to behave similarly to the motion estimates in Fig. 2. When actual brightness variations were present, a similar pattern was seen [Fig. 3]. The estimated brightness tracked the real brightness change, and the best estimates were obtained at intermediate noise levels.
Validation of the Experimental Setup
To assess the stability of our experimental setup, we estimated the random motion of the table and scanning mirrors of the confocal microscope by imaging a test sample repeatedly under identical conditions. The sample was a slide containing roots from convallaria majalis, which have a stable structure with bright fluorescence. A series of 11 three-dimensional image stacks were acquired, at two different magnifications, with a pause between each acquisition. Note that averaging was not used, and that the pixel dwell time was kept brief, resulting in images with substantial noise [Fig. 4 ]. The values for the stage motion given below therefore represent worst-case estimates because the noise in the images will lead to increased errors when calculating motion.
The root-mean-square displacement per step, , estimated from selected trajectories on the image, was computed asis the number of steps and represents the ’th displacement along a given trajectory . Similarly, the mean cumulative displacement after steps is defined by is the cumulative displacement after steps over the trajectory .
The displacement of the scanned volume was estimated with the optical flow algorithm. As seen in Fig. 4, the motion of the microscope stage from one stack to the next appeared to be random, with a root-mean-square displacement per step and a cumulative displacement after 10 steps of (pixel size ). Note that motion along the -axis, the position of which changes during acquisition of a focus series, was most pronounced. This probably relates to slight imperfections in the focus motor controlling the position of the microscope’s table. We may define a quantity, , which is interpreted as an effective number of steps, which should be approximately equal to if the stage was undergoing unconstrained Brownian motion. In Table 1 , it is seen that is substantially smaller than , which indicates that the stage and scanning system is in fact physically constrained. Obviously, this is a very desirable feature. During , the system does not wander around its mean position by more than a fraction of , a prerequisite for its use for motion measurements. Note that this constitutes an upper bound on stage motion, as this value will be influenced by any error present in the optical flow calculation in addition to the actual motion of the stage. We conclude that errors caused by random shifts within the microscope were small and affected the optical flow estimates made in the cochlear preparation by less than a few percent.
Estimates of the step size along the trajectory shown in Fig. 4. np is the number of image points over which averages are taken; θ1 is the root-mean-square displacement of the stage for a duration of 60s ; θn is the cumulative rms displacement after a duration of n×60s , here for n=10 ; neff=(θn∕θ1)2 is to be interpreted as an effective number of steps, which should be approximately equal to n if the stage was undergoing unconstrained Brownian motion. The small neff indicates that the stage is physically constrained and does not wander around its mean position by more than a fraction of a micron.
Optical Flow Estimate of 3-D Organ of Corti Motion
The main motivation for the development of our optical flow algorithm was to analyze the 3-D motion patterns of the complex cellular structures of the hearing organ in response to different kinds of mechanical stimulation, including sound. As a first step, we analyzed the 3-D displacement evoked by quasi-static ST pressure changes. Such pressure changes were used in several previous studies;20, 18 this type of stimulation leads to a consistent motion pattern within the organ of Corti, the magnitude of which depends on the quality of the seal between the cochlea and the perfusion tubing, the size of the opening at the apex, and the ST impedance.
Seven different experiments showing pressure change–induced movements of outer hair cells in the apical part of the cochlea were analyzed using identical parameters in the optical flow algorithm. An example showing data from two preparations is given in Fig. 5 . As seen in Figs. 5 and 5, the orientation of these two preparations with respect to the optical axis of the microscope differed substantially. The optical section in Fig. 5 is well aligned with the long axis of the cylindrical outer hair cells, whereas the preparation in Fig. 5 was nearly perpendicular to this, with the short axis of the hair cells nearly parallel to the image plane. The optical flow vectors were therefore converted to the standard coordinate system shown in Fig. 1 where the transverse axis is perpendicular to the reticular lamina, the radial axis points away from the center of the cochlea, and the longitudinal axis is directed along the cochlear spiral, toward its base.
Positive ST pressures were used in Fig. 5. The resulting motions of the apex of a first row outer hair cell are plotted with dashed lines in Fig. 5 and 5. The transverse displacement, given on the -axis, was largest, with amplitude of . There was also a smaller motion component directed to the apex of the cochlea, along the cochlear spiral. This longitudinal component reached an amplitude of . As seen in Fig. 5, the smallest motion was directed to the center of the cochlea. In the preparation shown in Fig. 5, negative ST pressures were used. Despite the very different orientation of this preparation, the main axis of the motion vectors is quite similar to the ones described above, and the transverse component remains the largest one. This gives additional confidence in the optical flow algorithm. Evidently, the motion of the organ of Corti evoked by these pressure changes is complex, with significant components in all three directions.
Confocal microscopy and optical flow makes it possible to assess the deformation of cells inside the hearing organ. Image data were used to generate the 3-D wire-frame models shown in Fig. 6 . Figure 6 shows an inner hair cell with its characteristic pear-shaped cell body, and its apical surface perpendicular to the transversal axis. The original wire-frame model of the cell (red) was deformed according to the computed optical flow map. This generated a second wire frame, shown in gray. Following a pressure change, motion components in all three directions were seen, but the largest component was in the radial direction, and the cell therefore appears to be translated toward the right of the image. Figure 6 shows data from an outer hair cell in the same preparation as Fig. 6. The two cells were close to each other, but not at exactly the same longitudinal position. Note the different scales used in Figs. 6 and 6. At the outer hair cell, a small motion component in the transversal direction was present, but the largest motion was radial. This radial motion is much larger than that found at the inner hair cells. The radial motion was larger near the bottom of the cell, which therefore shows a swinging motion, but little apparent deformation. By comparing the data in Figs. 5 and 6, it is apparent that motions at the surface of the hearing organ, displayed in Fig. 5, are different from those found inside (Fig. 6). In particular, the cell bodies of outer hair cells make sideways motions that are larger than those seen at the surface. These sideways motions were bigger than the surface motion in six out of seven cases.
In this paper, we present a new method for brightness-adjusted 3-D optical flow calculation based on the discrete wavelet transform. Brightness adjustment is highly desirable because bleaching is a common experimental problem in confocal microscopy. Cells may also undergo physiologically relevant changes that alter their fluorescence. Such changes could severely limit the capacity for motion detection, unless a brightness-compensated algorithm is used.
Our algorithm was validated with artificial as well as real image sequences that were degraded by noise. These tests show that the performance of the algorithm depends on the properties of the image, a factor that needs to be considered when performing experiments. Obviously, an image with poor contrast, high noise levels, and small gradients will result in suboptimal motion estimation. Efforts should therefore be made to control noise and, because errors increase for motions of voxels, to tailor the voxel size to the expected magnitude of the motion. Fortunately, both of these factors can to a large extent be controlled on current confocal microscopes.
A fundamental physical problem that cannot be circumvented is that brightness changes originating from motion cannot be fully separated from other types of brightness changes (e.g., Ref. 21). In practice, this creates a trade-off between motion detection capabilities and brightness change detection. The primary purpose of our algorithm is motion detection, and as seen in the performance evaluations in Fig. 2, the algorithm performs very well even under conditions of elevated noise.
Hearing Organ Motion
Optical flow algorithms have previously been used in several studies in the context of hearing research.9, 10, 16, 22, 23 Their use arises naturally from the fact that the hearing organ is a complex structure, the function of which depends on interactions between different cell types. Probing such interactions requires a system capable of at least 2D measurements.
The quasi-static pressure changes investigated in the current study represent a first step toward investigating sound-evoked cellular interactions in the hearing organ. We show here that the motion evoked by such pressure changes is complex, with components in all three directions. At the surface of the hearing organ, the reticular lamina, the transversal motion component was the largest one, followed by the longitudinal one. The longitudinal component was directed toward the apex of the cochlea. This may be a consequence of the anatomical arrangement of cells within the tissue: The Deiters cells are supporting cells with long processes extending in the apical direction from the base of the outer hair cells. This results in longitudinal mechanical coupling that may promote motion directed at the apex, in the forward direction of the traveling wave.
Interestingly, the cell bodies of outer hair cells showed larger motion than those seen at the surface. A similar behavior was observed in a previous study using 2-D optical flow, where motion was provoked by pressure gradients much larger than those utilized here.18 This behavior also occurs during sound stimulation,10 and during electrical stimulation of isolated cochleae, where movements inside the hearing organ were substantially larger than those seen at the surface.24 Collectively, these results suggest that radial movements of outer hair cell bodies contribute substantially to the internal mechanics of the organ of Corti.
We thank Igor Tomo for help in performing experiments. We were supported by the Swedish Research Council, the Human Frontier Science Programme, the Tysta Skolan foundation, Hörselskadades Riksförbund, and funds of the Karolinska Institute.