Translator Disclaimer
7 August 2019 Measurement model and observability analysis for optical flow-aided inertial navigation
Author Affiliations +

We introduce a navigation filter fused with information of visual optical flow and data collected from an inertial measurement unit during GPS signal degradation. Under the assumption that the tracked feature points are located on a level plane, the feature depth can be explicitly expressed and an exact measurement model was derived. Moreover, an error model analysis for a block-matching-based optical flow algorithm has been investigated. The measurement error follows a Gaussian distribution, which is a prerequisite for leveraging the error-state Kalman filter. Subsequently, a local observability analysis of the proposed filter was performed yielding the expression of three unobservable directions. We emphasize the ability of the proposed filter to estimate the aircraft’s state, especially for accurate altitude estimation, without any help of prior knowledge or extra sensors. Finally, an extensive Monte Carlo analysis was used to verify the findings in the observability results showing that all states can be estimated except the absolute horizontal positions and rotation around the gravity vector. The effectiveness of the proposed filter is demonstrated through experimental hardware used to acquire outdoor flight test data.



During the past decades, there has been a growing research interest in developing vision-based or vision-aided navigation systems for unmanned aerial vehicles (UAVs) under circumstances without prior information.14 The traditional GPS/inertial navigation system (INS) may face severe drift and inaccuracy for the estimation, especially in environments where GPS is not reliable. Integrated inertial systems using sensors such as radar, LiDAR, or sonar can be regarded as a possible method to improve accuracy.5 However, radar or LiDAR may increase the cost, weight, and power consumption for an airborne platform, while low-cost sonar always shows limited performance for flights in outdoor scenes. Camera-aided navigation and visual odometry, on the other hand, provide rich information about the environment and the camera pose, which seems to be a superior solution to the above problems.

In recent years, simultaneous localization and mapping gathered a lot of attention since it can provide accurate and real-time estimates of six-dimensional parameters. Nevertheless, it is subject to a series of strict requirements such as high-quality images for feature extraction, hardware with powerful onboard processor needed as computational complexity increase related to the quantity of features, and repeated tracking for the same image features over longer periods of time is needed to realize loop-closure.6 Moreover, there are few applications for fixed-wing aircraft that are faster and higher than quadcopters in an outdoor unstructured environment. Optical flow-aided INS evades the need for mapping and landmark data and shows better tolerance in terms of handling scenes with less salient features and little contrast texture. In addition, it has a compact image representation and does not require the extra burden for image storage without loss of rich navigation information on the translation and rotational velocity as well as the scene depth.7 It is more suitable for implementation in systems with limited resources. Previous tests have been conducted to compare the performance between different optical flow algorithms for UAV navigation, as introduced in Ref. 8. The optical flow observation error is usually modeled with the default zero mean normal distribution without any explanations.9 We turn to computer vision scientists for a better answer, but it was found that they mainly focus on the evaluation methodology to give new statistics for indicating the performance.10 They have raised definitions such as “absolute flow endpoint error” and “frame interpolation error,”11 rather than proposing a specific model analysis for the flow error. Furthermore, the use of INS aided with optical flow generated by a single camera often adopts an additional ultrasound range sensor to measure the distance to the ground,12 which solves the ambiguity of the scale factor between the translational velocity magnitude and the scene depth.9 Another commonly used method assumes that the specific maneuvers for UAV could be designed to keep a constant height or attitude. Five optimization constraints-based methods for the estimation of ego-motion were presented in Florian Raudies’ review,10 which can successively solve the problem. As an example, Thomas P. Webb13 leveraged epipolar constraint of feature points as the measurement equation in an implicit extended Kalman filter, also termed the essential or coplanarity constraint, and optical flow subspace constraint is similar to its differential version. However, the filter does not perform well in estimating the velocity states as the epipolar constraint weakly relates to velocity. A bilinear constraint-based visual-inertial scheme for relative motion estimation has been introduced.7 The observability analysis of INS errors for explicit measurements of a subspace constraint-based optical flow method was investigated in Jacques Waldmann’s work.14 These approaches require sufficient numbers of optical flow features to obtain the corresponding linear equations for the unknowns. Xu et al.2 presented an autonomous landing method using planar terrain features tracked in sequential images. Using the assumption of a level plane, therefore allowing for the scale of the line-of-sight vector to be expressed explicitly.

Compared to existing work, the main contributions of the proposed research focused on (1) a detailed error model analysis for optical flow measurement based on block-matching algorithm was investigated and showed to obey a Gaussian distribution and therefore suitable for error-state Kalman filter (ESKF); (2) the feature depth was obtained without extra-measurement device through the assumption of flat terrain, which could be reasonably used for outdoor navigation missions when UAVs fly at a high altitude. In the flat terrain assumption, the height proved to be observable by local observability analysis, which was further verified by simulation and experimental results. In addition, a more detailed development of the ESKF is proposed, which fused the information from IMU and optical flow originally described in Refs. 15 and 16. The paper is organized as follows: Sec. 2 describes the problem formulation, introduces the adopted sensors’ dynamical model, reviews the computation of an optical flow block-matching algorithm, and derives the measurement error analysis. Section 3 outlines the structure of the ESKF to estimate the full states. Section 4 presents the observability analysis. Sections 5 and 6 show the simulation, experimental results, and corresponding discussion; the final conclusions are given in Sec. 7.


Problem Formulation

The geometry of the general case of an aircraft equipped with an IMU and a monocular camera is shown in Fig. 1. Symbol {B} represents the aircraft body-fixed frame, and the vehicle relative position to the global inertial reference frame {G} is denoted by the vector rG. The camera frame {C} is fixed to the aircraft with a mounting displacement ΔB. We denote the coordinates of the j’th feature point Pj in the frame {G} by PGj, while PjC denotes the relative position of Pj w.r.t. the camera frame {C}. The transformation matrices TGB and TBC describe the rotations from {G} to {B} and {B} to {C}, respectively, where TBC is dependent on the mounting of the camera to the aircraft. The relative position of Pj can be expressed in the camera frame {C} as

Eq. (1)


Fig. 1

Problem geometry and pinhole camera model.


Considering the differential equation with respect to the reference frame {G} would require calculating the derivative of the rotation transformation matrix. Therefore, the rotational velocities of the vehicle expressed in the frame {B} and {C} are represented by ωB and ωC, respectively, resulting as

Eq. (2)

where P˙Gj=0 derives from the assumption for static feature point. Consider a strapdown camera that can be assumed to have a fixed location coinciding with the center of gravity of the aircraft, so that {B} and {C} can be regarded as the same frame, i.e., ΔB=Δ˙B=0, and TBC is the identity matrix, which yields the simplified Eq. (3):

Eq. (3)



Pinhole Camera Model

This work used a pinhole camera model, whose details are shown in Fig. 1. The camera coordinate frame is defined by zC paralleling to the optical axis, xC and yC matching the horizontal and vertical image directions. The projection results of the feature Pj on the image plane are expressed as

Eq. (4)

where f is the focal length, [μj,νj]T is the projected pixel coordinates of the feature point, and PCj=[xCj,yjC,zCj]T. With the previous assumption on the identity of the frame {B} and {C}, the coordinate of the feature expressed in the camera frame is given as

Eq. (5)


When considering the flight environment for fixed-wing aircraft, which generally holds the flight height above hundreds of meters, flat terrain such as plain and farm field the topographic relief can be neglected compared to the flight height. In such scenes, we assume that the visual features lie in a level plane. Further, without loss of generality, by assuming the feature plane contains the origin of the global inertial frame, we obtained e3TPjG=0, where e3=[001]T. We also adopted the unit quaternion qGB=q0i+q1j+q2k+q3 to represent the rotation from the global coordinate frame {G} to the body-fixed coordinate frame {B} at the time t. Hence, TGB can be replaced by C(qGB) which describes the rotational matrix corresponding to qGB. Thus, the scale of line-of-sight vector and also the depth component of PjC, namely zCj in Eq. (5) can be represented as

Eq. (6)



Optical Flow Computation

After the characterization of the motion equation of the camera and pinhole camera model, we focused on the optical flow computation method. In this work, the optical flow was determined as the displacement between two successive images divided by the time interval between their capture. Based on the local brightness constancy assumption and small motion assumption, the block-matching algorithm approximates the image motion by a displacement, which yielded the best match between image regions. The concept of the procedure, shown in Fig. 2, as well as the specific algorithm for displacement computation outlined in Fig. 3, starts with distinct feature extraction and ends up with minimizing the SSD cost function among the search window,9,17 and the result provided the displacement d(Xk) of the current feature, which can be formulated as

Eq. (7)


Eq. (8)

where IO(·) and IC(·) are both image intensity functions. Thus, the two-dimensional (2-D) optical flow ζ˙ can be obtained, i.e., ζ˙=d(Xk)Δt.

Fig. 2

Block-matching method.


Fig. 3

Algorithm for displacement computation.


Furthermore, we proved that the error of optical flow computation based on the SSD block-matching algorithm is subject to a Gaussian distribution, which is not fully explained by most reviewed articles, such as Farid Kendoul’s work.9 Therefore, we redefined the whole algorithm as a maximum likelihood estimation problem. The displacement of the pixel can be seen as a warp function ω, which transforms a pixel point pk(xk,yk) in the original image frame (IO) into a point pk(xk,yk) in the current image frame (IC) by a 2-D deformation vector ζ, namely pk(xk,yk)IO, ω(pk,ζ)=pk(xk,yk)IC. Based on the local brightness constancy assumption, which is given as

Eq. (9)

without loss of generality, εk can be modeled as Gaussian noise as εkN(0,σ2). Given the b×b size block of pixels, the problem aimed at the estimation of ζ by a patch of pixel samples {IO(pk)}k=1n=b×b. According to Eq. (9), IO(pk)N{IC[ω(pk,ζ)],σ2}, thus the corresponding probability density function is provided as

Eq. (10)


Hence, we derived the likelihood function for the pixel samples as

Eq. (11)

where the log-likelihood can be written as

Eq. (12)

in which, nlogC only contains items irrelevant to ζ. The maximum likelihood estimator ζ^ maximizes the log-likelihood, i.e., ζ^=argmaxζlogL(ζ), which is equivalent to finding the optimal displacement value that minimizes the least square function according to Eqs. (7) and (8). Therefore, we obtained ζ^=argminwdx,dywSSD(Xk,dx,dy).

The maximum likelihood estimate (MLE) is proven to be asymptotically normally distributed,18,19 and the estimated covariance error of MLE is given by the inverse of the observed Fisher information matrix; therefore, the maximum likelihood estimator can be expressed as

Eq. (13)

where ζ0 denotes the true displacement parameter, which can be obtained by the SSD block-matching algorithm. It is worth mentioning that in this context, the SSD block-matching algorithm is a brute force method that would guarantee returns with the global optimum. Matrix J(ζ0) is the Fisher information for an individual observation, while J(ζ^ML) is the observed Fisher information matrix and is defined as

Eq. (14)

which is also the expectation of the negative Hessian matrix of the log-likelihood function.18 Since the optical flow ζ˙ was calculated with the displacement motion divided by interframe time, we saw that the optical flow error maintained a Gaussian distribution, which was the prerequisite for adopting the Kalman filter.

Next, we deduce the measurement equation of optical flow. The projected pixel coordinates of the feature point Pj on the image plane are given by Eq. (4), taking the time-derivatives of both sides of Eq. (4) leads to the expression for optical flow velocity

Eq. (15)


Consider the relative motion between the camera and feature point by replacement of P˙Cj in Eq. (3) into Eq. (15), and eliminating [xjC,yjC,zjC] with [μj,νj,f] based on Eqs. (4) and (6), the optical flow equation can be computed as

Eq. (16)

where vG is the absolute velocity of the aircraft in the global inertial frame, i.e., r˙G=vG, and ξj was modeled as zero mean, white Gaussian noise with covariance Rξj which is related to the inverse of the negative Hessian for log-likelihood function as shown in Eq. (14).


IMU Measurement

The IMU outputs are measured quantities in the body-fixed frame, angular rate ωm and acceleration am, which were modeled as20,21

Eq. (17)

where ωR3 and aR3 are the true value, which are corrupted by IMU biases bω and ba, IMU drift noise nω and na which assumed to be zero-mean Gaussian white noise. g is the gravitational acceleration. The IMU biases are nonstatic but rather are seen as a random walk process with the biases noise nbω and nba. All the noise terms n=[naT,nωT,nbaT,nbωT]T are specified by the corresponding scalar variance coefficient σna2, σnω2, σnba2, and σnbω2; the diagonal covariance matrices are given by Qa=σna2I3, Qω=σnω2I3, Qba=σnba2I3, and Qbω=σnbω2I3. According to Refs. 22 and 23, they can be obtained by the Allan variance technique.


Estimator Description

With the purpose to integrate the two different sources of information collected by IMU and monocular-camera, an ESKF was designed. While the IMU information served to make predictions to the filter, the optical flow vision information was used to correct the filter. Compared to the general extend Kalman filter, which consists of states prediction step and update step, the ESKF considers the error states induced by the reset process, producing more precise results by reducing the long-term error drift in odometry system.


System State and Propagation

The estimator vector was the 16×1 vector, and all the symbol definition can be found in Sec. 2.

Eq. (18)


The system kinematic equation that describes the time evolution of the state was similar to Hesch’s model,21 of which the corresponding linear discretized form of predicted state estimate is provided as

Eq. (19)


In order to derive the process noise covariance matrix, the error dynamic needed to be derived from the kinematic equation along its nominal form. We defined the error state in the estimate x^ of a quantity x for aircraft position, velocity, and IMU biases as δx=xx^. And it is exceptive for the quaternion error, which was defined by the 3×1 angle-error vector δθ, as δq[12δθT1]T. Then, the linearized first-order approximation of continuous-time error-state kinematics14,20,24 are provided as

Eq. (20)


In practical terms, the ESKF is realized in a time-discrete manner, with the update time interval Δt=tk+1tk, which was also seen as the sampling period of IMU signals. By using numerical integration of Eq. (20), we obtained the discrete error-state equation

Eq. (21)

where Φk was the transition matrix and Gk denoted the perturbation matrix of noise vector nk. In the process of integration and discretization, based on Joan Sola’s results,20 the following auxiliary series was introduced:

Eq. (22)

by assigning the value of n with {0,1,2,3}, we obtained

Eq. (23)


Here, Γ0T means the incremental rotation matrix if rotating an arbitrary coordinate frame with a rotational rate of ω for timespan Δt according to the Rodrigues rotation formula; therefore, the discrete linearized error dynamic transition matrix Φk as a function of above series24 (for readability Γ^nT=ΓnT[ωm(k)b^ω(k)]) can be written as in matrix

Eq. (24)


Eq. (25)


The discrete process noise covariance matrix Qk was calculated as

Eq. (26)


Thus, the error covariance propagation resulted as

Eq. (27)



Measurement Update Equation

According to Eq. (16), the optical flow measurement can be expressed by the estimated state vector. Since the aircraft angular rate was not part of the state vector, we replaced it with ω^B=ωm(t)b^ω(t). Substituting the coefficient matrices associating with μj, νj, and f in Eq. (16), and it would be clearer with notation listed as

Eq. (28)


Eq. (29)


Furthermore, by subtracting measured angular rate item and redefining the revised optical flow measurement ζ¯˙j as a new measurement of the filter, the right side was expressed only using the state vector x^; therefore, the following notation remains as

Eq. (30)


So that the measurement residual, also called innovation, was given as

Eq. (31)

where the Jacobian measurement was

Eq. (32)


Eq. (33)


Eq. (34)


Eq. (35)


Eq. (36)


Note that the residual noise ηkj was involved with the optical flow algorithm error and gyroscope drift noise, i.e., ηkj=ξkjBPj·nω,k, hence, the covariance matrix of the revised optical flow measurement was the sum of the covariance of optical flow algorithm Rξj and the gyroscope covariance Qω, which was given as

Eq. (37)


In general, there were a set of optical flow vectors measured at one time, stacking all the observation residuals of Eq. (31) into one vector, which yielded, yk=Hkδxk+ηk, where yk, Hk, and ηk were composed of the block elements ykj, Hkj, and ηkj, such that the measurement covariance matrix could be stacked together diagonally as

Eq. (38)


Then, the a priori state estimate would be updated with current measurements according to the following steps: (1) calculate the Kalman gain as

Eq. (39)

and (2) calculate the observed error state correction as

Eq. (40)

with these quantities, the nominal states and error covariance get updated as

Eq. (41)



Error-State Kalman Filter Reset

For the ESKF reset process, we considered the expression among the new error δx^k+, the old error δx^k, and the observed error correction Δxk, as the true value was constant, yielding x^k++δx^k+=x^k+δx^k. Considering that the observed error has been injected into the updated nominal state, i.e., x^k+=x^k+Δxk so that δx^k+=δx^kΔxk, we defined an error reset function δxg(δx)=δx^kΔxk, where the reset operation was to ensure the error would reset δx0, meanwhile, the covariance of error needed to be modified as well,14 thus leading to

Eq. (42)

where G represented the Jacobian matrix according to Joan Sola’s proof15 defined as

Eq. (43)



Observability Analysis

The estimator employed a linearized and discretized version of the nonlinear system model (to simplify the analysis, we only consider one single feature point):

Eq. (44)

where errors caused by linearization and discretization are combined with the noise items nk and ηk. With the purpose of revealing the theoretical limitation of the observer, a tool for the analysis of the observability properties—local observability matrix M(x) was adopted,25 which was defined by the error dynamic transition matrix Φk and measurement Jacobian Hk, i.e.,

Eq. (45)


Eq. (46)


Eq. (47)


Here Δtk=tkt1, and the computing method for obtaining analogous expressions of the subblock elements Φmn can be found in Joel Hesch’s work.26 Furthermore, using these expressions of Φmn, we obtained the k’th block row Mk of M(x), for k2

Eq. (48)

and both Φmn and Mk are presented in detail in Sec. 8. It is noted that Mk4 and Mk5, as containing the nonzero integrals are time-varying matrices and have the linearly independent columns. So that we can only consider the remaining block elements of Mk to form the basis of the right nullspace of M(x), which is stated as follows:

Theorem I:

The right nullspace N of the local observability matrix M(x) for the linearized model is spanned by three unobservable directions:

Eq. (49)


The proof of the Theorem I is detailed in Sec. 9.

Nullspace N reveals that global rotations about the gravity vector is not observable, and global horizontal position is unobservable while height is fully observable. The remaining states, i.e., the velocity, pitch, and roll angle as well as IMU biases are also observable. The fact that the rotation around the gravity vector is unobservable is due to the missing of an absolute inertial yaw orientation, namely the yaw angle in the inertial frame is unobservable, while the absolute roll and pitch angle of aircraft w.r.t the global inertial frame is made observable by the gravity vector measurement. Similarly, without a global reference, the position in the direction of x and y is not observable, but what is different from previous findings, the altitude is observable because the assumption that optical flow features used for filter updates lie in a level plane and the plane contains the origin of the inertial frame. Only use the feature-plane assumption, according to Eq. (6), the feature depth is explicitly determined by absolute altitude and aircraft attitude, which becomes part of the components in the optical flow measurement equation and it, therefore, can get corrected when the new measurement update is coming.


Numerical Simulation

To numerically investigate the previous analysis, the proposed navigation scheme was implemented in a simulation. As shown in Fig. 4, there were 100 features randomly placed in a [350,350  m]×[350,350  m] horizontal plane around the origin. For generating the ground truth and sensor readings, we simulated a flight with a piecewise continuous trajectory, beginning with constant velocity at 20  m/s straight line motion lasting for 4 s, following by banking turn with a maximal angle of bank of 30 deg, and then a spiral motion with a pitch angle of up to 9 deg at the altitude between 200 and 300  m, ending with a 50 s of circling. A full list of parameters is shown in Table 1, in which most of those quantities were referred in the commonly used IMU datasheet, e.g., MPU6050 or ADIS16448.

Fig. 4

IMU-camera trajectory and feature used in the simulation.


Table 1

Simulation parameters and initial conditions of the filter [“rand (3, 1)” here represents a 3×1 random vector obeying standard normal distribution].

IMU sampling rate (Hz)100
Camera process rate (Hz)30
Accelerometer noise [m/(ss)]σna=2.24×103
Gyro noise [deg/s]σnw=0.005
Accelerometer bias noise [m/(s2s)σnba=7.53×105
Gyro bias noise [rad/(ss)]σnbw=1.08×105
Field of view (FOV) (deg)90
Optical flow noise (pixel·rad/s)0.01
StateInitial valueTrue value
Position (m)True value + 50·rand(3,1)[-50 -180 -200]
Velocity (m/s)True value + 10·rand(3,1)[20 0 0]
QuaternionTrue value + 0.5·rand(3,1)[0 0 0 1]
Accelerometer bias (m/s2)[0 0 0][0.0981 0.0981 0.0981]
Gyro bias (deg/s)[0 0 0][0.5 0.5 -0.5]

The choices of the initial state values are given in Table 1, which is showing that the filter had fault tolerance to the initial state offset to an extent for convergence. In addition, the initial filter error covariance matrix was set as P0=diag([2500I3100I30.25I30.01I37.6×105I3]).

Even though using a single optical flow measurement ensured the observability, in a practical trial through proper setting of FOV and initial feature quantity, we would have at least seven features to be extracted every time step to obtain better convergence. After running 100 Monte Carlo simulations, we computed the Monte Carlo standard deviation, also known as the root mean squared error, denoted by the sample error covariance PjMonte through collecting N simulation results as

Eq. (50)


Here, at the j’th time step, xj was the true state, and x^ij was the estimated value in the i’th simulation. The Monte Carlo standard deviation and filter standard deviation are the square root of the elements in the main diagonal of PjMonte and ESKF estimation error covariance, respectively.

In Figs. 5Fig. 6Fig. 7Fig. 89, it is shown the results of the proposed filter tracking ground truth for the position, attitude, velocity, and IMU bias. The shown results outline that most of the state estimations converge to ground truth occurred within 20 s, except the unobservable states, positions along x and y axes and the yaw angle, whose Monte Carlo standard deviation as well as the filter standard deviation at the final time maintained around values at the initial time. The unobservable states could remain with a constant error rather than absolute divergence since they could have proper evolution and IMU biases correction but lacking the initial reference. This correlated well with the observability analysis of horizontal position and yaw in Sec. 4.

Fig. 5

Position estimation and corresponding Monte Carlo and filter standard deviation σ hull with respect to the ground truth.


Fig. 6

Attitude estimation and corresponding Monte Carlo and filter standard deviation σ hull with respect to the ground truth.


Fig. 7

Body-fixed velocity estimation and corresponding Monte Carlo and filter standard deviation σ hull with respect to the ground truth.


Fig. 8

Inertial velocity estimation and corresponding Monte Carlo and filter standard deviation σ hull with respect to the ground truth.


Fig. 9

IMU bias estimation and corresponding Monte Carlo and filter standard deviation σ hull with respect to the ground truth: (a) the accelerometer bias and (b) the gyro bias.


Among the observable states, as the observability analysis disregards process noise and initial offset, as the filter standard deviation decreases promptly, the Monte Carlo standard deviation seems to converge to zero slowly. Comparing with curves of the gyroscope bias, the uncertainty of accelerometer bias presented a much slower convergence, which was induced by the gyroscope bias being explicitly included in the measurement equation and therefore directly corrected while the accelerometer bias only relied on states propagation and obtained the effect via coupling with other states. For the estimation of roll and pitch angle, height as well as vertical velocity, a better agreement between the Monte Carlo sample covariance and the filter error covariance than other states are shown in plots; nevertheless, the estimation of horizontal inertial and body-fixed velocity relied on mutual tight coupling effect with estimated attitude contained with accelerometer bias and unobservable yaw angle.



The proposed filter has been tested on an X-star quadcopter which was equipped with Pixhawk 4 and PX4Flow board. The optical flow camera consisted of a 752×480 MT9V034 image CMOS sensor with global shutter and 16 mm M12 lens, 24×24  μm, whose output to optical flow data was 4×4 binned image at 10 Hz. The camera module could be considered as a pinhole camera with the mounting method for pointing the camera straight down. The employed IMU was a dual redundancy solution with BMI055 and ICM-20689, providing inertial measurements at 125 Hz.

Ground truth was collected by the conventional INS based on extended Kalman filter and attitude and heading reference system algorithm, which integrated IMU outputs, magnetometer readings, and barometric altitude with GPS data. Note that these extra sensors were only for generating true value and not a part of the proposed navigation system, nor used for estimates.

In Fig. 10 the outdoor trajectory during the autonomous controlled waypoint mode flight test is shown. The flight envelope was set with a maximum height at 10 m and maximum speed around 7  m/s. The quadcopter took off at the origin, as we had little information of the IMU, the estimated states were initialized with zero value as this was the only option. The covariance parameters could be selected based on hardware specifications, which did not require strong tuning as the framework could work well with a large range of parameters.

Fig. 10

Test flight path.


From results in Figs. 11Fig. 1213, the filter operation was started in the flight at around t=70  s and at a speed of 5  m/s. The roll and pitch angle exhibit a high-quality tracking accuracy except the growing error occurs when the vehicle was ready to descend for landing at the time after 150 s, as the horizontal velocity drops which was not favorable for the accuracy of optical flow measurement. Due to misalignment of the initial value, the yaw angle always kept a deviation with true value, which might make a major contribution to the estimated error of velocity along x and y axes, especially during t[120,140  s] as they deviate from ground truth with a greater oscillation. Under unknown noise of the hardware system or vehicle vibration, the horizontal velocity and horizontal position estimates approached their true value more slowly and less precisely, but vertical velocity, as well as altitude estimate, exhibited more satisfactory accuracy.

Fig. 11

Position in test flight.


Fig. 12

Attitude in test flight.


Fig. 13

Inertial velocity in the test flight.




This paper has presented an optical flow-based approach to vision-aided inertial navigation. Under the assumption that the environment of the observed feature points can be seen as located on a plane, the feature depth can be obtained by deducing distance expression through the optical flow equation with a geometry-coordinate relationship based on a vehicle-camera-feature system. In order to fill in previous works’ gap, an error analysis for optical flow measurement using a block-matching method was provided, which was the prerequisite for adopting and designing an ESKF. With the IMU dead reckoning used for prediction, optical flow, and feature measurement for correction, the estimates of the vehicle’s altitude, attitude, velocity, and drift IMU biases can be obtained. The theoretical limitations and feasibility of the filter were investigated by observability analysis, which has shown that only the absolute position along x and y axes along with the yaw angle were unobservable. Finally, the Monte Carlo simulation and hardware involved experiment validated the statement in observability analysis.

The numerical and real experiments have shown that the effectiveness and performance of the proposed filter may suffer from initialization errors and noise. Considering that the minimal sensor configuration was low-priced and the adopted method required less computational effort, the overall system could be an onboard estimator for providing real-time supplemental state estimate when the UAV would be placed in an unknown GPS-denied environment.


Appendix I: Expression of Error Dynamic Transition Matrix Subblock Elements Φmn and Calculation of Mk

According to Ref. 26, the sub-block elements used in this paper are shown as

Eq. (51)


Moreover, with these expressions, and based on Eqs. (33)–(36), the k’th block row Mk of M(x) was calculated, for k2

Eq. (52)


Eq. (53)


Eq. (54)


Eq. (55)


Eq. (56)

Mk5=Hδr,kΦ15+Hδv,kΦ25+Hδθ,kΦ35+Hδbω,k=e3TCkTπPe3TrkGrkGeT3APCkvkGe3T·t1tkt1t(asGg)×t1sCτTdτdsdte3TCkTπPe3TrkGAPCk·t1tk(asGg)×t1sCτTdτ  dsAPCkvkGe3Te3TrkGCkTπP×Ckt1tkCtTdte3TCkTπPe3TrkGAPCkvkG×t1tkCtTdtBP.


Appendix II: The Proof of Theorem I


To verify the result, we just directly multiplied each block row of M(x) with N.


Eq. (57)


Eq. (58)


Since e3T=[001], both e3TCkTπPe3TrkGrkGe3TAPCkvGke3T and APCkvkGe3Te3TrkG had such structure [00*00*], which were marked by Σk1 and Σk2, respectively, hence

Eq. (59)


Eq. (60)


Eq. (61)


Given Σk1·×g=02×1, Σk2·×g=02×1, and g×g=02×1, it yielded H1N=02×3, MkN=02×3, so that we had M(x)N=0.


The first author wishes to thank Professor Walter Fichter and Dr. Lorenz Schmitt for their useful comments and language editing, which have greatly improved the manuscript. The authors also would like to thank the China Scholarship Council for proving the funding that made this research possible.



L. Schmitt and W. Fichter, “Observability criteria and null-measurement Kalman filter for vision-aided inertial navigation,” J. Guid. Control Dyn., 39 (4), 770 –780 (2016). Google Scholar


C. Xu, D. Wang and X. Huang, “Autonomous navigation based on sequential images for planetary landing in unknown environments,” J. Guid. Control Dyn., 40 (10), 2587 –2602 (2017). Google Scholar


Y. Liu and C. Wang, “Hybrid real-time stereo visual odometry for unmanned aerial vehicles,” Opt. Eng., 57 (7), 073104 (2018). Google Scholar


J. Hosen et al., “Vision-aided nonlinear observer for fixed-wing unmanned aerial vehicle navigation,” J. Guid. Control Dyn., 39 (8), 1777 –1789 (2016). Google Scholar


X. Zhang et al., “Integrated navigation method based on inertial navigation system and Lidar,” Opt. Eng., 55 (4), 044102 (2016). Google Scholar


R. Mur-Artal, J. M. M. Montiel and J. D. Tardos, “ORB-SLAM: a versatile and accurate monocular SLAM system,” IEEE Trans. Rob., 31 (5), 1147 –1163 (2015). Google Scholar


H. He, Y. Li and J. Tan, “Relative motion estimation using visual–inertial optical flow,” Auton. Rob., 42 (3), 615 –629 (2018). Google Scholar


M. Mammarella et al., “Comparing optical flow algorithms using 6-dof motion of real-world rigid objects,” IEEE Trans. Syst. Man Cybern. Part C, 42 (6), 1752 –1762 (2012). Google Scholar


F. Kendoul, I. Fantoni and K. Nonami, “Optic flow-based vision system for autonomous 3D localization and control of small aerial vehicles,” Rob. Autom. Syst., 57 (6–7), 591 –602 (2009). Google Scholar


F. Raudies and H. Neumann, “A review and evaluation of methods estimating ego-motion,” Comput. Vision Image Understanding, 116 (5), 606 –633 (2012). Google Scholar


S. Baker et al., “A database and evaluation methodology for optical flow,” Int. J. Comput. Vision, 92 (1), 1 –31 (2011). IJCVEQ 0920-5691 Google Scholar


N. Gageik, M. Strohmeier and S. Montenegro, “An autonomous UAV with an optical flow sensor for positioning and navigation,” Int. J. Adv. Rob. Syst., 10 (10), 341 (2013). Google Scholar


T. P. Webb et al., “Vision-based state estimation for autonomous micro air vehicles,” J. Guid. Control Dyn., 30 (3), 816 –826 (2007). JGCODS 0731-5090 Google Scholar


J. Waldmann, R. I. G. da Silva and R. A. J. Chagas, “Observability analysis of inertial navigation errors from optical flow subspace constraint,” Inf. Sci., 327 300 –326 (2016). Google Scholar


J. Sola, “Quaternion kinematics for the error-state Kalman filter,” Barcelona, Spain (2017). Google Scholar


M. Bloesch et al., “State estimation for legged robots-consistent fusion of leg kinematics and IMU,” Robotics: Science and Systems VIII, 17 –24 The MIT Press, Sydney (2013). Google Scholar


K. Fujita et al., “A debris image tracking using optical flow algorithm,” Adv. Space Res., 49 (5), 1007 –1018 (2012). ASRSDW 0273-1177 Google Scholar


R. W. Keener, “Estimating equations and maximum likelihood,” Theoretical Statistics: Topics for a Core Course, 175 –201 1st ed.Springer, New York (2010). Google Scholar


Y. Pawitan, “Score function and Fisher information,” In All Likelihood: Statistical Modelling and Inference Using Likelihood, 213 –228 Oxford University Press, Oxford (2001). Google Scholar


A. B. Chatfield, “Inertial instrumentation,” Fundamentals of High Accuracy Inertial Navigation, 174 59 –78 AIAA, New York (1997). Google Scholar


A. I. Mourikis and S. I. Roumeliotis, “A multi-state constraint Kalman filter for vision-aided inertial navigation,” in IEEE Int. Conf. Rob. and Autom., 3565 –3572 (2007). Google Scholar


N. El-Sheimy, H. Hou and X. Niu, “Analysis and modeling of inertial sensors using Allan variance,” IEEE Trans. Instrum. Meas., 57 (1), 140 –149 (2008). IEIMAO 0018-9456 Google Scholar


J. Rehder et al., “Extending kalibr: calibrating the extrinsics of multiple IMUs and of individual axes,” in IEEE Int. Conf. Rob. and Autom., 4304 –4311 (2016). Google Scholar


D. P. Koch et al., “Relative multiplicative extended Kalman filter for observable GPS-denied navigation,” (2017). Google Scholar


J. A. Hesch et al., “Camera-IMU-based localization: observability analysis and consistency improvement,” Int. J. Rob. Res., 33 (1), 182 –201 (2014). IJRREL 0278-3649 Google Scholar


J. A. Hesch et al., “Consistency analysis and improvement of vision-aided inertial navigation,” IEEE Trans. Rob., 30 (1), 158 –176 (2014). Google Scholar


Jia Deng received her BS degree at Beihang University, Beijing, in 2010, and she is currently pursuing her PhD at Beihang University, Beijing, China. She mainly studies navigation, control theory, and corresponding application and recently has focused on the data fusion problems and cooperative formation flight control system.

Sentang Wu received his PhD in dynamics, ballistics, and aircraft motion control systems from Ukraine National Aviation University in 1992. He mainly studies the theory and application of nonlinear stochastic systems, computer information processing, and control and aircraft cooperative control. He is currently a professor of automation science and electrical engineering and a doctoral tutor at Beihang University.

Hongbo Zhao received her BS degree at Beihang University, Beijing, and she is currently pursuing her PhD at Beihang University, Beijing, China. She mainly studies the control theory and application, distributed state estimation of stochastic systems, which focus on multiagent systems, and cooperative formation flight control systems with uncertainties.

Da Cai received his BS degree in control science and engineering from Beihang University, Beijing, China, in 2012 and also received his PhD in control science and engineering in 2018. His research interests include multirobot networks, distributed algorithms, and aircraft cooperative control.

© The Authors. Published by SPIE under a Creative Commons Attribution 4.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
Jia Deng, Sentang Wu, Hongbo Zhao, and Da Cai "Measurement model and observability analysis for optical flow-aided inertial navigation," Optical Engineering 58(8), 083102 (7 August 2019).
Received: 14 April 2019; Accepted: 17 July 2019; Published: 7 August 2019

Back to Top