Convergence and differentiation of Zernike expansion: application for an analysis of odd-order surfaces

Abstract. Odd-order surfaces have begun to be used in optics. In order to investigate the aberration characteristics of such surfaces, Zernike expansion is widely used since it directly and explicitly corresponds to wavefront aberrations. Since the Zernike expansion of an odd-order surface contains an infinite number of terms, the convergence of the expanded sum and the possibility of termwise derivatives are not explicitly guaranteed mathematically. We give a complete proof for these problems. For an application of this result, we analyze the aberration characteristics of odd-order surfaces and present their effectiveness in optical design.

Abstract. Odd-order surfaces have begun to be used in optics. In order to investigate the aberration characteristics of such surfaces, Zernike expansion is widely used since it directly and explicitly corresponds to wavefront aberrations. Since the Zernike expansion of an odd-order surface contains an infinite number of terms, the convergence of the expanded sum and the possibility of termwise derivatives are not explicitly guaranteed mathematically. We give a complete proof for these problems. For an application of this result, we analyze the aberration characteristics of odd-order surfaces and present their effectiveness in optical design. © The Authors.

Introduction
New types of aspheric surfaces and their mathematical expressions have been proposed in optical design. [1][2][3][4][5] In particular, the effectiveness of odd-order surfaces, which are expressed with terms including an odd-integer power of the radius, has begun to be recognized via the designs of viewfinders, 6 microscopes, 7 camera lenses, 8 projection display optics, 9 EUV optical systems, [10][11][12] and so on. Since aberration characteristics of odd-order surfaces differ largely from those of conventional aspheric surfaces, it is meaningful and important to analyze the features of odd-order surfaces mathematically.
Shibuya et al. 11 pointed out that not only surface shape but also surface slope should be fully approximated to express an optical surface because the refraction of light is determined by the tangent of the surface. From this viewpoint, by considering Taylor expansion, they proved that no odd-order surfaces can be fully represented by even-order terms and thus deduced that the odd-order surfaces have different aberration characteristics from conventional even-order surfaces. They concluded that this peculiarity of odd-order surfaces leads to effectiveness in optical design. 11,12 However, they ignored the possibility that a finite number of Zernike polynomials fully approximates odd-order surface shapes and slopes.
To analyze aberration characteristics of aspheric surfaces mathematically, Zernike expansion is the most common tool because each Zernike polynomial corresponds to a specific geometrical aberration. Zernike polynomials were originally introduced by Zernike 13,14 as the eigenfunctions of a rotationally symmetrical and self-adjoint differential equation on the unit disk. The orthogonality and the L 2 -completeness of Zernike polynomials are direct consequences of this definition. 15 Another definition was given by Bhatia and Wolf 13,16 from the viewpoint of symmetry. In this literature, convergence of Zernike expansion is not explicitly discussed, but is regarded as an implicit condition. A mathematician, Szegö,17 presented an explicit and strong solution for the convergence problem. That is, uniform convergence is fulfilled for a wide class of orthogonal expansion.
Braat and Janssen 18 gave an example of explicit Zernike expansion coefficients by factorial functions or the gamma function. For derivatives of Zernike polynomials, many researchers such as Nijboer 19 obtained the recurring formulae of derivatives to convert wavefront aberrations into geometrical aberrations. This derivation was based on the three term equalities of Gauss's hypergeometric function. Janssen 20 discussed the derivative formulae in connection with a Laplace operator. He presented the effectiveness of his method in the calculation and measurement of geometrical aberrations. He also mentioned that this method is applied to solve the Neumann problem of differential equations. By studying the algebraic structure of differential operators on the complex unit disk, Wünsche 21 generalized these discussions and derived a generalized form of orthogonal polynomials in the analogy of quantum theory.
As mentioned above, expansion by orthogonal polynomials and its derivatives have been deeply investigated. However, as far as we know, a concrete formulation for odd-order surfaces cannot be found. In particular, both the convergence of Zernike expansion and the convergence of termwise differentiation of Zernike expansion are not discussed in any literature at all.
The goal of this study is to prove that both shapes and slopes of odd-order surfaces are fairly approximated by finite numbers of Zernike polynomials. To show this, we address the convergence of Zernike expansion and their derivatives specifically for odd-order surfaces. Section 2 defines Zernike polynomials and describes their derivation formulae. We present the expansion coefficients of oddorder surfaces and an estimation of the decreasing speed of Zernike coefficients mainly for odd-order surfaces. In Sec. 3, we originally give a proof of the convergence for Zernike expansion in surface shapes and slopes for oddorder surfaces using the result of Sec. 2. In Sec. 4, we practically show the effectiveness of the expansion formula by its numerical estimation. By applying the result to lens design of a Schmidt surface, we demonstrate that both the corrector shape and slope are fully approximated by a finite number of Zernike polynomials.
In addition, the resultant approximation of an odd-order surface is not an example of Taylor expansion, which will be briefly explained in Sec. 2.1. Thus, the result of our method does not contradict the consequences of Ref. 11, which describes the impossibility of Taylor expansion of oddorder surfaces.
To express aspherical surfaces, expansion into power series, Zernike expansion, Qcon polynomials, and so on have been used. The method described in this paper allows us to analyze odd-order surfaces by Zernike polynomials and power series. Expressing odd-order surfaces by Qcon surfaces is our future issue.
where t ¼ r 2 , r is the normalized radial coordinate, and n is a non-negative integer. This definition of the right-side differential is called the Rodrigues formulae. 15 Explicitly, one obtains the first six polynomials as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 6 3 ; 3 8 0 Q 0 ðtÞ ¼ 1; Their orthogonality is expressed as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 6 3 ; 2 5 0 where δ m;n is Kronecker's delta. Since fQ n ðtÞg is a complete orthogonal set on the unit interval of [0, 1], any L 2 function fðtÞ is expanded by fQ n ðtÞg. Namely, E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 6 3 ; 1 5 3 fðtÞ ¼ a 0 Q 0 ðtÞ þ a 1 Q 1 ðtÞ þ a 2 Q 2 ðtÞþ · · · ; (4) where the coefficients are E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 3 2 6 ; 7 5 2 a n ¼ ð2n þ 1Þ The expansion by orthogonal polynomials such as Eq. (4) is also expressed as a polynomial fðtÞ ¼ c 0 þ c 1 t þ c 2 t 2 þ · · · c N t N by taking the terms up to N and rearranging the order of terms by Eq. (2). Although this formal expansion does not give an example of Taylor expansion, this is actually a fair approximation of the original function fðtÞ. (If the expression fðtÞ ¼ c 0 þ c 1 t þ c 2 t 2 þ · · · is a Taylor expansion, the coefficients c k s do not depend on the number of terms. However, in this case, the constant term c 0 ¼ P N k¼0 ð−1Þ k a k obviously depends on the number of terms N. Hence this is not an example of Taylor expansion.) This fact will be demonstrated in Sec. 3.3 Eq. (31) using fðtÞ ¼ t 3∕2 as an example.

Theoretical Estimation of the Decreasing Speed of Zernike Coefficients
For a generic monomial fðtÞ ¼ r 2α ¼ t α ðα > 0Þ, the coefficients of Eq. (5) are expressed by the gamma function. 18 That is, E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 3 2 6 ; 4 8 5 Hence, the expansion becomes E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 3 2 6 ; 4 2 9 Braat and Janssen 18 gave an equivalent formulation by analytical extension of an even-order term. We present another derivation of this expression by use of the beta function Bðp; qÞ. Substituting fðtÞ ¼ t α into Eq. (5) and integrating by part, E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 3 2 6 ; 3 3 0 : The last equality is derived by Bðp; qÞ ¼ ΓðpÞΓðqÞ∕ Γðp þ qÞ.
In order to evaluate the convergence, we estimate the decreasing speed of a n . For representing an odd-order aspherical surface, let α be a noninteger and α ≥ 1∕2. The minimum α ¼ 1∕2 corresponds to the first-order surface or the cone. By Euler's reflection formula ΓðzÞΓð1 − zÞ ¼ By the definition of the gamma function of Γðz þ 1Þ ¼ zΓðzÞ, E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 0 ; 6 3 ; 6 9 5 Γðα þ n þ 2Þ ¼ ðα þ n þ 1Þðα þ nÞΓðn þ αÞ: The denominator of Eq. (6) is ; t e m p : i n t r a l i n k -; e 0 1 1 ; 6 3 ; 6 5 3 Since α ≥ 1∕2, the inequality n þ α > n þ α − 1 ≥ n − α holds. Since ΓðxÞ > ΓðyÞ for positive numbers, x > y > 0. Then, Γðn þ αÞ is evaluated as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 2 ; 6 3 ; 5 4 9 Γðn Therefore, E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 3 ; 6 3 ; 4 8 8 Thus, E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 4 ; 6 3 ; 4 2 0 where M ¼ j sin πðn−αÞ π j is a positive number. Hence, ja n j decreases faster than or equally to n −2 .

Another Representation of Coefficients for Odd-Order Surfaces and a More Strict Theoretical Estimation of Their Decreasing Speed
We present a more detailed formulation for the odd-order case As far as we know, this concrete estimation has not been studied. If n ≥ k, substituting the equalities of the gamma function of half-integer 15 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 5 ; 3 2 6 ; 6 3 3 into Eq. (6), we obtain E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 6 ; 3 2 6 ; 5 7 7 Since Eq. (16) includes 2k þ 1 terms of n in the numerator and 4k þ 1 terms of n in the denominator, a n decreases as fast as n −2 k , which is faster than the estimation described in Eq. (14).
For the third-order surface, substituting k ¼ 2 into Eq. (16), we obtain an explicit expression as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 7 ; 6 3 ; 3 4 7 a n ¼ ð−1Þ n ð2n þ 1Þ Thus, the coefficient a n decreases as fast as n −4 . Furthermore for the first-order surface, which describes the cone, substituting k ¼ 1 into Eq. (16), we obtain an explicit expression as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 8 ; 6 3 ; 2 4 4 a n ¼ ð−1Þ n−1 ð2n þ 1Þ This decreases as fast as n −2 , which corresponds to the minimum speed of decrease described in Eq. (14).

Derivatives of Zernike Polynomials
Derivatives of Zernike polynomials have been already thoroughly discussed. [19][20][21] In this paper, we present a simple formulation for the rotationally symmetrical case.
According to Appendix A, it is enough to discuss the derivatives by t ¼ r 2 instead of the radial coordinate r.
Since Q n ðtÞ is a polynomial to the n'th power, its derivative Q n 0 ðtÞ must be described as a linear combination of fQ k ðtÞg n−1 k¼0 . Thus, E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 9 ; 3 2 6 ; 2 5 6 By the orthogonality shown in Eq. (3), the coefficient c m is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 0 ; 3 2 6 ; 2 0 3 Since Q n ð0Þ ¼ ð−1Þ n and Q n ð1Þ ¼ 1, the first term of the right side is ð2m þ 1Þfð−1Þ nþm − 1g. The second term is zero because the order of Q m 0 ðtÞ is less than Q n ðtÞ. Thus Therefore, Eq. (19) is represented as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 2 ; 6 3 ; 7 1 9 Q 0 2kþ1 ðtÞ ¼ 2 For example, the first six terms of this derivation are definitely shown as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 3 ; 6 3 ; 6 1 1
Since Q n ð0Þ ¼ ð−1Þ n and Q n ð1Þ ¼ 1, when substituting t ¼ 0; 1 into Eq. (4), the following relations hold: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 4 ; 6 3 ; 4 4 3 ð−1Þ n a n ¼ 0; fð1Þ ¼ a 0 þ a 1 þ a 2 þ · · · ¼ X ∞ n¼1 a n ¼ 1: By adding and subtracting the two equations of Eq. (24) E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 5 ; 6 3 ; 3 5 1 3 Uniform Convergence of Zernike Expansion in Shape and Slope for Odd-Order Aspherical Surface In general, let us consider a function gðtÞ and its expansion gðtÞ ¼ P h n W n ðtÞ, where fW n ðtÞg is an arbitrary sequence of functions. Even though gðtÞ ¼ P h n W n ðtÞ converges uniformly, its termwise differential P h n W n 0 ðtÞ does not necessarily converge to g 0 ðtÞ.
As shown in Sec. 2, when fðtÞ represents an odd-order surface shape, its Zernike expansion fðtÞ ¼ P a n Q n ðtÞ contains an infinite number of terms. In optical ray tracing, since the direction of the exit ray is determined by the surface slope, it is expected that P a n Q n 0 ðtÞ converges to the surface slope f 0 ðtÞ. In this section, we prove the convergence of both fðtÞ ¼ P a n Q n ðtÞ and f 0 ðtÞ ¼ P a n Q n 0 ðtÞ.

Convergence of Surface Shapes
We present a proof of the uniform convergence of Eq. (7). For an odd-order surface, ja n j decreases faster than or equally to n −2 . Since jQ n ðtÞj ≤ 1, there exists a positive number C and an integer N such that E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 6 ; 3 2 6 ; 7 5 2 ja n j þ Cπ 2 6 : (26) Thus, we conclude that the convergence of the expansion Eq. (7) for any odd-order surface is uniform and absolute by the Weierstrass M-test. 15

Convergence of Surface Slopes
To discuss the convergence of the surface slope of an oddorder surface, let us consider the monomial fðtÞ ¼ t To guarantee the continuity of the derivative, we suppose that the parameter k ≥ 2 (higher than the third-order surface). Since the case k ¼ 1 corresponds to the first-order surface or the cone, the original function fðtÞ ¼ ffiffi t p has a singular point at the origin. Although this case is considered as an exceptional case in this paper, similar discussion will be applied to excluding the origin in Sec. 3.4.
Since the derivative f 0 ðtÞ is also continuous, this function is also expanded as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 7 ; 3 2 6 ; 4 9 5 By the analogy of Fourier analysis of derivatives, the convergence in slopes is proven if the following two conditions are satisfied: Condition 1. Convergence of slope: P ∞ k¼0 b n Q n ðtÞ converges to f 0 ðtÞ. Condition 2. Consistency of termwise differential: the equality E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 8 ; 3 2 6 ; 3 4 6 X ∞ k¼0 a n Q 0 n ðtÞ ¼ holds.
Since f 0 ðtÞ is also the odd-order surface of order ð2 k − 3Þ∕2, the proof of condition 1 is the direct conclusion of the discussion of Sec. 3.1.
Thus, the expansion of the derivatives is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 3 ; 3 2 6 ; 6 0 6 The fitting errors of Eqs. (31) and (32) are shown in Fig. 1 To estimate the maximum error versus the number of Zernike polynomials, we present the list of coefficients up to n ¼ 15 in Table 1. Since the maximum errors occur at the origin, the errors can be estimated by P n k¼0 ð−1Þ n a n for the shape and P n k¼0 ð−1Þ n b n for the slope. Both of them converge to zero when n → ∞.
Since fðtÞ ¼ t 3∕2 and f 0 ðtÞ ¼ ð3∕2Þt 1∕2 , the decreasing speed of the coefficients is as fast as n −4 in shapes by Eq. (17) and n −2 in slopes by Eq. (18), respectively. Thus, Zernike expansions both in shape and in slope converge to the original functions. The behaviors of the maximum errors as functions of the number of Zernike terms are shown in Fig. 2.

Approximation for the Cone f ðt Þ ¼ t 1∕2 ¼ r
The cone is not an example of smooth odd-order functions. However, as discussed in Secs. 2.3 and 3.1, the Zernike expansion of the cone converges uniformly to fðtÞ ¼ t 1∕2 . Thus, the cone ure is approximated enough by the Zernike expansion including the origin. Moreover, the similar discussions in Secs. 3.2 and 3.3 can be applied to the cone except at the singular point or the origin. However, the Zernike expansion of f 0 ðtÞ diverges at the origin because its expansion gives a diverging series.
Actually, by Eq.
Thus, E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 7 ; 6 3 ; 3 6 9 X ∞ k¼0 b n Q n ð0Þ ¼ Hence, the sum j P ∞ k¼0 b n Q n ðtÞj does not converge at the origin.
Another explanation is possible as follows. Since the cone is described as fðtÞ ¼ t 1∕2 ¼ jrj on the interval [−1; 1], the differential is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 8 ; 6 3 ; 2 7 1 This conclusion intuitively meets the fact that the cone has a singular point at the vertex.

Schmidt Surface
In this section, we evaluate the approximation accuracy of the Zernike expansion and discuss the aberration properties of odd-order surfaces via the optical design of the Schmidt corrector plate. The corrector plate is incorporated into the Schmidt camera and corrects the spherical aberration of the primary mirror. The optical layout of the Schmidt camera is shown in Fig. 3. Even though corrector surfaces are conventionally designed only with even-order terms containing the power, 22,23 the design with odd-order terms has been known as the Schmidt surface 24 .
The specifications of the design example are shown in Table 2 and the basic lens data are shown in Table 3. The surface indicated by (*) in Table 3 is the corrector surface. The aperture stop is also placed there. For simplicity, we employ only the third term and even-order terms for the corrector plate. Thus, the aspherical sag of surface (*) is given as follows: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 3 9 ; 6 3 ; Note that all optical design and analysis is achieved with Code-V TM .

Design Results and Discussion
We compare the original third-order design with its approximation design by the even-order expansion. To obtain a diffraction-limited design, the third and even orders up to the 16th are needed. By decomposing the third coefficients by Eq. (17) and adding to the original even-order coefficients, we obtain the diffraction-limited design by expanded shape as well, in which even orders up to the 20th are needed. The aspherical coefficients are shown in Table 4.
The error in shape is shown in Fig. 4. If we use more Zernike terms, the dominant error at the origin will be gradually reduced.
The peak-to-valley error in Fig. 4 is 3.88 × 10 −5 mm, which corresponds to 0.034λ. The comparison of wavefront map and the RMS wavefront ΔW of the two designs is shown in Fig. 5. Note that the RMS wavefront is calculated without considering the central obscuration.
The expanded shape is a fair approximation of the original surface shape. From the viewpoint of practical optical design, the wavefront aberrations in both designs are almost identical. Thus, the third-order surface is almost completely approximated by a finite sum of even-order terms.
In addition, since there is the A 2 term in the expanded shape, the third-order surface acts as an aspherical surface containing the power, which is necessary in conventional designs. Although the power term induces slight change in the paraxial relation, the third-order surface does not affect the paraxial relation formally. Consequently, we can conclude that this property is one reason for the effectiveness of odd-order surfaces. Even though we have proven that the third-order aspherical surface is fairly approximated by a  finite sum of even-order terms, we infer that there are some other reasons for the effectiveness. This is our future issue.

Conclusion
In this paper, we have shown that odd-order surfaces are fairly approximated by a finite number of Zernike polynomials. Rearranging the order of monomials, odd-order surfaces are fully approximated by a finite number of even-order power series.
To show this, we have proven the uniform convergence of Zernike expansion of odd-order surfaces both in surface shapes and slopes. By estimating the decreasing speed of the expansion coefficients, any odd-order surface is precisely approximated to be an optical surface by a finite number of Zernike polynomials. In other words, odd-order surfaces are fairly approximated by ordinary even-order aspherical surfaces. For the first-order surface, the Zernike expansion approximates the shape. However, the expansion of the slope does not make sense because the coefficients give a diverging series.
We have demonstrated the result by a design of a Schmidt surface and confirmed that the effect of the third-order surface can be fully expressed by the 20th even-order surface. The power term is necessary in classical design of Schmidt cameras. However, in the calculation of paraxial quantity, odd-order surfaces do not affect values such as focal lengths, paraxial magnifications, and so on. This is one reason why odd-order surfaces are effective in optical design. Since we infer that there are some other reasons for the odd order's effectiveness, this will be our future problem.
Our analysis method is useful for not only analysis of aberration characteristics of aspherical surfaces but also optical measurement especially in the accuracy estimation of approximation of generic surfaces by finite number of terms. Analyzing another expression of odd-order surfaces such as Qcon polynomials is a future issue.

Appendix A: Conversion of the Radial Coordinate in the Derivative Formulae
In the definition of Zernike polynomials Eq. (1) and expansion Eq. (4), the converted coordinate t ¼ r 2 is used instead of the original coordinate r. Considering that the surface is given by z ¼ Fðx; yÞ, due to the law of refraction, the direction of the exit ray is determined by the surface slopes ∂F∕∂x and ∂F∕∂y. If Fðx; yÞ is rotationally symmetric, Fðx; yÞ should be a function of t ¼ r 2 , i.e., Fðx; yÞ ¼ fðtÞ.
Consider the Zernike expansion of fðtÞ ¼ P a n Q n ðtÞ, the slopes are rewritten as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 0 ; 3 2 6 ; 3 2 7 ∂F ∂x ¼ ∂t ∂x df dt ¼ 2x X a n Q n ðtÞ 0 ; ∂F ∂y ¼ ∂t ∂y df dt ¼ 2y X a n Q n ðtÞ 0 : Thus, in the context of Zernike expansion and its derivatives, it is enough to prove the equality of the termwise derivative ð P a n Q n ðtÞÞ 0 ¼ P a n Q n 0 ðtÞ and its convergence.

Appendix B: Derivation of the Coefficients b k
We present a straightforward derivation of the expansion coefficients b k . By differentiating Eq. (4) formally, we obtain E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 4 1 ; 3 2 6 ; 1 7 0 ða 0 Q 0 ðtÞ þ a 1 Q 1 ðtÞ þ a 2 Q 2 ðtÞþ · · · Þ 0 At this stage, the convergence of Eq. (41) is not guaranteed.