Ray matrix formalism is a useful theory applied to paraxial geometrical optics, which employs 2×2 matrices to describe optical systems [1, 2]. This formulation allows complex, multi-element, optical systems to be analyzed and simplified using easy calculations. This matrix theory is also very useful for teaching Optics, since it is well-valued by students because the related mathematics are simple. This formalism is included in many textbooks devoted to Geometrical Optics, although usually included in a final stage, as a practical tool for numerical calculations. However, it has been demonstrated to be very useful also to develop analytical derivations. For instance, in Refs. [3, 4] we presented a full derivation of Fourier Optics systems based on this matrix formalism. It is also commonly employed to analyze stability conditions in optical resonators , as well as its connection with the fractional Fourier transform .
From the practical point of view, the paraxial approximation is not fulfilled and “perfect” imaging are not obtained. Aberration theory is required to complete the description. This subject is included in advanced Optics courses, usually defined in terms of the distortion of the ideal spherical wavefront, named the wavefront aberration function, W. The aberration function is well defined and all primary aberrations are expressed in terms of it. However this aberration function theory can be quite annoying for students since the polynomial expressions are large, and analytical expressions are difficult to derive. Following our previous works [3, 6], we believe that the application of a common tool, as the ray matrix theory, to teach different modules of an Optics course permits the students to achieve a deeper understanding of the practical aspects of its application, and provides a uniform and elegant framework. It is our purpose here to add an additional step in this sequence, and use the ray matrix formalism to explain aberrations concepts.
Spherical aberration is the most common aberration in optical imaging. Thus, we have first attempted to derive related analytical expressions using the ray matrix theory. We analyze non-paraxial rays, where an axial object point is not imaged onto a single point. Instead rays at different heights focus at different axial locations. Generalized ray transfer matrices that account for aberration were introduced in Ref. . There, two 2×2 matrices were required, one for the radial component, and another for the azimuthal component. Here, however, we use the extended 3×3 ray matrix method introduced in the field of laser resonators . This extended method is based on the introduction of two parameters, Δx and Δσ, which account respectively for possible errors in the height and angle coordinates of a ray traversing an optical system. While this method has been used to analyze errors caused by misaligned optical elements, we show next that it can also be applied to account for spherical aberration in optical systems. For that purpose, we consider these error parameters produced by the deviation of the ray with respect to the paraxial solution. Using this mathematical treatment we are able to derive, in a simple manner, analytical equations describing the longitudinal (LSA) or axial (ASA) spherical aberration.
The paper is organized as follows: in section 2 we briefly review the usual ray matrix method, as well as the extended 3×3 matrix. Its application to account for spherical aberration is introduced in Section 3. The conclusions are presented in Section 4.
PARAXIAL AND EXTENDED RAY MATRICES
The ray matrix method considers rotationally symmetric optical systems under the paraxial approximation [1, 2]. The optical system is regarded as a set of optical components placed between two traverse planes, located at z=z1 and z=z2 (Fig 1). The paraxial approximation implies that angular coordinates (σ) follow the small angle approximation and it can be considered as the ray slope, σ=dx/dz. The optical system changes the position and the angle of the input ray. An input ray at the incidence plane, with coordinates (x1,σ1), is transformed into an output ray with coordinates (x2,σ2). In the paraxial approximation, the relations among these coordinates are linear, and they can be written in the form:
M is the ray matrix, with ABCD components. Ray matrices corresponding to the some basic blocks for building optical systems, like a free propagation (FP), a spherical refractive surface (R), and a thin lens (TL), are respectively:
being d the propagation distance, n, n′ and R the refractive indices of input and output media, and the radius of curvature of the refractive surface, respectively and f’ the lens focal length.
The extended ray matrix method was proposed to account for misalignments in laser resonators . Possible decentring and/or tilt of the optical elements were introduced as two error parameters, Δx and Δσ, in the height and the angle coordinates displacements of the emerging ray. Therefore, Eq. (1) must be transformed to:
which introduces the extended 3×3 ray matrix, which includes the standard ray matrix as the top-left 2×2 submatrix. The advantage of such method becomes clear when cascading multiple misaligned elements in optical systems. So, a whole optical system can be treated as a group of elements, as it is usual in this formalism, by multiplying the corresponding optical element matrices. The final 3×3 matrix provides the information of the propagated error in the output ray coordinates as the elements Δx and Δσ in its third column.
APPLICATION TO SPHERICAL ABERRATION
Here we apply the above 3×3 ray matrix method to analyze spherical aberration. While this aberration can not be considered as a misalignment (perfectly centered elements present spherical aberration), we consider that that rays are affected by coordinate errors from paraxial regime due to the aberration. Let us consider Fig. 2(a), which represents a spherical surface between two media with refractive indices n and n′, respectively and radius of curvature, R. S denotes the axial point on the spherical surface, and C its center of curvature. F′0 is the paraxial focus, located at the focal distance f′0 from S. This distance is given by the classical geometrical optics formula derived from the first order (paraxial) approximation as:
However, due to the spherical aberration, a parallel ray impinging the surface at height x will not focus on F′0 but on another axial point F′x (the true and paraxial ray trajectories are the blue and red lines in Fig. 2(a), respectively). The equation giving this focus point is probably the simplest derivation in third order approximation, and the result is given by :
where f′x denotes the distance from S to F′x. We can regard the deviation of the real ray trajectory from the paraxial ray in terms of the error parameters in the 3×3 ray matrix formalism. For simplicity, we can approximate Δx≈0 and consider that all errors are produced in the angular coordinate Δσ. Considering Eqs. (4) and (5), this angular deviation is given by:
Using this simple result, we can derive some properties about the spherical aberration of a lens. Let us consider a TL in air, composed by two spherical refractions of curvatures R1 and R2, and a medium of index N (Fig. 2(b). We have considered the extended angular error matrix and we have assumed a thin lens as a successive optical elements arranged in cascade having different degree of angular errors in Eq. (6). Then, the matrix describing the TL is given by:
The matrix product in Eq. (7) results in:
which recovers the TL ray matrix in Eq. (2), including the lens maker formula that provides the lens paraxial focal length, 1/f′0=(N−1)((1/R1)–(1/R2)). However, an TL angular deviation, Δσ, is also present in Eq. (9), given by:
Δσ depends on the height, x, to the third, with a proportional term Ω defined in the equation. Now it becomes a simple ray matrix exercise (free propagation + TL (Eq.(6) matrix) to find the distance f′x where a parallel ray (σ=0) impinging the thin lens at height, x, focalize in the axis. The obtained result is:
This shows the LSA second order dependence with the height x.
Analogously, we can propagate the ray emerging from the lens until the paraxial focal plane located at distance f′0, and calculate its height, providing a measure of the axial spherical aberration (ASA). The result is:
which shows its third order dependence with x.
In summary, we have introduced a new way to treat spherical aberration in terms of the extended 3×3 ray matrix method. We believe this is a powerful tool to derive related analytical expressions which otherwise require very extensive derivations. Therefore it is a useful tool to teach aberrations to students who are familiar with the standard ray matrix method. For example, the strict derivation of the last two results (dependence of LSA and ASA with x2 and x3, respectively) is rather annoying, and it is usually avoided in texts devoted to Optics. Even texts  which include chapters specifically devoted to this issue, present the result without a complete derivation. The ray matrix method proposed here constitutes a formal framework familiar to students in Optics, and it provides a relatively short derivation of these relations. We believe the method can be extended to obtain other related results.
We acknowledge financial support from Spanish Ministerio de Economía y Competitividad, through projects FIS2012-39158-C02-02 and FIS2010-16646, and by Fondo Europeo de Desarrollo Regional (FEDER).
A. Gerrard and J. M. Burch, Introduction to matrix methods in Optics. Dover Publications, New York (1975).Google Scholar
A. E. Siegman, Lasers, University Science Books (1986).Google Scholar
F. A. Jenkins, H. E. White, Fundamentals of Optics, McGraw-Hill (1981).Google Scholar