Translator Disclaimer
18 June 2019 Mathematical analysis of intensity distribution of the optical image in various degrees of coherence of illumination (representation of intensity by Hermitian matrices)
Author Affiliations +
Abstract

This is a historical translation of the seminal paper by H. Gamo, originally published in Oyo Buturi (Applied Physics, a journal of The Japan Society of Applied Physics) Vol. 25, pp. 431–443, 1956. English translation by Kenji Yamazoe, with further editing by the translator and Anthony Yen.

Since optical systems have distinctive features as compared to electrical communication systems, some formulation should be prepared for the optical image in order to use it in information theory of optical systems. In this paper the following formula for the intensity distribution of the image by an optical system having a given aperture constant α in the absence of both aberration and defect in focusing is obtained by considering the nature of illumination, namely coherent, partially coherent, and incoherent:

I(y)=nmanmun(y)um*(y),
where un(y)  =  sin 2πα/λ (y  −  nλ/2α) / 2πα/λ (y  −  nλ/2α) and anm  =  (2α/λ)2  ∬  Γ12(x1  −  x2) E(x1) E* (x2)  |  A(x1)  ||  A* (x2)  |  un(x1)um(x2)dx1 dx2. I(y) is the intensity of the image at a point of coordinate y, Γ12 the phase coherence factor introduced by H. H. Hopkins et al., E  (  x  )   the complex transmission coefficient of the object and A  (  x  )   the complex amplitude of the incident waves at the object, and the integration is taken over the object plane. The above expression has some interesting features, namely the “intensity matrix” composed of the element anm mentioned above is a positive-definite Hermitian matrix, and the diagonal elements are given by the intensities sampled at every point of the image plane separated by a distance λ  /  2α, and the trace of the matrix or the sum of diagonal elements is equal to the total intensity integrated over the image plane. Since a Hermitian matrix can be reduced to diagonal form by a unitary transformation, the intensity distribution of the image can be expressed as
I(y)=λ1|S

1.

Introduction

Discussion of the relationship between the object and the image of an optical system using information theory is a recent topic. Some basic studies such as describing the optical system using the response function have been done. However, entropy or noise is not discussed enough, though it is important in information theory. Information theory has been successfully applied to electrical communication, but the way it is applied is not directly applicable to an optical system. Hence, we need to formulate the information theory taking specific properties of the optical system into account.

The properties specific to the optical system are the following ones. (1) Observable quantity is not the amplitude of the wave but its intensity represented by the square of the amplitude; the intensity must be positive. (2) When we regard the optical system as a spatial filter, the phase and amplitude of the response function defined in the optical system are independent physical quantities; in the electric circuit, they are restricted by causality because they change over time. (3) Information contained in the image depends on the coherence of the illumination; for example, it is possible to obtain the information of amplitude and phase of coherent illumination; however, the phase information cannot be obtained when the illumination is incoherent. (4) If we regard the optical system as a communication channel, noise is an important quantity in deriving its capacity; in the optical system, noise can be stray light, disturbed observation due to random movement of the medium, graininess of a film, perception, and so on; these factors will add additional intensity to the nominal intensity or may partially reduce the nominal intensity; the resultant intensity will, however, not be negative. (5) Optical system behaves like a multi-dimensional spatial filter; in most cases, it is sufficiently described by a two-dimensional spatial filter.

These properties should be considered when we formulate information theory for the optical imaging system. If we do not take them into account, we cannot finalize the formulation of the information theory for the optical imaging system and may only get an insight that might help us understand the optical system.

In what follows, the author examines an aberration-free optical imaging system taking into account the first and second properties described above. In particular, the author will discuss how information contained in the image changes with respect to the third property, i.e., the coherence of illumination, using the sampling theorem that is often used in communication theory. Distribution of the image intensity will be described by a positive-definite Hermitian matrix, termed “intensity matrix,” and the physical meaning of the image will be revealed from the properties of the matrix. For this analysis, “phase coherence factor” introduced by H. H. Hopkins et al. will be used.

2.

Sampling Theorem

“The sampling theorem” used hereafter is briefly reviewed. We assume an optical imaging system in Fig. 1 consisting of a light source, an object, lens, and an image.

Fig. 1

F(y): complex amplitude of the image; A(xz): complex amplitude of the incident wave from a point source at P; E(x): complex transmission coefficient.

JM3_18_2_021101_f001.png

Let A(xz) be an incident wave from a point P on the source to a point Q on the object plane. When the complex transmission of the object is E(x), the complex amplitude of the wave after object is E(x)A(xz). As shown in Fig. 1, if we define θ as the maximum half-angle of a light cone to be captured by the lens and λ to be the wavelength, then 2sinθ/λ corresponds to the role of bandwidth of the electrical communication system. If we Fourier transform the wave E(x)A(xz), we obtain

Eq. (1)

f(X,z)=E(x)A(xz)e2πiXxdx.

In Eq. (1), X=sinθ/λ represents the directional cosine of the plane wave; thus, f(X) is the amplitude of the plane wave that propagates with the angle of sinθ=Xλ. Here, f(X) that forms the image is band-limited due to the aperture of the optical system. Since 2sinθ/λ behaves like the bandwidth of the optical system, as briefly noted above,

Eq. (2)

α/λXα/λ,
where α=sinθ. Since the bandwidth of f(X) is limited to 2α/λ, it can be expanded in a Fourier series as

Eq. (3)

f(X,z)=n=+an(z)e2πin(λ/2α)X

Eq. (3′)

an(z)=λ2ααλαλf(X,z)e2πin(λ/2α)XdX

The amplitude of the wave on the image plane F(y) is given by the inverse Fourier transform of f(X), which has the bandwidth of 2α/λ. Thus, from Eq. (3)

Eq. (4)

F(y,z)=αλ+αλf(X,z)e+2πiXydX

Eq. (4′)

=n=an(z)sin2παλ(ynλ2α)π(ynλ2α)

On comparing Eq. (3′) and Eq. (4), we notice that an(z) is obtained by multiplying λ/2α to F(nλ/2α)which is the image amplitude at y=nλ/2α. Thus,

Eq. (5)

F(y,z)=n=F(nλ2α,z)sin2παλ(ynλ2α)2παλ(ynλ2α)

Therefore, the image amplitude obtained through the optical system with the bandwidth of 2α/λ due to its numerical aperture can be determined if the image amplitude is sampled at an interval of λ/2α. The function set that appears in Eq. (5), i. e., {un=sin[(2πα/λ)(ynλ/2α)]/[(2πα/λ)(ynλ/2α)]} forms a complete orthogonal system. Hence, with any integer n and m,

Eq. (5′)

+un(y)um(y)dy=λ2αδnm,
where δnm is the Kronecker delta which is 1 when n=m and 0 otherwise (See Sec. 10). This result corresponds to the sampling theorem applied to a band-limited electrical system, with the only difference that the amplitude to be sampled is a complex number. The reason of this complex amplitude originates from the second property of the optical imaging system noted in Sec. 1. Although this result is obtained with an aberration-free one-dimensional imaging system, we can extend this idea and apply it to a two-dimensional imaging system with a square aperture. If we assign the basis function set to be {sin[(2πα/λ)(ξnλ/2α)]/[(2πα/λ)(ξnλ/2α)]} {sin[(2πα/λ)(ηmλ/2α)]/[(2πα/λ)(ηmλ/2α)]}, any band-limited complex amplitude can be represented by a series expansion. A rigorous treatment of the sampling for a circular aperture is not straightforward, but the sampling grid of a circular aperture is the same as that of the square aperture. For example, given a circle, we can define a sampling grid using circumscribed squares, and the amplitude distribution can be determined by sampled values though these sampled values are not fully independent of each other.

Above discussion holds when the illumination is a point source; in other words, it only holds when the illumination is coherent. Even when the illumination is coherent, its physical meaning is revealed only when we can measure the phase and amplitude independently by some appropriate way, such as taking the phase difference. This is due to the first property noted in Sec. 1, which demands that the direct measurement can be made to intensity only. In the following, a method to analyze the physical meaning of the image intensity will be introduced, which can be applied even when the illumination is partially coherent or incoherent.

3.

Intensity Matrix Related to the Phase Coherence Factor

We are able to obtain the image intensity distribution by Eq. (5) in Sec. 2. Letting I(y) be the image intensity at a point R on the image plane with coordinate y,

Eq. (6)

I(y)=F(y,z)F*(y,z)=nmF(nλ2α,z)F*(mλ2α,z)un(y)um*(y)
where
un(y)=sin2παλ(ynλ2α)/2παλ(ynλ2α)

We now consider the image intensity from a generalized source of a finite extent with an arbitrary intensity at each point. Letting J(z) be the point source intensity at a point P with coordinate z, the image intensity at a point R is given by

Eq. (7)

I(y)=n=+m=+anmun(y)um*(y)

Eq. (7′)

anm=J(z)F(nλ2α,z)F*(mλ2α,z)dz

Since point sources are mutually incoherent, the sum of intensities formed by all point sources is the total intensity. This fact can be explained from the statistical view point as follows. Intensity is found by taking the time average of the square of the amplitude of the light wave. Since each point source has no correlation with all others, this time average of the square of superimposed light waves is equal to the sum of time average of the square of each light wave.

The main feature of this paper will be based on Eq. (7) which has a quadratic form of variables un and um with coefficients anm. Since the intensity cannot be negative, this is a positive quadratic form. The matrix with elements anm is termed “intensity matrix” in this paper. (Intensity matrix has the similar property to the information matrix proposed for general systems by D. M. MacKay.7 Although N. Wiener8 proposed the coherency matrix by assuming time varying light waves, it is not applied to an optical imaging system.) Once this matrix is given, we can determine the image intensity distribution. In this sense, the intensity matrix contains all the information of the image intensity distribution. In what follows, the physical property of the intensity matrix will be examined.

The element of the intensity matrix anm is expressed using the Hopkins’ phase coherence factor. The use of the phase coherence factor is beneficial for our purpose because it has been studied in detail. First, Eq. (7′) is changed to (See Sec. 8.)

Eq. (8)

anm=J(z)E(x1)E*(x2)A(x1z)A*(x2z)u(x1nλ2α)u*(x2mλ2α)dx1dx2dz,
where u(x1nλ/2α) or u(x2mλ/2α) is the amplitude of light arriving at a point x=nλ/2α or mλ/2α on the image plane through the optical system with bandwidth 2α/λ (α: numerical aperture) and having unit amplitude when it is at point Q1 or Q2 on the object plane. As we assume an aberration-free optical system, it is

Eq. (8′)

u(x1nλ2α)=sin2παλ(x1nλ2α)/π(x1nλ2α)

Equation (8) shows the relationship of anm to the phase coherence factor. Inside Eq. (8), we single out the integral with respect to the source coordinate z, and set it as I(x1,x2). Then

Eq. (9)

I(x1,x2)=J(z)A(x1z)A*(x2z)dz

This value can be regarded as a correlation between light wave amplitudes at points x1 and x2 on the object plane. If the amplitude at each point generated by a point source at z are U1 and U2, we obtain

Eq. (9′)

I(x1,x2)=U1U2*dz

When the intensity at x1 is I1 and at x2 is I2,

Eq. (9″)

I(x1,x2)=I1I2Γ(x1x2)

Here, Γ(x1x2) is the phase coherence factor which is by definition given by

Eq. (10)

Γ(x1x2)=1I1I2U1U2*dz

When the light source Σ illuminates the object, the intensity Ii (i=1, 2) is the square of the absolute value of complex amplitude of the incident light at xi, which is represented as |A(xi)|2. Therefore,

Eq. (11)

anm=Γ(x1x2)E(x1)E*(x2)|A(x1)||A*(x2)|u(x1nλ2α)u*(x2mλ2α)dx1dx2,
where E(x) is the complex transmittance of the object, A(x) is the complex amplitude of the incident light, u(x1nλ/2α) given by Eq. (8′) is the amplitude of wave at nλ/2α on the image plane which results from the propagation of a light with unit amplitude at x1 on the object plane, and Γ(x1x2) is the phase coherence factor. According to Hopkins, Γ is given by

Eq. (12)

Γ(X1X2,Y1Y2)=1I1I2J(x,y)ei[x(X1X2)+y(Y1Y2)]dxdy,
where J(x,y) is the intensity at (x,y) on the source. An illumination system with a condenser lens can also result in the same equation as Eq. (12) when an appropriate light source is given.

Equation (11) is our first formulation. We discuss a few examples of the intensity matrix ||anm|| in the next section.

4.

Examples of the Intensity Matrix

4.1.

Coherent Source

When the light source is coherent, the phase coherence factor Γ(x1x2) is 1 regardless of x1 or x2.2 Therefore, by substituting Γ(x1x2)=1 into Eq. (11), the integral can be separated into a product of two integrals to yield

Eq. (13)

anm=F(nλ2α)F*(mλ2α)

The matrix element is a product of the complex amplitude on the image plane at nλ/2α and mλ/2α. This result simply produces Eq. (6) of Sec. 3, but analyzing the meaning of the above matrix will give us an important insight for the remaining part of this paper. In this coherent source case, a remarkable feature of the matrix is that its rank is 1 and its single eigenvalue determines Σann. The rank of a matrix is said to be r when the determinant of sub-matrix with the order lower than or equal to r is non-zero but the determinant of all sub-matrices with the order higher than r is 0. If the matrix element is given by Eq. (13), we may notice that the determinant of all sub-matrices with the order higher than 1 is 0. The characteristic equation that gives the eigenvalue is

det|anmλδnm|=0,
which is equivalent to xN(Σann)xN1=0. The single non-zero eigenvalue is equal to the trace of the matrix.

4.2.

Incoherent Source

In this case, the phase coherence factor Γ is

Γ(x1x2)={0x1x21x1=x2

When this Γ is substituted into Eq. (11), the element anm of the intensity matrix for the incoherent source is given by the single integral as

Eq. (14)

anm=+I(x)u(xnλ2α)u*(xmλ2α)dx
where
I(x)=|E(x)A(x)|2u(xnλ2α)=sin2παλ(xnλ2α)/π(xnλ2α)

Equation (14) is connected to the equation that gives the image I(y) from an incoherent source with intensity I(x), which can be expressed as an integral by the incoherent imaging formula

Eq. (14′)

I(y)=+I(x)|u(xy)|2dx

According to Eq. (14′), a single integral yields the image whereas the expression by the intensity matrix seems to involve more integrations which gives us an impression of extra cost of calculation. This point is discussed at the end.

The simplest example is an object with uniform brightness. If we set I(x)=A where A is a constant

Eq. (15)

anm=A+u(xnλ2α)u*(xmλ2α)dx=2αλAδnm(δnm=1,n=m;δnm=0,nm)

[The intensities I(y) in Eq. (14′) and I(y) from Eq. (15) must be the same (See Sec. 9).] Our next example is simple and basic, in which the object has a sinusoidal amplitude over the object plane. The brightness on the object plane is

I(x)=Acos2ωx=12A(1+cos2ωx)

In this case, the element of the intensity matrix is represented by the following integral

Eq. (16)

anm=12A+(1+cos2ωx)sin2παλ(xnλ2α)sin2παλ(xmλ2α)π(xnλ2α)π(xmλ2α)dx

We now change the valuables. Letting ξ=2παx/λ and p=λω/2πα,

anm=Aαπλ+(1+cos2pξ)sin(ξnπ)sin(ξmπ)(ξnπ)(ξmπ)dξ

This type of definite integral will appear in the following but since the result may not be found in standard integral tables, the principal result together with its derivation is listed in Sec. 10. According to Sec. 10, when 0<p<1, i.e., 0<ω<2πα/λ,

Eq. (16′)

anm=Aαλsin(nm)π(1p)(nm)πcos(n+m)pπnm

Eq. (16″)

anm=A2[1+(1p)cos2npπ]n=m

When p>1 or equivalently ω>2πα/λ, anm is 0. Based on this result in which anm is 0 above a certain spatial frequency threshold, the response function for the image intensity in Eq. (14) is of triangular shape1 with 0 value over the bandwidth of 4πα/λ.

We now consider the rank and eigenvalues of the intensity matrix. First, when the object brightness is uniform, the intensity matrix in Eq. (15) is a diagonal matrix with non-zero diagonal elements. Next, when the object brightness changes sinusoidally, especially when p=1/2 or equivalently ω=α/λ, the intensity matrix is also a diagonal matrix with the diagonal element of (1/2)A(λ/2α)(1+1/2(1)n). These examples show that incoherent illumination gives the greatest matrix rank in contrast to coherent illumination which gives the minimum matrix rank of 1. (When the object size is finite, the object intensity or complex transmittance can be represented by a Fourier series; hence we may be able to apply the results of Secs. 4.2 and 4.3 to each Fourier term.)

4.3.

Partially Coherent Source

We consider the one-dimensional case, under which the phase coherence factor is defined in Eq. (12) as

Eq. (17)

Γ(x1x2)=sin2παλS(x1x2)2παλS(x1x2),
where S is a parameter that represents the extent of the source. Clearly, in the limit of S0, Γ=1. If, in Eq. (17), α is equal to the numerical aperture of the imaging system, a condition of S=1 suggests that the numerical aperture of the imaging system is equal to the sine of the angle of view from the object to the source. Therefore, it is important to distinguish S>1 and S<1.

Let us consider the intensity matrix element for an image formed by uniform object transmittance. Substituting Γ(x1x2) in Eq. (17) into Eq. (11), and furthermore letting E(x)A(x)=K,

Eq. (18)

anm=K2π2++sinS(ξη)S(ξη)sin(ξnπ)(ξnπ)sin(ηmπ)(ηmπ)dξdη,
where ξ=2παx1/λ and η=2παx2/λ. This integration yields (see Sec. 10)

Eq. (18′)

S>1anm=K2Sδnm

Eq. (18″)

S<1anm=K2sinS(nm)πS(nm)π

Next, we consider an object with a sinusoidal transmittance. Substituting Γ(x1x2) in Eq. (17) into Eq. (11), and furthermore letting E(x)A(x)=Kcosωx,

Eq. (19)

anm=K2π2++cospξcospηsinS(ξη)S(ξη)sin(ξnπ)(ξnπ)sin(ηmπ)(ηmπ)dξdη,
where ξ=2παx1/λ, η=2παx2/λ and p=λω/2πα. The mathematical details is described in Sec. 10 and only the results are shown here:

Eq. (20‐1)

(i)  p>(1+S)and(S>1  or  S<1),anm=0

Eq. (20-2)

(ii)  (1+S)<p<1and(S>1  or  S<1),{anm=K22SA(nm)π  (nm)ann=K22S(1+Sp)

Eq. (20-3)

1>p>(S1)  or  (1S)and(S>1  or  S<1),{anm=K22SA+B(nm)π  (nm)ann=K22S[1+Sp+2(1p)cos2npπ]
where A and B are
A=cosnpπsin(nm)(S+1p)π2cos[(n+m)p+(S1)(nm)]π2sinnpπsin(nm)(p+1S)π2sin[(n+m)p+(S+1)(nm)]π22B=sin(nm)(1p)πcos(n+m)pπsin(nm)pπcos[(n+m)p+(nm)π]
(iii)S>1and(S1)>p>1,anm=0  (nm)ann=K22S}

Eq. (20‐4)

S>1and1>p>0,anm=K24Ssin(nm)(1p)π(nm)πcos(n+m)pπ  (nm)ann=K24S[1+(1p)cos2npπ]}

Eq. (20-5)

(iv)  S<1and(1S)>p>0,anm=K2cos2pηπsinS(nm)πS(nm)π  (nm)ann=K2cos2pηπ}

We may be aware of a few interesting points from the above example of partially coherent illumination. First, for incoherent illumination, anm=0 when p>1 or ω>2πα/λ, leading to no contribution to the image. Whereas for partially coherent illumination, anm=0 when p>(1+S) or ω>(1+S)2πα/λ, meaning more non-zero matrix elements than incoherent illumination. Second, if S>1 and (S1)>p>1 as in case (iii), we will obtain the same result as that of incoherent illumination. Lastly, if S<1 and (1S)>p>0 as in case (iv), the result reduces to the coherent case in the limit of S0. Deriving the rank of the intensity matrix with the elements anm given above in order to calculate its eigenvalue is an extremely difficult problem, except in some special cases. This fact implies that the image formation with the intensity matrix may be limited in practice. However, since the intensity matrix itself has interesting general properties, it can be useful in clarifying and organizing the physical concept of optical image formation.

Up to this point, we have considered an object of sinusoidal transmittance or brightness. If the object has an arbitrary distribution of transmittance or brightness, its transmittance or brightness can be expressed as a sum of periodic terms using the Fourier integral. Since each term can be treated as is explained above, we can obtain the elements of the intensity matrix when the object has an arbitrary transmittance or brightness distribution.

5.

General Properties of the Intensity Matrix

The intensity matrix was derived by squaring the amplitude obtained from the sampling theorem applied to a band-limited system. In electrical communication, since the square of the amplitude corresponds to the electric power, we may be able to derive an equation similar to the intensity matrix. However, this idea is not as crucial as the intensity matrix in an optical imaging system because the phase coherence factor explained in Sec. 1 is a unique feature only for an optical imaging system.

For detailed discussion in the following, Eq. (7) is repeated here. The intensity I(y) is

Eq. (7)

I(y)=n=+m=+anmun(y)um*(y),
where un(y)=sin2παλ(ynλ2α)/2παλ(ynλ2α).

If u1,u2,,un, are regarded as vectors in a multi-dimensional space, the linear sum Σanm unm=vn is another vector. The intensity is then given by the inner product of these two vectors

Eq. (21)

I(y)=(ϕ,Aϕ),
where ϕ is a vector {u1,u2,,un,} and A is a matrix with elements anm.

Degrees of freedom for the image can be equated with the dimension number N in the multi-dimensional space introduced above. Degrees of freedom N is exactly the number of the sampling points used to express the image under consideration. If the area of the image is S, N=4α2S/λ2. Toraldo di Francia defined the degrees of freedom when the source is coherent, which is N defined above. However, Toraldo di Francia defined another definition when the source is incoherent. In comparison, it seems mathematically consistent, regardless of the coherence of the source, to define the degrees of freedom as the number of dimensions to be used to determine the image. The source coherence is contained in the intensity matrix. Therefore, one can separate the number of the dimensions determined by the numerical aperture of the imaging system from the source coherence.

Six important physical properties of the intensity matrix will be discussed next. The first physical property is related to the fact that the only observable quantity is the intensity, which must be real and positive. This fundamental fact leads to the following mathematical property:

  • i. The intensity matrix is a positive-definite Hermitian matrix.

If a matrix is a Hermitian matrix, its elements have the following property

Eq. (22)

anm=amn*
which can be verified by putting I(y)=I*(y) in Eq. (7). The matrix is positive-definite because Eq. (7) is always positive regardless of y. (To satisfy this condition, all the principal sub-matrices of matrix ||anm|| need to have positive determinants.) We can now utilize the mathematical knowledge on positive-definite Hermitian matrices to investigate the intensity matrix for the image.

Next, let us consider the direct relationship between the elements of the intensity matrix and the observable physical quantity. The first relationship is:

  • ii. Diagonal elements of the intensity matrix are equal to the intensities sampled at the interval of λ/2α, and the trace of the intensity matrix is equal to the integrated image intensity over the image plane.

The first half of the statement can be understood from Eq. (7) in which un(y) is 1 at the sampling point of y=nλ/2α and zero at other sampling points. For the second half of the statement, the integrated image is obtained by taking the integral all over the image plane; therefore it is proved by integrating Eq. (7) with reference to Eq. (5′),

Eq. (23)

I0=I(y)dy=λ2α(n=+ann),

The source coherence is contained in the property of the intensity matrix, which will be more explicitly expressed by matrix diagonalization with unitary transformation. If intensity I is expressed as the inner product of vectors as in Eq. (21), let ϕ be expressed by basis vectors ϕ1,ϕ2,ϕ3,. These vectors are orthogonal, for example, ϕ1=(1,0,0,), ϕ2=(0,1,0,), and so on. With these basic vectors, vector ϕ is given by

Eq. (24)

ϕ=u1ϕ1++urϕr+

Now, orthogonal transformation of the basis will change Eq. (21) into the simplest form. The transformed vectors must be eigenvectors of the intensity matrix. That is, the eigenvectors ψ1,ψ2,ψi, will satisfy the following condition

Eq. (25)

Aψi=λ1ψi
where λi is a constant called an eigenvalue and it is proved that a Hermitian matrix always has real eigenvalues. Let the components of the eigenvector ψ be (S1,S2,S3,), the equation that determines the eigenvalue and eigenvector, Aψ=λψ, is then written as

Eq. (26)

(a11λ)S1+a12S2+a13S3++a1nSn=0a21S1+(a22λ)S2+a23S3++a2nSn=0an1S1+an2S2+an3S3++(annλ)Sn=0}

For Si to have non-trivial solutions, the determinant of the matrix formed by the coefficients has to be zero. The resulting equation is called the characteristic equation whose solutions are eigenvalues λ1,λ2,. If each eigenvalue is substituted into Eq. (26), we will have a set of simultaneous equations whose solution is the corresponding eigenvector to the inserted eigenvalue. By letting the components of eigenvector ψi corresponding to eigenvalue λi be (S1i,S2i,S3i,,Sni), the square of |ψi| is given by (ψi,ψi)=S1iS1i*+S2iS2i*++SniSni*. Hereinafter, we normalize the square of |ψi| to be 1. In addition, it is proved that the eigenvectors are mutually orthogonal.5 Therefore, the orthonormal condition can be written as

Eq. (27)

(ψi,ψi)=S1i*S1j+S2i*S2j++Sni*Snj=δij

Let us return to our original intention. Suppose that the original basis vectors ϕ1,ϕ2,,ϕn are transformed to eigenvectors ψ1,ψ2,ψn. Then, vector ϕ=(u1,u2,,un) can be represented by a new vector ψ=(ξ1,ξ2,ξn), i.e.,

Eq. (28)

ϕ=ξ1ψ1+ξ2ψ2++ξnψn

If we denote it by the vector components

Eq. (29)

ui=kSikξk
we will obtain

Eq. (30)

ϕ=Sψ
where matrix S has Sik as its i-th row k-th column element.

The coefficients in Eq. (28), i.e., the vector components obtained by transforming vector ϕ satisfy the following condition due to the orthogonality in Eq. (27)

Eq. (31)

ξi=(ψi,ϕ)

If Eq. (31) is written explicitly,

Eq. (31′)

ξi=Ski*uk
where Ski* is the i-th row and k-th column element of the inverse transformation matrix, which is given by transposing matrix S in Eq. (29) followed by taking the complex conjugate of each component. Here, we denote the transpose of a matrix by the single quotation mark ’ and write the transforming matrix in Eq. (31) as S*. Therefore,

Eq. (32)

ψ=S*ϕ

The square of vector ϕ is

Eq. (33)

(ϕ,ϕ)=|uk|2

As vector ui is expressed in Eq. (29),

(ϕ,ϕ)=(ξkψk,ξrψr)

Because of the orthogonal property, we obtain (ϕ,ϕ)=Σ|ξk|2. Thus, the matrix transformation used here preserves the square of the absolute value (norm) is not changed. This is proved by Eq. (27) in conjunction with Eq. (29). The constant norm property with Eq. (31′) proves

Eq. (34)

kSik*Sjk=δij

Equations (27) and (34) can be simplified with an identity matrix E as

Eq. (35)

S*S=SS*=E

In general, transformation that satisfies Eq. (35) is called a unitary transformation and matrix S is called a unitary matrix.

With the unitary transformation by matrix S, we can examine how the image in Eqs. (7) or (21) is transformed. As in Eq. (28), vector ϕ is written by the linear sum of transformed basic vectors ψi. Then,

I=(ϕ,Aϕ)=(ξiψi,Aξjψj)

If we use the property of eigenvalue and eigenvector in Eq. (25) and the orthogonality in Eq. (27), we obtain

Eq. (36)

I=λi|ξi|2

Since Eqs. (31) or (31′) shows the explicit form of ξi,

Eq. (37)

I=λi|(ψi,ϕ)|2=iλi|kSki*uk|2

If this result is written in the matrix form,

I=(ϕ,Aϕ)=(Sψ,ASψ)=(ψ,S*ASψ)

From Eq. (37), S*AS results in a diagonal matrix in which off-diagonal elements are zero. Letting the diagonal matrix be D,

Eq. (38)

S*AS=D

Diagonal elements of the diagonal matrix are eigenvalues λ1,λ2,,λn.

Next, because of Eqs. (35) and (38)

Eq. (39)

A=SDS*
or if Eq. (39) is written using the matrix elements

Eq. (40)

anm=λiSniSmi*

Among the results shown above, Eqs. (37) and (40) are useful in revealing the physical properties of the intensity matrix. Thus, Eqs. (37) and (40) will be examined in more detail.

  • iii. The image intensity is given by eigenvalues λi and eigenfunctions ψi (S1i,S2i,,Sni) of the intensity matrix A as

    Eq. (37)

    I(y)=iλi|kSki*uk(y)|2uk(y)=sin2παλ(ykλ2α)/2παλ(ykλ2α)

When the light source is coherent, only one eigenvalue is non-zero. Therefore

Eq. (37′)

I(y)=λ0|kSk*uk(y)|2
where λ0Sk corresponds to F(kλ/2α) in Eq. (5). When the light source is partially coherent, there are more than one non-zero eigenvalues in Eq. (37), in which the eigenfunction associated with the greatest eigenvalue contributes to the image formation most significantly. When the light source is incoherent, each eigenfunction contributes almost equally to image formation. From Sec. 4.2, in special cases where the object brightness is uniform or object transmittance changes sinusoidally across the object plane, we can obtain Eq. (37) without the unitary transformation. When the object brightness is uniform, all eigenvalues have the same value.

As a result, analyzing how the eigenvalue distributes reveals the degree of coherence of the light source, which is involved in image formation. When eigenvalues are λi, the sum of all eigenvalues I0 is equal to the trace of the intensity matrix and also equal to the integrated image intensity on the image plane. Let us consider the following quantity

Eq. (41)

d=i(λi/I0)log(λi/I0)

If the light source is coherent, we always obtain d=0 and if the light source is incoherent with uniform intensity, d takes the maximum value d0

d0=logN,
where N is the degree of freedom of the image. For one-dimensional images with a length of L, N=2αL/λ and for two-dimensional images with an area of S, N=4α2S/λ2. If the image area becomes infinitely large, N will also be infinitely large. For this case, we may consider d/d0 and take the limit with an infinitely large N. Therefore we define another quantity δ as
δ=(d0d)/d0

The value of δ changes continuously from 1 to 0 as the light source gradually changes its coherent state from coherent to incoherent. The quantity δ is an interesting quantity because we are able to measure coherence with δ. Therefore, we define the degree of coherence using δ. (The author noted in Ref. 9 that δ can be used for an object with uniform transmittance or brightness. However, according to Eq. (18′), δ from a light source with S>1 is the same as δ from incoherent source. We may need to investigate this point more carefully.)

According to von Neumann,6 λnlogλn is given with the matrix A as

Eq. (43)

λnlogλn=Trace(AlogA),
where the function of the matrix, log A, is a matrix given by substituting matrix A into the series expansion of the logarithm function. Equation (43) is considered to provide us a way to evaluate δ without the unitary transformation of A.

Next, let us summarize the result of Eq. (40).

  • iv. A positive definite Hermite matrix of order N, in our case the element of intensity matrix anm, is given by N positive real numbers λi together with N orthonormal vectors ψi (S1i,S2i,,Sni) as

    Eq. (40)

    anm=λiSniSmi*

The number of the variables to define the intensity matrix is determined by the N eigenvalues and N eigenvectors. The vector component Sij is in general complex, so that it can be represented by Sij=rijexp(θij). The number of the variables that determines the vector components is 2N2. As a result, the number of the variables that determines the intensity matrix of order N is (2N+1)N. Here, Eq. (27) defines the orthonormal condition, which is a set of

N+12N(N1)=12N(N+1)
equations. Hence,

  • v. The number of the independent variables, R, to define an intensity matrix with order N is

    Eq. (44)

    R=N(2N+1)12N(N+1)=12(3N+1)

Note that the simplest case for the intensity matrix is obtained when the light source is coherent because we may set the eigenvalues 0 except the single eigenvalue λ0. In this case, only one vector decides the matrix elements, so that the number of independent variables is

Eq. (45)

R(coherent)=2N

The number of independent variables is a fundamental quantity of the intensity matrix that depends on the coherence of the light source. However, the number of independent variables is not always meaningful for image intensity as is discussed next. Rather, the image intensity is determined by 2N sampling points where N is the number of sampling points for a coherent source. If the 2N sampled values could be arbitrary, the situation would be simpler; however, they need to be the sampled image intensity expressed in Eq. (7), which is obtained by the intensity matrix. Equation (7) introduces correlation among 2N sampled values. Thus, the property of the intensity matrix explained here is necessary to derive the amount of information of the image intensity, which will be explained elsewhere.

  • vi. Regardless of the coherence of the light source, image intensity I(y) in Eq. (7) is determined by the sampled values with a sampling step of λ/4α, and the following relationship between the sampled values and the intensity matrix is derived

    Eq. (46)

    I(y)=kI(kλ4α)sin4παλ(ykλ4α)4παλ(ykλ4α)
    where
    I(kλ4α)=nmanmuu(kλ4α)um(kλ4α)
    k=2p:  uu(2pλ/4α)=0,pnuu(2pλ/4α)=1,p=nk=2p+1:  uu((2p+1)λ/4α)=(1)pn/{2(pn)+1}

The sampling step is derived by the Fourier transform of the image I(y) in Eq. (7), in which the Fourier transform of un(y)um(y) has twice as wide as the bandwidth of the Fourier transform of un(y). Outside of the bandwidth, the Fourier transform of I(y) is 0. The equation is explicitly shown below by referring to the appendix

Eq. (47)

Inm(ω)=+sin(ξnπ)(ξnπ)sin(ξmπ)(ξmπ)eiωξdξ

If ω>2 or ω<2,

Inm(ω)0

If 0<ω<2,

Inm(ω)=12ieiπ[nω(nm)]eiπ[mω+(nm)](nm)

If 0>ω>2,

Inm(ω)=12ieiπ[mω(nm)]eiπ[nω+(nm)](nm)

When n=m, we will obtain the following.

If |ω|>2,

Inn(ω)0

If 0<ω<2,

Inn(ω)=π(1ω2)eiπωnα

If 0>ω>2,

Inn(ω)=π(1+ω2)eiπωnα

Since the bandwidth of I(ω) is limited to 4α/λ, the sampling theorem in Sec. 2 leads to the series expansion of Eq. (46).

In Eq. (46), rewriting the k-th sampled value as Bk,

Eq. (48)

B2p=app,B2p+1=kr4π2akr(1)k+r/{2(pk)+1}{2(pr)+1}

Also, the integrated intensity I0 is rewritten using Eq. (23) as

Eq. (49)

2αλI0=pB2p=pB2p+1=kakk

6.

Closing Remarks

The results presented in this paper are an expanded version of the presentation given at a symposium on “Application of the information theory to optics” organized by The Japan Society of Applied Physics on April 6, 1956. This paper explains the physical meaning of the intensity matrix introduced here. This work is just the beginning; we may need to work more on, for example, the application of the intensity matrix to calculating the amount of information, extension of the intensity matrix to two-dimensional imaging, evaluation of the change of the intensity matrix by phase difference, examination of the intensity matrix with aberration, and so on. For these cases, the author wishes to present at another opportunity.

In this paper, the discussion is limited to finite dimensions; however, for completeness of the study, it should be discussed with infinite dimensions. Therefore, we need to utilize the concept of Hilbert space, but the author was not able to reach this point in this short paper.

The author thanks Professor Hidetoshi Takahashi of the Department of Physics of the University of Tokyo and Associate Professor Kazuo Miyake of Tokyo University of Education for helpful discussions and Professor Hiroshi Kubota of the Institute of Industrial Science of the University of Tokyo, who encouraged the author to apply the information theory to optics and provided the related references.

7.

Addendum

Since the submission of the manuscript, the author has noticed that D. Gabor of the United Kingdom independently proposed how to generally express the image by a Hermitian matrix. [Information Theory, Third London Symposium, edited by Colin Cherry (1956, Butterworths Scientific Publications) 4. Optical Transmission by D. Gabor, pp. 26-33.]

Although the concrete derivation differs from the one introduced in this paper, the author appreciates that he reached fundamentally the same conclusion. Those interested may read his paper together with this paper. The author mentioned in this paper that the intensity matrix for a generalized case would be a very difficult problem. However, since then, the author was able to conclude that the intensity matrix can be generalized by matrix transformation of the object or can be defined with the existence of aberration in the imaging optics, and report these findings in a physics seminar at the University of Tokyo. The details will be published elsewhere. These are not exactly pointed out by Gabor. However, our concepts are in agreement.

8.

Appendix 1

Proof of Eq. (8).

Let the object complex transmittance be E(x) and complex amplitude of the incident light be A(xz), the Fourier transform of the light amplitude at the object plane E(x)A(xz) is given by Eq. (1),

Eq. (1)

f(X,z)=E(x)A(xz)e2πiXxdx

On the other hand, the light amplitude in the image plane F(y) is given by Eq. (4)

Eq. (4)

F(y,z)=αλ+αλf(X,z)e+2πiXydX

Substituting Eq. (1) into Eq. (4) followed by changing the order of integration gives

Eq. (3)

F(y,z)=+E(x)A(xz)sin2παλ(yx)π(yx)dx
Letting u(yx)=sin2παλ(yx)/π(yx), we get
F(y,z)=+E(x)A(xz)u(yx)dx

We obtain F(nλ2α,z) and F(mλ2α,z) by substituting y=nλ2α, mλ2α into above equation, and insert them into Eq. (7′) to obtain Eq. (8).

9.

Appendix 2

9.1.

Incoherent Illumination

Substituting the intensity matrix of Eq. (15) into Eq. (7) gives

I(y)=2αλn=+{sin2παλ(ynλ2α)2παλ(ynλ2α)}2
and changing Eq. (14′) yields
I(y)=A+{sin2παλ(yx)π(yx)}2dx=2αλA1π+sin2(ξη)(ξη)2dξ=2αλA

This result leads to the requirement of n=+sin2(ηnπ)(ηnπ)2=1. This relationship can be obtained by the series expansion of 1/sin2η. [See, for example, Magnus, Overhettinger, Formeln und Sätze für die speziellen Functionen der mathematischen Physik (Springer 1948) p.215.] Similarly, if the object brightness changes periodically, substituting Eq. (16) into Eq. (7) yields

I(ξ)=Aαλ[1+(1p)n=+cos2npπsin2(ξnπ)(ξnπ)2+1πnmsin(nm)(1p)π(nm)cos(n+m)pπsin(ξnπ)sin(ξmπ)(ξnπ)(ξmπ)]

On the other hand, if I(ξ) is derived by Eq. (14),

I(ξ)=Aαλ[1+(1p)cos2pξ]

Since the above two equations are identical, we obtain the one we need.

9.2.

Partially Coherent Illumination

If the object transmittance is uniform, the image intensity is obtained by putting anm from Eqs. (18′) and (18″) into Eq. (7)

S>1,I=K2S{nsin2(ηnπ)(ηnπ)2}=K2SS<1,I=K2nmsinS(nm)πS(nm)πsin(ηnπ)ηnπsin(ηmπ)ηmπ

If the illumination is incoherent, we have Eq. (14′) and when the illumination is partially coherent, we have

I(y)=Γ(x1x2)E(x1)E*(x2)A(x1)A*(x2)u(x1y)u*(x2y)dx1dx2

If we set E(x)A(x)=K and substitute u(xy) in Eq. (8′) and Γ(x1x2) in Eq. (17) into above equation, we will obtain the following equation with a help of the integration in Eq. (18)

S>1,I=K2SS<1,I=K2

Therefore, we obtain

nmsinS(nm)πS(nm)πsin(ηnπ)ηnπsin(ηmπ)ηmπ=1

10.

Appendix 3

10.1.

+sin2παλ(ynλ2α)2παλ(ynλ2α)sin2παλ(ymλ2α)2παλ(ymλ2α)dy=λ2αδnm

If we set a variable 2παλy=ξ,

λ2πα+sin(ξnπ)ξnπsin(ξmπ)ξmπdξ=λ2πα+1(2i)2[ei(ξnπ)ei(ξnπ)][ei(ξmπ)ei(ξmπ)](ξnπ)(ξmπ)dξ

The integration can be decomposed into four terms each of which has the following result

+eiax(xλ1)(xλ2)dx={πieiaλ1eiaλ2λ1λ2(a>0)πieiaλ1eiaλ2λ1λ2(a<0)

To arrive at the above result, we have to choose an integration path such that exp(iax) is cancelled by a loop with an infinite radius. In addition, this integral has a pole on the real axis so that we have to take the Cauchy’s principal value. Note that if the residue on the real axis is R0, the result of the complex integral is affected by 2πi(ΣR+1/2ΣR0), where R is the residue inside the integration path. [Whittaker, Modern Analysis p.117 (1935).] With these procedures, we obtain the integration shown in the beginning.

10.2.

In Eq. (16), we need to calculate the following integral;

Inm=+cos2pξsin(ξnπ)(ξnπ)sin(ξmπ)(ξmπ)dξ

This integral can be carried out in the similar way as the one shown above, i.e., we have to evaluate complex integrals with an exponential function. As a result,

p>1Inm=00p1{nmInm=sinπ(nm)(1p)(nm)cosπ(n+m)pn=mInn=π(1p)cos2πnp

10.3.

In Eq. (18), the following integral results in

I=+sinS(ξη)S(ξη)sin(ξnπ)(ξnπ)dξS>1I=πSsin(ηnπ)(ηnπ)S<1I=πSsin(ηnπ)(ηnπ)

10.4.

Equation (18) has the following definite integral,

I=+cospξsinS(ξη)S(ξη)sin(ξnπ)(ξnπ)dξ

  • i. (S>1 or S<1) and p>(1+S)

    I0

  • ii. [S>1 and (1+S)>p>(S1)] or [S<1 and (1+S)>p>(1S)]

    I=π2Ssin[S(ηnπ)+pnπ]+sin(ηnπpη)ηnπ

  • iii. S>1 and (S1)>p>0

    I=πSsin(ηnπ)ηnπcospη

  • iv. S<1 and (1S)>p>0

    I=πSsinS(ηnπ)ηnπcospnπ

Above results reduces the integration of Eq. (19) to a single integral to give the results as follows

  • i. S1 and p>(1+S)

    anm0

  • ii. [S>1 and (1+S)>p>(S1)] or [S<1 and (1+S)>p>(1S)]

    anm=K22πS+cospηsin[S(ηnπ)+pnπ]+sin(ηnπpη)ηnπsin(ηmπ)ηmπdη

  • iii. S>1 and (S1)>p>0

    anm=K2πS+cos2pηsin(ηnπ)ηnπsin(ηmπ)ηmπdη

  • iv. S<1 and (1S)>p>0

    anm=K2πScospnπ+cospηsinS(ηnπ)ηnπsin(ηmπ)ηmπdη

Among the above four results, (iii) and (iv) would be able to be calculated by the results of 10.2 and 10.4. For (ii),

2πSK2anm=sinpnπ+cospηcos  S(ηnπ)ηnπsin(ηmπ)ηmπdη(a)+cospnπ+cospηsinS(ηnπ)ηnπsin(ηmπ)ηmπdη(b)++cos2pηsin(ηnπ)ηnπsin(ηmπ)ηmπdη(c)+sinpηcospηcos(ηnπ)ηnπsin(ηmπ)ηmπdη(d)

For (b) and (c), we can use the formula shown already. However, we need some more calculation for (a) and (d).

In the end, the four integrals are

+cospηsinS(ηnπ)ηnπsin(ηmπ)ηmπdη={sin(nm)(S+1p)π2nmcos[(n+m)p+(S1)(nm)]π2nmπ2(S+1p)cosnπpn=m
+cospηcosS(ηnπ)ηnπsin(ηmπ)ηmπdη={()sin(nm)(p+1S)π2nmsin[(n+m)p+(S+1)(nm)]π2nm()π2(p+1S)sinnπpn=m
+sin2pηcos(ηnπ)ηnπsin(ηmπ)ηmπdη

(1+S)>p>1

{=0nm=πcos2npπn=m

1>p>(S1) or 1>p>(1S)

{=sin(nm)pπnmcos[(n+m)p+(nm)]πnm=πpcos2npπn=m
+cos2pηsin(ηnπ)ηnπsin(ηmπ)ηmπdη

(1S)>p>1

{=0nm=1n=m

1>p>(S1) or 1>p>(1S)

={sin(nm)(1p)πnmcos(n+m)pπnmπ(1p)cos2npπn=m

References

1. 

H. Kubota, Kagaku, 26 (6), 285 (1956). Google Scholar H. Kubota and H. Gamo, Oyobutsuri, 11 (1), 22 (1956). Google Scholar

2. 

H. H. Hopkins, Proc. R. Soc. A, 208 263 (1951). Google Scholar H. H. Hopkins, Proc. R. Soc. A, 218 408 (1953). Google Scholar , E. Wolf, Proc. R. Soc. A, 225 96 (1954). Google Scholar , E. Wolf, Proc. R. Soc. A, 230 246 (1955). Google Scholar , A. Blanc-Lapierre and P. Dumontet, Revue d’Optique, 34 1 (1955). Google Scholar

3. 

G. T. di Francia, J. Opt. Soc. Am., 45 497 –501 (1955). Google Scholar G. T. di Francia, Opt. Acta, 2 5 –8 (1955). Google Scholar , A. Blance-Lapierre, Ann. L’Inst. H. Poincare, 13 (4), 283 (1953). Google Scholar , E. H. Linfoot and P. B. Fellgett, Philos. Trans. R. Soc. London A, 247 369 (1955). Google Scholar

4. 

C. E. Shannon, W. Weaver, Mathematical Theory of Communication, 61 University of Illinois Press, Champaign, Illinois (1949). Google Scholar C. E. Shannon, Proc. IRE, 37 10 –21 (1949). Google Scholar

5. 

T. Yamanouchi, Algebra and Geometry, Kawade Shobo).Google Scholar and M. Fujiwara, Matrix and Determinant, Iwanami Zensho).Google Scholar

6. 

J. von Neumann, Mathematische Grundlagen der Quantenmechanik, Springer(1932). Google Scholar

7. 

D. M. MacKay, J. London Edinburgh Dublin Philos. Mag. J. Sci., 41 289 –311 (1950). https://doi.org/10.1080/14786445008521798 Google Scholar

8. 

N. Wiener, “Generalized harmonic analysis,” Acta Math., 55 117 –258 (1930). Google Scholar

9. 

H. Gamo, “Matrix to give the intensity distribution of optical image and the degree of coherence,” Kagaku, 26 (9), 470 (1956). Google Scholar
© 1956, The Japan Society of Applied Physics. Published by SPIE under a Creative Commons Attribution 4.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
Hideya Gamo "Mathematical analysis of intensity distribution of the optical image in various degrees of coherence of illumination (representation of intensity by Hermitian matrices)," Journal of Micro/Nanolithography, MEMS, and MOEMS 18(2), 021101 (18 June 2019). https://doi.org/10.1117/1.JMM.18.2.021101
Published: 18 June 2019
JOURNAL ARTICLE
12 PAGES


SHARE
Advertisement
Advertisement
RELATED CONTENT

The Case For The Pupil Function
Proceedings of SPIE (May 31 1974)
Influence of edge error on MTF
Proceedings of SPIE (February 09 2005)
Film Grain Noise In Partially Coherent Imaging
Proceedings of SPIE (October 24 1979)
Lau imaging
Proceedings of SPIE (February 28 1991)

Back to Top