10 November 2016 Simulated annealing approach to temperature–emissivity separation in thermal remote sensing
Author Affiliations +
J. of Applied Remote Sensing, 10(4), 040501 (2016). doi:10.1117/1.JRS.10.040501
The method of simulated annealing is adapted to the temperature–emissivity separation problem. A patch of surface at the bottom of the atmosphere is assumed to be a graybody emitter with spectral emissivity ε(k) describable by a mixture of spectral endmembers. We prove that a simulated annealing search conducted according to a suitable schedule converges to a solution maximizing the a-posteriori probability that spectral radiance detected at the top of the atmosphere originates from a patch with stipulated T and ε(k). Any such solution will be nonunique. The average of a large number of simulated annealing solutions, however, converges almost surely to a unique maximum a-posteriori (MAP) solution for T and ε(k).
Morgan: Simulated annealing approach to temperature–emissivity separation in thermal remote sensing



The temperature–emissivity separation (TES) problem bedevils any attempt to extract spectral information from remote sensing in the thermal infrared. Numerous methods have been proposed for handling TES. In this letter we investigate the application of simulated annealing to a MAP solution of the TES problem. The approach is an extension of earlier work on Bayesian TES.3,5 Simulated annealing cannot give a unique solution to this problem, but we shall show that the average of a large number of simulated annealing TES solutions converges almost surely to a unique TES estimate.



Simulated annealing has traditionally been regarded as a preferred method of global solution for combinatorial optimization problems, such as the traveling salesman. In this letter, we adapt the Metropolis algorithm1819.20 to an optimization problem that lacks a unique global optimal solution: TES. The underdetermined TES problem, notoriously, has a continuous infinity of solutions that yield the identical optimum value for any cost or payoff function one cares to choose.

A key part of any simulated annealing algorithm is the choice of an annealing schedule that causes the posterior probabilities to transition from nearly uniform to very tight in such a way as to evade the risk of the MAP search from converging to a local, rather than a global, optimum. Factors that enter into the choice of the annealing schedule are described in Refs. 19 and 20. In what follows, we shall stipulate that a suitable annealing schedule has been supplied and shall concern ourselves with the existence of a solution to the simulated annealing TES problem.


Simulated Annealing and the Temperature–Emissivity Separation Problem


Metropolis Algorithm Search for MAP Solution

Suppose that we have in our possession prior knowledge that a target patch, forming part of the lower boundary of the atmosphere, is composed of an intimate mixture of m+1 spectral endmembers {εi(k)} at temperature T. The label k may, depending upon context, refer to the wavenumber or to a finite number of wavenumber-averaged spectral bands.

For the development that follows, it is desirable to restate some of the main definitions used in spectral mixing theory using language borrowed from topology.21,22 Let m+1 distinct points y0,y1,ym in Rm be chosen so that the vectors y1y0,y2y0ymy0 are linearly independent. Then the set




λi0,  i,


is an m-simplex.

A spectral mixture amounts to a mapping into a geometric m-simplex, whose vertices have spectral endmembers at a stipulated temperature T for “coefficients.” A mixture with stipulated weights λi corresponds to the vector


The “interior” of Km is the subset of Km for which λi>0, i.e., the closure of its interior. The “polyhedron” of Km, denoted as |Km|, is the set comprised of the points of xKm considered as a subset of Rm and is a convex compact subset of Rm.

In the case m+1=3, a familiar example of a 2-simplex is the ternary diagram used to classify phreatic igneous rocks. The double three-component diagram used in the quartz, alkali feldspar, plagioclase, feldspathoid classification23 scheme is a union of two 2-simplices, and is an example of a “simplicial complex.”

For present purposes, the i’th pure endmember for the n’th trial is assigned to the i’th vertex of Km


with the spectral mixture corresponding to a point in the polyhedron of Km.

It is necessary to account for surface temperature in a somewhat different way. Let the minimum and maximum physically admissible surface temperatures be Tmin and Tmax, respectively. Then the temperature of our target patch is given as




Corresponding to x introduced already, we have from Eq. (6)


the unit interval, with


The quantity that appears in the forward model for the n’th trial is


where Bk(Tn) is the (band-integrated, as needed) Planck function at temperature Tn. The parametrization of the choice {Tn,ε(k)} in terms of the vector x is a mapping into the topological product


of I1 and |Km|. The set Hm+1 is not a simplex nor is it necessarily a simplicial complex. It is, however, a convex polytope and is the convex hull of its vertices xi, 0im+1. Although we will not need it in what follows, Hm+1 can be decomposed into either a simplicial complex or a union of simplices.

We score trial mixtures by that we most wish to maximize; the posterior probability for the observed spectral radiance to originate from a surface patch with temperature T and spectral emissivity ε(k). A standard argument3,5 gives the posterior probability in terms of a MAXENT estimator


in terms of a forward model


that is some function of the n’th trial, in each spectral bin k. We note that while the equation of transfer is linear, the dependence of its solution IFM upon εi(k)Bk(Tn) need not be. The assumed noise variance σ2 is shown as having a formal dependence upon a parameter, the “annealing temperature” Ta, which governs the annealing schedule for the search for an MAP solution. The joint posterior probability in J spectral bands is proportional to


If radiance Ik in each of J bands originating from a patch on the Earth’s surface has been detected at the top of the atmosphere, the posterior probability that the surface patch is at a temperature T given prior knowledge K is given by Bayes’ theorem as


The noise variance is assumed to be known and the functional dependence of probabilities upon σi is omitted. The prior probability P({Ii}|K) for the radiances {Ik} has no dependence upon T and for our purposes may be absorbed into an overall normalization.24 Equation (15) is evaluated with the aid of the prior probability for the surface to be at temperature T and has spectral emissivity ε(k), given available knowledge K5


where P[T,εi(k)|{Ii},K] is the conditional probability for the hypothesis that the surface temperature is T, and the spectral emissivity εk, given observed radiances {Ii} and prior knowledge K.

Each trial is thus scored according to the joint posterior probability for observed spectral radiance Ii to result from surface temperature T and spectral emissivity εk


where x stands for {x,xm+1}. Thus, in going from the (n1)’th to the n’th trial, the n’th candidate mixture is selected by Monte Carlo draw and pn for the new trial is compared to pn1 for the last one. The probability that it is accepted is18,20


P={1if  pn/pn11P(Ta)otherwise,
where the probability P(Ta) of taking a downward step in pn is determined by the annealing schedule. The dependence of P(Ta) on the annealing schedule is symbolized by the annealing temperature Ta, which is taken to decrease systematically during the MAP search. The actual form P(Ta) takes in practical calculations and is determined empirically.



We now examine the question of convergence. Corresponding to the sequence of m-simplices, Km, as the number of trials n increases without bound is a sequence of trials {Tn,εi(k)} with associated loci {x}Hm+1.

As a closed bounded subset of Rm+1, Hm+1 is a compactum. Therefore, as n, the sequence of trials x contains a convergent subsequence, whatever the value of m. Correspondingly, the sequence of posterior probabilities likewise has a convergent subsequence that, by construction, tends to the maximum value of the posterior probability, i.e., to an MAP solution for T and ε(k).

Consider the map x=Φ(x) given by


Φ(x)={xif  pn(x)pn(x)0xotherwise..
The mapping of Eq. (19) gives the action of the Metropolis algorithm according to Eq. (18) at sufficiently late times in the annealing schedule so that a transition to a state of decreased posterior probability occurs rarely, in the limit, almost never. We have noted that at a sufficiently late point in the annealing schedule, trials that decrease the posterior probability [Eq. (17)] will become infrequent. We may elide any such trials without affecting the convergence of the subsequence, which then takes the form


For all n greater than some M, convergence of the subsequence implies the Cauchy condition


(with the Euclidean norm supplying a suitable metric for finite m) so that



The mapping of Eq. (19) generates a sequence of trials x for which pn is nondecreasing. By Zorn’s lemma, the set comprised of all admissible trials x has at least one element with a maximal value of pn. We note that maximizing pn also maximizes the information-theoretic entropy by Eq. (12). According to the usual statement of the second law, the state of maximum entropy is one of thermodynamic equilibrium. We may, in view of the thermodynamic analogy underlying the Metropolis algorithm, therefore call the limit Eq. (22) an equilibrium point.

This nomenclature is attractive for another reason. In the limit, Eq. (22) amounts to a fixed point of Eq. (19). Ordinary fixed-point theorems are inapplicable to Eq. (19) because it is neither continuous nor semicontinuous: It can map an open set Hm+1 to a singleton x. We can, however, adapt the celebrated construction introduced by Nash25 to prove the existence of a fixed (equilibrium) point of an equivalent self-mapping.

In fact, we shall prove a somewhat stronger result. Consider


for stipulated x. The function φ is continuous in the mixture xα. Define the mapping N:xx by


where the index α is taken to run over members of any finite set of admissible trials xα in the execution of the Metropolis algorithm. (One may think of the collection of all sequences xα in ensemble-theoretic terms.) Suppose that x is a fixed point under Eq. (24). In Eq. (24), some values of α correspond to choices for {Ti,εi(k)} for which the posterior probability does not increase


For these values of α


If the choice x is fixed under the mapping N in Eq. (24), then the contribution to x from any xβ must not decrease; therefore, φβ=0,   β, lest the denominator in Φ exceed unity. Put another way, no other choice of {Ti,εi(k)} can increase the posterior probability, but that is the definition of an equilibrium point. If, on the other hand, an equilibrium point x maximizes the posterior probability [Eq. (17)], every φα vanishes, so that x is a fixed point.

Equation (24) is continuous and maps points x into a convex compactum Rm+1. A fixed point


Therefore, exists according to the Brouwer fixed-point theorem that, by construction, maximizes the a-posteriori probability of x.

The mapping Eq. (19) generates a sequence of trials x for which pn is nondecreasing and gives the maximal value of pn in the limit, while Eq. (27) demonstrates the existence of a trial x* for which pn cannot be made greater. In view of the ensemble-theoretic freedom to choose xα, we may identify the limit in Eq. (22) with the fixed point in Eq. (27). Therefore, a convergent subsequence of annealing trials exists that tends to an equilibrium point. Moreover, Eq. (27) demonstrates that the annealing search can, in principle, find x* in a finite number of trials. We conclude that, granted a suitable annealing schedule, there exists at least one convergent sequence of trials that tends to MAP surface temperature and spectral emissivity estimates consistent with observed spectral radiances Ik.



Sequential compactness guarantees the existence of a convergent subsequence of trials. In practice, we must expect that there will be more than one such sequence. The nonuniqueness of solutions to the TES problem suggests that there will be a continuous infinity of possible trials {Tn,εi(k)} that yield any stipulated value for the posterior probability. In any realizable search strategy, however, we need only contend with a countable set of convergent subsequences. Among these, there will be one for which the posterior probability is greatest. Its limit will be the closest approach to the MAP solution achieved by simulated annealing. In the nature of things, more than one convergent subsequence may be expected to exist that yields this same maximal estimate, with the same asymptotic annealing temperature Ta. We ignore all subsequences except these maximal ones.

In Refs. 5 and 3, expectation values for T and {ε(k)} over the the posterior probability [Eq. (15)] were shown to give good estimates for physical surface temperatures and emissivities. We claim that the mean of a large number of subsequences that converge to the limiting MAP value will tend to the expectation values for T and {ε(k)} with respect to Eq. (15).

The MAXENT estimator is constructed from the posterior probability of noise power in a spectral bin. For the sake of simplicity, we assume identical noise power in each bin. (This assumption is inessential and may be relaxed.) A fully annealed MAP estimate may be thought of as an individual Bernoulli trial drawn from the likelihood function for {Tn,εi(k)}. By construction, all such trials are independent and identically distributed with bounded expectation values. Moments over Eq. (15) are bounded despite bad behavior of the Jeffreys prior at T=0, because of the rapid decay of the exponentials away from the MAP solution, as L’Hôpital’s rule demonstrates.





be the means of MAP surface temperature and spectral emissivity taken over N convergent subsequences. Suppose the covariance matrix Σ of the trials to be nonsingular. We invoke the multivariate central limit theorem to conclude the mean values converge weakly to the multivariate Gaussian distribution



Reliance on the mixing hypothesis in the form given by Eq. (10), however, brings with it the concern that the relevant covariance matrix might be singular. In that event, the strong law of large numbers26,27 ensures




but without giving estimated variances of the mean values, such as with Eq. (30).

To the extent that the estimator used in the simulated annealing search is zero-mean error, we conclude the estimates yield accurate values for the physical values of T and {ε(k)}. As the spectral weights x, lying as they do between zero and unity, possess bounded moments, this conclusion applies to the limiting mean values of {Tn,{xmn}} as well.



The practical utility of the mathematical development in this letter may be questioned. We briefly address two possible concerns.

A legitimate concern is that the spectral emissivity of natural ground covers in the wild will seldom be known to the level of accuracy found in Ref. 28. While true in general, this concern has not dissuaded other researchers from relying upon spectral unmixing.

While convergence of the algorithm has been proved to our satisfaction, we have no equally satisfactory estimates of the rate of convergence, with the consequence that the choice of annealing schedule remains a matter of trial and error. In response to this concern, the availability of massively parallel computation made possible by the ready availability of cheap graphics processing unit arrays means that massive processing requirements need not preclude the use of a resource-hungry algorithm if that algorithm can provide a performance not attainable by other approaches.



In this letter, the Metropolis algorithm has been adapted to formulate a simulated annealing approach to the TES problem for a spectral mixture. We have presented a proof of convergence of simulated annealing searches for candidate MAP TES solutions. We have additionally shown that the average of a large number of these candidate MAP solutions converges almost surely to a unique estimate of surface temperature and spectral emissivity that, given a forward model leading to an unbiassed estimator for T and {εk}, closely approximates the true values of these quantities.

The simulated annealing approach to TES by spectral unmixing offers something that other TES algorithms do not. By construction, it gives (in the limit) the unique best estimate in an MAP sense, for the remote determination of surface temperature and spectral emissivity of a patch of ground that is known to be comprised of a spectral mixture of a stipulated set of spectral end members.


Support from the Aerospace Corporation Technical Investment Program under USAF Contract No. FA8802-14-C-0001 is acknowledged.



A. Barducci et al., “Maximum entropy temperature-emissivity separation in the TIR spectral range using the MaxEnTES algorithm,” Infrared Phys. Technol. 56, 12–20 (2013).BAMOAD0273-0979http://dx.doi.org/10.1016/j.infrared.2012.09.002IPTEEY1350-4495Google Scholar


Z.-L. Li et al., “Land surface emissivity retrieval from satellite data,” Int. J. Remote Sens. 34, 3084–3127 (2013).IJSEDK0143-1161http://dx.doi.org/10.1080/01431161.2012.716540Google Scholar


J. A. Morgan, “Comparison of Bayesian land surface temperature algorithm performance with Terra MODIS observations,” Int. J. Remote Sens. 32, 8139–8159 (2012).IJSEDK0143-1161http://dx.doi.org/10.1080/01431161.2010.532824Google Scholar


J. A. Sobrino et al., “Land surface temperature derived from airborne hyperspectral scanner thermal infrared data,” Remote Sens. Environ. 102, 99–115 (2006).http://dx.doi.org/10.1016/j.rse.2006.02.001Google Scholar


J. A. Morgan, “Bayesian estimation for land surface temperature retrieval: the nuisance of emissivities,” IEEE Trans. Geosci. Remote Sens. 43, 1279–1288 (2005).IGRSD20196-2892http://dx.doi.org/10.1109/TGRS.2005.845637Google Scholar


P. Dash et al., “Land surface temperature and emissivity estimation from passive sensor data: theory and practice-current trends,” Int. J. Remote Sens. 23, 2563–2594 (2002).IJSEDK0143-1161http://dx.doi.org/10.1080/01431160110115041Google Scholar


Z.-M. Wan, MODIS Land-Surface Temperature Algorithm Theoretical Basis Document, Institute for Computational Earth System Science, University of California, Santa Barbara (1999).Google Scholar


Z.-L. Li et al., “Evaluation of six methods for extracting relative emissivity spectra from thermal infrared images,” Remote Sens. Environ. 69, 97–214 (1999).http://dx.doi.org/10.1016/S0034-4257(99)00049-8Google Scholar


A. B. Kahle and R. E. Alley, “Separation of temperature and emittance in remotely sensed radiance measurements,” Remote Sens. Environ. 42, 107–111 (1992).http://dx.doi.org/10.1016/0034-4257(92)90093-YGoogle Scholar


P. S. Kealy and S. J. Hook, “Separating temperature and emissivity in thermal infrared multispectral scanner data: implications for recovering land surface temperatures,” IEEE Trans. Geosci. Remote Sens. 31, 1155–1164 (1993).IGRSD20196-2892http://dx.doi.org/10.1109/36.317447Google Scholar


F. Petitcolin and E. F. Vermote, “Land surface reflectance, emissivity and temperature from MODIS middle and thermal infrared data,” Remote Sens. Environ. 83(1–2), 112–134 (2002).http://dx.doi.org/10.1016/S0034-4257(02)00094-9Google Scholar


Z.-L. Li and F. Becker, “Feasibility of land surface temperature and emissivity determination from AVHRR data,” Remote Sens. Environ. 43, 67–85 (1993).http://dx.doi.org/10.1016/0034-4257(93)90065-6Google Scholar


K. Watson, “Spectral ratio method for measuring emissivity,” Remote Sens. Environ. 42, 113–116 (1992).http://dx.doi.org/10.1016/0034-4257(92)90094-ZGoogle Scholar


C. C. Borel, J. Szymanski, “Physics-based water and land temperature retrieval,” in Handbook of Science Algorithms for the Multispectral Thermal Imager, and B. W. Smith, Ed., Los Alamos National Laboratory and Savannah River Technology Center, Los Alamos, New Mexico (1998).Google Scholar


Z.-M. Wan and J. Dozier, “A generalized split-window algorithm for retrieving land-surface temperature from space,” IEEE Trans. Geosci. Remote Sens. 34, 892–905 (1996).IGRSD20196-2892http://dx.doi.org/10.1109/36.508406Google Scholar


A. Barducci and I. Pippi, “Temperature and emissivity retrieval from remotely sensed images using the ‘Grey body emissivity’ method,” IEEE Trans. Geosci. Remote Sens. 34, 681–695 (1996).IGRSD20196-2892http://dx.doi.org/10.1109/36.499748Google Scholar


Z.-M. Wan and Z.-L. Li, “A physics-based algorithm for land-surface emissivity and temperature from EOS/MODIS data,” IEEE Trans. Geosci. Remote Sens. 35, 980–996 (1997).IGRSD20196-2892http://dx.doi.org/10.1109/36.602541Google Scholar


N. Metropolis et al., “Equations of state calculations by fast computing machines,” J. Chem. Phys. 21, 1087–1091 (1953).JCPSA60021-9606http://dx.doi.org/10.1063/1.1699114Google Scholar


S. Kirkpatrick, C. D. Gelatt and M. P. Vecchi, “Optimization by simulated annealing,” Science 220, 671–680 (1983).SCIEAS0036-8075http://dx.doi.org/10.1126/science.220.4598.671Google Scholar


W. Press et al., Numerical Recipes in C, pp. 326–334, Cambridge University Press, Cambridge (1988).Google Scholar


C. R. F. Maunder, Algebraic Topology, pp. 31–32, Dover, Mineola, New York (1996).Google Scholar


J. J. Rotman, An Introduction to Algebraic Topology, pp. 31–38, Springer-Verlag, New York (1988).Google Scholar


M. J. Le Bas and A. L. Streckeisen, “The IUGS systematics of igneous rocks,” J. Geol. Soc. London 148, 825–833 (1991).http://dx.doi.org/10.1144/gsjgs.148.5.0825Google Scholar


L. Bretthorst, “Bayesian spectrum analysis and parameter estimation,” in Lecture Notes in Statistics, and J. Berger et al., Eds., Vol. 48, Springer-Verlag, New York (1988).Google Scholar


J. F. Nash, “Non-cooperative games,” Ann. Math. 54, 286–295 (1951).ANMAAH0003-486Xhttp://dx.doi.org/10.2307/1969529Google Scholar


J. W. Salisbury et al., Infrared (2.1–25 μ) Spectra of Minerals, Johns Hopkins University Press, Baltimore, Maryland (1992).Google Scholar


A. W. van der Waart, Asymptotic Statistics, Cambridge University Press, Cambridge (1998).Google Scholar


Z. Artstein and R. A. Vitalem, “A strong law of large numbers for random compact sets,” Ann. Probab. 3, 879–882 (1975).APBYAE0091-1798http://dx.doi.org/10.1214/aop/1176996275Google Scholar

John A. Morgan, "Simulated annealing approach to temperature–emissivity separation in thermal remote sensing," Journal of Applied Remote Sensing 10(4), 040501 (10 November 2016). http://dx.doi.org/10.1117/1.JRS.10.040501
Submission: Received ; Accepted



Fourier transforms

Thermal remote sensing

Atmospheric sensing

Error analysis

Frequency modulation

Back to Top