Model of thermal infrared image texture generation based on the scenery space frequency

Abstract. Infrared texture is an important feature in identifying scenery. To simulate infrared image texture effectively at different distances, we propose a model of infrared image texture generation based on scenery space frequency and the image pyramid degradation principle. First, we build a spatial frequency filter model based on imaging distance, taking into account the detector’s maximum spatial frequency, and use the filter to process a “zero” distance infrared image texture. Second, taking into consideration the actual temperature difference of the scenery’s details due to variation of the imaging distance and the effect of atmospheric transmission, we compare the actual temperature difference with the minimum resolvable temperature difference of the thermal imaging system at a specific frequency and produce a new image texture. The results show that the simulated multiresolution infrared image textures produced by the proposed model are very similar (lowest mean square error=0.51 and highest peak signal-to-noise ratio=117.59) to the images captured by the thermal imager. Therefore, the proposed model can effectively simulate infrared image textures at different distances.


Introduction
Infrared texture is an important feature in identifying scenery and has been used in various applications such as target detection, precision guidance, and three-dimensional scene simulation. [1][2][3] Infrared texture generation has been studied for decades, but because of security considerations, progress on the topic was seldom reported in the public literature.
The few published papers on infrared texture reveal the two methods used to generate infrared image texture: infrared texture simulation based on visible light texture [4][5][6] and the random field model. [7][8][9][10] The former method uses Planck's equation to calculate the infrared radiation energy for each object in the scene and then the energy value is mapped using a specific gray level. The deviation of the specific gray level is computed by the variations of gray in the visible image. The final infrared image texture is obtained using the specific gray level and its deviation. This simulation method can be adapted for a large-scale scene that needs only a low amount of detail, but it is not suitable for a scene that requires a high amount of detail because the infrared and visible textures have different principles of formation. The other simulation method based on a random field model, e.g., long correlation models 7 and the Markov random field model, [8][9][10] can generate infrared image texture. However, this method requires a large number of model parameter tests to determine the proper parameters, and this method is highly complex and has low fidelity.
To simulate infrared image texture at different distances, the simulated image is transformed by zooming in and out. The two simulation methods mentioned above do not take into consideration the attenuation of high frequency and the variation in the temperature difference of the scenery detail due to the atmospheric effect on transmission and different distances. The transformation of the simulated images obtained by the two methods discussed above is not reliable when the distance changes. Based on the image multiresolution pyramid principle, we propose an infrared image texture generation model based on scenery spatial frequency to generate infrared image texture at different distances. First, we calculate the scenery spatial frequency at a specific distance using the Nyquist frequency of the detector, and then we use the calculated scenery spatial frequency as the cut-off frequency to build a filter model based on distance. We use the filter to process the "zero"-distance infrared image texture captured by the thermal imager and downsample the filtered image. Second, given that the actual temperature difference corresponding to different scenery texture details will change with a change in distance due to the atmospheric transmission effect, we compare the changed temperature difference with the minimum resolvable temperature difference (MRTD) of the thermal image system. The comparative results are used to build a filter based on MRTD to decide whether the frequency should be recognized. Finally, after performing the above two steps of the filtering process, we obtain the final image texture.
Section 2 introduces the infrared image texture model based on scenery spatial frequency, Sec. 3 presents the experimental results and discussion, and Sec. 4 gives the conclusion of the paper. gradually decrease from the bottom image to the top image of the pyramid. The size of the base layer J (the original image) is N × N or 2 J × 2 J , where J ¼ log 2 N. The size of peak layer 0 is 1 × 1, i.e., a single pixel. The size of a layer j is 2 j × 2 j , where 0 ≤ j ≤ J. Therefore, a multiresolution pyramid is formed by starting with the N × N size of the original image and the image size of each successively smaller layer is an integral multiple of 2.
In a photoelectric imaging system with a fixed number of pixels, when the distances change, the imaging process becomes a series of multiresolution displays. Therefore, the generation of infrared image textures at different distances is equivalent to the formation of an image pyramid: the "zero"distance infrared image is the bottom image in the pyramid and has the highest resolution. The effects of distance and atmospheric transmission on the scenery infrared textures are equivalent to low-pass filtering and the process of downsampling in the image pyramid. A series of infrared image textures of different sizes and resolutions can be obtained by repeated filtering and downsampling. The filters are based on distance and MRTD. All of the filtering processes act on the "zero"-distance infrared image.

Spatial Frequency Filter Based on Distance
The results of the scenery imaging on the detector are shown in Fig. 2, where h and w are the height and width of the scenery, O is the optic center, f 0 is the focal length of the infrared imaging system, and p h × p w and p h 0 × p w 0 are the image size at distance L 0 and L, respectively.

Frequency filter model based on distance
For the infrared imaging system with a fixed number of pixels, the ability to distinguish scenery details decreases with increasing distance. The cut-off frequency D L is the frequency that the detector can distinguish at distance L. The cut-off frequency determines the level of detail of the scenery at L and is calculated using L. Then a filter model based on the cut-off frequency is built and is used to process the "zero"-distance image. We call the filter a spatial frequency filter, denoted by H S and it is defined as where f L is the spatial frequency of the "zero"-distance image and D L is the cut-off frequency of the image at distance L.
We apply the Fourier transform Fðu; vÞ to the "zero"-distance image of size M × N: where fðx; yÞ is the gray value at (x; y) on the "zero"-distance image. Then, the filtered image Gðu; vÞ in the frequency domain is calculated using the following equation: The spatial-domain image g p ðx; yÞ is obtained by using the inverse Fourier transform of Gðu; vÞ in the frequency domain: where ζ −1 is the inverse Fourier transform. The image size ðp h' × p w' Þ of the scenery at distance L is determined by the relationship between the location of the scenery and the detector, as shown in Fig. 2. The g p ðx; yÞ is filtered again using the downsampled window: where col and row are the number of columns and rows of the detector, respectively. We use g p ðx 0 ; y 0 Þ to denote the result of filtering g p ðx; yÞ. This filtered image is the simulated image when the detector is located at L and the atmospheric transmission effect is not taken into consideration.

Image cut-off spatial frequency based on distance
The horizontal and vertical sample frequencies, f w and f h , respectively, of the detector are expressed as where d w and d h are the width and height of the detector pixel. The imaging height (p h' ) and width (p w 0 ) on the detector at L are expressed as where f 0 is the focal length of the infrared imaging system and h and w are the height and width of the scenery. The cutoff spatial frequencies of the image at L are determined by  the relationship between the scenery and the detector and are defined as follows: where D Lh and D Lw are the vertical and horizontal cut-off spatial frequencies of the image at L and h dect and w dect are the height and width of the image on the detector plane.

Infrared image texture filter model based on MRTD
For scenery with a single spatial frequency f, such as a bar target, the atmospheric transmission affects the temperature difference between the target and the background. If the actual temperature difference is still greater than the MRTDðfÞ of the thermal imaging system after considering the atmospheric transmission, the thermal imaging system can distinguish the details of the frequency f. Otherwise, the details of f will not be distinguished and the image will become blurry. This yields the following formula: 11 ΔT 0 · τðLÞ ≥ MRTDðfÞ; (11) where ΔT 0 is the "zero"-distance temperature difference between the target and the background of the blackbody and τðLÞ is the mean atmospheric transmittance along the direction from the detector to the target at L in the wave band of the thermal imaging system. In reality, the scenery contains different levels of detail and the spatial frequency of the infrared image is a frequency range, not one fixed value. Therefore, it is necessary to calculate the actual temperature differences for the different spatial frequencies of the image at distance L. Comparing the actual temperature differences of different spatial frequencies and MRTDðfÞ is important to discriminate the details of the image with frequency f. If the thermal imaging system can distinguish the scenery details with a frequency f at distance L, it needs to meet the following condition: ΔTðfÞ · τðLÞ ≥ MRTDðfÞ; (12) where ΔTðfÞ is the mean temperature difference for frequency f in the image, and τðLÞ is as defined above and can be calculated using the program MODTRAN. 11 A temperature filter model H t based on the MRTD, according to Eq. (12), is defined as We denote the Fourier transform of the filter result g p ðx 0 ; y 0 Þ as G 0 ðu; vÞ and use the filter based on MRTD to process it to obtain the final filtered image Rðu; vÞ in the frequency domain: To obtain the filtered image in the spatial domain, the inverse Fourier transform is applied to Rðu; vÞ: R p ðx; yÞ ¼ freal½ζ −1 ½Rðu; vÞgð−1Þ xþy ; (15) where R p ðx; yÞ is the final simulated image of the thermal infrared texture at distance L.

Model of relationship between frequency distribution and temperature difference of scenery
For the "zero"-distance infrared image (L ¼ L 0 ), we can determine the temperature range (T min , T max ) and can calculate the gray level range (G min , G max ). The relationship between temperature and the gray values can be approximated by a linear relationship in a particular temperature range. 12 Therefore, the temperature T in the "zero"-distance infrared image is defined as where G is the pixel gray level. The temperature difference ΔT ij of a given pixel (i; j) is defined as the temperature difference between the given point and its neighboring points: where Tði; jÞ is the temperature at pixel (i; j). The mean temperature difference ΔT avg of the whole image at distance L 0 is given by where m and n are the pixel numbers of the "zero"-distance infrared image in the horizontal and vertical directions, respectively, and f 1 is the highest spatial frequency of the infrared image at L 0 . The temperature difference between neighboring pixels has the highest frequency at L 0 . Therefore, the average temperature difference of the image at L 0 corresponds to the highest frequency f 1 . Sceneries at different distances have different highest spatial frequencies, each of which is less than f 1 . For example, at the distance L ¼ 2L 0 , the highest spatial frequency for the scenery on the detector is f 2 ¼ ð1∕4Þf 1 , indicating that each pixel on the detector represents the average temperature of the four pixels in the "zero"-distance infrared image. Therefore, the temperature of a pixel in the infrared image at distance L ¼ 2L 0 is The average temperature difference of the image corresponding to the highest spatial frequency f 2 is Similarly, we can calculate all average temperature differences that correspond to different highest spatial frequencies f 3 , f 4 ; : : : , then draw the fitting curve for ΔT avg ðf i Þ and f i using discrete values of the highest spatial frequencies and the average temperature differences. In this work, we used the exponential function to simulate the relationship between ΔT avg ðf i Þ and f i : where a, b, c, and d are coefficients which are obtained by fitting curves of the relationship between the frequency distribution and the temperature difference of the scenery in the experimental step. Different "zero"-distance images have different coefficients.

MRTD of the thermal imaging system
The MRTD 13 of the thermal imaging system is expressed as where NETD is the noise equivalent temperature difference, SNR is the signal-to-noise ratio, SNR T is the threshold of the SNR, α × β is the instant field angle of the optical system, τ d is the residence time, f p is the frame frequency, t e is the integral time of the eye, Δf is the noise equivalent bandwidth, and MTFðfÞ is the modulation transfer function of the thermal imaging system 13 and is defined as where MTF o , MTF e , and MTF d are the modulation transfer functions of the optical system, the electronic circuit, and the detector in the thermal imaging system, respectively. More details about the modulation transfer functions are given in Ref. 13.

Experimental Results and Discussion
We simulated the infrared image texture of scenery at different distances based on the "zero"-distance image. The "zero"-distance image was captured by the VarioCAMLong Wave Thermal Imaging System (InfraTec GmbH, Dresden, Germany). The parameters were as follows: resolution ¼ 240 × 320 pixels, wave band ¼ 7.5 to 14 μm, temperature Two "zero"-distance images were collected on October 18, 2013, and were shown in Figs. 3(a) and 3(b). They were taken at 40 deg north latitude under a cloudy sky. In addition, there was haze that made visibility 0.5 km, and the atmospheric transmissivity was <0.7. Using the model of the relationship between the frequency distribution and the temperature difference of the scenery, we calculated five typical points of frequency and their corresponding average The infrared image textures shown in Fig. 4(a) were simulated as follows. First, we determined the distance of the simulated image; we assumed that it was 5 m. Second, we applied the spatial frequency filter based on distance and downsampled the "zero"-distance infrared image [ Fig. 4(a)] using Eq. (3); the experimental results are shown in Figs. 4(b) and 4(c) in frequency and spatial domains, respectively. Finally, we used the infrared texture image filter based on MRTD from Eq. (14) to process the filtered image shown in Fig. 4(c); the result is shown in Figs. 4(d) and 4(e).
We found that the image in Fig. 4(c) is fuzzier and smaller than that in Fig. 4(a), and the image in Fig. 4(e) is fuzzier than that in Fig. 4(c). Some details are attenuated because of the atmospheric transmission effect. Figure 5 compares the simulated image with the infrared image captured by the thermal imager (real infrared image) when the subject was 5 m from the imager. To compare the two images directly and analyze the simulation, the simulated image was extended to the whole field of view. Both images [Figs. 5(a) and 5(b)] are relatively similar from a subjective point of view. The slight discrepancy between the two [Fig. 5(c)] is caused mainly by the nonconformity of scenery locations in the two images. The location of the object in the infrared image is not always just centered in the entire field of view, so in the simulation image, the nonconformity is caused. Figure 6 shows the histograms 14 of the infrared image captured by the thermal imager [ Fig. 5(a)] and the simulated image [ Fig. 5(b)]. Figures 6(a) and 6(b) are the whole histograms and Figs. 6(c) and 6(d) are the histograms in the gray-level range of 0 to 100 for the infrared image and the simulated image, respectively. The histograms in Figs. 6(a) and 6(b) have a peak value between 0 and 255 gray levels. The histograms in Figs. 6(c) and 6(d) show that the infrared image and the simulated image have similar distributions of gray levels.
The simulated images and real infrared images at 10, 15, and 20 m are presented in Fig. 7. The details of the simulated  images and the real infrared images decrease with increasing imaging distance. The simulated image has a texture similar to that of the real infrared image when the imaging distances of the two images are the same. Figure 8 presents the real infrared image [ Fig. 3(b)] and the simulated images of the grass at different distances. The "zero"-distance captured image [ Fig. 8(a)] is of a patch of grass 0.6-m wide and 0.45-m high. We used the proposed filter model to process the "zero"-distance infrared image at different distances to obtain the simulated infrared texture images. The simulated images should be the entire field of view, so the texture-matching technology based on the sample plot is adapted to each simulated image; the simulated results are shown in Figs. 8(b)-8(f). The figures show that as the distance increased, the details gradually became blurrier. These changes reflect the variations in the details of the scenery infrared texture at different distances.
Mean square error (MSE) and peak signal-to-noise ratio (PSNR) are often used as the evaluation indices 15 to compare the similarity of two images. In general, if the PSNR > 20, there is a strong similarity between the two images. 15 The similarity indices of the captured images and simulated images at different distances are presented in Table 1.
The results in Table 1 show that when the distance increases, the MSE decreases and the PSNR increases, indicating that the similarities increase as the distance increases. All the PSNR values in this study were greater than 20, so the captured image and simulated images are very similar when the distance is between 5 and 20 m. The small MSE values and the large PSNR values in Table 1 suggest that the proposed filter model has high fidelity and is valid.
This study has one limitation, i.e., the proposed model was tested on only two thermal images, that of the person and the grass. However, we limited the number of images for three reasons. First, the performance of the model in simulating scenery depends on the imaging distance and viewing direction, not on the object in the scenery. Second, the experimental images of the person and the grass show the degradation of the image and the variation in the texture detail that occur when the imaging distance changes. We verified with the two experimental images that the proposed model is valid for the scenery simulation in which the "zero"-distance infrared image of the scenery is obtained by the perpendicular shoot to the scenery (i.e., the grass was shot from above, whereas the person was shot horizontally), but it is not valid for the scenery simulated from different viewing directions. Therefore, we did not use more images to test the model from the vertical direction to the scenery. Third, experimental conditions that were more complex and more materials would have been necessary to capture additional thermal infrared images at different viewing directions and distances, e.g., we may have had to use unmanned drones. Therefore, when the experimental conditions are appropriate, we will consider capturing more images for our future work.

Conclusion
Based on the principle of the multiresolution image pyramid, we proposed a new thermal infrared image texture generation model based on scenery spatial frequency. The model was based on a "zero"-distance infrared image. Two typical sceneries were simulated using the model, and the simulations   were compared with the infrared image texture captured by a thermal imager. The experimental results validated the proposed model by showing that it can reflect the features of infrared image texture and the imaging principle at different distances. In conclusion, the proposed model is able to effectively simulate infrared images with textures on the largescale background and can meet some of the requirements of qualitative analysis. In the future, we will capture and simulate sceneries from different directions and distances and use them to improve the robustness of the proposed model.