Remote sensing images are usually acquired under dissimilar imaging conditions, such as different periods, different illumination intensities, and different sensor angles. Therefore, uneven distributions of brightness, contrast, and color exist in and among remote sensing images, which greatly restricts the use of the subsequent orthophoto mosaics and their applications. Image dodging is a process used to reduce uneven brightness and hue in and between images, a relative equalization procedure of contrast and color information in images.1 It is an essential part in producing a seamless mosaic and ensures that mosaicked images accurately and truthfully express the objective real world and therefore are suitable for use in real-world applications. Image dodging is particularly important when dealing with unmanned aerial vehicle (UAV) images collected from fixed-wing UAVs without gimbals, from different solar elevations, or from multiple flights under varying weather conditions.
UAV imaging technology has developed rapidly in recent years, gaining wide use in various applications because of its cost effectiveness, high flexibility, and high spatial resolution. However, different images have widely varying brightness and hue due to different imaging conditions and sensor restriction issues that are specific to UAV imagery. If images are acquired via a low and unstable platform such as a fixed-wing UAV without gimbals, then the camera angle is subject to extreme shifts and the brightness and hue of ground objects may vary among images taken at different camera angles. When UAV images are acquired at a small solar elevation angle, shadows cause visible difference of brightness distribution between sunlight side and the opposite side in an image. The quality of consumer-level digital cameras is limited and can lead to vignetting, an uneven brightness distribution in single images. Large survey area coverage requires multiple flights, subject to changes in weather with significant differences in global image illumination. Some image conditions can be controlled by using stable multirotors and gimbals, but other image conditions cannot be controlled, such as the weather during multiple image-collection flights. All these issues may cause a dark-bright interstrip effect in a final mosaic image. The radiometric inconsistency caused by these conditions can be reduced or even eliminated, however, through image dodging procedures. Several studies on the radiometric correction of UAV images have been conducted23.4.–5 that focus on the quantitative inversion of UAV images for agriculture applications. In this paper, however, we focus on a dodging method designed to enhance the visual appearance of a final mosaic image. Some representative image-dodging methods were compared by Over et al.6 and Pudale and Bhosle,7 including histogram matching, linear transformation, and statistical methods based on mean and standard deviation (SD) and can be broadly categorized into linear and nonlinear models.8,9
Histogram matching is a common method of nonlinear models, which reduces the difference in brightness and hue among image objects by correcting the shape of the histogram. Doutre and Nasiopoulos10 used the histogram matching method to correct differences in brightness and hue among camera video images. They applied histogram statistics for brightness and hue to the original and the reference image, and then created brightness and hue-mapping functions based on the cumulative distribution of the shape difference function. Wang and Pan,11 however, point out that although the histogram matching method adjusts the mean and variance to fit both the reference and target images, this occurs by directly changing the shape of the histogram. Different internal features of images reflect differences in the histogram shape of the image; thus, when the difference is large, the histogram-matching method will change the original relative distance among the gray levels, making the image color shift, and the image dodging process fails.
To address the problems in histogram matching, linear modeling has received a great deal of attention.1213.14.15.–16 In these approaches, the combined value of hue and illumination variation among images is estimated statistically from pixels sampled from overlapping areas of several images. This value is then used to reduce the differences among the images; however, it does not represent the true gray difference among the images. The advantage of the linear modeling is that it takes color consistency for multiple images as a whole into consideration to facilitate quality control. In addition, the results do not depend on the processing order of the image. The disadvantage is that it does not well reflect the nonlinear attributes of the aerial image. Although overall hue and illumination consistency are ensured, differences may still exist in local areas. The linear model is also likely to cause color distortion for the gray-value complex-distribution images. The linear statistical method based on mean and SD is based on the idea that two images have a “least mean squares” sense of gray difference, but this approach reduces local contrast in images to be processed.
These methods were proposed for satellite imagery or traditional aerial images. UAV flying height generally is only tens to hundreds of meters above the ground. Camera angles, lighting conditions, and properties of ground objects generate large highlighted areas and dark spots in images collected by UAVs. Because of occlusion, the area and position of the same highlighted building or shadow found in different images may vary, thus influencing image-dodging results. Traditional image-dodging methods based on global or local statistical parameters may abnormally stretch the hue and illumination of different areas. For example, large areas of vegetation will become brighter and buildings and other bright regions will become darker. Commercial softwares, such as Pix4D and Agisoft Photoscan, have their own image-dodging processes embedded in the workflow; these processes balance color in images acquired under similar conditions in a single flight. However, traditional dodging approaches and commercial software cannot effectively reduce the dark-bright interstrip effect between UAV images and the overall brightness inconsistencies found in images acquired during different flights. Pan et al.17 proposed first a global then local processing method; each image is treated at first based on a linear model, then subjected to local optimization based on a nonlinear model.
In this paper, we address these problems and extend the research following global then local principles by combining the concepts behind the Wallis filter and MASK dodging18 in our proposed image-dodging method. A flow chart of the algorithm is shown in Fig. 1. The overall reference background image was obtained based on global statistics, and the overall brightness remains consistent with the two-dimensional (2-D) radiometric spatial attributes. We use the Contourlet transform for high- and low-frequency information separation before processing a single image, and apply relative radiometric consistency processing to the low-frequency section of the target image. We use only the overall mean difference for radiation adjustment in low frequency sections to keep relative radiation distribution information within one image. The foreground bright and dark objects are smoothed before the overall mean difference is obtained to reduce adverse effects for average difference acquisition.
Acquisition of Overall Reference Background Images
The aim of image dodging for image mosaics is to make the hue and illumination between images consistent as a whole, while maximally protecting texture information and true contrast. The radiometric differences among images stem from the background radiation. Image dodging therefore should target the background radiometric information of images. Thus, a key step in an image dodging process is obtaining this background radiometric information to produce an overall mosaic result.
Highly reflective objects such as buildings and water bodies, and nonreflective objects, such as shadows, may appear in images. These objects are considered as the foreground objects, which need to be excluded during the extraction of overall background radiation information.19 We use the object-level image smoothing (OLIS) method19 to minimize the negative effect from foreground objects in the images. We calculate the mean and SD of these foreground-smoothed images. After the foreground-smoothed single image is produced by relative radiometric consistency processing based on mean and SD (see the following step 3), these images are premosaicked. Because image stitching may produce some visible mosaic traces, and the edge information may be mixed into the images to be processed in a subsequent procedure, the premosaicked result needs to be smoothed to ensure the reliability of the overall reference background image.
Suppose there are images () to be mosaicked. The overall reference background image is . The steps can be summarized as follows:
1. Using OLIS to remove the bright and dark foreground objects in image to obtain ();
2. Calculating mean and SD of image , which have been object-level smoothed, to derive the overall background image mean and SD , where is the pixel count of ;
3. Performing relative radiometric consistency processing for based on mean and SD to obtain ; the process is as follows:
4. Premosaicking for and smoothing premosaicked results to get .
Separation of High and Low Frequencies of Images
Contourlet transform20 is a multiscale and multidirectional image analysis method that compensates for the shortcomings of wavelet transform in obtaining the intrinsic geometric structure, i.e., high-frequency information, of an image. Therefore, we conduct high- and low-frequency separation for images using Contourlet transform and better protect the high-frequency texture information.
Contourlet transform is divided into two analysis phases: the multiscale analysis and the multidirection analysis. It utilizes the Laplacian pyramid21 to perform multiscale analysis, decomposing images into a downsampled low-pass filtered image and a band-pass image isolating breakpoints of the edges. Then the 2-D directional filter bank (DFB)22 connects the breakpoints in the same direction into lines to form directional coefficients.
Dodging for a Single Image
Contourlet transform is a local transformation of space and frequency domains. Since the DFB is variable, it has a stronger directionality than wavelet transform and can express the image texture features more effectively. The Contourlet transform is introduced in image dodging for effective protection of texture information.
To obtain true differences in background radiation between the reference image and the image to be processed, not only do the bright and dark objects in images need to be smoothed before performing Contourlet transform but also the low-frequency section in the transform result needs to be further processed using a low-pass filter. Even if the image has been processed by a multilevel Contourlet transform, a portion of high-frequency texture information still exists in the low-frequency section. Therefore, we use the low-pass filtered result of the low-frequency section for extracting the background radiation difference and add this difference back to the low-frequency section of the original image. We perform an inverse Contourlet transform to get the dodged image while keeping the high-frequency section intact. The high-frequency section is not involved in this process, which prevents the dodging process from altering the high-frequency texture information of the image. The dodging process of a single image can be summarized as the following steps:
Suppose that is the original image in the dodging process, denotes the image with foreground objects smoothed of , and is the corresponding position image in .
1. Performing Contourlet transform with levels for , , and to obtain low-frequency section , , and and high-frequency section , , and .
2. Conducting low-pass filter for and to obtain “true” low-frequency information and , which better represents the image background.
3. Utilizing , , and to get based on the following:
4. Replace with and combining to perform inverse Contourlet transform to obtain the dodged result.
The Contourlet transform level can be set to 4.19
Dodging for the Whole Mosaic Image
Since the orthophotos to be mosaicked already contain geoinformation, the position of each single image is known in the mosaic image. When a single image is processed, the corresponding region data of the overall reference background image must be extracted at the same time. The bigger the Contourlet transform level number, the lower the downsampling level of the low-frequency section, which means that each pixel processed in the low-frequency section corresponds to a bigger area in the mosaic image. The mosaic images usually have a large pixel count; it is practically impossible to process entire mosaic images at once. Moreover, the Contourlet transform has a so-called edge effect; therefore, a special deblocking strategy with overlaps is introduced to perform dodging for a single image, as well as the corresponding region data of the overall reference background image. A schematic diagram for deblocking is shown in Fig. 2.
As shown in Fig. 2, each block has an overlap with its adjacent blocks. When processing a block, the valid range for the block is smaller than the block range itself. The overlap size depends on the Contourlet transform level number and the window size of the low-pass filter in the low-frequency section. The overlap region number must be more than , where is the Contourlet transform level and is the window size of the low-pass filter in the low-frequency section.
Experiment and Analysis
The test images used in our experiment were 1650 orthophotos after aerial triangulation and bundle adjustment. The original images were acquired on three flights in three different days due to large coverage. The UAV used for acquiring images was a fixed-wing UAV with no gimbal in the fuselage. The on-board UAV camera was a Canon 5D II. The image format was regular 8-bit RGB-band color JPEG file. Pixel resolution for a single image was . The file size of a single orthophoto after rectification was . The total data size for all orthophotos was . The final file size of the mosaic image was 34 GB. The dark-bright interstrip effect does not seem too noticeable in the original 1:1 image; only in the overview will it become apparent. Traditional linear model cannot solve the nonlinear brightness distribution problem of the single image due to its model constraint. If we apply the nonlinear histogram matching method, especially when the shape difference in the image histogram is large, then the relative distance among the original gray levels will change, resulting in hue and illumination shifts in images with different internal features. This set of test images involves all situations noted in Sec. 1 that can cause different brightness distributions in UAV images. Therefore, these images can be utilized to test the effectiveness of the proposed method. Figure 3 shows a comparison between the mosaic result without image dodging and the result after processing with our proposed dodging method as presented in this paper.
From the mosaic result without image dodging in the left part of Fig. 3, we can see there are three different illumination level regions from left to right. The left three flight strips are underexposed and brightness is low. Flight strips in the middle part are exposed properly and brightness is relatively modest, but the dark-bright interstrip effect is visible. One side is bright and the other side is dark since the illumination distribution among the individual images is uneven. In addition, the UAV flight path is a zigzag, so the overall dark-bright interstrip effect in the mosaic result becomes much more pronounced. Flight strips in the right part are slightly overexposed; these images were acquired on a sunny day and the brightness is high. After carefully analyzing the original image, we found that when the UAV flies in different strips, different amounts of shadow casting objects were captured on different sides of the image because of occlusion, causing uneven brightness distribution within a single image. This uneven brightness distribution causes the dark-bright interstrip effect. Figure 4 shows the mosaic results from two commercial softwares commonly used for UAV images processing. The left part is the mosaic result from Pix4D and the right part is the mosaic result from Agisoft Photoscan. Both of the mosaics have the dark-bright interstrip effect. Furthermore, the three different global illuminations from three flights remained.
The enlarged views of three typical regions selected from mosaic results found in Fig. 3 are shown in Fig. 5, where a, b, and c regions correspond to a, b, and c regions in the mosaic results found in Fig. 3.
In Fig. 5, the left column shows the results without image dodging and the right column shows the dodged results using our proposed method. Regions a, b, and c were insets taken from three typical areas in Fig. 3 where the original UAV images are underexposed, properly exposed, and overexposed, respectively. They were taken from a 1:1 viewing scale of both mosaics; all of them contained parts of two adjacent images. The brightness and hue in the lower right corner of each inset are visibly different from the main body of each inset. However, as it can be seen from the right column in Fig. 5, local differences can also be rectified. Mean and SD parameters for the three regions corresponding to the true color RGB bands are given in Table 1. We can see the mean of the three bands, a, b, and c regions, was successively larger before dodging, but remained at the consolidated level after dodging. Moreover, the corresponding SD was almost unchanged for all three bands, indicating that our proposed method causes little interference with image information and can maintain image texture information in the process. Our proposed method keeps the brightness of all images consistent and eliminates the dark-bright interstrip effect, caused by the shadows of ground objects and vignetting.
Statistic parameters of three typical areas.
In this paper, we propose an algorithm for automatic image dodging of UAV images considering 2-D radiometric spatial attributes. It removes dark and bright foreground objects to reduce adverse effects and obtains an overall reference background image during preprocessing. Our method uses the Contourlet transform to separate high- and low-frequency sections of images, calculating the average difference among reference images in the low-frequency portions for radiometric consistency. The aspects of this method are the acquisition of the overall reference background image for obtaining mosaic results and the targeted processing of low-frequency sections of single images. Because images are usually large in pixel count, there is a need to deblock the images with overlaps, only retaining the effective area of sub-blocks in resulting images. The size of the overlap region is dependent upon the levels of Contourlet transform and window size of low-pass smoothing in low-frequency sections.
The proposed method can be used for image dodging to balance the inconsistencies of hue and illumination in images acquired at different times and can maintain image detail. It can also be applied in push broom images with strip phenomenon due to varying light and shadow in different directions. Yet there still exist problems such as the contamination of colors in bright areas by the surrounding regions and fogging of the final visual effect in the mosaic image, which will be improved in follow-up studies.
This work was supported by the National Natural Science Foundation of China under Grant No. 41471354.
Wenzhuo Li received his BS and MS degrees in photogrammetry and remote sensing from Wuhan University, Wuhan, China, in 2011 and 2013, respectively. He is currently pursuing his PhD in photogrammetry and remote sensing with the School of Remote Sensing and Information Engineering, Wuhan University. His current research interests include image segmentation, image classification, land use and land cover changes detection, and object-oriented image analysis.
Kaimin Sun received his BS, MS, and PhD degrees in photogrammetry and remote sensing from Wuhan University, Wuhan, China, in 1999, 2004, and 2008, respectively. He is currently an associate professor in the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University. His research interests include photogrammetry, object-oriented image analysis, and image change detection.
Deren Li received his PhD in photogrammetry and remote sensing from the University of Stuttgart, Stuttgart, Germany, in 1984. Currently, he is a PhD supervisor with the State Key Laboratory of Information Engineering in Mapping, Surveying, and Remote Sensing, Wuhan University, China. He is also an academician of the Chinese Academy of Sciences, the Chinese Academy of Engineering, and the Euro-Asia International Academy of Sciences. His research interests are spatial information science and technology represented by RS, GPS, and GIS.
Ting Bai received her BS degree in GIS from Huazhong Agricultural University, Wuhan, China, in 2014. She is currently pursuing her MS and PhD degrees in photogrammetry and remote sensing with the School of State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University. Her current research interests include remote sensing and feature fusion, machine learning, ensemble learning, land use, and land cover changes analysis of long-time series.