Super-resolution reconstruction of high dynamic range images in a perceptually uniform domain

Abstract. Super resolution is a signal processing method that utilizes information from multiple degraded images of the same scene in order to reconstruct an image with enhanced spatial resolution. The method is typically employed on similarly exposed pixel valued images, but it can be extended to differently exposed images with a high combined dynamic range. We propose a novel formulation of the joint super-resolution, high dynamic range image reconstruction problem, using an image domain in which the residual function of the inverse problem relates to the perception of the human visual system. Simulated results are presented, including a comparison with a conventional method, demonstrating that the proposed approach avoids some severe reconstruction artifacts.


Introduction
To obtain high visual quality in digital camera systems, a high spatial resolution is required.The most direct way to increase spatial resolution in a camera is to reduce the size of the sensor pixel elements or to increase the chip size.However, there are applications for which this may not be feasible.For example, to overcome sensor noise at low exposure levels, smaller pixels have to be exposed for a longer duration.Consequently, in terms of hardware, a high spatial resolution is in contradiction to having a high temporal resolution, which is a requirement for high-quality video systems.Additionally, a longer exposure duration means that frames become susceptible to motion blur effects.Superresolution reconstruction (SRR), as an alternative, is a signal processing approach to enhance spatial resolution.Multiple low-resolution (LR) images, captured in a sequence, of the same original scene are used to construct one high-resolution (HR) image.A further introduction to super resolution is given in Sec.1.3.Camera sensors that use a single fixed-exposure duration for each captured image have dynamic range limitations.This is particularly the case for compact cameras that have a smaller pixel size.The limitation is experienced e.g., when capturing an image of an outdoor environment on a sunny day or an indoor scene with a window.Certain image areas will then be overexposed or underexposed, which has a large impact in terms of visual quality.The human visual system (HVS) is able to perceive detailed information in bright and dim areas simultaneously, and thus it is undesired of the camera system to impose this limitation.By taking two or more images of the same scene, using different exposure settings, a single high dynamic range (HDR) image can be created, which is free from saturation.Digital display devices, for their part, also have a low dynamic range (LDR) compared to real-world scenes.The dynamic range limitation for display devices, however, can practically be overcome by tonemapping HDR image information to a LDR image in such a way that it, to the HVS, resembles the HDR image that it was created from. 1 Before proceeding to a review of relevant literature on HDR as well as SRR, the HVS and how light is interpreted is briefly introduced.

The Human Visual System
Understanding the main principles of the HVS is of crucial importance to researchers in image processing.When dealing with standard 8-bit pixel image formats, much is abstracted into how those images are coded.For example, in the standard red, green, and blue (RGB) color spaces, data is coded linearly with respect to the perception of brightness in the HVS.This implies that image operations or filters naturally work as expected.Also, storage, compression or filtering of image data in this domain makes sense, because of the property that pixel value errors have the same perceptual impact across the whole range of pixel values.
Light that is incident on the camera sensor is nonlinear with regard to the HVS in several respects.As will be presented, various camera components are designed to mimic the HVS in order to produce an output image, which is perceived to be of high fidelity.Consider a light spectrum observed by the human eye.The incident energy is registered by the photoreceptors of the eye, the three different types of cone cells (corresponding to, roughly, red, green and blue light), that are responsible for human color vision.The three cone types each have their respective spectral responses.Their combined spectral response can be described in terms of grayscale vision by a single function of wavelength, the luminous efficacy curve.For simplicity as well as to stay within the main scope of the paper, the discussion of the HVS is from here on reduced to grayscale vision.The luminous efficacy curve is a dimensionless function that tells what fraction of radiant power at any given wavelength contributes to the luminous properties of the HVS.It is nonzero for wavelengths between, roughly, 380 to 700 nm, the socalled visible spectrum. 2he term for the incident power spectrum of light is irradiance.The mathematical inner product of irradiance with the luminous efficacy curve is denoted illuminance.As opposed to irradiance, a radiometric unit, illuminance is a term defined based on the HVS, and is thus a photometric unit.While illuminance is the light registered by the human photoreceptors, the cones, it is still nonlinear with respect to perceived brightness.Illuminance changes of an absolute measure have a much higher perceived impact at low illuminance levels as compared to at high illuminance levels.Perceived brightness is subjective.Based on tests of human vision on large sample sizes, standardized units such as lightness and luma are defined and used in color spaces for LDR image processing.Table 1 summarizes the terms presented in this section and how they relate to the HVS and digital cameras.An incident light spectrum, irradiance, on a camera sensor is filtered according to the same principle as that of the cones of the eye.Thus, light contributes differently, depending on its wavelength, to the amount of current that is excited from sensor elements.The term illuminance is used here also when referring to the sensor output due to the filtered light, even though the term is really defined for the HVS.An image domain that is an approximation of perceived brightness, i.e., which is approximately linear to perceived brightness, will henceforth be called a perceptually uniform (PU) domain.Examples of (grayscale) PU domains are lightness and luma; however, they are only designed for LDR images.Any PU domain will correspond to a concave nonlinear mapping from the illuminance domain.

High Dynamic Range Images
The HVS can adapt to illuminance levels of about 10 orders of magnitude, and at any given adaptation level about five orders of magnitudes are perceived.As mentioned, for a given exposure setting, camera sensors have a lower dynamic range than that of many natural scenes (and of the HVS), which will cause saturation of sensor elements, either from over-or underexposure.Thus, the sensors are said to have a LDR.In HDR image reconstruction, a single HDR image is constructed from two or more LDR images captured with different exposure settings.The LDR images must be aligned and should be captured in a quick sequence, such that local motion in the scene is kept to a minimum.The images are merged by weighted average in a PU image domain.Weights of saturated pixels are naturally set to zero.Given a set of LDR images in the pixel value domain, their corresponding sensor exposures are retrieved by an inverse mapping with the camera response function (CRF).Illuminance domain images are subsequently obtained by scaling the images with the inverse of their respective exposure durations.Finally, the images are transformed to e.g., the log-illuminance domain, which is an approximation of lightness (because perceived brightness of the HVS is much closer to logarithmic than linear in illuminance), where the weighted average is taken. 3o reproduce an image of an HDR scene on a display device, e.g., a computer monitor, it has to be tonemapped to the lower dynamic range of the display device. 1,4The tonemapping essentially compresses the dynamic range uniformly in a suitable domain, but more sophisticated tonemappers also do local processing.For example, image contrasts may be compressed more on edges between image segments than on details within image segments.Additionally, suitable color spaces are used in order to maintain high color fidelity.The output LDR image, typically stored in a standard 8-bit format such as sRGB, is perceived by the HVS to still contain HDR information.HDR applications are recently starting to reach the general public.For example, smartphone applications that take two images in a sequence and generate a merged HDR image have reached widespread use, and similar features are common in modern commercial cameras.The result is often satisfactory, although artifacts sometimes occur due to spatial misalignment.

Super-Resolution Reconstruction
The signal processing technique SRR makes use of multiple degraded LR images of the same original scene. 5,6In order to be useful, each image must contain new information of the scene, which is the case if the images are shifted relative to each other on a subpixel scale (e.g., from natural camera movement) or if they are blurred by different blur functions.
In the corresponding mathematical model, this means that the images each contribute with linearly independent observations of the desired HR image representation.To reconstruct an HR image, an inverse problem is solved, including image registration (alignment) with high precision as well as blur estimation, either as pre-processing or jointly with estimating the HR image.A property of SRR is that the resulting inverse problem is fundamentally ill-conditioned, which presents big challenges. 7,8If a larger number of observations are used, the conditioning improves, but only slightly.Furthermore, because complex motion between frames is undesired, a relatively small number of images is preferred so that they are close temporally.Thus, to improve the conditioning, a regularization term is often added to the problem formulation.The regularization function should enforce a solution that matches the statistics of the image, or rather of natural images in general, since the statistics of the unknown image are unavailable.Bayesian interpretations, in which the unknown image is viewed as stochastic, of the SRR problem are common. 9,10imilarly to deterministic methods, they lead to minimizing a (regularized) objective function.A prior is assumed on the HR image and often also on other system parameters that should be estimated.For example, fine-scale image registration can be cast as a stochastic problem, with a prior selected for the registration error that remains after a rough pre-processing registration.Blind super resolution (BSR) addresses the joint estimation of HR image and blur kernels (that include subpixel level shifts). 11BSR has clear similarities with multichannel blind deconvolution, 12 with the added extension of downsampling.BSR is very challenging in terms of finding the global optimal point, due to its nonconvex structure.Often, simple parameterizations of the blur kernel are made, or it is assumed static to model only the blur effect of the sensor.Applications of SRR exist both for offline processing, such as astronomy and forensics, and for real-time processing, as for video streaming.Work on realtime implementations are increasingly common, including SRR implementations on custom hardware such as Field Programmable Gate Arrays.Custom sensors may also be used, that can e.g., increase pixel size for desired improvement of temporal resolution in situations with moving objects that are susceptible to motion blur. 13The spatial resolution is then maintained by performing SRR on the frames with reduced resolution.Optical flow models are implemented to handle local object motion within the image frames.

Super-resolution Reconstruction of HDR Images
Traditionally, SRR is performed on similarly exposed LDR pixel valued images.However, during the past few years SRR has also been applied to HDR images. 14,15The combined dynamic range of the reconstructed HR, HDR image has no restrictions, as the exposure durations of the input images can be selected freely.A good reconstruction result requires accurate image registration, both for spatial shifts and to photometrically align the images in an HDR image domain.Facing both geometric and photometric registration adds some extra challenges, as compared to only geometric registration. 16Ambitious methods implement optical flow techniques to handle local spatial motion within the images. 17If the input images to the HDR SRR are given as 8-bit pixel valued images, the CRF is needed in order to retrieve the illuminance information from the earlier stage of the camera processing chain.However, if the device was designed to perform HDR SRR, the illuminance information (of a higher bit depth) could be more readily available.
A common factor for the published work on SRR for HDR images is that the reconstruction takes place in the illuminance domain, which, as discussed in Sec.1.1, is highly nonlinear to human perception. 18This is opposed to the case of LDR SRR methods, where the reconstruction is performed in the PU pixel value domain.Because illuminance errors at low illuminance levels have such a high perceived impact, reconstruction errors in dim image areas cause noticeable artifacts.Therefore, in this work, the HDR SRR is performed in a tonemapped, PU domain.The HDR illuminance information is thus mapped by a nonlinear tonemapping function, modifying the inverse SRR problem from a least squares (LS) problem to a nonlinear problem of equal dimension.Note that the choice of tonemapping function is nontrivial.They all aim to mimic the HVS but should merely be seen as approximations. 1,4In the formulation of the proposed method, a simple tonemapping function is used, which nevertheless captures the main characteristics of the HVS well (but should not be the preferred choice for the application of visualizing HDR data).Methods for image registration or the estimation of other system parameters are outside the scope of this paper, the focus is on the formulation of the inverse problem used in the SRR.The remainder of this paper is organized as follows.Section 2 describes the camera model for the image data acquisition.In Sec. 3, a method for SRR on HDR images in a PU domain is proposed.Section 4 provides experimental results for the proposed method and a related illuminance domain approach.Finally, Sec. 5 concludes the paper.

Camera Model
When acquiring an image, the camera sensor is illuminated by a real-world scene for a given exposure duration.Let the image X, of size X 1 × X 2 , denote an HR representation with dynamic range X max ∕X min of an original continuous scene, and let x be its corresponding vectorized representation of length ðX 1 X 2 Þ × 1 ≜ n × 1.It is assumed that the camera provides a set of low-resolution and LDR images y k according to the expression k¼ 1; : : : ; K: For each of the multiple observations, C H k , of size n × n, performs 2-D convolution on the vectorized HR image x.Its convolution kernel H k , of support H 1 × H 2 , represents blurring as well as planar small-scale spatial shifts (assuming that rough image registration can be performed as pre-processing) for y k relative to a reference image (e.g., y 1 ).The downsampling matrix D, of size ðn∕L 2 Þ × n, decimates the spatial resolution a factor L in the x-and y-directions, and Δt k is the exposure duration.Additive Gaussian noise n k of variance σ 2 n models the noise in the image sensor.In addition, by somewhat exaggerating the value of σ 2 n , other error sources can be contained in n k .The y k images are quantized to 8-bit pixel values, with uniformly spaced quantization levels Y ∈ f0; : : : ; 1g.This quantization effect is modeled by q k , while quantization in the A/D converter to a higher bit depth is considered to be part of n k .
The low-resolution exposure of the camera sensor is The e k vector is, similar to y k , n k , and q k , of size ðn∕L 2 Þ × 1.At each pixel i, the exposure ½e k i is mapped by the pixelwise, nonlinear CRF, f, to a pixel value in the output image.A reasonable (simplified) model of a real CRF, adopted for the simulations in this paper, is where γ LDR < 1 makes the curve concave in the unsaturated region.What causes the y k images to be of LDR is the fixed operational range of exposure values of the CRF, ½E min ; E max .Exposure values outside of this interval are clipped and can not be recovered (from that single image).
The CRF is said to perform gamma compression on the exposure image, to an LDR, PU output domain.

Alternative Camera Models
It is interesting to note that the camera model used for LDR SRR on pixel domain images differs from the camera model for HDR SRR methods, without exception.The forward model assumed for LDR SRR is for which the order of f and DC H k is interchanged as compared to Eq. ( 1). 5 From the set of observations, fy k g, the desired pixel valued HR image z ¼ fðΔtxÞ is to be reconstructed.The model in Eq. ( 3) is thus used in the formulation of the inverse problem, where it relates the unknown HR image to the actual observed LR data.The inconsistency in the choice of camera model gives rise to the question of what model, Eq. ( 1) or Eq. ( 3), is closer to the real physics of the camera.One of the two, LDR SRR or HDR SRR, has a larger model mismatch that comes into play when real-world data is considered.The choice of camera model is not investigated further in this paper.

Image Reconstruction in a Perceptually
Uniform Domain To reconstruct, or rather to find a good estimate of x, an inverse problem should be formulated and solved.To do so, knowledge of the system parameters in H k , including blur functions and subpixel-precision relative motion between image frames, is necessary.Assuming these, as well as the CRF f, to be known or to have been estimated, a compact notation is introduced in the beginning of this section to aid the inverse problem formulations.Then, in the remainder of the section, three different HDR SRR formulations are described, one which has an objective function that is expressed in the illuminance domain, and the other two in which a PU domain is employed.For mathematical convenience, the impact of the quantization noise q k will be neglected in the inverse problem formulation (quantization is nevertheless used when generating y k ), as it typically has little impact on reconstruction results compared to other error sources, e.g., n k (and in the more general case, to errors in system parameter estimates).
To align the observed images y k photometrically, the corresponding illuminance domain images [of the same size are recovered by mapping each pixel value in y k with the inverse (not in the strict sense, due to the clipping effect) CRF, followed by a scaling with 1∕Δt k .To handle saturation in f, each observation y k has a corresponding diagonal weight matrix W k , of size ðn∕L 2 Þ × ðn∕L 2 Þ, which is zero for diagonal elements corresponding to saturated pixels in y k and one for the remaining diagonal elements.Saturated pixels are those where ½e k i ∈ = ½E min ; E max .However, since only the y k are at hand, saturated pixels are taken to be those where ½y k i is either 0 or 1, the lowest and the highest quantization levels respectively.Based on the camera model in Eq. ( 1), where the second equality holds if the quantization effect is neglected, which means that gðfðÞÞ is the identity function for the unsaturated region of f.By stacking the individual LR, LDR illuminance domain observations in i ¼ ½i T 1 ; : : : ; i T K T , the noise vector n ¼ ½n T 1 ; : : : More generally, W could be designed to weigh observed data differently, as a function of e.g., its pixel value and exposure duration, according to the noise properties of Eq. ( 1). 14 In the image reconstruction, the objective is to minimize the difference, in some sense, between Wℋx and the faithfully (unsaturated) observed information Wi.Thus, a distance measure of the residual function rðxÞ should be minimized in order to obtain x ¼ arg min where ρ is a nonnegative, norm-like function.The objective functions to be minimized in this paper all have the structure above.The rank of Wℋ depends on the number of observations K, and specifically the unsaturated regions of the images y k , as well as the selection of L. For example, if there is no saturation (which however implies that x cannot be an HDR image) and L ¼ 2, K ¼ L 2 ¼ 4 images or more implies that Wℋ is full rank.However, due to the structure of Wℋ, or more precisely due to the blur functions H k contained within, the inverse problem is inherently illconditioned.Thus, even for a full-rank problem, the solution is very sensitive to noise.By adding a regularization term of a certain weight, the condition number of the overall problem improves, at the cost of removing fidelity from the data term.The remainder of this section presents three specific formulations of the general inverse problem in Eq. ( 8).These are then evaluated in Sec. 4.

The Least Squares Solution
Under the assumption of additive Gaussian noise n, the LS estimate of x is optimal in the maximum likelihood sense.However, the corresponding inverse problem is equivalent to minimizing the sum of squared reconstruction errors in the illuminance domain, a domain that is nonlinearly related to perceived error by the HVS, yet it is still used for SRR of HDR images. 14,15,17The LS estimate is included mainly as a benchmark for the proposed method.In order to improve the condition number of the inverse problem and provide a stable solution, a regularization term is appended.Thus, the LS estimate is for some linear regularization matrix Γ of size n × n with regularization weight parameter λ.Because images are typically piecewise smooth, Γ is selected as a matrix that performs 2D convolution on the vectorized image x with a 3 × 3 Laplacian kernel, that enforces a smooth solution by penalizing the 2nd order x-and y-derivative.The residual vector here is denoted r LS ðxÞ.In the above case, the objective function is where r LS;i ðxÞ denotes the i'th component function in r LS ðxÞ.Because the system of equations is very sparse, an analytic solution is computationally feasible for a fairly large size of the problem.

The Proposed Objective Function
For the proposed objective function, the residual function rðxÞ is formulated in such a way that errors perceived by the HVS are penalized uniformly in the reconstruction.here is taken to be a power law function but with a different exponent, γ HDR .Because the method that is used to generate an HDR ground truth image (from a set of LDR images) only provides relative illuminance information, and not the true absolute values, 3 the illuminance information, i, is normalized before applying f ∼ .The same notation is kept.The normalized illuminance information is then tonemapped to a PU domain by In this work, no special care is taken to customize the value of γ HDR depending on the dynamic range of i.It should be noted that, to instead minimize f ∼ ½Wðℋx − iÞ, i.e., the tonemapped difference, would be incorrect, since the absolute illuminance level determines how sensitive the perception of the HVS is.The objective function of Eq. ( 11) is where r PU;i ðxÞ denotes the i'th component function in r PU ðxÞ.
To solve the nonlinear minimization problem in Eq. ( 11), an iterative nonlinear programming method must be employed.An initial estimate, x ð0Þ , may be taken as an interpolated version of an i k image.Furthermore, let x ðkÞ be an estimate of x at iteration k, then the estimate is updated according to where d ðkÞ is the search direction and α ðkÞ is the step length at iteration k.The search direction is of great importance with regard to the convergence rate.The 1st order gradient descent method is typically very slow to converge for the SRR problem.Because of that, higher order information is considered in order to determine d ðkÞ .A 2nd order Taylor approximation of O ¼ ρ½rðxÞ in the neighborhood of x ðkÞ , as used in Newton methods, is where G is the gradient vector and H is the Hessian matrix of O.For the specific case of a (possibly) nonlinear LS objective function, as for O 2 (and O 1 ), the gradient and Hessian can be written in terms of r as and Note that OðxÞ, rðxÞ, GðxÞ, and HðxÞ all depend on x, and thus need to be re-evaluated for each update in the iterative optimization method.However, for compactness, this dependency will not always be emphasized by writing out the argument.The commonly used Gauss-Newton method fits nicely for the optimization problem in Eq. (11) which consists of a data term r PU;1∶m ðxÞ and a regularization term r PU;mþ1∶mþn ðxÞ.The data term elements have small residual values for x close to the true image.The regularization term is designed such that the true image should be penalized as little as possible, but in practice its contribution to the residual is certainly not zero.Because the residual elements are small near x, the Hessian H in Eq. ( 16) is well approximated by H GN ¼ 2J T r J r , dropping the term S, which contains the second order derivatives of r i .In order to obtain d GN , at each iteration is obtained as the search direction to be used in the iterative minimization in Eq. (13).By taking the inner product of the gradient and the search direction d GN , it is seen that the GN search direction is indeed a descent direction.This is a nice property of the GN method, and it is a consequence of the positive definiteness of H GN .Standard GN does not include any step size α ðkÞ in its updates.However, while H GN ¼ 2J T r J r is positive definite (by construction), J r is only valid in a region around x ðkÞ for which it was evaluated, thus the update step may be taken too large.By including a line search method to determine the step size at each iteration, the robustness of the method is increased.The extended method is famously known as damped Gauss-Newton, and it is used here to minimize O 2 .

Reconstruction Using Robust Norm and Robust Regularization
So far, the methods of reconstructing x have involved minimizing an objective function that is the L2-norm of a residual vector.The LS solution is suitable when the residual, at the optimum, has a Gaussian distribution.This was the case for the (unsaturated) observations i in the LS formulation of r LS in Eq. ( 9).The residual function r PU ðxÞ in Eq. ( 11), however, does not follow a Gaussian distribution.Thus, using another norm function could give a more suitable estimate.A second reason to consider other functions than the L2-norm is that any errors in the system parameters of ℋ (which would be the case for a practical setup where system parameters themselves need to be estimated), as used in the inverse reconstruction model, give a model mismatch with the observation model.As mentioned, a regularization term is usually added to avoid noise amplification and thus stabilizes the solution.Depending on the amount of noise, the weight parameter λ for the regularization term should be adjusted accordingly.The larger λ is, the more important it becomes that the solution preferred by the regularization term matches the true statistical properties of x.For this purpose, large values of r mþ1∶mþn ðxÞ should be interpreted as edges, which are a natural part in images, and should not be penalized.The Lorentzian "norm" (not really a norm since it violates the triangle inequality), where T i > 0 are constant parameters, has been introduced and showed promising results for SRR in which several different noise models were tested. 19The T i are interpreted as threshold values.For r i smaller than T i , ρ Lor ð•Þ acts as the L2-norm and penalizes residual components quadratically, while for r i greater than T i , ρ Lor ð•Þ acts as the L1-norm, making the reconstruction less sensitive to large residuals that correspond to outliers in the data term or edges in the regularization term.In this regard, the simple linear Tikhonov regularization with the Lorentzian norm achieves the same edge-preserving property, which is the motivation for the use of the bilateral total variation regularization. 6,19sing the same residual function r ¼ r PU as for O 2 , combined with the Lorentzian norm, a third estimate of x is The corresponding objective function is where T 1 and T 2 are the threshold values of the Lorentzian norm for the data term and the regularization term respectively.The nonlinear objective is minimized iteratively.The GN method used for Eq. ( 11) does not fit this problem.Instead, the update equation for the iterative minimization of Eq. ( 22) is where P ðkÞ is a linear transformation performed on the gradient vector If P ðkÞ is the Hessian matrix the iterative Eq. ( 23) update is the Newton method.However, P ðkÞ may be any linear transformation, as long as it gives a higher convergence rate as compared to that of the gradient direction it is beneficial.It is preferred to be positive definite, as this guarantees to give a descent direction in the neighborhood of x ðkÞ .The choice P ðkÞ ¼ ðJ T r J r Þ −1 in Eq. ( 23) is empirically proven to give a high convergence rate for Eq. ( 21), and is thus used for the experiments.

Experimental Results and Discussion
In this section, the three SRR formulations presented in Sec. 3 are evaluated in terms of experimental results.To compare the quality of the estimates, xLS , xPU , and xLor , their respective tonemapped results are presented.This provides a subjective comparison.As objective measures, PSNR and MSSIM values are given for the tonemapped results relative to the tonemapped original image x. 20 The tonemapping function in MATLAB is used to display all HDR images, as it gives more visually appealing images than when tonemapping with f  The pixelwise function f and O 3 in order to tonemap the normalized illuminance domain data, thus achieving a residual function r PU that is PU with respect to the HVS.There is no established value for the exponent γ HDR of f ∼ .Instead, it is selected here as γ HDR ¼ 1∕6 based on empirical tests.Thus, it is lower than that of the LDR case, γ LDR ¼ 1∕2.2.It is noted that the authors of the image appearance model iCAM, similarly, suggest that an exponent of 1∕6 is required in order to clearly perceive details in dim areas (that is, to make that data roughly perceptually uniformly coded), when using a power law function such as f ∼ for visualization of an HDR image. 22n their updated model, iCAM06, the HDR data is raised by an exponent of of 1∕3 at one step and later by an additional exponent of 0.7.Due to use of several color spaces and additional manipulation, however, no clear value for the total gamma is given. 1 As a lowest acceptable performance limit for SRR methods, consider the interpolated images in Figs.1(c) and 2(c).These have been obtained by first creating an LR, HDR image using perfectly registered LR, LDR images, one for each unique exposure duration, and the exact g function for photometric alignment, followed by bicubic interpolation in order to increase the resolution a factor L ¼ 2.  There is a slight improvement by using the Lorentzian norm of r PU as in O 3 rather than the L2norm in O 2 , but it is minor.Simulations using imperfectly estimated H k in the SRR do not seem to further enhance the performance difference of using the robust Lorentzian norm as compared to using the L2-norm; however, more rigorous investigations for this case would be interesting.
For the Mount Tam image, additional reconstruction results for the objective function O 1 are shown in Fig. 2(g)-2(i), with λ ¼ 10, 0.1 and 10 −6 used respectively.It is seen that the increased value of λ from 0.05 to 0.1 gives a less noisy result compared to the LS reconstruction in Fig. 2(d); however, the edge artifacts are magnified.A too-high value for λ leads to oversmoothing, while a too-low value for λ results in severe noise amplification due to the poor condition number of the inverse problem.For the reconstruction result in Fig. 2(i) with λ ¼ 10 −6 , K ¼ 8 images are used, four for each of the two different exposure durations, which means that the problem is full rank.Thus, it is clear that due to the ill-conditioned structure, regularization is necessary, which in turn gives numerical reconstruction errors particularly at image edges.These errors have a large perceptual impact for low-illuminance levels, as results show when performing SRR in the illuminance domain as for O 1 .For O 2 and O 3 , as an example, setting λ for SRR of the Mount Tam image to 0.01 or 0.1 instead of the current choice of 0.05 does not cause much noticeable difference in the results.The SRR method presented in this paper could straightforwardly be applied to color images, by treating each RGB color channel individually.This gives satisfactory results, even though it is not optimal in terms of color fidelity.The RGB channels should then be coded jointly, such as for the LDR L*a*b color space.

Conclusions
This paper treats SRR of differently exposed LDR images.A formulation of the HDR SRR problem in which perceived difference is minimized in an appropriate, PU image domain has been proposed.As shown in experimental results, even small numerical reconstruction errors will have a large perceptual impact in dim image areas if the illuminance domain is used in the objective function, due to the nonlinearity of the illuminance domain to perceived brightness of the HVS.This issue is overcome by using the proposed approach.Because the proposed residual function is non-Gaussian, an alternative to the L2-norm was suggested, which, however, only gave a minor improvement in the presented experiments.
; h i ½r i ðxÞ ≥ 0; ∀i ; are used in the presented experiments, Memorial Church and Mount Tam (Images are courtesy of Greg Ward, available at Ref. 21), displayed in Figs.1(a) and 2(a), respectively.From each of these images, a set of

Fig. 1
Fig. 1 Top row, from left to right: (a) Original image Memorial Church.(b) The observed images y k (c) Bicubic interpolation of an LR, HDR version of Memorial Church.Bottom row: Reconstruction results using (d) O 1 , (e) O 2 , and (f) O 3 .
Figure 1(d)-1(f) shows the reconstructed images, xLS , xPU , and xLor , obtained from minimizing O 1 , O 2 , and O 3 , for the Memorial Church image.The regularization weight parameter used is λ ¼ 0.02, 0.03, and 0.02 respectively.Similarly, the results for the Mount Tam image are shown in Fig. 2(d)-2(f), all acquired with λ ¼ 0.05.The threshold values for the Lorentzian norm in O 3 are set to T 1 ¼ T 2 ¼ 0.1 for both experiments, selected to give the highest possible MSSIM value, the same criterion used as for selecting λ (parameter values were selected empirically).PSNR and MSSIM values for the tonemapped reconstruction results are given in Table 2. Since PSNR is a crude measure for image quality, emphasis should be on the MSSIM values.Edge artifacts are clearly visible in the dim regions of the LS illuminance domain estimate xLS .These artifacts are not present for the SRR results in the PU domain f ∼ ð•Þ.Thus, visual inspection is needed to complement the PSNR or MSSIM values that do not fully convey the results of the reconstruction.However, the proposed objective function O 2 and its
19. V. Patanavijit and S. Jitapunkul, "A Lorentzian stochastic estimation for a robust iterative multiframe super-resolution reconstruction with Lorentzian-Tikhonov regularization," EURASIP J. Adv.Signal Process.2007(2), 1-21 (2007).20.Z. Wang et al., "Image quality assessment: from error visibility to structural similarity," IEEE Trans.Image Process.13(4), 600-612 (2004).21.G. Ward, "High dynamic range images," http://www.anyhere.com/gward/hdrenc/pages/originals.html.22. M. Fairchild and G. Johnson, "iCAM framework for image appearance, differences, and quality," J. Electron.Imaging 13(1), 126-138 (2004).Tomas Bengtsson is a PhD student in the signal processing research group at the Chalmers University of Technology, in Gothenburg, Sweden.He obtained a BS degree in electrical engineering and an MS degree in communication engineering at Chalmers University of Technology between 2004 and 2009.His research lies in signal processing and sensing, focusing on image processing, in particular high dynamic range imaging and super-resolution techniques.He is involved in a project on visual quality measures, with the aim to explore algorithms for the enhanced display of images and videos with applications toward the automotive industry, and in particular car safety.Tomas McKelvey obtained an MS degree in electrical engineering at Lund University between 1987 and 1991 and received his PhD in automatic control at Linköping University in 1995.Between 1995 and 1999, he held research and teaching positions at Linköping University and became docent in 1999.Between 1999 and 2000 he was a visiting researcher at University of Newcastle, Australia.Since 2000, he has been with Chalmers University of Technology, becoming a full professor in 2006 and the head of the signal processing group at Chalmers in 2011.Professor McKelvey's research interests are model based in statistical signal processing, system identification, and image processing with applications to biomedical engineering and combustion engines.Irene Yu-Hua Gu received her PhD degree in electrical engineering from Eindhoven University of Technology, Eindhoven, The Netherlands, in 1992.From 1992 to 1996, she was a research fellow at Philips Research Institute IPO, Eindhoven, The Netherlands, and did post-doctorate work at Staffordshire University, Staffordshire, United Kingdom.She was also a lecturer at the University of Birmingham, Birmingham, United Kingdom.Since 1996, she has been with the Department of Signals and Systems, Chalmers University of Technology, Göteborg, Sweden, where she has been a full professor since 2004.Her research interests include statistical image and video processing, object tracking and video surveillance, pattern classification, and signal processing with applications to electric power systems.She has been an associate editor with the EURASIP Journal on Advances in Signal Processing since 2005, and on the editorial board for the Journal on Ambient Intelligence and Smart Environments (JAISE) since 2011.Optical Engineering 102003-10 October 2013/Vol.52(10) Bengtsson, McKelvey, and Gu: Super-resolution reconstruction of high dynamic range images. . .Downloaded From: https://www.spiedigitallibrary.org/journals/Optical-Engineering on 16 Sep 2023 Terms of Use: https://www.spiedigitallibrary.org/terms-of-use

Table 1
Description of light units in the context of the HVS and digital cameras.

Table 2
Experimental results.PSNR and MSSIM values are given for tonemapped SRR estimates t ð xÞ relative to the tonemapped original image t ðxÞ, where t ð•Þ denotes the MATLAB tonemapping function.The highest PSNR and MSSIM values for each of the original images are highlighted in bold.Bengtsson, McKelvey, and Gu: Super-resolution reconstruction of high dynamic range images. . .