## 1.

## Introduction

Image mosaicing has impacted a wide variety of application areas.^{1} It is to combine multiple images with overlapping areas into a wide field or high-resolution image.^{2} The process is divided into two steps. The first step is image registration whose purpose is to find the coordinate transformation between a pair of images and align them. The second is image blending which aims at smoothing the transition areas and minimizing visual flaws, such as ghost, exposure difference, or misalignments. The performance of the algorithm is, therefore, both related to these two steps.

To evaluate the performance of the image mosaicing algorithm quantitatively is indispensable for algorithm comparisons and improvements. However, this aspect of research is still in an early stage of development due to the difficulties in obtaining the reference panoramas. The general solutions usually require manual interventions. In Ref. 3, it rates the mosaicing algorithm based on the comparison between the output yielded on standard data sets and the ground truth stitched manually. In Ref. 4, the input data sets are collected by virtual camera, a software module simulating the imaging process of an actual device. The shooting parameters are fully customizable so that it is possible to compose the ground truth. In Ref. 5, a reference image is specified as the real world scenery in a test environment. Similarly, the inputs are simulated by applying a group of common imaging distortions on narrow views of the reference image.

In this paper, an assessment approach for image mosaicing algorithms without reference panoramas is proposed. Five evaluation indices are calculated and combined with the analytic hierarchy process (AHP) for comprehensive assessment. It can reduce manual interventions and make the evaluation much more convenient.

## 2.

## Assessment Scheme for Image Mosaicing

A novel assessment scheme for image mosaicing is established as seen in Fig. 1. The most notable advantage of the approach is that it does not need a panoramic image but the original input narrow-angle image instead for comparison. It is deemed that the gathered image for stitching is the most realistic view from that angle in that projection. It is fair enough for partial comparison. The indices extracted in this aspect belonging to full-reference assessment are calculated by comparing the similarity between the original input images and their neighboring images, or their inverse warped partial mosaics. As for the whole mosaic image, it is up to its own performance in perception. The related indices belonging to no-reference assessment are calculated by measuring the output mosaics globally.

## 3.

## Evaluation Criteria and Comprehensive Assessment

## 3.1.

### No-Reference Criteria

The no-reference criteria include two indices. The first one is entropy for characterizing the texture and information amount of the output mosaics *I*_{mos}, denoted as *C*_{1}. The higher the value of *C*_{1}, the richer the image information. The second one is clarity and defined as follows.

## 1

[TeX:] \documentclass[12pt]{minimal}\begin{document}\begin{eqnarray} &&\hspace*{-5pt}C_2 = {\rm Clarity}_{{\rm Sobel8}} (I_{{\rm mos}}) =\! \sum\limits_{x = 1}^M \!{\sum\limits_{y = 1}^N {\left| {H(x,y)} \right|^2 } },{\rm }H(x,y)\! >\! T_N,\nonumber\\ &&\hspace*{-5pt}H(x,y) = \sqrt {\sum( {I_{{\rm mos}} (x,y)*S_j } )^2}\, , \end{eqnarray}\end{document} $$\begin{array}{ccc}& & {C}_{2}={\mathrm{Clarity}}_{\mathrm{Sobel}8}\left({I}_{\mathrm{mos}}\right)=\sum _{x=1}^{M}\sum _{y=1}^{N}{\left|H(x,y)\right|}^{2},H(x,y)>{T}_{N},\hfill \\ & & H(x,y)=\sqrt{\sum {\left({I}_{\mathrm{mos}}(x,y)*{S}_{j}\right)}^{2}}\phantom{\rule{0.16em}{0ex}},\hfill \end{array}$$*S*

_{i}are Sobel masks in eight directions (0 deg, 45 deg, …, 315 deg),

^{6}

*T*

_{N}is the standard deviation of

*I*

_{mos}, and

*M*×

*N*is the size of

*I*

_{mos}. The higher the value of

*C*

_{2}, the less the blur of the mosaic image.

## 3.2.

### Full-Reference Criteria

The full-reference criteria include registration error, peak signal-to-noise ratio (PSNR) and structural similarity (SSIM), which are defined as *C*_{3}, *C*_{4}, and *C*_{5}.

## 2

[TeX:] \documentclass[12pt]{minimal}\begin{document}\begin{eqnarray} C_3 &=& \frac{1}{{N - 1}}\sum\limits_{n = 1}^{N - 1} {\varepsilon _n }\nonumber\\ & =& \frac{1}{{N - 1}}\sum\limits_{n = 1}^{N - 1} {\left( {\frac{1}{P}\sum\limits_{p = 1}^P {\big\| {x_{jp} - H_j^{ - 1} H_i x_{ip} } \big\|} } \right)} \end{eqnarray}\end{document} $$\begin{array}{ccc}\hfill {C}_{3}& =& \frac{1}{N-1}\sum _{n=1}^{N-1}{\varepsilon}_{n}\hfill \\ & =& \frac{1}{N-1}\sum _{n=1}^{N-1}\left(\frac{1}{P}\sum _{p=1}^{P}\Vert {x}_{jp}-{H}_{j}^{-1}{H}_{i}{x}_{ip}\Vert \right)\hfill \end{array}$$## 3

[TeX:] \documentclass[12pt]{minimal}\begin{document}\begin{eqnarray} C_4 = \frac{1}{N}\sum\limits_{n = 1}^N {{\rm PSNR}(\hat r_n,s_n)} = \frac{1}{N}\sum\limits_{n = 1}^N {20\lg \frac{{255}}{{\sqrt {{\rm MSE}(\hat r_n,s_n)} }}},\nonumber\\ \end{eqnarray}\end{document} $$\begin{array}{c}\hfill {C}_{4}=\frac{1}{N}\sum _{n=1}^{N}\mathrm{PSNR}({\widehat{r}}_{n},{s}_{n})=\frac{1}{N}\sum _{n=1}^{N}20\mathrm{lg}\frac{255}{\sqrt{\mathrm{MSE}({\widehat{r}}_{n},{s}_{n})}},\end{array}$$## 4

[TeX:] \documentclass[12pt]{minimal}\begin{document}\begin{eqnarray} C_5 &=& \frac{1}{N}\sum\limits_{n = 1}^N {{\rm SSIM}(\hat r_n,s_n)}\nonumber\\ \!&=&\! \frac{1}{N}\!\sum\limits_{n = 1}^N \!\frac{1}{K}\! {\sum\limits_{k = 1}^K {[l_n (x_k,y_k)]^\alpha [c_n (x_k,y_k)]^\beta [s_n (x_k,y_k)]^\gamma } }\!, \end{eqnarray}\end{document} $$\begin{array}{ccc}\hfill {C}_{5}& =& \frac{1}{N}\sum _{n=1}^{N}\mathrm{SSIM}({\widehat{r}}_{n},{s}_{n})\hfill \\ \hfill & =& \frac{1}{N}\sum _{n=1}^{N}\frac{1}{K}\sum _{k=1}^{K}{\left[{l}_{n}({x}_{k},{y}_{k})\right]}^{\alpha}{\left[{c}_{n}({x}_{k},{y}_{k})\right]}^{\beta}{\left[{s}_{n}({x}_{k},{y}_{k})\right]}^{\gamma},\hfill \end{array}$$*P*is the number of feature pairs in neighboring images, and the transform relations are shown in Fig. 2,

*N*is the number of input images or interested regions, [TeX:] $\hat{r}_n$ ${\widehat{r}}_{n}$ is the inverse warped partial mosaic image, and

*s*

_{n}is the input image. Refer to Ref. 8 for more parameter details of SSIM. The lower the value of

*C*

_{3}, the smaller the registration error. The higher the value of

*C*

_{4}, the more continuous in blending intensity. The higher the value of

*C*

_{5}, the less structure difference between the input sequence and the synthetic mosaics.

## 3.3.

### Comprehensive Assessment Based on AHP

For different applications, the demands for image mosaicing are usually different. The above criteria play roles of varying importance. To determine the weight coefficients, the problem is modeled as a hierarchy, shown in Fig. 3, and pairwise comparisons of the AHP are performed. The comprehensive assessment result is obtained by three steps.

The first step is constructing a set of pairwise comparison matrices *U* using a scale of values ranging from 1 (equal importance) to 9 (absolute importance).

## 5

[TeX:] \documentclass[12pt]{minimal}\begin{document}\begin{eqnarray} U &=& \left[ {\begin{array}{c@{\quad}c@{\quad}c@{\quad}c} {u_{11} } & {u_{12} } & {\ldots} & {u_{1n} } \\ {u_{21} } & {u_{22} } & {} & {u_{2n} } \\ {\ldots} & {} & {} & {} \\ {u_{n1} } & {u_{n2} } & {} & {u_{nn} } \\ \end{array}} \right], \end{eqnarray}\end{document} $$\begin{array}{ccc}\hfill U& =& \left[\begin{array}{cccc}\hfill {u}_{11}& \hfill {u}_{12}& \hfill ...& {u}_{1n}\\ \hfill {u}_{21}& \hfill {u}_{22}& \hfill & {u}_{2n}\\ \hfill ...& \hfill & \hfill & \\ \hfill {u}_{n1}& \hfill {u}_{n2}& \hfill & {u}_{nn}\end{array}\right],\hfill \end{array}$$*u*

_{ij}is the relative importance value of the

*i’*th factor compared to the

*j’*th, therefore,

*u*

_{ij}= 1/

*u*

_{ji},

*u*

_{ii}= 1. The elements of

*U*can be determined by statistical data, intensity of requirement on issues, or experts’ knowledge. Taking the application in photograph editing for instance, they can be defined as Table 1. For the criterion layer, the local full-reference criteria are deemed to be slightly important over the global no-reference criteria, since photograph editing requires high accuracy and the evaluation usually pays close attention to local details of the mosaics. Other matrices can be also determined in this way, seen in Table 1.

## Table 1

An example on comparison matrices and weights in the hierarchy.

Criterion layer | Index layer B1 | Index layer B2 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

A | B_{1} | B_{2} | w_{i} | B_{1} | C_{1} | C_{2} | w_{1}^{i} | B_{2} | C_{3} | C_{4} | C_{5} | w_{2}^{i} |

C_{3} | 1 | 1 | 1/2 | 0.240 | ||||||||

B_{1} | 1 | 1/2 | 0.333 | C_{1} | 1 | 1/3 | 0.250 | |||||

C_{4} | 1 | 1 | 1/3 | 0.210 | ||||||||

B_{2} | 2 | 1 | 0.667 | C_{2} | 3 | 1 | 0.750 | |||||

C_{5} | 2 | 3 | 1 | 0.550 |

The second step is computing a vector of priorities in each layer. It equals the normalized eigenvector [*w*_{1}, *w*_{2}, …, *w*_{n}]^{T} corresponding to the maximum eigenvalue of *U*, where *w*_{i} is calculated by

## 6

[TeX:] \documentclass[12pt]{minimal}\begin{document}\begin{eqnarray} w_i = {{\sqrt[n]{{\prod\limits_{j = 1}^n {u_{ij} } }}} / {\sum\limits_{i = 1}^n {\sqrt[n]{{\prod\limits_{j = 1}^n {u_{ij} } }}} }}\, . \end{eqnarray}\end{document} $$\begin{array}{c}\hfill {w}_{i}=\sqrt[n]{\prod _{j=1}^{n}{u}_{ij}}/\sum _{i=1}^{n}\sqrt[n]{\prod _{j=1}^{n}{u}_{ij}}\phantom{\rule{0.16em}{0ex}}.\end{array}$$^{9}The computation results of the above case are also listed in Table 1.

The third step is synthesizing priority by aggregating the relative weights up the hierarchy. The comprehensive assessment result is expressed as

## 7

[TeX:] \documentclass[12pt]{minimal}\begin{document}\begin{eqnarray} A = \sum\limits_{i = 1}^m {\left( {w_i \sum\limits_{j = 1}^n {w_j^i C_i } } \right)}, \end{eqnarray}\end{document} $$\begin{array}{c}\hfill A=\sum _{i=1}^{m}\left({w}_{i}\sum _{j=1}^{n}{w}_{j}^{i}{C}_{i}\right),\end{array}$$*w*

_{i}is weight coefficient of evaluation factor in criterion layer, and [TeX:] $w_j^i $ ${w}_{j}^{i}$ is weight coefficient of evaluation index of the

*i’*th criterion factor in the

*j*'th index layer,

*C*

_{i}is the normalized index data. In the example, the computed overall weight vector becomes

**= [0.083, 0.250, 0.160, 0.140, 0.367].**

*W*## 4.

## Experiments and Conclusions

Two algorithms for photograph editing are compared here, denoted by #1 and #2. They are both based on a projective model, but extracting different features for registration. The blending methods are also different, #1 using center-weighting but #2 using optimal seam stitching. In order to test the validity of the proposed method, a series of mosaics stitched by these algorithms were evaluated by 15 volunteers. They observed each pair of mosaics under the same viewing conditions and voted for the algorithm. The algorithm with more votes was deemed as the better one. For all the test scenes, #2 was superior to #1 in our subjective experiments. Due to space limitations, we present five typical scenes with different complexities and textures for illustration. We also compared with another evaluation method on the basis of rms intensity error between the final mosaics and the reference panoramas stitched by hand. The mosaicing results and the reference panoramas are shown in Fig. 4. The computed index data in the proposed scheme are shown in Table 2. The relative parameters are set as recommendations in references. Their assessment results are shown in Table 3. They are all consistent with subjective judgments. Although some local moving objects might be lost by #2 as seen in Fig. 4, which results in lower full-reference evaluation, its global impression significantly defeats the local details by avoiding the blending ghost. However, some evaluation results of the compared method are contrary to subjective judgments as seen in Table 3, marked by “×,” mainly caused by moving objects which increase the rms intensity error if taking a large portion in the input image and lost in the final mosaics.

## Table 2

Values of the evaluation indices.

C1 | C2 | C3 | C4 | C5 | ||
---|---|---|---|---|---|---|

sc. 1 | #1 | 6.8308 | 22187.4 | 0.3278 | 24.0025 | 0.9991 |

#2 | 6.8020 | 45455.2 | 0.4309 | 25.6590 | 0.9991 | |

sc. 2 | #1 | 7.2475 | 16138.5 | 0.2098 | 27.8449 | 0.9998 |

#2 | 7.2059 | 28174.2 | 0.2166 | 26.7563 | 0.9997 | |

sc. 3 | #1 | 7.2618 | 12553.5 | 0.3816 | 26.9503 | 0.9998 |

#2 | 7.2386 | 23478.5 | 0.3699 | 26.2622 | 0.9998 | |

sc. 4 | #1 | 6.7956 | 16256.3 | 0.2025 | 26.4237 | 0.9997 |

#2 | 6.5323 | 30427.2 | 0.2739 | 26.1030 | 0.9996 | |

sc. 5 | #1 | 7.0709 | 15771.4 | 0.2677 | 25.1656 | 0.9996 |

#2 | 7.0754 | 32270.6 | 0.3801 | 24.1162 | 0.9995 |

## Table 3

The comparison of assessment results.

Proposed method | Compared method | ||||||
---|---|---|---|---|---|---|---|

No-ref. | Full-ref. | Overall | rms error | ||||

sc. 1 | #1 | 0.5061 | 0.6269 | 0.5867 | √ | 0.2556 | × |

#2 | 0.8504 | 0.5911 | 0.6775 | 0.2623 | |||

sc. 2 | #1 | 0.5501 | 0.6132 | 0.5922 | √ | 0.1258 | √ |

#2 | 0.8271 | 0.6018 | 0.6769 | 0.1126 | |||

sc. 3 | #1 | 0.5307 | 0.6068 | 0.5814 | √ | 0.1031 | √ |

#2 | 0.8379 | 0.6083 | 0.6848 | 0.0905 | |||

sc. 4 | #1 | 0.5337 | 0.6355 | 0.6015 | √ | 0.1151 | √ |

#2 | 0.8348 | 0.5833 | 0.6671 | 0.1068 | |||

sc. 5 | #1 | 0.5060 | 0.6422 | 0.5968 | √ | 0.1041 | × |

#2 | 0.8507 | 0.5778 | 0.6687 | 0.1127 |

The above experiments show the objectivity and the stability of the proposed method. Owing to multicriteria and AHP weights selection, the assessment is robust even if there are variances and disturbances in different scenes. Indices extension and parameter analysis will be the focus in future research.

## Acknowledgments

This work is supported by Chinese National Science Fund for Distinguished Young Scholars, Grant No. 60925011.