Color quantization (cq) is a classical image processing operation that reduces the number of distinct colors in a given image. Although the idea of cq dates back to the early 1970s, the first true cq algorithm, median-cut, was proposed later in 1980. Since then, hundreds of publications have investigated the topic of cq, proposing dozens of algorithms. A vast majority of these publications demonstrate their results on small datasets, containing a handful of images of mixed quality. Furthermore, the reproducibility of cq research is often limited due to the use of private test images or public test images with multiple non-identical copies on the World Wide Web or restrictive licenses. To address these problems, we curated a large, diverse, and high-quality dataset of 24-bit color images called cq100 and released it under a permissive license. We present an overview of cq100 and demonstrate its use in comparing cq algorithms. |
1.Introduction24-bit color images have become commonplace since the turn of the millennium.^{1}^{,}^{2} These images typically contain hundreds of thousands of distinct colors, which complicate their display, storage, transmission, processing, and analysis. Color quantization (cq) is a common image processing operation that reduces the number of distinct colors in a given image. Figure 1 shows the pencils image (Ref. 3, cc0 license, $768\times 512\text{\hspace{0.17em}\hspace{0.17em}}\text{pixels}$) and its quantized versions with 4, 16, 64, and 256 colors obtained using the median-cut algorithm.^{4} It can be seen that the reproduction is reasonably accurate with only 64 colors and is nearly indistinguishable from its original with 256 colors. cq is composed of two phases: color palette design and pixel mapping. In the former phase, a small set of colors, called the color palette, representing the input colors is selected, whereas, in the latter phase, each pixel in the input image is assigned to one of the palette colors. The purpose of cq is to reduce the number of distinct colors in a given image to a significantly smaller number with minimal distortion. Since natural images often contain a wide range of colors, faithful reproduction of such images with a small color palette is a challenging problem. In fact, cq can be characterized as a large-scale combinatorial optimization problem.^{5} There are several ways to classify cq algorithms. Image-independent algorithms design a universal color palette without regard to any particular input image, whereas image-dependent ones design a custom color palette based on the distribution of the colors in a given input image. Unsurprisingly, a vast majority of cq algorithms are image-dependent. Another classification scheme is based on the nature of the underlying clustering algorithm.^{6} Hierarchical algorithms recursively find nested clusters in a top-down (or divisive) or bottom-up (or agglomerative) fashion. In contrast, partitional algorithms find all the clusters simultaneously as a partition of the data without imposing a hierarchical structure on the data.^{7} Early cq algorithms were mostly hierarchical, whereas modern cq algorithms tend to be partitional.^{8} Yet another classification scheme is based on whether or not the palette size is allowed to change during the cq process. Static algorithms assume that the palette size is a constant to be specified by the user in advance, whereas dynamic algorithms compute the palette size automatically at run-time. A vast majority of cq algorithms proposed to date are image-dependent and static. Therefore, in this paper, we focus primarily on such algorithms. (The extraction of relevant colors (i.e., colors that stand out to an observer^{9}) from a color image, which is an important task in specific application domains such as fine arts^{9}^{,}^{10} and medicine,^{11}^{,}^{12} is outside the scope of this study.) Many popular partitional algorithms are based on the local optimization of an objective function, which is typically nonsmooth and nonconvex with numerous local optima. Hence, such algorithms are often highly dependent on initialization^{13} and can easily get stuck in a poor local optimum. By contrast, metaheuristic-based algorithms are formulated based on global optimization and, thus, are less sensitive to initialization. cq was a necessity in the past due to the limitations of the display hardware, many of which could not handle the number of colors that can be present in a typical 24-bit image. Over the past decades, 24-bit display hardware have become ubiquitous. However, cq is still used in many visual computing applications as a preprocessing step. Modern applications of cq include non-photorealistic rendering, image matting, image dehazing, image compression, color-to-grayscale conversion, image watermarking/steganography, image segmentation, content-based image retrieval, color analysis, saliency detection, and skin detection. For specific references, see Ref. 8. Despite the over four decades of research on cq, there is currently no benchmark dataset on which cq algorithms can be developed, tested, and compared.^{8} Many cq studies employ a subset of the usc-sipi image database (Available at Ref. 14) or the Kodak lossless true color image suite (Available at Ref. 15). The usc-sipi dataset contains 14 scanned images of relatively poor quality (in modern standards). Furthermore, the copyright status of many of these images is unknown. The Kodak dataset contains 24 relatively high-quality images in the public domain. However, the dataset is neither sufficiently diverse nor sufficiently large to permit comprehensive evaluation of cq algorithms. (For example, in the usc-sipi dataset, three ($\approx 21\%$) images (4.1.01, 4.1.03, and 4.1.04) are portrait photos, two images (4.1.04 and 4.1.02) show the same person, and two images (4.1.07 and 4.1.08) depict identical objects. On the other hand, one-third of the Kodak images portray water bodies. In addition, the dataset contains pairs of images featuring identical objects [i.e., kodim02 depicts a subregion of kodim01, whereas kodim06 and kodim11 depict the same boat captured in two separate yet similar scenes.]) To address these problems, we curated a large, diverse, and high-quality dataset of 24-bit color images called cq100 and released it under a permissive license. The remainder of this paper is organized as follows. Section 2 gives a detailed description of our dataset. Section 3 demonstrates the use of our dataset in comparing cq algorithms. Finally, Sec. 4 concludes the paper and suggests future research directions. 2.Description of the DatasetOur objective was to collect a large, diverse, and high-quality dataset of images and release it under a permissive license. We started by collecting over 100 images from three public sources:
During the image collection process, we paid particular attention to the diversity of content and license compatibility issues. Once we obtained a tentative dataset with 100+ images, we eliminated the images with outlying aspect ratios using the following iterative process. We first computed the mean ($m$) and standard deviation ($s$) of the aspect ratios (Ordinarily, the aspect ratio of an $H\times W$ image is given by $W/H$. However, to treat portrait and landscape images uniformly, we computed the aspect ratio as $\mathrm{max}(W/H,H/W)$.) of the current set of images and then eliminated those with outlying aspect ratios, that is, aspect ratios outside the range $[m-2s,m+2s]$. We added a few more images to the set (from the aforementioned public sources) and then repeated the above outlier removal process until we were left with 100 images. The characteristics of this dataset were:
The images in our initial dataset had dimensions ranging from $768\times 512$ to $6498\times 8123$. To facilitate comparisons, it is desirable to have all images contain the same number of pixels. A simple way to achieve this goal is to resize all images to the same dimensions (and thus the same aspect ratio). To this end, we performed a grid search between the minimum ($l=1.25$) and maximum ($u=1.64$) aspect ratios in the dataset with a step size of $(u-l)/1000$. The optimal aspect ratio minimizing the average relative deviation (Given an image, let $a$ and $\tilde{a}$ be its aspect ratios before and after resizing, respectively. The relative deviation in the aspect ratio caused by the resizing operation is then $\epsilon =|\tilde{a}-a|/a$.) turned out to be 1.49973, which is very close to the standard 3:2 aspect ratio commonly used in 35-mm film photography and digital single-lens reflex camera (dslr) photography. Accordingly, we resized [We used the resize operator (with the default Lanczos resampling filter) of the convert program of ImageMagick 7.1.0-33 (available at Ref. 17)] all portrait and landscape images in our initial dataset respectively to $512\times 768$ and $768\times 512$ to achieve our target aspect ratio of 1.5. These resizing operations caused an average of only 5.35% relative distortion in the aspect ratio, which is negligible. Other than resizing, file renaming and file format conversion were the only modifications made to the original images. Most of the files originally had meaningless or long and overly descriptive names, some reaching 120 characters. Therefore, we renamed each file concisely to accurately reflect its contents. This renaming operation reduced the mean filename length from about 40 to about 12. As for file formats, 95 of the images were originally stored in the jpg format and 5 in the png format. We converted all images to the uncompressed binary ppm format. ppm is popular in visual computing because it is uncompressed, and ppm files are easy to read and write, owing to their extremely simple structure. As mentioned earlier, each image in cq100 has one of four types of licenses (For an overview of these licenses, visit Ref. 18.) (listed in decreasing permissiveness):
Since these licenses are mutually compatible, we released cq100 under the least permissive of them, namely the cc by-sa 4.0 license. Nevertheless, this license allows anyone to share and modify our dataset with the restrictions noted above. The metadata associated with each image include the following 16 attributes: original image filename, modified image filename, image category, source url, license, license url, author, author url, modifications made to the original image, additional notes about the original image, original image width, original image height, original image number of colors, modified image width, modified image height, and modified image number of colors. To demonstrate the diversity of cq100, we show thumbnails for images 1 through 25, 26 through 50, 51 through 75, and 76 through 100 in Figs. 2Fig. 3Fig. 4–5, respectively. The numeric id around each thumbnail indicates the alphabetical order of the filename of the corresponding image, e.g., 1: adirondack_chairs; 2: astro_bodies,…, 99: wool_carder_bee; and 100: yasaka_pagoda. A more detailed view of a subset of cq100 is given in Fig. 6, which displays a sample image from each category. Note that for each image featured in Figs. 2Fig. 3Fig. 4Fig. 5–6, the original license is given in the respective caption. Figures 7 and 8 show box plots of the distribution of respectively the number of distinct colors and colorfulness^{19} per image for cq100, Kodak, and usc-sipi datasets. It can be seen that cq100 has a greater color diversity compared to the other two datasets. 3.Demonstration of the DatasetIn this section, we demonstrate the use of our cq100 dataset on a cq task. First, we describe the experimental setup and present a brief qualitative assessment. Then, we conduct a more detailed quantitative assessment and discuss its implications. 3.1.Experimental Setup and Qualitative AssessmentWe compare the following 21 cq algorithms (listed in chronological order) with respect to their effectiveness [An effective cq algorithm is one that produces minimal distortion, as quantified by a chosen image fidelity metric (see Sec. 3.2).]:
These algorithms represent the cq work published between 1980 and 2022. (For a modern survey of cq, see Ref. 8.) However, it should be noted that our objective is not to perform an exhaustive comparison of the cq algorithms published over the past 40 years but to demonstrate the use of the presented dataset in comparing several popular cq algorithms. We execute each algorithm with the default parameter values suggested by its authors (see Table 1) and quantize each image in our dataset to 4, 16, 64, and 256 colors separately. The cases of 4 and 256 colors represent the two extremes. It can be argued that a typical natural image can hardly be quantized to fewer than 4 colors, and most images can be reproduced relatively accurately with no more than 256 colors. Table 1Parameter setting for each cq algorithm.
Figures 9Fig. 10Fig. 11–12, respectively show the shopping bags, motorcycle, umbrellas, and common jezebel images quantized using three algorithms (in each figure, going from top to bottom, the subfigures are arranged in progressively decreasing quality). Given an input image, for each algorithm, we display the corresponding reduced-color output image and a grayscale error image that allows us to visualize the differences between the input and output. The error image is obtained by amplifying the pixelwise normalized Euclidean differences between the input and output by a factor of four and then negating them for better visualization (see below). Hence, the cleaner/lighter the error image, the better the reproduction of the input image. Let ${I}_{\mathrm{RGB}}$ and ${\tilde{I}}_{RGB}$ respectively denote the $H\times W$ original input and quantized output images in the standard rgb (srgb) color space.^{39} ${I}_{\mathrm{RGB}}(r,c)$ and ${\tilde{I}}_{\mathrm{RGB}}(r,c)$ are then three-dimensional vectors containing the rgb values of the pixel with (row, column) coordinate $(r,c)$ in ${I}_{\mathrm{RGB}}$ and ${\tilde{I}}_{\mathrm{RGB}}$, respectively ($r\in \{1,\dots ,H\}$ and $c\in \{1,\dots ,W\}$). The corresponding pixel in the 8-bit error image $E$ is then computed as Eq. (1)$$E(r,c)=255-\frac{4}{\sqrt{3}}{\Vert {I}_{\mathrm{RGB}}(r,c)-{\tilde{I}}_{\mathrm{RGB}}(r,c)\Vert}_{2},$$3.2.Quantitative Assessment and DiscussionQuantitative assessment remains to be one of the least explored aspects of cq.^{8} Most cq studies employ pixelwise image fidelity metrics such as the mean squared error (mse), or its variants such as the peak signal-to-noise ratio (psnr), computed in the rgb color space. Recently, Ref. 40 investigated 25 fidelity metrics and concluded that mse computed in the cielab color space was among the best. Following their recommendation, we compute the mse metric as follows: Eq. (2)$$\mathrm{MSE}({I}_{\mathrm{CIELAB}},{\tilde{I}}_{\mathrm{CIELAB}})=\frac{1}{\mathrm{HW}}\sum _{r=1}^{H}\sum _{c=1}^{W}{\Vert {I}_{\mathrm{CIELAB}}(r,c)-{\tilde{I}}_{\mathrm{CIELAB}}(r,c)\Vert}_{2}^{2},$$The presented cq100 dataset includes the 100 (true-color) input images and their metadata (as described in Sec. 2), 8400 (reduced-color) output images (21 cq algorithms $\times 100$ input images $\times \{\mathrm{4,16,64,256}\}$ colors), and Microsoft Excel worksheets containing the mse for each input/output image combination. (Other image fidelity metrics can be computed over the provided input/output images.) By contrast, to the best of our knowledge, for the usc-sipi and Kodak datasets, there is no public repository containing the output images produced by a variety of algorithms and the corresponding mse values. To determine if there are any statistically significant differences among the cq algorithms, we employ two nonparametric statistical tests:^{41} the Friedman test^{42} and the Iman-Davenport test.^{43} These tests are alternatives to the parametric two-way analysis of variance (anova) test. Their advantage over anova is that they do not require normality or homoscedasticity, assumptions that are often violated in machine learning or optimization studies.^{44}^{–}^{48} Given $B$ blocks (subjects) and $T$ treatments (measurements), the null hypothesis (${H}_{0}$) of the Friedman test is that populations within a block are identical. The alternative hypothesis (${H}_{1}$) is that at least one treatment tends to yield larger (or smaller) values than at least one other treatment. The test statistic is computed as follows.^{49} In the first step, the observations within each block are ranked separately, so each block contains a separate set of $T$ ranks. If ties occur, the tied observations are given the mean of the rank positions for which they are tied. If ${H}_{0}$ is true, the ranks in each block should be randomly distributed over the columns (treatments). Otherwise, we expect a lack of randomness in this distribution. For example, if a particular treatment is better than the others, we expect small ranks to “favor” that column. In the second step, the ranks in each column are summed. If ${H}_{0}$ is true, we expect the sums to be fairly close—so close that we can attribute differences to chance. Otherwise, we expect to see at least one difference between pairs of rank sums so large that we cannot reasonably attribute it to sampling variability. The test statistic is given as where ${R}_{j}$ ($j\in \{\mathrm{1,2},\dots ,T\}$) is the rank sum of the $j$th column. ${\chi}_{r}^{2}$ is approximately chi-square with $(T-1)$ degrees of freedom. ${H}_{0}$ is rejected at the $\alpha $ level of significance if the value of (3) is greater than or equal to the critical chi-square value for $(T-1)$ degrees of freedom.Reference 43 proposed the following alternative statistic which is distributed according to the $F$-distribution with $(T-1)$ and $(T-1)(B-1)$ degrees of freedom. Compared to ${\chi}_{r}^{2}$, this statistic is not only less conservative but also more accurate for small sample sizes.^{43}In this study, blocks and treatments correspond to images and cq algorithms, respectively. For each $K\in \{\mathrm{4,16,64,256}\}$, our goal is to determine if at least one algorithm is significantly better than at least one other algorithm at the $\alpha =0.05$ level of significance. If this is the case, we perform multiple comparison testing to determine which pairs of algorithms differ significantly. For this purpose, we employ the Bergmann-Hommel test^{50} (also at the $\alpha =0.05$ level), a powerful multiple comparison test that has been used successfully in various machine learning studies.^{13}^{,}^{41}^{,}^{51}^{–}^{53} (The power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis.) Bergmann-Hommel is a dynamic test that considers the logical relations among the hypotheses and is strictly more powerful^{54} than various alternative tests that control the familywise error rate (The familywise error rate is the probability of falsely rejecting at least one null hypothesis when performing multiple comparison tests.) such as Nemenyi,^{55} Holm,^{56} and Shaffer^{57} tests. Table 2 gives the mean rank of each cq algorithm over the dataset for $K\in \{\mathrm{4,16,64,256}\}$ (lower ranks are better). The last column gives the mean of the (mean) ranks over the four $K$ values. Unsurprisingly, the top ranks are occupied by partitional and metaheuristic-based algorithms. As mentioned earlier, the Bergmann-Hommel test is a powerful multiple comparison test. However, this power comes at the expense of a very high computational cost. More specifically, the test takes exponential time in the number of hypotheses;^{58} thus, it cannot handle more than ten algorithms, even on a high-performance cpu. To address this problem, we eliminate the patently inferior algorithms,^{48} namely all seven hierarchical algorithms (i.e., bs, mc, oct, sam, vcl, wan, and wu) and the five partitional or metaheuristic-based algorithms that rank in the bottom half (i.e., atcq, ffatcq, itatcq, pop, and som). Thus, the nonparametric statistical analyses below are conducted for the remaining nine algorithms (Even if the Bergmann-Hommel test could handle all 21 algorithms, it would have been extremely unwieldy to analyze the test results involving $\left(\begin{array}{c}21\\ 2\end{array}\right)=420$ hypotheses, as opposed to $\left(\begin{array}{c}9\\ 2\end{array}\right)=36$ hypotheses, and such an analysis would have more than likely yielded complex and uninterpretable rules.), namely abcatcq, adu, bsitatcq, iokm, psoatcq, sfla, wfcm, wsm, and wuatcq. Table 2Mean rank of each cq algorithm for K∈{4,16,64,256}.
For $K=4$ colors, both Friedman and Iman-Davenport tests detect a statistically significant difference in image fidelity (or effectiveness) among the cq algorithms: ${\chi}_{r}^{2}(8)=158.990$ with $p=9.613\mathrm{e}-11$ and ${F}_{r}(\mathrm{8,792})=24.555$ with $p=6.675\mathrm{e}-34$. The results of the Bergmann-Hommel test are given in column 2 of Table 3. For example, the test rejects the null hypothesis “abcatcq versus adu” (i.e., the two algorithms are equally effective). Since we know from Table 2 that abcatcq has a lower (or better) rank than adu for $K=4$, the rejection of the above hypothesis means abcatcq is more effective than adu for $K=4$ and the difference between the two algorithms is statistically significant at the $\alpha =0.05$ level. The results of the Bergmann-Hommel test can be summarized succinctly as follows: wfcm > {abcatcq, psoatcq, sfla, wsm} > {adu, bsitatcq, wuatcq}, where a notation such as {a, b} > c indicates that there is no statistically significant difference between algorithms a and b, and these two algorithms are significantly more effective than (or superior to) algorithm c. The above (summary) rule can be interpreted as follows:
Table 3Results of the Bergmann-Hommel test for K∈{4,16,64,256} (✓: rejected and ✗: not rejected).
Observe that the above summary does not include iokm. This is because, as Table 3 shows, while iokm is inferior to wfcm and superior to {adu, bsitatcq}, it cannot be included in group {abcatcq, psoatcq, sfla, wsm } since it is not superior to wuatcq. Hence, an alternative rule, including iokm but excluding wuatcq, is wfcm > {abcatcq, iokm, psoatcq, sfla, wsm} > {adu, bsitatcq}. For $K=16$ colors, both Friedman and Iman-Davenport tests detect a statistically significant difference in image fidelity (or effectiveness) among the cq algorithms: ${\chi}_{r}^{2}(8)=203.771$ with $p=9.240\mathrm{e}-11$ and ${F}_{r}(8792)=33.835$ with $p=4.888\mathrm{e}-46$. The results of the Bergmann-Hommel test are given in column 3 of Table 3. In this case, the results can be summarized by a single rule that covers all nine algorithms: psoatcq > {abcatcq, iokm, sfla, wfcm, wsm} > {adu, bsitatcq, wuatcq}, which can be interpreted as follows:
For $K=64$ colors, both Friedman and Iman-Davenport tests detect a statistically significant difference in image fidelity (or effectiveness) among the cq algorithms: ${\chi}_{r}^{2}(8)=330.423$ with $p=1.416\mathrm{e}-10$ and ${F}_{r}(\mathrm{8,792})=69.663$ with $p=1.763\mathrm{e}-86$. The results of the Bergmann-Hommel test are given in column 4 of Table 3. In this case, the results can be summarized by three alternative rules:
Observe that {bsitatcq, wfcm, wuatcq} is the worst group of algorithms in every case. For $K=256$ colors, both Friedman and Iman-Davenport tests detect a statistically significant difference in image fidelity (or effectiveness) among the cq algorithms: ${\chi}_{r}^{2}(8)=394.704$ with $p=1.872\mathrm{e}-10$ and ${F}_{r}(\mathrm{8,792})=96.413$ with $p=1.441\mathrm{e}-111$. The results of the Bergmann-Hommel test are given in column 5 of Table 3. In this case, the results can be summarized by two alternative rules:
Observe that, in either case, {wfcm, wuatcq} is the worst group of algorithms; {bsitatcq, sfla} is the second-worst group; and iokm is superior to the preceding two groups. General remarks with respect to effectiveness (recall that lower ranks are better):
Finally, for each $K$ value, if we were to recommend a single algorithm based on effectiveness considerations, it would be: wfcm for $K=4$, psoatcq for $K\in \{\mathrm{16,64}\}$, and adu for $K=256$. It is important to emphasize that in our experiment, we compared these cq algorithms along a single dimension (mse in the cielab color space). There are many other criteria on which one can compare cq algorithms, including computational efficiency, simplicity (conceptual and implementation), and ease of use (e.g., as measured by the number of user-defined parameters). For example, while iokm is inferior to psoatcq based on effectiveness (as quantified by mse), it is vastly superior based on the other three criteria mentioned above. Therefore, there is no such thing as a universal cq algorithm, and the best algorithm depends highly on the application requirements. 4.Conclusions and Future Research DirectionsIn this paper, we presented cq100, a large, diverse, and high-quality dataset of 24-bit color images released under the cc by-sa 4.0 license. We also demonstrated how cq100 can be used to compare cq algorithms based on the popular mse metric (computed in the cielab color space). Future work includes a multivariate comparison of cq algorithms over cq100 based on several image fidelity metrics. The use of cq100 is not restricted to cq. For example, it can also be used to develop, test, and compare filtering^{59} or segmentation^{60}^{,}^{61} algorithms. However, for some applications, the images in the dataset may need to be annotated by one or more human experts. For example, in a segmentation application, the annotation for a given image could include bounding boxes around objects of interest or a segmentation mask for the entire image. cq100 is significantly larger and more diverse than its two main competitors (i.e., usc-sipi and Kodak). Although the current size of the dataset appears to be adequate for its primary target application, which is an unsupervised learning task, it should be expanded if it is to be used for supervised learning tasks. Data, Materials, and Code AvailabilityThe cq100 dataset presented in this paper is available at Ref. 62. The dataset includes the 100 (true-color) input images and their metadata, 8400 (reduced-color) output images (21 cq algorithms $\times 100$ input images $\times \{\mathrm{4,16,64,256}\}$ colors), and Microsoft Excel worksheets containing the mse for each input/output image combination. AcknowledgmentsThis material is based upon work supported by the National Science Foundation (Grant No. OIA-1946391). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. The authors gratefully acknowledge the following open-source software: 1. median-cut algorithm (gimp, Spencer Kimball, and Peter Mattis); 2. octree algorithm (ImageMagick, John Cristy); 3. marginal variance minimization algorithm (Utah Raster Toolkit and Craig E. Kolb); 4. binary splitting algorithm (Charles A. Bouman); 5. variance minimization algorithm (Xiaolin Wu); 6. self-organizing map algorithm (Anthony Dekker); 7. split-and-merge algorithm (Luc Brun); 8. multipletest software^{41} (Salvador García and Francisco Herrera); and 9. BoxPlotR software^{63} (Michaela Spitzer, Jan Wildenhain, Juri Rappsilber, and Mike Tyers). The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. ReferencesG. Sharma, M. J. Vrhel and H. J. Trussell,
“Color imaging for multimedia,”
Proc. IEEE, 86
(6), 1088
–1108 https://doi.org/10.1109/5.687831
(1998).
Google Scholar
R. Ramanath et al.,
“Color image processing pipeline,”
IEEE Signal Process. Mag., 22
(1), 34
–43 https://doi.org/10.1109/MSP.2005.1407713 ISPRE6 1053-5888
(2005).
Google Scholar
P. Heckbert,
“Color image quantization for frame buffer display,”
ACM SIGGRAPH Comput. Graph., 16
(3), 297
–307 https://doi.org/10.1145/965145.801294
(1982).
Google Scholar
X. Wu,
“Color quantization by dynamic programming and principal analysis,”
ACM Trans. Graph., 11
(4), 348
–372 https://doi.org/10.1145/146443.146475 ATGRDF 0730-0301
(1992).
Google Scholar
A. K. Jain, M. N. Murty and P. J. Flynn,
“Data clustering: a review,”
ACM Comput. Surv., 31
(3), 264
–323 https://doi.org/10.1145/331499.331504 ACSUEY 0360-0300
(1999).
Google Scholar
Partitional Clustering Algorithms, Springer(
(2015). Google Scholar
M. E. Celebi,
“Forty years of color quantization: a modern, algorithmic survey,”
Artif. Intell. Rev., https://doi.org/10.1007/s10462-023-10406-6
(2023).
Google Scholar
J. L. Nieves et al.,
“Computing the relevant colors that describe the color palette of paintings,”
Appl. Opt., 59
(6), 1732
–1740 https://doi.org/10.1364/AO.378659 APOPAI 0003-6935
(2020).
Google Scholar
J. L. Nieves et al.,
“Psychophysical determination of the relevant colours that describe the colour palette of paintings,”
J. Imaging, 7
(4), 72 https://doi.org/10.3390/jimaging7040072
(2021).
Google Scholar
M. E. Celebi and A. Zornberg,
“Automated quantification of clinically significant colors in dermoscopy images and its application to skin lesion classification,”
IEEE Syst. J., 8
(3), 980
–984 https://doi.org/10.1109/JSYST.2014.2313671
(2014).
Google Scholar
C. Barata et al.,
“Clinically inspired analysis of dermoscopy images using a generative model,”
Comput. Vis. Image Underst., 151 124
–137 https://doi.org/10.1016/j.cviu.2015.09.011 CVIUF4 1077-3142
(2016).
Google Scholar
M. E. Celebi, H. Kingravi and P. A. Vela,
“A comparative study of efficient initialization methods for the k-means clustering algorithm,”
Expert Syst. Appl., 40
(1), 200
–210 https://doi.org/10.1016/j.eswa.2012.07.021
(2013).
Google Scholar
D. Hasler and S. E. Suesstrunk,
“Measuring colorfulness in natural images,”
Proc. SPIE, 5007 87
–95 https://doi.org/10.1117/12.477378
(2003).
Google Scholar
M. Gervautz, W. Purgathofer,
“A simple method for color quantization: octree quantization,”
New Trends in Computer Graphics, 219
–231 Springer(
(1988). Google Scholar
S. J. Wan, P. Prusinkiewicz and S. K. M. Wong,
“Variance-based color image quantization for frame buffer display,”
Color Res. Appl., 15 52
–58 https://doi.org/10.1002/col.5080150109 CREADU 0361-2317
(1990).
Google Scholar
M. Orchard and C. Bouman,
“Color quantization of images,”
IEEE Trans. Signal Process., 39
(12), 2677
–2690 https://doi.org/10.1109/78.107417 ITPRED 1053-587X
(1991).
Google Scholar
X. Wu, 126
–133 Academic Press(
(1991). Google Scholar
A. Dekker,
“Kohonen neural networks for optimal colour quantization,”
Netw. Computat. Neural Syst., 5
(3), 351
–367 https://doi.org/10.1088/0954-898X/5/3/003
(1994).
Google Scholar
L. Brun and M. Mokhtari,
“Two high speed color quantization algorithms,”
in Proc. 1st Int. Conf. Color in Graph. and Image Process.,
116
–121
(2000). Google Scholar
M. E. Celebi,
“Improving the performance of k-means for color quantization,”
Image Vis. Comput., 29
(4), 260
–271 https://doi.org/10.1016/j.imavis.2010.10.002
(2011).
Google Scholar
Q. Wen and M. E. Celebi,
“Hard vs. fuzzy c-means clustering for color quantization,”
EURASIP J. Adv. Signal Process., 2011
(1), 118
–129 https://doi.org/10.1186/1687-6180-2011-118
(2011).
Google Scholar
M. E. Celebi, S. Hwang and Q. Wen,
“Colour quantisation using the adaptive distributing units algorithm,”
Imaging Sci. J., 62
(2), 80
–91 https://doi.org/10.1179/1743131X13Y.0000000059
(2014).
Google Scholar
M. E. Celebi, Q. Wen and S. Hwang,
“An effective real-time color quantization method based on divisive hierarchical clustering,”
J. Real-Time Image Process., 10
(2), 329
–344 https://doi.org/10.1007/s11554-012-0291-4
(2015).
Google Scholar
M.-L. Pérez-Delgado,
“Colour quantization with ant-tree,”
Appl. Soft Comput., 36 656
–669 https://doi.org/10.1016/j.asoc.2015.07.048
(2015).
Google Scholar
M.-L. Pérez-Delgado,
“Artificial ants and fireflies can perform colour quantisation,”
Appl. Soft Comput., 73 153
–177 https://doi.org/10.1016/j.asoc.2018.08.018
(2018).
Google Scholar
M.-L. Pérez-Delgado,
“The color quantization problem solved by swarm-based operations,”
Appl. Intell., 49
(7), 2482
–2514 https://doi.org/10.1007/s10489-018-1389-6
(2019).
Google Scholar
M.-L. Pérez-Delgado,
“Color image quantization using the shuffled-frog leaping algorithm,”
Eng. Appl. Artif. Intell., 79 142
–158 https://doi.org/10.1016/j.engappai.2019.01.002 EAAIE6 0952-1976
(2019).
Google Scholar
M.-L. Pérez-Delgado and J.-Á. Román-Gallego,
“A hybrid color quantization algorithm that combines the greedy orthogonal bi-partitioning method with artificial ants,”
IEEE Access, 7 128714
–128734 https://doi.org/10.1109/ACCESS.2019.2937934
(2019).
Google Scholar
M.-L. Pérez-Delgado,
“Color quantization with particle swarm optimization and artificial ants,”
Soft Comput., 24
(6), 4545
–4573 https://doi.org/10.1007/s00500-019-04216-8
(2020).
Google Scholar
M.-L. Pérez-Delgado,
“A mixed method with effective color reduction,”
Appl. Sci., 10
(21), 7819 https://doi.org/10.3390/app10217819
(2020).
Google Scholar
M.-L. Pérez-Delgado,
“Revisiting the iterative ant-tree for color quantization algorithm,”
J. Visual Commun. Image Represent., 78 103180 https://doi.org/10.1016/j.jvcir.2021.103180 JVCRE7 1047-3203
(2021).
Google Scholar
A. Abernathy and M. E. Celebi,
“The incremental online k-means clustering algorithm and its application to color quantization,”
Expert Syst. Appl., 207 117927 https://doi.org/10.1016/j.eswa.2022.117927
(2022).
Google Scholar
M. Anderson et al.,
“Proposal for a standard default color space for the internet—sRGB,”
in Proc. Color and Imaging Conf.,
238
–245
(1996). Google Scholar
B. Ortiz-Jaramillo et al.,
“Evaluation of color differences in natural scene color images,”
Signal Process. Image Commun., 71 128
–137 https://doi.org/10.1016/j.image.2018.11.009 SPICEF 0923-5965
(2019).
Google Scholar
S. García and F. Herrera,
“An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons,”
J. Mach. Learn. Res., 9 2677
–2694
(2008).
Google Scholar
M. Friedman,
“The use of ranks to avoid the assumption of normality implicit in the analysis of variance,”
J. Am. Stat. Assoc., 32
(200), 675
–701 https://doi.org/10.2307/2279372
(1937).
Google Scholar
R. L. Iman and J. M. Davenport,
“Approximations of the critical region of the friedman statistic,”
Commun. Stat. Theory Methods, 9
(6), 571
–595 https://doi.org/10.1080/03610928008827904
(1980).
Google Scholar
J. Demšar,
“Statistical comparisons of classifiers over multiple data sets,”
J. Mach. Learn. Res., 7 1
–30
(2006).
Google Scholar
J. Luengo, S. García and F. Herrera,
“A study on the use of statistical tests for experimentation with neural networks: analysis of parametric test conditions and non-parametric tests,”
Expert Syst. Appl., 36
(4), 7798
–7808 https://doi.org/10.1016/j.eswa.2008.11.041
(2009).
Google Scholar
S. García et al.,
“A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability,”
Soft Comput., 13 959
–977 https://doi.org/10.1007/s00500-008-0392-y
(2009).
Google Scholar
S. García et al.,
“A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 special session on real parameter optimization,”
J. Heuristics, 15
(6), 617
–644 https://doi.org/10.1007/s10732-008-9080-4
(2009).
Google Scholar
J. Carrasco et al.,
“Recent trends in the use of statistical tests for comparing swarm and evolutionary computing algorithms: practical guidelines and a critical review,”
Swarm Evol. Computat., 54 100665 https://doi.org/10.1016/j.swevo.2020.100665
(2020).
Google Scholar
W. W. Daniel, Applied Nonparametric Statistics, 2nd ed.PWS-KENT Publishing Company(
(1990). Google Scholar
B. Bergmann, G. Hommel, ,
“Improvements of general multiple test procedures for redundant systems of hypotheses,”
Multiple Hypotheses Testing, 100
–115 Springer(
(1988). Google Scholar
J. Derrac et al.,
“A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms,”
Swarm Evol. Computat., 1
(1), 3
–18 https://doi.org/10.1016/j.swevo.2011.02.002
(2011).
Google Scholar
R. M. O. Cruz, R. Sabourin and G. D. C. Cavalcanti,
“Dynamic classifier selection: recent advances and perspectives,”
Inf. Fusion, 41 195
–216 https://doi.org/10.1016/j.inffus.2017.09.010
(2018).
Google Scholar
U. Johansson et al.,
“Rule extraction with guarantees from regression models,”
Pattern Recognit., 126 108554 https://doi.org/10.1016/j.patcog.2022.108554
(2022).
Google Scholar
G. Hommel and G. Bernhard,
“Bonferroni procedures for logically related hypotheses,”
J. Stat. Plann. Inference, 82
(1–2), 119
–128 https://doi.org/10.1016/S0378-3758(99)00035-X
(1999).
Google Scholar
P. B. Nemenyi,
“Distribution-free multiple comparisons,”
Princeton University,
(1963).
Google Scholar
S. Holm,
“A simple sequentially rejective multiple test procedure,”
Scand. J. Stat., 6
(2), 65
–70 https://doi.org/10.2307/4615733 SJSADG 0303-6898
(1979).
Google Scholar
J. P. Shaffer,
“Modified sequentially rejective multiple test procedures,”
J. Am. Stat. Assoc., 81
(395), 826
–831 https://doi.org/10.2307/2289016
(1986).
Google Scholar
G. Hommel and G. Bernhard,
“A rapid algorithm and a computer program for multiple test procedures using logical structures of hypotheses,”
Comput. Methods Programs Biomed., 43
(3–4), 213
–216 https://doi.org/10.1016/0169-2607(94)90072-8 CMPBEK 0169-2607
(1994).
Google Scholar
M. E. Celebi, H. A. Kingravi and Y. A. Aslandogan,
“Nonlinear vector filtering for impulsive noise removal from color images,”
J. Electron. Imaging, 16
(3), 033008 https://doi.org/10.1117/1.2772639 JEIME5 1017-9909
(2007).
Google Scholar
Y. Deng and B. Manjunath,
“Unsupervised segmentation of color-texture regions in images and video,”
IEEE Trans. Pattern Anal. Mach. Intell., 23
(8), 800
–810 https://doi.org/10.1109/34.946985 ITPIDJ 0162-8828
(2001).
Google Scholar
M. Mignotte,
“Segmentation by fusion of histogram-based k-means clusters in different color spaces,”
IEEE Trans. Image Process., 17
(5), 780
–787 https://doi.org/10.1109/TIP.2008.920761 IIPRE4 1057-7149
(2008).
Google Scholar
M. Spitzer et al.,
“BoxPlotR: a web tool for generation of box plots,”
Nat. Methods, 11 121
–122 https://doi.org/10.1038/nmeth.2811 1548-7091
(2014).
Google Scholar
BiographyM. Emre Celebi received his PhD in computer science and engineering from the University of Texas at Arlington, United States. Currently, he is a professor and the chair of the Department of Computer Science and Engineering at the University of Central Arkansas, United States. He has published 170+ articles on image processing/analysis and data mining. According to Google Scholar, his work has received more than 14,000 citations so far. He is a senior member of the ieee and a fellow of the spie. María-Luisa Pérez-Delgado received her engineering degree in computer science from the University of Valladolid, Spain and a PhD from the University of Salamanca, Spain, where she is currently a professor. Her research interests include artificial intelligence, optimization, graph theory, and data mining. She has published several research articles and books in these areas. |