Superpixel-guided preprocessing algorithm for accelerating hyperspectral endmember extraction based on spatial–spectral analysis

Abstract. Preprocessing is a major area of interest in the field of hyperspectral endmember extraction, for it can provide a few high-quality candidates for fast endmember extraction without sacrificing endmember accuracy. We propose a superpixel-guided preprocessing (SGPP) algorithm to accelerate endmember extraction based on spatial compactness and spectral purity analysis. The proposed SGPP first transforms a hyperspectral image into low-dimension data using principal component analysis. SGPP then utilizes the superpixel method, which normally has linear complexity, to segment the first three components into a set of superpixels. Next, SGPP transforms low-dimension superpixels into noise-reduced superpixels and calculates their spatial compactness and spectral purity based on Tukey’s test and data convexity. SGPP finally retains a few high-quality pixels from each superpixel with high spatial compactness and spectral purity indices for subsequent endmember identification. Based on the spectral angle distance, root-mean-square error, and speedup, experiments are conducted on synthetic and real hyperspectral datasets, and they indicate that SGPP is superior to current state-of-the-art preprocessing techniques.


Introduction
Due to the limited spatial resolution of a hyperspectral sensor, different materials can jointly occupy a single pixel, which occurs as a mixed pixel in a hyperspectral image. The process of decomposing a mixed pixel into a collection of constituent spectra, or endmembers, and a set of corresponding fractions, or abundances, is called hyperspectral unmixing. 1 The last three decades have witnessed a huge growth in endmember extraction algorithms (EEAs) because the exploitation of endmembers is a prerequisite to accurate estimation of abundance fractions. Most such algorithms for endmember extraction only take spectral information into account. They generally investigate the convexity of data structures and treat the vertices of the simplex as potential endmembers. If desired vertices exist, they are identified as representative of endmembers, which are determined by maximizing the determinant until reaching maximal simplex volume or capturing extreme projections if the pixel lies on a subspace. Otherwise, the boundary pixels (at least p − 1 pixels in each facet) are considered replacements to minimize a minimal simplex volume to generate endmembers. The classic maximal simplex volume strategy-based EEAs include N-FINDR 2 and its important extensions, such as the simplex growing algorithm 3 and successive volume maximization. 4 Classic orthogonal subspace projection (OSP) strategy-based algorithms include the pixel purity index, 5 OSP, 6 and vertex component analysis (VCA). 7 Minimal simplex volume strategy-based algorithms include minimum volume transform, 8 minimum volume enclosing simplex, 9 simplex identification via *Address all correspondence to Wenxing Bao, baowenxing@nun.edu.cn variable splitting and augmented Lagrangian, 10 and minimum volume simplex analysis (MVSA). 11 Recently, a lot of spatial-spectral-based EEAs have been proposed; these integrate spatial context into spectral-based unmixing processes such as automatic morphological endmember extraction, 12 spatial-spectral information-based endmember bundle extraction, 13 spatially weighted simplex strategy, 14 and spatial energy-constrained maximum volume (SENMAV). 15 It is worth mentioning that the abovementioned EEAs, especially spectral-based EEAs, focus mainly on spectral information without considering spatial contextual. Such spectral-based EEAs also involve finding endmembers from entire pixels, which is time-consuming. In this regard, numerous spatial-spectral-based preprocessing algorithms (PPAs) have been proposed; these are independent modules that generally utilize both spatial and spectral information with the intent to offer a few high-quality candidates for fast endmember extraction without endmember accuracy loss. The current PPAs are mainly divided into two categories. The first focus primarily on reconstructing each target pixel using its surroundings. A representative PPA behind this strategy was proposed by Zortea and Plaza 16 and is called spatial preprocessing (SPP); it reconstructs the central pixel using a spectral weight scale in its neighborhood. The second preprocessing strategy holds a belief that the desired endmembers are far less than the HSI pixels, meaning that lots of pixels within HSI are redundant and thus should be removed. Under this strategy, numerous PPAs have been proposed based on the process of fusing spatial-spectral information and removing low-quality redundant pixels. To emphasize spatially homogeneous and spectrally pure regions, Martín et al. 17 proposed a region-based spatial preprocessing (RBSPP) algorithm that adaptively searches for the most spectrally pure local regions as endmember candidates using a hybrid procedure that combines unsupervised clustering and OSP. To combine spatial and spectral information and find high-quality endmember candidates, Martín et al. 18 proposed a spatial-spectral information-based algorithm (SSPP) that fuses spatial and spectral information from the perspectives of multi-scale Gaussian filtering, homogeneous index calculation, and spectral clustering to select endmember candidates prior to endmember identification. To reduce the redundancy of intraclass pixels and improve the importance between class boundaries, Kowkabi et al. 19 proposed a spatial-spectral preprocessing module (SSPM) algorithm that performs clustering and boundary removal to retain homogeneous regions and captures pixels with high spectral purity as endmember candidates by projecting homogeneous pixels onto eigenvectors. To effectively explore homogeneous regions with a faster processing speed, Xu et al. 20 proposed a regional clustering-based spatial preprocessing algorithm (RCSPP) that identifies homogeneous areas using a modified simple linear iterative clustering (SLIC) algorithm and yields a set of endmember candidates by maintaining pixels with high spectral purity from each superpixel. To simultaneously consider the local spatial correlations and spectral purity of each target pixel, Shen and Bao 21 proposed a spatial energy and spectral puritybased preprocessing module to remove loads of redundant low-quality pixels. To exploit geometrical correlations between the target pixel and its surroundings, Kowkabi and Keshawarz 22 developed a spectral geodesic and spatial Euclidean weights-based preprocessing, which considers two types of weights between target pixels and their neighborhoods to determine the final data subset for endmember extraction. To reduce the computational burden and noise, recently, Shen et al. 23 proposed a subspace-based preprocessing module that transforms hyperspectral data into a low-dimensional subspace for endmember selection.
The above-mentioned PPAs are divided into two parts (see Fig. 1). The first involves reconstructing each pixel using its spatial neighborhoods for the purpose of providing a better quality image including SPP. The second involves removing redundant pixels from the image by simultaneously employing spatial-spectral information. Representative methods include SSPP, RBSPP, RCSPP, and SSPM. However, such PPAs still have two main drawbacks: (1) they are sensitive to noise that highly affects the quality of extracted data subset and (2) they still require relatively large computational time, which make the PPA-EEA combinations more timeconsuming than those of without coupling PPA.
To improve the noise robustness and reduce the computational burden, this paper outlines a superpixel-guided preprocessing (SGPP) algorithm based on spatial compactness and spectral purity. To quickly obtain the spatial context, SGPP reduces the HSI into low-dimension data using principal component analysis (PCA) and uses the SLIC 24 to segment the first three components into a set of superpixels. Based on each superpixel, SGPP transforms low-dimensional superpixels into noise-reduced superpixels and calculates their spatial compactness and spectral purity to find high-quality pixels. SGPP finally retains a few high-quality pixels from each superpixel as the input of EEA. The preprocessing flowchart of SGPP is presented in Fig. 2. Compared with the current PPAs, Although SGPP still has a similar preprocessing strategy that removes redundant pixels from the image by jointing spatial-spectral information, it achieves lighter computation time and higher noise robustness abilities.
We make three primary contributions.
• We propose a superpixel-guided PPA that first exploits superpixels based on the first three components of the HSI and then transforms them into noise-reduced superpixels, which assures less computational time and light noise. • We discuss spatial-spectral information exploitation and fusion processes, which hinge highly on Tukey's test and data convexity. • Compared with existing PPAs, the proposed SGPP has a negligible computational burden, high endmember accuracy, and usable parameters.
The remainder of this paper is organized as follows. SGPP is introduced in Sec. 2. Section 3 displays experimental results comparing SGPP and other algorithms. Section 4 concludes this paper with some remarks.

Proposed SGPP Algorithm
In linear spectral mixing analysis, light is considered to follow a linear combination of different materials (or endmembers). Suppose Y ¼ ½y 1 ; y 2 ; : : : ; y n B×N is a hyperspectral image with B bands and N total pixels. It is formulated by considering endmembers, abundance, and noise matrices: Fig. 2 Visual descriptions of SGPP. SGPP first uses PCA to decompose hyperspectral data into p − 1 dimension components and corresponding eigenvectors. Then SGPP segments the first three components into a set of superpixels because the first three components normally contain over 99% of the total spectral variance. Next, for each superpixel, SGPP projects it into entire eigenvectors to recover the noise-reduced data space, which is repetitively projected into eigenvectors to capture projections regarding each pixel within the superpixel. The projections are subsequently used to determine pixels' spectral purity and spatial compactness using Tukey's theory. Finally, SGPP fuses both spatial and spectral information and specifies high-quality subsets.
E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 7 3 5 where M ∈ R B×p , A ∈ R p×N , and W ∈ R B×N are endmember, abundance, and additive noise matrices, respectively, and p is the number of endmembers, which are estimated by classic techniques such as virtual dimensionality (VD). 25 EEAs concentrate on identifying the endmember matrix M, i.e., the most spectrally pure signatures, from whole hyperspectral data Y, and applying the fully constrained least-square method 26 to estimate abundances with respect to endmembers. Not all spectral signatures of hyperspectral data are usable to search endmembers, i.e., only a few high-quality pixels are selected from the data. From the perspective of data simplex, only vertices and boundary pixels play a crucial role in specifying endmembers, with pure pixel assumption-based s-EEAs concentrating on the vertices and non-pure pixel assumption-based s-EEAs paying attention to the boundary pixels. In this regard, the aim of SGPP is to shrink hyperspectral data by removing interior pixels as much as possible.

Step 1: Superpixel Generation Process
A superpixel is normally a local irregular homogeneous area comprising a set of spatially correlated and spectrally similar pixels. Many superpixel algorithms, such as SLIC, entropy rate, 27 and watershed, 28 have been applied to hyperspectral images to explore their spatial context. 29,30 Superpixel-based methods have less computational complexity than traditional spatial context exploration methods, such as clustering or sliding window. We exploit the promising segmentation performance and linear complexity of SLIC to generate a set of superpixels for HSI.
SLIC converts an RGB image to a CIELAB color space. Each pixel has a five-dimensional vector ½l; a; b; x; y T , where ½l; a; b T and ½r; s T are the CIELAB color space and pixel position, respectively. The distance D ij between pixels y i and y j is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 1 1 6 ; 4 2 3 8 > > > > > < > > > > > : where M is a constant related to the degree of polymerization of the superpixel and K is the maximum spatial distance used to specify the regional clustering area. Based on the initial clusters and the distance measurement, the inital clustering areas are finally grouped into irregular superpixels. Figure 3 shows different clustering strategies related to the k-means and SLIC. SLIC treats the RGB image as the input, yet the hyperspectral image contains hundreds of continuous spectral bands. Two methods are considered to solve this problem. The first selects three bands with wavelengths corresponding to red, green, and blue as the RGB image, and the second uses dimension reduction methods such as PCA to capture the first three components as an RGB image. We adopt the second method because there are at least two advantages: (1) the first three components decomposed by PCA normally contain almost 99% of the information in input hyperspectral data, which can be seen as an input of the SLIC, and (2) the utilization of PCA can simultaneously alleviate noise.
For the SGPP, it first transforms the HSI into p − 1 dimension data. Then the first three components are regarded as an RGB image, which is segmented by the SLIC algorithm. Each superpixel is converted to noise-reduced data space with L-dimension for the purpose of subsequent spatial compactness and spectral purity analysis.

Step 2: Spatial Compactness Analysis
Spatial information is traditionally obtained using a sliding window to calculate the spectral difference or label dependence between the pixel and its adjacent pixels in the local neighborhood system. 16,18,19 However, this is time-consuming since each pixel undergoes this operation in sequence. In this regard, SGPP discards traditional spatial exploitation processes but treats spatial information as the process of specifying its compactness. Specifically, spatial compactness highly relies on the fact that spatially correlated pixels should be close to each other if they are projected into the subspace. In this regard, SGPP projects pixels belonging to the same superpixel into a set of projection bases decomposed by PCA. By applying Tukey's test, SGPP can quickly observe the distribution features of projections and remove outliers because Tukey's test defines a normal data range by calculating the upper or lower limit using quartiles. Data outside the upper and lower limits are regarded as abnormal. The detailed information of Tukey's test can be found in Kraaikamp and Meester 31 and Aggarwal. 32 Suppose S ¼ ½s 1 ; s 2 ; : : : ; s m B×m is a segmented superpixel with B bands and m total pixels, and V ¼ fv i g p i¼1 is a set of projection bases decomposed by PCA. The superpixel data are projected into v i as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 1 1 6 ; 4 4 6 where x ¼ ðx 1 ; x 2 ; : : : ; x m Þ ∈ R m . We arrange the elements of x in ascending order, and a new vector isx, with elementsx 1 ≤x 2 ≤ : : : ≤x m . Then a quartile is given as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 1 1 6 ; 3 9 0 where Q 1 , i.e., q ¼ 1, denotes the lower or first quartile and Q 3 , i.e., q ¼ 3, denotes the upper or third quartile. It is noteworthy that the second quartile is the median of the data samples. The interquartile range (IQR) is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 1 1 6 ; 2 9 0 From above the upper and lower quartiles, a distance of 1.5 times the IQR is measured to give the lower and upper limits. 31 The lower and upper limits are Q 1 − 1.5 × IQR and Q 3 þ 1.5 × IQR, respectively. According to the data range between the lower and upper limits, we redefine the spatial information of the pixels s i on v i as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 1 1 6 ; 2 1 0 By calculating spatial information of the pixels s i on entire projection bases, we capture its spatial compactness as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 1 1 6 ; 1 4 0 When the pixels that locate in the same superpixel are spatially homogeneous without containing anomalous pixels, the spatial compactness vector of superpixel sc ¼ ðφ s 1 ; φ s 2 ; : : : ; φ ÃÃs m Þ ∈ R m should be a vector of ones, which implies that the superpixel S has high spatial compactness, while zero elements imply low spatial compactness. Figure 4(a) displays a flowchart of spatial compactness, where orange signatures or points are detected by the above-mentioned process.

Step 4: Spectral Purity Calculation
Based on the projection basis, we project the data points S onto the i'th vector v i : E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 1 1 6 ; 4 9 5 The mean projection between the maxima and minima projections on the i'th vector is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 1 1 6 ; 4 4 9 For all of the vectors, the spectral purity of s i is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 0 ; 1 1 6 ; 3 9 5 where absð·Þ denotes the absolute value. By considering spectral purity of all of the pixels, we have a spectral purity vector of superpixel sp ¼ ðϕ s 1 ; ϕ s 2 ; : : : ; ϕ s m Þ ∈ R m . Figure 4(b) details the process of identifying spectral purity, with red points denoting higher spectral purity than blue points.

Step 5: Integration of Spatial Compactness and Spectral Purity
If a superpixel contains anomalous pixels, they will generally have a very high or low spectral reflectance that can lead to high spectral purity, but their spatial compactness is low since there are zero elements in the spatial compactness vector of the superpixel [see Eq. (6)]. We fuse the spatial compactness and spectral purity of the pixels and define the spatial compactness and spectral purity index (SCSPI) as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 1 1 6 ; 2 0 0 SCSPI ¼ sc⊙sp; where ⊙ denotes the Hadamard product. By sorting entities in descending order of SCSPI, pixels with the highest λ percent of SCSPI values are retained for endmember extraction.

Results
This section details experimental results obtained from different algorithms on various experimental scenarios. All of the algorithms ran on a PC with an Intel Core i7-2600K (at 3.4 GHz) and 16 GB RAM.

Benchmark Methods
Four representative PPAs, i.e., SPP, SSPP, SSPM, and RCSPP, were compared with our proposed SGPP algorithm; SPP involves reconstructing each pixel using its neighborhoods, and the others (i.e., SSPP, SSPM, RCSPP, and SGPP) consider removing redundant pixels from the image. Four representative spectral-based EEAs, i.e., OSP, N-FINDR, VCA, and MVSA, were combined with the PPAs to validate experimental performances. OSP and VCA find endmembers by specifying pixels with maximum subspace projections, whereas N-FINDR and MVSA find endmembers by forming a maximum inner simplex volume and a minimum external simplex volume, respectively. The main reason that we consider the above-mentioned algorithms for experimental comparison is that they are the most representative and wildly discussed methods.

Spectral angle distance
SAD is used to assess the spectral similarity between extracted endmember spectra and the library and is given by where y i andỹ i are the extracted endmember spectra and library, respectively. A higher spectral similarity between y i andỹ i means a smaller SAD.

Root-mean-square error
Root-mean-square error (RMSE) is used to evaluate the hyperspectral image reconstruction error between the original and estimated images and is given by RMSEðY;ŶÞ ¼ where Y andŶ represent the original and estimated image, respectively. Both have B bands and N total pixels. A lower RMSE means a better reconstruction performance.

Speedup
Speedup measures the computational cost ratio between EEAs without and with coupling PPAs and is defined as where T eea , T ppa , andT eea are the EEA execution time on original hyperspectral data, PPA preprocessing cost, and EEA execution time on preprocessed hyperspectral data, respectively. A speedup >1 implies that the PPA accelerates endmember extraction.

Synthetic dataset DS1
Using fractals to simulate spatial patterns, a 100 × 100 pixel synthetic image [see Fig. 5(a)] was generated with nine endmembers and 221 bands, 33 the endmembers of which were selected from the U.S. Geological Survey (USGS). 34 Aiming to simulate real-world scenarios, the pixels that are closer to the border of the region are more heavily mixed, but the pixels at the center of the region are more spectrally pure. 18 Zero-mean Gaussian noise was added to the fractal 1 dataset with signal-to-noise ratios varying from 10 to 60 dB.

Synthetic dataset DS2
To validate the computational cost of each PPA on different size scenarios, we used a well-known HSI generation toolbox 35 to obtain a set of synthetic images. For all of the synthetic images, the image size has the same row and column, the minimum size was fixed at 40 × 40 pixels, and the maximum size was fixed at 500 × 500 pixels. The noise for all images was fixed at 30 dB, and 10 endmembers were randomly selected from the USGS library. Figure 5(b) shows an example figure of DS2 with the size fixed at 100 × 100.

Real dataset Jasper Ridge
The Jasper Ridge dataset 36 [see Fig. 5(c) for image] contains 100 × 100 pixels with 198 spectral bands retained from 224 bands (the excluded bands were 1 to 3 and108 to 112). Four endmembers, i.e., road, soil, water, and tree, are observed from this dataset.

Parameters Setting
SGPP mainly involves only one parameter regarding the retaining ratio of endmember candidates, i.e., λ. We are aware that, if λ is fixed too large, it will highly affect algorithmic acceleration performance, whereas a small λ may extensively remove high-quality candidates. In this regard, λ is empirically set to 0.1. In terms of SPP, SSPP, SSPM, and RCSPP, their corresponding parameters are carefully tuned according to the original research articles. 16,[18][19][20] 3.5 Experimental Performance

Experiment 1
The aim of experiment 1 is to validate endmember extraction results obtained from different algorithms on the noisy dataset with different noise levels. It can be seen from Table 1  (best results are outlined in bold and suboptimal results are in italics) that the EEAs such as NFINDR and OSP can yield better SAD results for most noise scenarios after combining with SGPP. For the MVSA, it generates lower SAD results under most noise levels when it combines with SPP, but it also provides suboptimal results after combining with SGPP. In terms of VCA, it can produce better SAD results under 20 and 30 dB and also shows good endmember extraction performance under 40, 50, and 60 dB when it combines with SSPP.
To verify reconstruction errors between original hyperspectral data and reconstructed image, Table 2 reported the RMSE results of different algorithm combinations. As can be seen from Table 2, the EEAs coupled with SGPP provide lower results than other PPA-EEA combinations, especially for NFINDR, OSP, and MVSA. In addition, SSPP-based VCA and MVSA equally produced low results.

Experiment 2
The objective of experiment 2 is to verify the acceleration performances of PPAs on EEAs. We conducted this experiment because PPAs normally introduce spatial information to assist in accurate endmember extraction, and SAD and RMSE are the two most commonly used evaluation methods. However, if PPA takes too much time, it will affect its value in use. Theoretically, a good PPA usually introduces the spatial information for subsequent endmember extraction processes with lower computational complexity. Table 3 tabulates speedup results. Most remarkably, under all noise scenarios, SGPP significantly improves the computational efficiency of endmember extraction because the computational cost of SGPP is very low, even negligible, compared with other PPAs. It is noteworthy that, when VCA is combined with different PPAs, including SGPP, the speedup performance is insufficient because VCA has computational complexity 2p 2 N, which is light consumption for endmember extraction. The related complexity analysis of VCA can be found in Nascimento and Dias. 7

Experiment 3
The main purpose of experiment 3 is to provide the computational time, speedup results, and SAD results of all PPAs on DS2 with different numbers of pixels. The image size varies from 40 × 40 to 500 × 500, with stepwise increases of 20. The noise level in each image was fixed at 40 dB, and 10 endmembers were randomly selected from the USGS library. Figure 6(a) presents the computation time of four PPAs on synthetic images of different sizes. It is seen from this figure that the computation time of SPP and RCSPP is high, whereas SSPP, SSPM, and SGPP take less time to process the whole image. To validate the acceleration performance of PPAs, Fig. 6(b) provides the speedup results on different image sizes. Compared with SPP, SSPP, and SSPM, under entire image sizes, SGPP achieves the best acceleration capability because its speedup results are over two or even four. The reason that SGPP can offer the best acceleration capability is that SGPP is a simple and low-complexity algorithm framework that identifies a few high-quality endmember candidates. Figure 6(c) presents the tendency of SAD results while N-FINDR couples with four PPAs on all image scenarios. It can be seen from this figure that, when N-FINDR combines with SGPP, it provides low SAD results than other combinations. In addition, it can be observed that SSPP, SSPM, and RCSPP have a low impact on the endmember accuracy of N-FINDR, yet SPP reduces the endmember accuracy while it combines with N-FINDR because SPP reconstructs each pixel vector using its neighborhoods, leading to several small land covers possibly being overly affected.

Experiment 4
The chief purpose of this experiment is to verify the impact of pixel purity on endmember accuracy. This experiment generates four datasets based on DS1 with the noise level fixed at 30 dB. The maximum pixel purity of the four datasets is fixed at 1, 0.9, 0.8, and 0.7, respectively.

Experiment 5
The aim of this experiment is to provide preprocessing performances regarding SAD, RMSE, speedup, and time results of four PPAs on the Jasper Ridge dataset. Compared with cuprite, Jasper Ridge has relatively simple topographic features, with four ground covers. As seen from Table 4, among four PPAs, the combination of SGPP and the other two EEAs (N-FINDR and OSP) maintained the best SAD results compared with two EEAs without combining with SGPP. In addition, compared with other PPAs, three EEAs (N-FINDR, OSP, and  MVSA) that couple with SGPP generated lower RMSE results. In terms of speedup and execution time, SGPP has a better acceleration performance than the other PPA-EEA combinations. Additionally, SGPP only took 0.15 s for preprocessing, with the total execution time also on the low side. Although four EEAs couple with two PPAs including SSPP and SSPM had less endmember extraction time, they barely generated a lower time consumption than SGPP coupled EEAs.

Experiment 6
The aim of experiment 6 is to evaluate the endmember extraction accuracy, reconstruction error, acceleration performance, and computation time associated with four PPAs on the cuprite dataset. Six representative minerals, i.e., alunite, buddingtonite, dumortierite, kaolinite, muscovite, and montmorillonite, were considered for comparison with the USGS library to assess endmember extraction. 18,19,39 Table 5 tabulates the overall experimental results, which shows several interesting parts. First, N-FINDR and MVSA provided better mSAD when combined with SGPP, yet when MVSA combined with SSPP and SSPM, mSAD was very high, perhaps because a large number of pixels were removed in preprocessing so that determining the simplex was not problematic. Second, when integrated with SGPP, several EEAs, such as OSP and MVSA, generated lower RMSE results. Third, compared with other PPAs, SGPP provided the best acceleration performance for most EEAs on the cuprite datasets. Additionally, SGPP had a low computational cost, at about 1 s, whereas SPP required over 40 s, even more than the EEAs themselves, followed by RCSPP, which demanded over 20 s.

Conclusion
This paper proposes SGPP, a superpixel-guided preprocessing algorithm based on spatial compactness and spectral purity analysis. Specifically, SGPP utilizes spatial and spectral information simultaneously by capturing spatial homogeneous and spectral purity and fuses them to select a few high-quality pixels at the preprocessing stage. Experiments on different datasets indicate that SGPP significantly reduce the computational complexity of spectral-based EEAs with a negligible execution burden, while guaranteeing endmember accuracy. A potential limitation of SGPP is that the selection of λ requires consideration of the image size and spatial homogeneity level, which deserves further research.