The mixed pixel problem of remote sensing imagery has made spectral mixture analysis (SMA) a predominant method in the accurate interpretation of urban surface materials. High spatial resolution imagery is very beneficial in the extraction of pure pixels in SMA, but its high intraclass variability has seriously affected the accuracy of SMA. The multiple endmember spectral mixture analysis (MESMA) provides a good solution for high intraclass variability. Previous studies, however, were basically pixel-based and spectral-based, and ignored the effects of neighboring pixels on endmember spectra. To solve this problem, this study took full advantage of spatial–spectral information and proposed a multiple endmember object spectral mixture analysis (MEOSMA) approach for high spatial resolution imagery. Combined with object-based image analysis, the segment-based endmember object extraction method was developed, which used both spatial and spectral attributes to extract “endmember objects.” Then, an endmember object optimization method considering spatial correlation was put forward to select different endmember object combinations for different pixels. Compared with MESMA and simple endmember SMA, the higher correct unmixed proportion and determination coefficient (R2) indicated that MEOSMA is more accurate and has great potential for applications in urban environmental monitoring.