The CMOS single photon avalanche photodiode (SPAD) image sensor, as the third-generation solid-state imaging device, features single photon response capability, picosecond magnitude time resolution and micron-scale spatial resolution. The device is currently the mainstream ideal device for single-photon, picosecond time-resolved transient imaging, and is gradually applied to time-resolved spectral measurement, 3D ranging and imaging, fluorescence lifetime imaging, quantum imaging sensing and such low light or even single photon ultrafast imaging. In this paper, we introduce the research progress of the CMOS SPAD image sensor, and the challenges and solutions of the device are analyzed. In the past years, the mainstream CMOS SPAD image sensor features front-illuminated SPAD and the planar-structure pixel. However, for the planar-structure pixel, in order to make the SPAD with higher fill factor, reducing the duty cycle of the readout electronics within the pixel is the usual method, which to some extent sacrifices the function of reading electronics. In addition, the lower process node was used to improve the integration of electronics, but the high dark count rate was easily caused; The integration of micro-lens array in pixels was also used, but limits the flexibility of pixel size and increases the costs. Compared with planar-structure pixel, the pixel scheme of the three dimensional (3D) stacked structure, integrates the SPAD device and the readout electronics in the pixel correspondingly on the vertically coupled two wafers, which eliminates the problem of duty cycle of the readout electronics within the pixel and would be the development direction in the future.