A kind of stochastic gradient descent method of self-adaptive learning rate is proposed in this thesis. This method is based on the optimization algorithm Nesterov accelerated gradient (NAG). First second derivative approximation of cost function is executed, then the final update orientation is corrected through self-adaptive learning rate, and the convergence of the method is analyzed theoretically. This method required no manual adjustment of the learning rate and is robust in the selection of noise gradient information and hyper-parameters, featuring high computation efficiency and small memory overhead. Finally, a comparison is made between this method and other stochastic gradient descent methods through MNIST digital classification task, and the experiment result showed that Adan worked well with the faster rate of convergence and is better than other stochastic gradient descent optimization methods.