2 March 1994 Smoothing of cost function leads to faster convergence of neural network learning
Author Affiliations +
Abstract
One of the major problems in supervised learning of neural networks is the inevitable local minima inherent in the cost function f(W,D). This often makes classic gradient-descent-based learning algorithms that calculate the weight updates for each iteration according to (Delta) W(t) equals -(eta) (DOT)$DELwf(W,D) powerless. In this paper we describe a new strategy to solve this problem, which, adaptively, changes the learning rate and manipulates the gradient estimator simultaneously. The idea is to implicitly convert the local- minima-laden cost function f((DOT)) into a sequence of its smoothed versions {f(beta t)}Ttequals1, which, subject to the parameter (beta) t, bears less details at time t equals 1 and gradually more later on, the learning is actually performed on this sequence of functionals. The corresponding smoothed global minima obtained in this way, {Wt}Ttequals1, thus progressively approximate W--the desired global minimum. Experimental results on a nonconvex function minimization problem and a typical neural network learning task are given, analyses and discussions of some important issues are provided.
© (1994) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Li-Qun Xu, Li-Qun Xu, Trevor J. Hall, Trevor J. Hall, } "Smoothing of cost function leads to faster convergence of neural network learning", Proc. SPIE 2243, Applications of Artificial Neural Networks V, (2 March 1994); doi: 10.1117/12.169959; https://doi.org/10.1117/12.169959
PROCEEDINGS
9 PAGES


SHARE
Back to Top