Probabilistic neural networks (PNN) build internal density representations based on the kernel or Parzen estimator and use Bayesian decision theory in order to build up arbitrarily complex decision boundaries. As in the classical kernel estimator, the training is performed in a single pass of the data and asymptotic convergence is guaranteed. One important factor affecting convergence is the kernel width. Theory only provides an optimal width in the case of normally distributed data. This problem becomes acute in multivariate cases. In this paper we present an asymptotically optimal method of setting kernel widths for multivariate Gaussian kernels based on the theory of filtered kernel estimators and show how this can be realized as a filtered kernel PNN architecture. Performance comparisons are made with competing methods.