In this paper, a robust, and fast system for recognition as well as pose estimation of a 3-D object from a single 2-D perspective of it taken from an arbitrary viewpoint is developed. The approach is invariant to location, orientation, and scale of the object in the perspective. The silhouette of the object in the 2-D perspective is first normalized with respect to location and scale. A set of rotation invariant features derived from complex and orthogonal pseudo- Zernike moments of the image are then extracted. The next stage includes a bank of multilayer feed-forward neural networks (NN) each of which classifies the extracted features. The training set for these nets consists of perspective views of each object taken from several different viewing angles. The NNs in the bank differ in the size of their hidden layer nodes as well as their initial conditions but receive the same input. The classification decisions of all the nets are combined through a majority voting scheme. It is shown that this collective decision making yields better results compared to a single NN operating alone. After the object is classified, two of its pose parameters, namely elevation and aspect angles, are estimated by another module of NNs in a two-stage process. The first stage identifies the likely region of the space that the object is being viewed from. In the second stage, an NN estimator for the identified region is used to compute the pose angles. Extensive experimental studies involving clean and noisy images of seven military ground vehicles are carried out. The performance is compared to two other traditional methods, namely a nearest neighbor rule and a binary decision tree classifier and it is shown that our approach has major advantages over them.