Extreme learning machine (ELM) is of great interest to the machine learning society due to its extremely simple training step. Its performance sensitivity to the number of hidden neurons is studied under the context of hyperspectral remote sensing image classification. An empirical linear relationship between the number of training samples and the number of hidden neurons is proposed. Such a relationship can be easily estimated with two small training sets and extended to large training sets to greatly reduce computational cost. The kernel version of ELM (KELM) is also implemented with the radial basis function kernel, and such a linear relationship is still suitable. The experimental results demonstrated that when the number of hidden neurons is appropriate, the performance of ELM may be slightly lower than the linear SVM, but the performance of KELM can be comparable to the kernel version of SVM (KSVM). The computational cost of ELM and KELM is much lower than that of the linear SVM and KSVM, respectively.