Support vector machine (SVM) classifiers are widely applied to hyperspectral image (HSI) classification and provide significant advantages in terms of accuracy, simplicity, and robustness. SVM is a well-known learning algorithm that maximizes the minimum margin. However, recent theoretical results pointed out that maximizing the minimum margin leads to a lower generalization performance than optimizing the margin distribution, and proved that the margin distribution is more important. In this paper, a large margin distribution machine (LDM) is applied to HSI classification, and optimizing the margin distribution achieves a better generalization performance than SVM. Since the raw HSI feature space is not the most effective space for representing HSI, we adopt factor analysis to learn an effective HSI feature and the learned features are further filtered by a structure-preserved filter to fully exploit the spatial structure information of HSI. The spatial structure information is integrated in the feature learning process to obtain a better HSI feature. Then we propose a multiclass LDM to classify the filtered HSI feature. Experimental results show that the proposed LDM with feature learning method achieves the classification performance of the state-of-the-art methods in terms of visual quality and three quantitative evaluations and indicates that LDM has a high generalization performance.