Many classifiers have been proposed for ATR applications. Given a set of training data, a classifier is built from the labeled training data, and then applied to predict the label of a new test point. If there is enough training data, and the test points are drawn from the same distribution (i.i.d.) as training data, then many classifiers perform quite well. However, in reality, there will never be enough training data or with limited computational resources we can only use part of the training data. Likewise, the distribution of new test points might be different from that of the training data, whereby the training data is not representative of the test data. In this paper, we empirically compare several classifiers, namely support vector machines, regularized least squares classifiers, C4.4, C4.5, random decision trees, bagged C4.4, and bagged C4.5 on IR imagery. We reduce the training data by half (less representative of the test data) each time and evaluate the resulting classifiers on the test data. This allows us to assess the robustness of classifiers against a varying knowledge base. A robust classifier is the one whose accuracy is the least sensitive to changes in the training data. Our results show that ensemble methods (random decision trees, bagged C4.4 and bagged C4.5) outlast single classifiers as the training data size decreases.
A new construction algorithm for binary oblique decision tree classifier, MESODT, is described. Multimembered evolution strategies (μ,λ) integrated with the perceptron algorithm is adopted as the optimization algorithm to find the appropriate split that minimizes the evaluation function at each node of a decision tree. To better explore the benefits of this optimization algorithm, two splitting rules, the criterion based on the concept of degree of linear separability, and one of the traditional impurity measures -- information gain, are each applied to MESODT. The experiments conducted on public and artificial domains demonstrate that the trees generated by MESODT have, in most cases, higher accuracy and smaller size than the classical oblique decision trees (OC1) and axis-parallel decision trees (See5.0). Comparison with (1+1) evolution strategies is also described.