The increase of the spatial resolution of remote-sensing sensors helps to capture the abundant details related to the semantics of surface objects. However, it is difficult for the popular object-oriented classification approaches to acquire higher level semantics from the high spatial resolution remote-sensing (HSR-RS) images, which is often referred to as the “semantic gap.” Instead of designing sophisticated operators, convolutional neural networks (CNNs), a typical deep learning method, can automatically discover intrinsic feature descriptors from a large number of input images to bridge the semantic gap. Due to the small data volume of the available HSR-RS scene datasets, which is far away from that of the natural scene datasets, there have been few reports of CNN approaches for HSR-RS image scene classifications. We propose a practical CNN architecture for HSR-RS scene classification, named the large patch convolutional neural network (LPCNN). The large patch sampling is used to generate hundreds of possible scene patches for the feature learning, and a global average pooling layer is used to replace the fully connected network as the classifier, which can greatly reduce the total parameters. The experiments confirm that the proposed LPCNN can learn effective local features to form an effective representation for different land-use scenes, and can achieve a performance that is comparable to the state-of-the-art on public HSR-RS scene datasets.