This paper presents a novel algorithm for the dimensionality reduction which employs compressed sensing (CS) to improve the generalization capability of a classifier, especially for large-scale data. Compared to traditional dimensionality reduction methods, the proposed algorithm makes no use of the problem-dependent parameters, nor does it require additional computation for the eigenvalue decomposition like PCA or LDA. Mathematically, the derived algorithm regards the input features as the dictionary in CS, and selects the features that minimize the residual output error iteratively, thus the resulting features have a direct correspondence to the performance requirements of the given problem. Furthermore, the proposed algorithm can be regarded as a sparse classifier, which selects discriminative features and classifies the training data simultaneously. Experimentally, the CS-based algorithm is tested with a hierarchical visual pattern recognition architecture. The simulation results show that not only does the proposed method utilize only 25% of full features while achieving the test accuracy of the original full architecture, but also its performance is competitive when compared to existing dimensionality reduction methods.