Feature selection with kernel class separability
Classification can often benefit from efficient feature selection. However, the presence of linearly nonseparable data, quick response requirement, small sample problem, and noisy features makes the feature selection quite challenging. In this work, a class separability criterion is developed in a high-dimensional kernel space, and feature selection is performed by the maximization of this criterion. To make this feature selection approach work, the issues of automatic kernel parameter tuning, numerical stability, and regularization for multiparameter optimization are addressed. Theoretical analysis uncovers the relationship of this criterion to the radius-margin bound of the Support Vector Machines (SVMs), the Kernel Fisher Discriminant Analysis (KFDA), and the kernel alignment criterion, thus providing more insight into feature selection with this criterion. This criterion is applied to a variety of selection modes using different search strategies. Extensive experimental study demonstrates its efficiency in delivering fast and robust feature selection.