Infomax principle based pooling of deep convolutional activations for image retrieval
Neural activations produced by deep convolutional networks have recently become state-of-the-art representation for image retrieval. To obtain a global image representation, sum-pooling has been frequently used to aggregate activations of convolutional feature maps. This work first presents an understanding on the effectiveness of sum-pooling via probabilistic interpretation, by proving that sum-pooling is an upper bound of the probability that a visual pattern is present in an image. To further answer the optimality of sum-pooling, a quantitative analysis based on the Infomax principle in neural networks is provided. It shows that sum-pooling aligns well with the leading eigenvector of principal component analysis (PCA) applied to the activations of a feature map. Moreover, considering the 2D matrix structure of feature maps, a two-directional 2DPCA-based pooling scheme is proposed to aggregate the convolutional activations. Experiments on multiple benchmark image retrieval datasets demonstrate the above analysis and the superiority of the proposed pooling scheme.