Category attention transfer for efficient fine-grained visual categorization

Publisher:
ELSEVIER
Publication Type:
Journal Article
Citation:
Pattern Recognition Letters, 2022, 153, pp. 10-15
Issue Date:
2022-01-01
Full metadata record
Fine-Grained Visual Categorization (FGVC) aims at distinguishing subordinate-level categories with subtle interclass differences. Although previous research shows the impressive effectiveness of the recurrent multi-attention models and the second-order feature encoding, they often require an enormous amount of both computation and memory space, making them inadequate for mobile applications. This paper proposed a Category Attention Transfer CNN (CAT-CNN) to address the efficiency issue in solving FGVC problems. We transfer part attention knowledge from a very large-scale FGVC network to a small but efficient network to significantly improve its presentation ability. Using the proposed CAT-CNN, the accuracy of the efficient networks, such as ShuffleNet, MobilieNet, and EfficientNet, can be improved by up to 5.7% on the CUB-2011-200 dataset without increasing computation complexity or memory cost. Our experiments show that the proposed CAT-CNN can be applied to multiple structures to enhance their performance. With a single efficient network structure and single inference, the proposed CAT-MobileNet-large-1.0 and the CAT-EfficientNet-b0 can achieve accuracies of 86.5% and 86.7%, respectively, on the CUB-2011-200 dataset, which is close to or better than the results from state-of-the-art methods using large scale networks and multiple inferences, and make FGVC feasible on mobile devices.
Please use this identifier to cite or link to this item: