Phan, N. and Huy, T.D. and Duong, S.T.M. and Hoang, N.T. and Tran, S. and Hung, D.H. and Nguyen, C.D.T. and Bui, T. and Truong, S.Q.H. (2023) Logovit: Local-Global Vision Transformer for Object Re-Identification. In: UNSPECIFIED.
Full text not available from this repository. (Upload)Abstract
Object re-identification (ReID) is prone to errors under variations in scale, illumination, complex background, and object occlusion scenarios. To overcome these challenges, attention mechanisms are employed to focus on the object's characteristics, thereby extracting better discriminative features. This paper introduces a local-global vision transformer (LoGoViT) for object re-identification by learning a hierarchical-level representation from fine-grained (local) to general (global) context features. It comprises two components: (i) shift and shuffle operations to generate robust local features and (ii) local-global module to aggregate the multi-level hierarchy features of an object. Extensive experiments show that our method achieves state-of-the-art on the ReID benchmarks. We further investigate effective augmentation operations and discuss how the patch modifications improve the proposed model's generalization under occlusion scenarios. The source code is available at https://github.com/nguyenphan99/LoGoViT. © 2023 IEEE.
Item Type: | Conference or Workshop Item (UNSPECIFIED) |
---|---|
Divisions: | Offices > Office of International Cooperation |
Identification Number: | 10.1109/ICASSP49357.2023.10096126 |
Uncontrolled Keywords: | Complex background; Complex objects; Global vision; Multi-scales; Object occlusion; Object re-id; Patch modification augmentation; Public security; Re identifications; Vision transformer, Computer vision |
Additional Information: | cited By 0; Conference of 48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 ; Conference Date: 4 June 2023 Through 10 June 2023; Conference Code:193814 |
URI: | http://eprints.lqdtu.edu.vn/id/eprint/11007 |