Phan, H.-H. and Ha, C.T. and Nguyen, T.T. (2020) Improving the efficiency of human action recognition using deep compression. In: 3rd International Conference on Multimedia Analysis and Pattern Recognition, MAPR 2020, 8 October 2020 through 9 October 2020.
Improving the efficiency of human action recognition using deep compression.pdf
Download (730kB) | Preview
Abstract
Convolutional neural networks (CNNs) have become the power method for many computer vision applications, including action recognition. However, they are almost computationally and memory intensive, thus are challenging to use and to deploy on systems with limited resources, except for a few recent networks which were specifically designed for mobile and embedded vision applications. In this paper, we propose a novel feature for human action recognition as an input for CNN, named MOMP Image. This idea is simple but quite beneficial since we can directly use the existing CNN models for fine-tuning. We also propose a novel pruning algorithm to decrease computational cost and improve the accuracy of action recognition. The strategy can measure the redundancy of parameters based on their relationship using the covariance and correlation criteria and then prune the less important ones. Our method directly applies to CNNs, both on convolutional and fully connected layers, and requires no specialized software/hardware accelerators. The proposed method is the first time applying network compression for human action recognition. We evaluate our system in the context of action classification on the large-scale action datasets. Our method obtains promising performance as compared to other approaches. The proposed method reduces the model size and decreases over-fitting and therefore increases the overall performance of CNN on the large-scale datasets. © 2020 IEEE.
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Divisions: | Faculties > Faculty of Information Technology |
Identification Number: | 10.1109/MAPR49794.2020.9237772 |
Uncontrolled Keywords: | Classification (of information); Convolution; Convolutional neural networks; Embedded systems; Large dataset; Action classifications; Action recognition; Computational costs; Computer vision applications; Human-action recognition; Large-scale datasets; Network compression; Specialized software; Pattern recognition |
Additional Information: | Conference code: 164647. Language of original document: English. |
URI: | http://eprints.lqdtu.edu.vn/id/eprint/8911 |