STEFF: Spatio-temporal EfficientNet for dynamic texture classification in outdoor scenes

Heliyon. 2024 Feb 5;10(3):e25360. doi: 10.1016/j.heliyon.2024.e25360. eCollection 2024 Feb 15.

Abstract

In recent years, dynamic texture classification has become an important task for computer vision. This is a challenging task due to the unknown spatial and temporal nature of dynamic texture. To overcome this challenge, we investigate the potential of deep learning approaches and propose a novel spatio-temporal approach (STEFF) for dynamic texture classification that combines the representation power of motion and appearance using the difference and average operators between video sequences. In this work, we extract deep texture features from outdoor scenes and integrate both spatial and temporal features into a pre-trained Convolutional Neural Network model, namely EfficientNet, with a fine-tuning and regularization process. The robustness of the proposed approach is reflected in the promising result when comparing our method to the proposed architectures and other existing models. The experimental results on three datasets demonstrate the effectiveness and efficiency of the proposed approach. The accuracy percentages are 95.95%, 94.09%, and 98.01% on the outdoor scenes of Yupenn, DynTex++, and Yupenn++ datasets, respectively.

Keywords: CNN; Deep learning; Dynamic texture; EfficientNet; Outdoor scene classification; STEFF; Spatio-temporal features.