AI-Based Detection of Unwanted Behavior: The Paradoxical Effect of Standard Data Augmentation in Video Surveillance
MathAI 2026 Selected Papers
Special Issue
Video Anomaly Detection
Behavioral Recognition
Computer Vision
Data Augmentation
Spatiotemporal Modeling
ResNet3D
DenseNet-RNN
Abstract
Public spaces and commercial environments face persistent challenges regarding human misconduct. Traditional surveillance remains passive, while manual monitoring is labor-intensive and inefficient. Consequently, automating the detection of unwanted behavior via digital assistants is essential. This study explores the application of deep learning models to identify such behavior and analyzes the impact of data augmentation on model performance.
We utilized the UCF-Crime dataset (1,900 videos, 13 classes) and constructed a novel custom dataset comprising 5,236 videos with a focus on ``violent behavior.'' Algorithms based on ResNet3D and DenseNet-RNN were trained on both original and augmented versions of these datasets. On the UCF-Crime dataset, the DenseNet-RNN model showed stability only in detecting the ``arson'' class. However, the model's stability and recognition quality improved significantly when trained on the custom dataset.
Crucially, our experiments demonstrated that data augmentation negatively impacted the recognition quality on the custom dataset ($F(1,36)=7.40, p=0.01$). Specifically, the ResNet3D method exhibited a significant degradation in performance, with the AUC dropping from 0.83 (95\% CI: $[0.80-0.86]$) before augmentation to 0.73 (95\% CI: $[0.69-0.77]$) post-augmentation. The DenseNet-RNN method showed a similar downward trend (AUC 0.82, 95\% CI: $[0.79-0.85]$ vs. AUC 0.75, 95\% CI: $[0.71-0.79]$). These findings suggest that the blind application of standard augmentation procedures—a common industry practice for both static images and video frames—can lead to a paradoxical decline in detection quality. This highlights the urgent need to critically reevaluate how augmentations are selected and to develop task-specific methodologies that ensure transformations actually improve, rather than hinder, model efficacy in behavioral recognition.
Cite this article
Emelianov, M.V. AI-Based Detection of Unwanted Behavior: The Paradoxical Effect of Standard Data Augmentation in Video Surveillance. Mathematics & AI 2026, 1, 4. https://enigma.ist/j/mathematics-ai/1/2/4