AI-Based Detection of Unwanted Behavior: The Paradoxical Effect of Standard Data Augmentation in Video Surveillance

Emelianov, M.V.

Article #1004

Issue MathAI 2026 Selected Papers Special Issue

Received 31 Mar 2026

Accepted 15 May 2026

Published 22 May 2026

AI-Based Detection of Unwanted Behavior: The Paradoxical Effect of Standard Data Augmentation in Video Surveillance

M

Maksim Emelianov *

MathAI 2026 Selected Papers Special Issue

Published: May 22, 2026 Accepted: May 15, 2026 Received: March 31, 2026

Video Anomaly Detection Behavioral Recognition Computer Vision Data Augmentation Spatiotemporal Modeling ResNet3D DenseNet-RNN

Abstract

Public spaces and commercial environments face persistent challenges regarding human misconduct. Traditional surveillance remains passive, while manual monitoring is labor-intensive and inefficient. Consequently, automating the detection of unwanted behavior via digital assistants is essential. This study explores the application of deep learning models to identify such behavior and analyzes the impact of data augmentation on model performance. We utilized the UCF-Crime dataset (1,900 videos, 13 classes) and constructed a novel custom dataset comprising 5,236 videos with a focus on ``violent behavior.'' Algorithms based on ResNet3D and DenseNet-RNN were trained on both original and augmented versions of these datasets. On the UCF-Crime dataset, the DenseNet-RNN model showed stability only in detecting the ``arson'' class. However, the model's stability and recognition quality improved significantly when trained on the custom dataset. Crucially, our experiments demonstrated that data augmentation negatively impacted the recognition quality on the custom dataset ($F(1,36)=7.40, p=0.01$). Specifically, the ResNet3D method exhibited a significant degradation in performance, with the AUC dropping from 0.83 (95\% CI: $[0.80-0.86]$) before augmentation to 0.73 (95\% CI: $[0.69-0.77]$) post-augmentation. The DenseNet-RNN method showed a similar downward trend (AUC 0.82, 95\% CI: $[0.79-0.85]$ vs. AUC 0.75, 95\% CI: $[0.71-0.79]$). These findings suggest that the blind application of standard augmentation procedures—a common industry practice for both static images and video frames—can lead to a paradoxical decline in detection quality. This highlights the urgent need to critically reevaluate how augmentations are selected and to develop task-specific methodologies that ensure transformations actually improve, rather than hinder, model efficacy in behavioral recognition.

Cite this article

Emelianov, M.V. AI-Based Detection of Unwanted Behavior: The Paradoxical Effect of Standard Data Augmentation in Video Surveillance. Mathematics & AI 2026, 1, 4. https://enigma.ist/j/mathematics-ai/1/2/4

Mathematics & AI

AI-Based Detection of Unwanted Behavior: The Paradoxical Effect of Standard Data Augmentation in Video Surveillance

Abstract

Cite this article

Full Text (PDF)