Novel Deep Feature Fusion Framework for Multi-Scenario Violence Detection
Abstract
:1. Introduction
- A novel generalisation framework, the Concatenation model, has been proposed to address the generalisation problem in video violence detection. This approach offers flexibility in incorporating new datasets without requiring training for new tasks;
- An interior deep feature fusion approach, the Fusion model, has been adopted to enhance feature representation by integrating multiple DL models;
- Three pre-trained models from ImageNet have been utilised in this paper, leveraging TL to tackle data scarcity and improve feature representation and reduce the risk of overfitting;
- The Fusion model attained an accuracy of 97.66% on the RLVS and 92.89% accuracy on the Hockey datasets during interior fusion. These results show better performance than the best existing methods;
- The proposed Concatenation model attained 97.64% accuracy on the RLVS and 92.41% on the Hockey datasets with the same classifier. As far as we know, no existing method performed similar work of a single model on multiple tasks in video anomaly detection;
- The results are further validated and explained using the Grad-CAM technique.
2. Related Works
- The methods mentioned above have a significant issue with generalisation. They require to be started from scratch when adding new datasets for new tasks, which limits their ability to perform well in various situations. This problem makes anomaly detection systems less practical and efficient. Therefore, new approaches are needed to solve the generalisation problem without extensive retraining.
- The methodologies mentioned in [5,6,7,8,9,17,18,19,20,22,25,26,28] faced a shared challenge concerning the integration of new models into the existing framework. These methods require the existing models to be retrained from scratch, resulting in substantial demands on computational resources and time. This computationally intensive and time-consuming process impedes the efficiency and practicality of incorporating new models, underscoring the need for alternative approaches to alleviate the burden of extensive retraining, while maintaining or improving performance.
- The existing approaches outlined in references [8,17,18,19,20,22] employed a single model for feature extraction, thereby overlooking the opportunity to achieve an enhanced feature representation. This limitation arises due to the varying strengths exhibited by different models in capturing specific features or patterns within the data. By amalgamating the capabilities of multiple models, we can establish a more inclusive and diverse feature representation capable of encompassing a broader spectrum of patterns and relationships in the data. Additionally, integrating multiple models mitigates the risk of overfitting and enhances the overall generalisation capacity of the model.
3. Materials and Methods
3.1. Datasets
3.2. State-of-the-Art Architectures
3.2.1. InceptionV3 Model
3.2.2. InceptionResNetV2 Model
3.2.3. Xception Model
- The global average pooling (GAP) layer, which reduces the dimensions of the feature maps and produces a fixed-length feature vector by computing the average value of each feature map;
- Flatten layer, which transforms the multi-dimensional feature maps into a one-dimensional representation, facilitating subsequent processing;
- The dense (fully connected) layer, which aims to capture intricate patterns and establish complex relationships within the feature vector. This layer allows for comprehensive feature representation through its connectivity to every element of the preceding layer;
- The dropout layer, which aims to deactivate neurons during training to prevent overfitting. Doing so encourages the model to learn more robust and generalisable features;
- The SoftMax layer, which aims to assign class probabilities for the two types of human behaviour: violence and normal behaviour. The Softmax activation function computes the probability distribution, ensuring that the predicted probabilities sum up to 1.
3.3. Proposed Solutions
3.3.1. The Architecture of the Proposed Fusion Model
- Flexibility: It offers a flexible approach for fusing multiple CNN models without training them from scratch. Instead, the new models are trained separately on specific datasets of interest, and the extracted features are added to the existing feature pool. This approach saves significant time and effort, reduces the computational resources required, and eliminates the need to retrain the pre-trained models.
- Better feature representation: It captures features from different models that can achieve a better feature representation than a single model. This is because different models have different strengths in capturing certain features or patterns in the data. By fusing the strengths of multiple models, we can create a more comprehensive and diverse feature representation that captures a wider range of patterns and relationships in the data. Moreover, combining multiple models can reduce the risk of overfitting and improve the model’s generalisation ability.
3.3.2. The Structure of the Concatenation Model
- It has improved generalisation capabilities compared to individual models trained on specific datasets. By fusing features from different pre-trained CNN models and incorporating dataset-specific models, the model leverages the strengths of each component to perform well on previously unseen data, combining knowledge from diverse violent scenarios.
- The Concatenation model has been developed with scalability, ensuring that it can efficiently incorporate new datasets. This feature allows the model to adapt and excel in various anomaly scenarios, including violence, arson, and road accidents, without requiring complete retraining of the entire model. The capability to integrate new datasets enhances the model’s versatility and practicality in real-world settings.
3.4. Training
- Training and evaluating the three models using the RLVS dataset;
- Training and evaluating the three models using the Hockey dataset;
- Assessing the performance of the proposed Fusion model, which incorporates features from the RLVS dataset, through separate tests on both the RLVS and Hockey datasets;
- Evaluating the proposed Fusion model using features from the Hockey dataset, and conducting separate tests on both the Hockey and RLVS datasets;
- Finally, for evaluating the proposed Concatenation model, which integrates the extracted features from both the RLVS and Hockey datasets, we performed tests specifically on the RLVS dataset. Subsequently, we also assessed the model’s performance on the Hockey dataset.
3.5. Experiment Setup and Training Options
3.6. Grad-CAM
4. Results
4.1. Performance Evaluation Metrics
- TP—True positive, TN—True negative.
- FP—False positive, FN—False negative.
4.2. Evaluation of Individual Models
4.2.1. Experiment Results on RLVS Dataset
4.2.2. Experiment Results on Hockey Dataset
4.3. Experimental Results of the Fusion Model
4.3.1. Experimental Results on RLVS Dataset
4.3.2. Experimental Results on Hockey Dataset
4.4. Experimental Results of the Concatenation Model
4.5. State-of-the-Art Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Jebur, S.A.; Hussein, K.A.; Hoomod, H.K.; Alzubaidi, L.; Santamaría, J. Review on Deep Learning Approaches for Anomaly Event Detection in Video Surveillance. Electronics 2022, 12, 29. [Google Scholar] [CrossRef]
- Amin, J.; Anjum, M.A.; Ibrar, K.; Sharif, M.; Kadry, S.; Crespo, R.G. Detection of Anomaly in Surveillance Videos Using Quantum Convolutional Neural Networks. Image Vis. Comput. 2023, 135, 104710. [Google Scholar] [CrossRef]
- Abd, W.H.; Sadiq, A.T.; Hussein, K.A. Human Fall down Recognition Using Coordinates Key Points Skeleton. In Proceedings of the 2022 3rd Information Technology to Enhance E-Learning and Other Application (IT-ELA), Baghdad, Iraq, 27–28 December 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 232–237. [Google Scholar]
- Ali, M.A.; Hussain, A.J.; Sadiq, A.T. Deep Learning Algorithms for Human Fighting Action Recognition. Int. J. Online Biomed. Eng. 2022, 18, 71–87. [Google Scholar]
- Naik, A.J.; Gopalakrishna, M.T. Deep-Violence: Individual Person Violent Activity Detection in Video. Multimed. Tools Appl. 2021, 80, 18365–18380. [Google Scholar] [CrossRef]
- Traoré, A.; Akhloufi, M.A. Violence Detection in Videos Using Deep Recurrent and Convolutional Neural Networks. In Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Toronto, ON, Canada, 11–14 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 154–159. [Google Scholar]
- Gadelkarim, M.; Khodier, M.; Gomaa, W. Violence Detection and Recognition from Diverse Video Sources. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Padova, Italy, 18–23 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–8. [Google Scholar]
- Irfanullah; Hussain, T.; Iqbal, A.; Yang, B.; Hussain, A. Real Time Violence Detection in Surveillance Videos Using Convolutional Neural Networks. Multimed. Tools Appl. 2022, 81, 38151–38173. [Google Scholar] [CrossRef]
- Vijeikis, R.; Raudonis, V.; Dervinis, G. Efficient Violence Detection in Surveillance. Sensors 2022, 22, 2216. [Google Scholar] [CrossRef]
- Kang, M.; Park, R.-H.; Park, H.-M. Efficient Spatio-Temporal Modeling Methods for Real-Time Violence Recognition. IEEE Access 2021, 9, 76270–76285. [Google Scholar] [CrossRef]
- Abdali, A.-M.R.; Al-Tuma, R.F. Robust Real-Time Violence Detection in Video Using Cnn and Lstm. In Proceedings of the 2019 2nd Scientific Conference of Computer Sciences (SCCS), Baghdad, Iraq, 27–28 March 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 104–108. [Google Scholar]
- Ali, L.R.; Shaker, B.N.; Jebur, S.A. An Extensive Study of Sentiment Analysis Techniques: A Survey. In Proceedings of the AIP Conference Proceedings, Baghdad, Iraq, 8–9 December 2021; AIP Publishing: Baghdad, Iraq, 2023; Volume 2591. [Google Scholar]
- Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of Deep Learning: Concepts, CNN Architectures, Challenges, Applications, Future Directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef]
- Al-Khazraji, L.R.A.; Abbas, A.R.; Jamil, A.S. A Systematic Review of Deep Dream. IRAQI J. Comput. Commun. Control Syst. Eng. 2023, 23, 192–209. [Google Scholar]
- Ali, L.R.; Jebur, S.A.; Jahefer, M.M.; Shaker, B.N. Employing Transfer Learning for Diagnosing COVID-19 Disease. Int. J. Online Biomed. Eng. 2022, 18, 31–42. [Google Scholar] [CrossRef]
- Abdulhadi, M.T.; Abbas, A.R. Human Action Behavior Recognition in Still Images with Proposed Frames Selection Using Transfer Learning. iJOE 2023, 19, 47. [Google Scholar] [CrossRef]
- Jebur, S.A.; Hussein, K.A.; Hoomod, H.K. Improving Abnormal Behavior Detection in Video Surveillance Using Inception-v3 Transfer Learning. IRAQI J. Comput. Commun. Control Syst. Eng. 2023, 23, 201–221. [Google Scholar]
- Durães, D.; Santos, F.; Marcondes, F.S.; Lange, S.; Machado, J. Comparison of Transfer Learning Behaviour in Violence Detection with Different Public Datasets. In Progress. in Artificial Intelligence, Proceedings of the 20th EPIA Conference on Artificial Intelligence, EPIA 2021, Virtual Event, 7–9 September 2021, Proceedings 20; Springer: Berlin/Heidelberg, Germany, 2021; pp. 290–298. [Google Scholar]
- Khan, S.U.; Haq, I.U.; Rho, S.; Baik, S.W.; Lee, M.Y. Cover the Violence: A Novel Deep-Learning-Based Approach towards Violence-Detection in Movies. Appl. Sci. 2019, 9, 4963. [Google Scholar] [CrossRef]
- Mumtaz, A.; Sargano, A.B.; Habib, Z. Violence Detection in Surveillance Videos with Deep Network Using Transfer Learning. In Proceedings of the 2018 2nd European Conference on Electrical Engineering and Computer Science (EECS), Bern, Switzerland, 20–22 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 558–563. [Google Scholar]
- Alzubaidi, L.; Bai, J.; Al-Sabaawi, A.; Santamaría, J.; Albahri, A.S.; Al-dabbagh, B.S.N.; Fadhel, M.A.; Manoufali, M.; Zhang, J.; Al-Timemy, A.H. A Survey on Deep Learning Tools Dealing with Data Scarcity: Definitions, Challenges, Solutions, Tips, and Applications. J. Big Data 2023, 10, 46. [Google Scholar] [CrossRef]
- Imah, E.M.; Wintarti, A. Violence Classification Using Support Vector Machine and Deep Transfer Learning Feature Extraction. In Proceedings of the 2021 International Seminar on Intelligent Technology and Its Applications (ISITIA), Virtual, 21–22 July 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 337–342. [Google Scholar]
- Alzubaidi, L.; Duan, Y.; Al-Dujaili, A.; Ibraheem, I.K.; Alkenani, A.H.; Santamaría, J.; Fadhel, M.A.; Al-Shamma, O.; Zhang, J. Deepening into the Suitability of Using Pre-Trained Models of ImageNet against a Lightweight Convolutional Neural Network in Medical Imaging: An Experimental Study. PeerJ Comput. Sci. 2021, 7, e715. [Google Scholar] [CrossRef]
- Albahri, A.S.; Duhaim, A.M.; Fadhel, M.A.; Alnoor, A.; Baqer, N.S.; Alzubaidi, L.; Albahri, O.S.; Alamoodi, A.H.; Bai, J.; Salhi, A. A Systematic Review of Trustworthy and Explainable Artificial Intelligence in Healthcare: Assessment of Quality, Bias Risk, and Data Fusion. Inf. Fusion. 2023, 96, 156–191. [Google Scholar] [CrossRef]
- Sernani, P.; Falcionelli, N.; Tomassini, S.; Contardo, P.; Dragoni, A.F. Deep Learning for Automatic Violence Detection: Tests on the AIRTLab Dataset. IEEE Access 2021, 9, 160580–160595. [Google Scholar] [CrossRef]
- Chexia, Z.; Tan, Z.; Wu, D.; Ning, J.; Zhang, B. A Generalized Model for Crowd Violence Detection Focusing on Human Contour and Dynamic Features. In Proceedings of the 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid), Taormina, Italy, 16–19 May 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 327–335. [Google Scholar]
- Kotkar, V.A.; Sucharita, V. Fast Anomaly Detection in Video Surveillance System Using Robust Spatiotemporal and Deep Learning Methods. Multimed. Tools Appl. 2023, 82, 34259–34286. [Google Scholar] [CrossRef]
- Huillcen Baca, H.A.; de Luz Palomino Valdivia, F.; Solis, I.S.; Cruz, M.A.; Caceres, J.C.G. Human Violence Recognition in Video Surveillance in Real-Time. In Future of Information and Communication Conference (FICC); Springer Nature: Cham, Switzerland, 2023; pp. 783–795. [Google Scholar]
- Soliman, M.M.; Kamal, M.H.; Nashed, M.A.E.-M.; Mostafa, Y.M.; Chawky, B.S.; Khattab, D. Violence Recognition from Videos Using Deep Learning Techniques. In Proceedings of the 2019 Ninth International Conference on Intelligent Computing and Information Systems (ICICIS), Cairo, Egypt, 8–9 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 80–85. [Google Scholar]
- Bermejo Nievas, E.; Deniz Suarez, O.; Bueno García, G.; Sukthankar, R. Violence Detection in Video Using Computer Vision Techniques. In Proceedings of the International Conference on Computer Analysis of Images and Patterns, Seville, Spain, 29–31 August 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 332–339. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Peng, S.; Huang, H.; Chen, W.; Zhang, L.; Fang, W. More Trainable Inception-ResNet for Face Recognition. Neurocomputing 2020, 411, 9–19. [Google Scholar] [CrossRef]
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 June 2017; pp. 1251–1258. [Google Scholar]
- Huang, C.; Wang, X.; Cao, J.; Wang, S.; Zhang, Y. HCF: A Hybrid CNN Framework for Behavior Detection of Distracted Drivers. IEEE Access 2020, 8, 109335–109349. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Das, A.; Vedantam, R.; Cogswell, M.; Parikh, D.; Batra, D. Grad-CAM: Why Did You Say That? arXiv 2016, arXiv:1611.07450. [Google Scholar]
- Saporta, A.; Gui, X.; Agrawal, A.; Pareek, A.; Truong, S.Q.H.; Nguyen, C.D.T.; Ngo, V.-D.; Seekins, J.; Blankenberg, F.G.; Ng, A.Y. Benchmarking Saliency Methods for Chest X-Ray Interpretation. Nat. Mach. Intell. 2022, 4, 867–878. [Google Scholar] [CrossRef]
- Bi, Y.; Li, D.; Luo, Y. Combining Keyframes and Image Classification for Violent Behavior Recognition. Appl. Sci. 2022, 12, 8014. [Google Scholar] [CrossRef]
- Deniz, O.; Serrano, I.; Bueno, G.; Kim, T.-K. Fast Violence Detection in Video. In Proceedings of the 2014 International Conference on Computer Vision Theory and Applications (VISAPP), Lisbon, Portugal, 5–8 January 2014; IEEE: Piscataway, NJ, USA, 2014; Volume 2, pp. 478–485. [Google Scholar]
- Huang, J.-F.; Chen, S.-L. Detection of Violent Crowd Behavior Based on Statistical Characteristics of the Optical Flow. In Proceedings of the 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery (Fskd), Xiamen, China, 19–21 August 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 565–569. [Google Scholar]
- Schwarz, K.; Fragkias, M.; Boone, C.G.; Zhou, W.; McHale, M.; Grove, J.M.; O’Neil-Dunne, J.; McFadden, J.P.; Buckley, G.L.; Childers, D. Trees Grow on Money: Urban Tree Canopy Cover and Environmental Justice. PLoS ONE 2015, 10, e0122051. [Google Scholar] [CrossRef]
- Gao, Y.; Liu, H.; Sun, X.; Wang, C.; Liu, Y. Violence Detection Using Oriented Violent Flows. Image Vis. Comput. 2016, 48, 37–41. [Google Scholar] [CrossRef]
- Serrano, I.; Deniz, O.; Bueno, G.; Garcia-Hernando, G.; Kim, T.-K. Spatio-Temporal Elastic Cuboid Trajectories for Efficient Fight Recognition Using Hough Forests. Mach. Vis. Appl. 2018, 29, 207–217. [Google Scholar] [CrossRef]
- Garcia-Cobo, G.; SanMiguel, J.C. Human Skeletons and Change Detection for Efficient Violence Detection in Surveillance Videos. Comput. Vis. Image Underst. 2023, 233, 103739. [Google Scholar] [CrossRef]
Dataset Name | Class Name | Group Name | No. of Clips | No. of Frames |
---|---|---|---|---|
RLVS | Violence | Training | 719 | 10,659 |
Testing | 175 | 2526 | ||
Non-violence | Training | 800 | 10,661 | |
Testing | 200 | 2521 | ||
Hockey | Violence | Training | 397 | 15,795 |
Testing | 103 | 3950 | ||
Non-violence | Training | 399 | 15,776 | |
Testing | 101 | 3882 |
Model | Accuracy (%) | Recall (%) | Precision (%) | F1 Score (%) |
---|---|---|---|---|
Inception | 96.0 | 95.88 | 96.18 | 96.0 |
InceptionResNet | 96.19 | 95.32 | 97.0 | 96.16 |
Xception | 96.17 | 96.75 | 95.65 | 96.20 |
Model | Accuracy (%) | Recall (%) | Precision (%) | F1 Score (%) |
---|---|---|---|---|
Inception | 93.75 | 91.59 | 95.82 | 93.66 |
InceptionResNet | 88.72 | 91.0 | 87.18 | 89.05 |
Xception | 92.41 | 95.50 | 90.0 | 92.69 |
Classifier | Accuracy (%) | Recall (%) | Precision (%) | F1 Score (%) |
---|---|---|---|---|
Naïve Bayes | 96.69 | 99.16 | 94.49 | 96.77 |
KNN | 97.12 | 97.98 | 96.34 | 97.15 |
SoftMax | 97.58 | 97.66 | 97.50 | 97.58 |
SVM | 97.60 | 97.70 | 97.51 | 97.60 |
AdaBoost | 97.60 | 98.06 | 97.17 | 97.61 |
LogReg | 97.66 | 98.06 | 97.28 | 97.67 |
Classifier | Accuracy (%) | Recall (%) | Precision (%) | F1 Score (%) |
---|---|---|---|---|
KNN | 85.88 | 80.31 | 90.60 | 85.15 |
SVM | 92.23 | 93.39 | 91.37 | 92.37 |
LogReg | 92.24 | 93.39 | 91.40 | 92.38 |
Softmax | 92.29 | 93.65 | 91.28 | 92.45 |
AdaBoost | 92.30 | 93.09 | 91.76 | 92.42 |
Naïve Bayes | 92.89 | 88.06 | 97.60 | 92.59 |
Classifier | Accuracy (%) | Recall (%) | Precision (%) | F1 Score (%) |
---|---|---|---|---|
Naïve Bayes | 37.19 | 3.37 | 10.76 | 5.14 |
SVM | 41.50 | 0.43 | 2.55 | 0.73 |
SoftMax | 41.57 | 0.40 | 2.42 | 0.69 |
LogReg | 41.87 | 0.35 | 2.21 | 0.61 |
KNN | 52.47 | 71.45 | 52.08 | 60.25 |
AdaBoost | 58.82 | 60.81 | 58.85 | 59.82 |
Classifier | Accuracy (%) | Recall (%) | Precision (%) | F1 Score (%) |
---|---|---|---|---|
SoftMax | 37.05 | 16.19 | 27.84 | 20.47 |
SVM | 37.07 | 14.13 | 26.17 | 18.35 |
KNN | 51.71 | 57.0 | 51.59 | 54.16 |
LogReg | 62.88 | 85.78 | 58.86 | 69.82 |
AdaBoost | 62.98 | 85.15 | 59.02 | 69.72 |
Naïve Bayes | 65.70 | 69.12 | 64.73 | 66.82 |
Classifier | Accuracy (%) | Recall (%) | Precision (%) | F1 Score (%) |
---|---|---|---|---|
KNN | 95.00 | 93.03 | 96.86 | 94.91 |
Naïve Bayes | 97.34 | 98.57 | 96.21 | 97.37 |
SoftMax | 97.60 | 97.78 | 97.43 | 97.60 |
SVM | 97.60 | 97.70 | 97.51 | 97.60 |
AdaBoost | 97.60 | 98.06 | 97.17 | 97.61 |
LogReg | 97.64 | 97.98 | 97.32 | 97.65 |
Classifier | Accuracy (%) | Recall (%) | Precision (%) | F1 Score (%) |
---|---|---|---|---|
KNN | 90.91 | 90.32 | 91.53 | 90.92 |
SoftMax | 92.24 | 93.47 | 91.33 | 92.39 |
SVM | 92.24 | 93.39 | 91.40 | 92.38 |
AdaBoost | 92.25 | 94.0 | 90.93 | 92.44 |
LogReg | 92.30 | 93.77 | 91.20 | 92.47 |
Naïve Bayes | 92.41 | 96.03 | 89.64 | 92.73 |
Ref., Year | Method | Accuracy % |
---|---|---|
[29], 2019 | VGG16 + LSTM | 88.20 |
[6], 2020 | ValdNet2 (GRU) | 96.74 |
[18], 2021 | Flow Gated RGB | 87.25 |
[37], 2022 | keyframe-based ResNet18 | 94.60 |
[26], 2022 | HD-NET | 96.50 |
Proposed Fusion model | 97.66 |
Ref., Year | Method | Accuracy % |
---|---|---|
[30], 2011 | STIP (HOG) + HIK | 91.7 |
[38], 2014 | Histograms of frequency-based motion intensities + AdaBoost | 90.1 |
[39], 2014 | The variance of optical flow, SVM | 86.9 |
[40], 2015 | Motion blobs + Random Forests | 82.4 |
[41], 2016 | ViF, OViF, AdaBoost and SVM | 87.5 |
[42], 2018 | STEC + Hough Forests | 82.6 |
[19], 2019 | MobileNet | 87.0 |
[29], 2019 | VGG16 + LSTM | 86.20 |
[18], 2021 | Flow Gated RGB | 92.0 |
[43], 2023 | ConvLSTM | 91.0 |
Proposed Fusion model | 92.89 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jebur, S.A.; Hussein, K.A.; Hoomod, H.K.; Alzubaidi, L. Novel Deep Feature Fusion Framework for Multi-Scenario Violence Detection. Computers 2023, 12, 175. https://doi.org/10.3390/computers12090175
Jebur SA, Hussein KA, Hoomod HK, Alzubaidi L. Novel Deep Feature Fusion Framework for Multi-Scenario Violence Detection. Computers. 2023; 12(9):175. https://doi.org/10.3390/computers12090175
Chicago/Turabian StyleJebur, Sabah Abdulazeez, Khalid A. Hussein, Haider Kadhim Hoomod, and Laith Alzubaidi. 2023. "Novel Deep Feature Fusion Framework for Multi-Scenario Violence Detection" Computers 12, no. 9: 175. https://doi.org/10.3390/computers12090175
APA StyleJebur, S. A., Hussein, K. A., Hoomod, H. K., & Alzubaidi, L. (2023). Novel Deep Feature Fusion Framework for Multi-Scenario Violence Detection. Computers, 12(9), 175. https://doi.org/10.3390/computers12090175