Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (30)

Search Parameters:
Keywords = crowd video monitoring

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
23 pages, 20665 KiB  
Article
Motion-Status-Driven Piglet Tracking Method for Monitoring Piglet Movement Patterns Under Sow Posture Changes
by Aqing Yang, Shimei Li, Shuqin Tu, Na Han, Lei Zhang, Yizhi Luo and Yueju Xue
Vet. Sci. 2025, 12(7), 616; https://doi.org/10.3390/vetsci12070616 - 24 Jun 2025
Viewed by 449
Abstract
Understanding how piglets move around sows during posture changes is crucial for their safety and healthy growth. Automated monitoring can reduce farm labor and help prevent accidents like piglet crushing. Current methods (called Joint Detection-and-Tracking-based, abbreviated as JDT-based) struggle with problems like misidentifying [...] Read more.
Understanding how piglets move around sows during posture changes is crucial for their safety and healthy growth. Automated monitoring can reduce farm labor and help prevent accidents like piglet crushing. Current methods (called Joint Detection-and-Tracking-based, abbreviated as JDT-based) struggle with problems like misidentifying piglets or losing track of them due to crowding, occlusion, and shape changes. To solve this, we developed MSHMTracker, a smarter tracking system that introduces a motion-status hierarchical architecture to significantly improve tracking performance by adapting to piglets’ motion statuses. In MSHMTracker, a score- and time-driven hierarchical matching mechanism (STHM) was used to establish the spatio-temporal association by the motion status, helping maintain accurate tracking even in challenging conditions. Finally, piglet group aggregation or dispersion behaviors in response to sow posture changes were identified based on the tracked trajectory information. Tested on 100 videos (30,000+ images), our method achieved 93.8% tracking accuracy (MOTA) and 92.9% identity consistency (IDF1). It outperformed six popular tracking systems (e.g., DeepSort, FairMot). The mean accuracy of behavior recognition was 87.5%. In addition, the correlations (0.6 and 0.82) between piglet stress responses and sow posture changes were explored. This research showed that piglet movements are closely related to sow behavior, offering insights into sow–piglet relationships. This work has the potential to reduce farmers’ labor and improve the productivity of animal husbandry. Full article
Show Figures

Figure 1

21 pages, 4777 KiB  
Article
Harnessing Semantic and Trajectory Analysis for Real-Time Pedestrian Panic Detection in Crowded Micro-Road Networks
by Rongyong Zhao, Lingchen Han, Yuxin Cai, Bingyu Wei, Arifur Rahman, Cuiling Li and Yunlong Ma
Appl. Sci. 2025, 15(10), 5394; https://doi.org/10.3390/app15105394 - 12 May 2025
Viewed by 402
Abstract
Pedestrian panic behavior is a primary cause of overcrowding and stampede accidents in public micro-road network areas with high pedestrian density. However, reliably detecting such behaviors remains challenging due to their inherent complexity, variability, and stochastic nature. Current detection models often rely on [...] Read more.
Pedestrian panic behavior is a primary cause of overcrowding and stampede accidents in public micro-road network areas with high pedestrian density. However, reliably detecting such behaviors remains challenging due to their inherent complexity, variability, and stochastic nature. Current detection models often rely on single-modality features, which limits their effectiveness in complex and dynamic crowd scenarios. To overcome these limitations, this study proposes a contour-driven multimodal framework that first employs a CNN (CDNet) to estimate density maps and, by analyzing steep contour gradients, automatically delineates a candidate panic zone. Within these potential panic zones, pedestrian trajectories are analyzed through LSTM networks to capture irregular movements, such as counterflow and nonlinear wandering behaviors. Concurrently, semantic recognition based on Transformer models is utilized to identify verbal distress cues extracted through Baidu AI’s real-time speech-to-text conversion. The three embeddings are fused through a lightweight attention-enhanced MLP, enabling end-to-end inference at 40 FPS on a single GPU. To evaluate branch robustness under streaming conditions, the UCF Crowd dataset (150 videos without panic labels) is processed frame-by-frame at 25 FPS solely for density assessment, whereas full panic detection is validated on 30 real Itaewon-Stampede videos and 160 SUMO/Unity simulated emergencies that include explicit panic annotations. The proposed system achieves 91.7% accuracy and 88.2% F1 on the Itaewon set, outperforming all single- or dual-modality baselines and offering a deployable solution for proactive crowd safety monitoring in transport hubs, festivals, and other high-risk venues. Full article
Show Figures

Figure 1

29 pages, 2763 KiB  
Review
A Review of Computer Vision Technology for Football Videos
by Fucheng Zheng, Duaa Zuhair Al-Hamid, Peter Han Joo Chong, Cheng Yang and Xue Jun Li
Information 2025, 16(5), 355; https://doi.org/10.3390/info16050355 - 28 Apr 2025
Viewed by 1468
Abstract
In the era of digital advancement, the integration of Deep Learning (DL) algorithms is revolutionizing performance monitoring in football. Due to restrictions on monitoring devices during games to prevent unfair advantages, coaches are tasked to analyze players’ movements and performance visually. As a [...] Read more.
In the era of digital advancement, the integration of Deep Learning (DL) algorithms is revolutionizing performance monitoring in football. Due to restrictions on monitoring devices during games to prevent unfair advantages, coaches are tasked to analyze players’ movements and performance visually. As a result, Computer Vision (CV) technology has emerged as a vital non-contact tool for performance analysis, offering numerous opportunities to enhance the clarity, accuracy, and intelligence of sports event observations. However, existing CV studies in football face critical challenges, including low-resolution imagery of distant players and balls, severe occlusion in crowded scenes, motion blur during rapid movements, and the lack of large-scale annotated datasets tailored for dynamic football scenarios. This review paper fills this gap by comprehensively analyzing advancements in CV, particularly in four key areas: player/ball detection and tracking, motion prediction, tactical analysis, and event detection in football. By exploring these areas, this review offers valuable insights for future research on using CV technology to improve sports performance. Future directions should prioritize super-resolution techniques to enhance video quality and improve small-object detection performance, collaborative efforts to build diverse and richly annotated datasets, and the integration of contextual game information (e.g., score differentials and time remaining) to improve predictive models. The in-depth analysis of current State-Of-The-Art (SOTA) CV techniques provides researchers with a detailed reference to further develop robust and intelligent CV systems in football. Full article
(This article belongs to the Special Issue AI-Based Image Processing and Computer Vision)
Show Figures

Figure 1

15 pages, 5509 KiB  
Article
Multimodal Video Analysis for Crowd Anomaly Detection Using Open Access Tourism Cameras
by Alejandro Dionis-Ros, Joan Vila-Francés, Rafael Magdalena-Benedito, Fernando Mateo and Antonio J. Serrano-López
Appl. Sci. 2024, 14(23), 11075; https://doi.org/10.3390/app142311075 - 28 Nov 2024
Cited by 3 | Viewed by 1729
Abstract
In this article, we propose the detection of crowd anomalies through the extraction of information in the form of time series in video format using a multimodal approach. Through pattern recognition algorithms and segmentation, informative measures of the number of people and image [...] Read more.
In this article, we propose the detection of crowd anomalies through the extraction of information in the form of time series in video format using a multimodal approach. Through pattern recognition algorithms and segmentation, informative measures of the number of people and image occupancy are extracted at regular intervals, which are then analyzed to obtain trends and anomalous behaviors. Specifically, through temporal decomposition and residual analysis, intervals or specific situations of unusual behaviors are identified, which can be used in decision-making and the improvement of actions in sectors related to human movement such as tourism or security. This methodology introduces a novel, privacy-focused approach by analyzing anonymized metrics rather than tracking or recognizing individuals, setting a new standard for ethical crowd monitoring. Applied to the webcam of Turisme Comunitat Valenciana in the town of Morella (Comunitat Valenciana, Spain), this approach has shown excellent results, correctly detecting specific anomalous situations and unusual overall increases during the previous weekend and during the October 2023 festivities. These results have been obtained while preserving the confidentiality of individuals at all times by using measures that maximize anonymity, without trajectory recording or person recognition. Full article
(This article belongs to the Special Issue Advanced Image Analysis and Processing Technologies and Applications)
Show Figures

Figure 1

26 pages, 7339 KiB  
Article
Crowd Density Estimation via Global Crowd Collectiveness Metric
by Ling Mei, Mingyu Yu, Lvxiang Jia and Mingyu Fu
Drones 2024, 8(11), 616; https://doi.org/10.3390/drones8110616 - 28 Oct 2024
Cited by 2 | Viewed by 1566
Abstract
Drone-captured crowd videos have become increasingly prevalent in various applications in recent years, including crowd density estimation via measuring crowd collectiveness. Traditional methods often measure local differences in motion directions among individuals and scarcely handle the challenge brought by the changing illumination of [...] Read more.
Drone-captured crowd videos have become increasingly prevalent in various applications in recent years, including crowd density estimation via measuring crowd collectiveness. Traditional methods often measure local differences in motion directions among individuals and scarcely handle the challenge brought by the changing illumination of scenarios. They are limited in their generalization. The crowd density estimation needs both macroscopic and microscopic descriptions of collective motion. In this study, we introduce a Global Measuring Crowd Collectiveness (GMCC) metric that incorporates intra-crowd and inter-crowd collectiveness to assess the collective crowd motion. An energy spread process is introduced to explore the related crucial factors. This process measures the intra-crowd collectiveness of individuals within a crowded cluster by incorporating the collectiveness of motion direction and the velocity magnitude derived from the optical flow field. The global metric is adopted to keep the illumination-invariance of optical flow for intra-crowd motion. Then, we measure the motion consistency among various clusters to generate inter-crowd collectiveness, which constitutes the GMCC metric together with intra-collectiveness. Finally, the proposed energy spread process of GMCC is used to merge the inter-crowd collectiveness to estimate the global distribution of dense crowds. Experimental results validate that GMCC significantly improves the performance and efficiency of measuring crowd collectiveness and crowd density estimation on various crowd datasets, demonstrating a wide range of applications for real-time monitoring in public crowd management. Full article
Show Figures

Figure 1

21 pages, 8252 KiB  
Article
Train Station Pedestrian Monitoring Pilot Study Using an Artificial Intelligence Approach
by Gonzalo Garcia, Sergio A. Velastin, Nicolas Lastra, Heilym Ramirez, Sebastian Seriani and Gonzalo Farias
Sensors 2024, 24(11), 3377; https://doi.org/10.3390/s24113377 - 24 May 2024
Cited by 4 | Viewed by 2613
Abstract
Pedestrian monitoring in crowded areas like train stations has an important impact in the overall operation and management of those public spaces. An organized distribution of the different elements located inside a station will contribute not only to the safety of all passengers [...] Read more.
Pedestrian monitoring in crowded areas like train stations has an important impact in the overall operation and management of those public spaces. An organized distribution of the different elements located inside a station will contribute not only to the safety of all passengers but will also allow for a more efficient process of the regular activities including entering/leaving the station, boarding/alighting from trains, and waiting. This improved distribution only comes by obtaining sufficiently accurate information on passengers’ positions, and their derivatives like speeds, densities, traffic flow. The work described here addresses this need by using an artificial intelligence approach based on computational vision and convolutional neural networks. From the available videos taken regularly at subways stations, two methods are tested. One is based on tracking each person’s bounding box from which filtered 3D kinematics are derived, including position, velocity and density. Another infers the pose and activity that a person has by analyzing its main body key points. Measurements of these quantities would enable a sensible and efficient design of inner spaces in places like railway and subway stations. Full article
(This article belongs to the Special Issue Feature Papers in Intelligent Sensors 2024)
Show Figures

Figure 1

18 pages, 10470 KiB  
Article
An Improved CrowdDet Algorithm for Traffic Congestion Detection in Expressway Scenarios
by Chishe Wang, Yuting Chen, Jie Wang and Jinjin Qian
Appl. Sci. 2023, 13(12), 7174; https://doi.org/10.3390/app13127174 - 15 Jun 2023
Cited by 6 | Viewed by 2712
Abstract
Traffic congestion detection based on vehicle detection and tracking algorithms is one of the key technologies for intelligent transportation systems. However, in expressway surveillance scenarios, small vehicle size and vehicle occlusion present severe challenges for this method, including low vehicle detection accuracy and [...] Read more.
Traffic congestion detection based on vehicle detection and tracking algorithms is one of the key technologies for intelligent transportation systems. However, in expressway surveillance scenarios, small vehicle size and vehicle occlusion present severe challenges for this method, including low vehicle detection accuracy and low traffic congestion detection accuracy. To address these challenges, this paper proposes an improved version of the CrowdDet algorithm by introducing the Involution operator and bi-directional feature pyramid network (BiFPN) module, which is called IBCDet. The proposed IBCDet module can achieve higher vehicle detection accuracy in expressway surveillance scenarios by enabling long-distance information interaction and multi-scale feature fusion. Additionally, a vehicle-tracking algorithm based on IBCDet is designed to calculate the running speed of vehicles, and it uses the average running speed to achieve traffic congestion detection according to the Chinese expressway level of serviceability (LoS) criteria. Adequate experiments are conducted on both the self-built Nanjing Raoyue expressway monitoring video dataset (NJRY) and the public dataset UA-DETRAC. The experimental results demonstrate that the proposed IBCDet outperforms the commonly used object detection algorithms in both vehicle detection accuracy and traffic congestion detection accuracy. Full article
(This article belongs to the Special Issue Applications of Machine Learning in Image Recognition and Processing)
Show Figures

Figure 1

21 pages, 6194 KiB  
Article
Fusion of CCTV Video and Spatial Information for Automated Crowd Congestion Monitoring in Public Urban Spaces
by Vivian W. H. Wong and Kincho H. Law
Algorithms 2023, 16(3), 154; https://doi.org/10.3390/a16030154 - 10 Mar 2023
Cited by 8 | Viewed by 4062
Abstract
Crowd congestion is one of the main causes of modern public safety issues such as stampedes. Conventional crowd congestion monitoring using closed-circuit television (CCTV) video surveillance relies on manual observation, which is tedious and often error-prone in public urban spaces where crowds are [...] Read more.
Crowd congestion is one of the main causes of modern public safety issues such as stampedes. Conventional crowd congestion monitoring using closed-circuit television (CCTV) video surveillance relies on manual observation, which is tedious and often error-prone in public urban spaces where crowds are dense, and occlusions are prominent. With the aim of managing crowded spaces safely, this study proposes a framework that combines spatial and temporal information to automatically map the trajectories of individual occupants, as well as to assist in real-time congestion monitoring and prediction. Through exploiting both features from CCTV footage and spatial information of the public space, the framework fuses raw CCTV video and floor plan information to create visual aids for crowd monitoring, as well as a sequence of crowd mobility graphs (CMGraphs) to store spatiotemporal features. This framework uses deep learning-based computer vision models, geometric transformations, and Kalman filter-based tracking algorithms to automate the retrieval of crowd congestion data, specifically the spatiotemporal distribution of individuals and the overall crowd flow. The resulting collective crowd movement data is then stored in the CMGraphs, which are designed to facilitate congestion forecasting at key exit/entry regions. We demonstrate our framework on two video data, one public from a train station dataset and the other recorded at a stadium following a crowded football game. Using both qualitative and quantitative insights from the experiments, we demonstrate that the suggested framework can be useful to help assist urban planners and infrastructure operators with the management of congestion hazards. Full article
(This article belongs to the Special Issue Recent Advances in Algorithms for Computer Vision Applications)
Show Figures

Figure 1

17 pages, 2955 KiB  
Article
User Preference-Based Video Synopsis Using Person Appearance and Motion Descriptions
by Rasha Shoitan, Mona M. Moussa, Sawsan Morkos Gharghory, Heba A. Elnemr, Young-Im Cho and Mohamed S. Abdallah
Sensors 2023, 23(3), 1521; https://doi.org/10.3390/s23031521 - 30 Jan 2023
Cited by 3 | Viewed by 2431
Abstract
During the last decade, surveillance cameras have spread quickly; their spread is predicted to increase rapidly in the following years. Therefore, browsing and analyzing these vast amounts of created surveillance videos effectively is vital in surveillance applications. Recently, a video synopsis approach was [...] Read more.
During the last decade, surveillance cameras have spread quickly; their spread is predicted to increase rapidly in the following years. Therefore, browsing and analyzing these vast amounts of created surveillance videos effectively is vital in surveillance applications. Recently, a video synopsis approach was proposed to reduce the surveillance video duration by rearranging the objects to present them in a portion of time. However, performing a synopsis for all the persons in the video is not efficacious for crowded videos. Different clustering and user-defined query methods are introduced to generate the video synopsis according to general descriptions such as color, size, class, and motion. This work presents a user-defined query synopsis video based on motion descriptions and specific visual appearance features such as gender, age, carrying something, having a baby buggy, and upper and lower clothing color. The proposed method assists the camera monitor in retrieving people who meet certain appearance constraints and people who enter a predefined area or move in a specific direction to generate the video, including a suspected person with specific features. After retrieving the persons, a whale optimization algorithm is applied to arrange these persons reserving chronological order, reducing collisions, and assuring a short synopsis video. The evaluation of the proposed work for the retrieval process in terms of precision, recall, and F1 score ranges from 83% to 100%, while for the video synopsis process, the synopsis video length compared to the original video is decreased by 68% to 93.2%, and the interacting tube pairs are preserved in the synopsis video by 78.6% to 100%. Full article
(This article belongs to the Special Issue Application of Semantic Technologies in Sensors and Sensing Systems)
Show Figures

Figure 1

14 pages, 1397 KiB  
Article
Adaptive Hierarchical Density-Based Spatial Clustering Algorithm for Streaming Applications
by Darveen Vijayan and Izzatdin Aziz
Telecom 2023, 4(1), 1-14; https://doi.org/10.3390/telecom4010001 - 22 Dec 2022
Cited by 4 | Viewed by 3180
Abstract
Clustering algorithms are commonly used in the mining of static data. Some examples include data mining for relationships between variables and data segmentation into components. The use of a clustering algorithm for real-time data is much less common. This is due to a [...] Read more.
Clustering algorithms are commonly used in the mining of static data. Some examples include data mining for relationships between variables and data segmentation into components. The use of a clustering algorithm for real-time data is much less common. This is due to a variety of factors, including the algorithm’s high computation cost. In other words, the algorithm may be impractical for real-time or near-real-time implementation. Furthermore, clustering algorithms necessitate the tuning of hyperparameters in order to fit the dataset. In this paper, we approach clustering moving points using our proposed Adaptive Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) algorithm, which is an implementation of an adaptive approach to building the minimum spanning tree. We switch between the Boruvka and the Prim algorithms as a means to build the minimum spanning tree, which is one of the most expensive components of the HDBSCAN. The Adaptive HDBSCAN yields an improvement in execution time by 5.31% without depreciating the accuracy of the algorithm. The motivation for this research stems from the desire to cluster moving points on video. Cameras are used to monitor crowds and improve public safety. We can identify potential risks due to overcrowding and movements of groups of people by understanding the movements and flow of crowds. Surveillance equipment combined with deep learning algorithms can assist in addressing this issue by detecting people or objects, and the Adaptive HDBSCAN is used to cluster these items in real time to generate information about the clusters. Full article
Show Figures

Figure 1

20 pages, 3088 KiB  
Article
Suspicious Actions Detection System Using Enhanced CNN and Surveillance Video
by Esakky Selvi, Malaiyalathan Adimoolam, Govindharaju Karthi, Kandasamy Thinakaran, Nagaiah Mohanan Balamurugan, Raju Kannadasan, Chitapong Wechtaisong and Arfat Ahmad Khan
Electronics 2022, 11(24), 4210; https://doi.org/10.3390/electronics11244210 - 16 Dec 2022
Cited by 26 | Viewed by 8242
Abstract
Suspicious pre- and post-activity detection in crowded places is essential as many suspicious activities may be carried out by culprits. Usually, there will be installations of surveillance cameras. These surveillance cameras capture videos or images later investigated by authorities and post-event such suspicious [...] Read more.
Suspicious pre- and post-activity detection in crowded places is essential as many suspicious activities may be carried out by culprits. Usually, there will be installations of surveillance cameras. These surveillance cameras capture videos or images later investigated by authorities and post-event such suspicious activity would be detected. This leads to high human intervention to detect suspicious activity. However, there are no systems available to protect valuable things from such suspicious incidents. Nowadays machine learning (ML)- and deep learning (DL)-based pre-incident warning alarm systems could be adapted to monitor suspicious activity. Suspicious activity prediction would be based on human gestures and unusual activity detection. Even though some methods based on ML or DL have been proposed, the need for a highly accurate, highly precise, low-false-positive and low-false-negative prediction system can be enhanced by hybrid or enhanced ML- or DL-based systems. This proposed research work has introduced an enhanced convolutional neural network (ECNN)-based suspicious activity detection system. The experiment was carried out and the results were claimed. The results are analyzed with the Statistical Package for the Social Sciences (SPSS) tool. The results showed that the mean accuracy, mean precision, mean false-positive rate, and mean false-negative rate of suspicious activity detections were 97.050%, 96.743%, 2.957%, and 2.927% respectively. This result was also compared with the convolutional neural network (CNN) algorithm. This research work can be applied to enhance the pre-suspicious activity alert security system to avoid risky situations. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

19 pages, 12434 KiB  
Article
3DMesh-GAR: 3D Human Body Mesh-Based Method for Group Activity Recognition
by Muhammad Saqlain, Donguk Kim, Junuk Cha, Changhwa Lee, Seongyeong Lee and Seungryul Baek
Sensors 2022, 22(4), 1464; https://doi.org/10.3390/s22041464 - 14 Feb 2022
Cited by 5 | Viewed by 4808
Abstract
Group activity recognition is a prime research topic in video understanding and has many practical applications, such as crowd behavior monitoring, video surveillance, etc. To understand the multi-person/group action, the model should not only identify the individual person’s action in the context but [...] Read more.
Group activity recognition is a prime research topic in video understanding and has many practical applications, such as crowd behavior monitoring, video surveillance, etc. To understand the multi-person/group action, the model should not only identify the individual person’s action in the context but also describe their collective activity. A lot of previous works adopt skeleton-based approaches with graph convolutional networks for group activity recognition. However, these approaches are subject to limitation in scalability, robustness, and interoperability. In this paper, we propose 3DMesh-GAR, a novel approach to 3D human body Mesh-based Group Activity Recognition, which relies on a body center heatmap, camera map, and mesh parameter map instead of the complex and noisy 3D skeleton of each person of the input frames. We adopt a 3D mesh creation method, which is conceptually simple, single-stage, and bounding box free, and is able to handle highly occluded and multi-person scenes without any additional computational cost. We implement 3DMesh-GAR on a standard group activity dataset: the Collective Activity Dataset, and achieve state-of-the-art performance for group activity recognition. Full article
Show Figures

Figure 1

21 pages, 11989 KiB  
Article
A Social Distance Estimation and Crowd Monitoring System for Surveillance Cameras
by Mohammad Al-Sa’d, Serkan Kiranyaz, Iftikhar Ahmad, Christian Sundell, Matti Vakkuri and Moncef Gabbouj
Sensors 2022, 22(2), 418; https://doi.org/10.3390/s22020418 - 6 Jan 2022
Cited by 30 | Viewed by 5924
Abstract
Social distancing is crucial to restrain the spread of diseases such as COVID-19, but complete adherence to safety guidelines is not guaranteed. Monitoring social distancing through mass surveillance is paramount to develop appropriate mitigation plans and exit strategies. Nevertheless, it is a labor-intensive [...] Read more.
Social distancing is crucial to restrain the spread of diseases such as COVID-19, but complete adherence to safety guidelines is not guaranteed. Monitoring social distancing through mass surveillance is paramount to develop appropriate mitigation plans and exit strategies. Nevertheless, it is a labor-intensive task that is prone to human error and tainted with plausible breaches of privacy. This paper presents a privacy-preserving adaptive social distance estimation and crowd monitoring solution for camera surveillance systems. We develop a novel person localization strategy through pose estimation, build a privacy-preserving adaptive smoothing and tracking model to mitigate occlusions and noisy/missing measurements, compute inter-personal distances in the real-world coordinates, detect social distance infractions, and identify overcrowded regions in a scene. Performance evaluation is carried out by testing the system’s ability in person detection, localization, density estimation, anomaly recognition, and high-risk areas identification. We compare the proposed system to the latest techniques and examine the performance gain delivered by the localization and smoothing/tracking algorithms. Experimental results indicate a considerable improvement, across different metrics, when utilizing the developed system. In addition, they show its potential and functionality for applications other than social distancing. Full article
(This article belongs to the Special Issue Computer Visions and Pattern Recognition)
Show Figures

Figure 1

16 pages, 5327 KiB  
Article
Abnormal Activity Recognition from Surveillance Videos Using Convolutional Neural Network
by Shabana Habib, Altaf Hussain, Waleed Albattah, Muhammad Islam, Sheroz Khan, Rehan Ullah Khan and Khalil Khan
Sensors 2021, 21(24), 8291; https://doi.org/10.3390/s21248291 - 11 Dec 2021
Cited by 36 | Viewed by 5657
Abstract
Background and motivation: Every year, millions of Muslims worldwide come to Mecca to perform the Hajj. In order to maintain the security of the pilgrims, the Saudi government has installed about 5000 closed circuit television (CCTV) cameras to monitor crowd activity efficiently. Problem: [...] Read more.
Background and motivation: Every year, millions of Muslims worldwide come to Mecca to perform the Hajj. In order to maintain the security of the pilgrims, the Saudi government has installed about 5000 closed circuit television (CCTV) cameras to monitor crowd activity efficiently. Problem: As a result, these cameras generate an enormous amount of visual data through manual or offline monitoring, requiring numerous human resources for efficient tracking. Therefore, there is an urgent need to develop an intelligent and automatic system in order to efficiently monitor crowds and identify abnormal activity. Method: The existing method is incapable of extracting discriminative features from surveillance videos as pre-trained weights of different architectures were used. This paper develops a lightweight approach for accurately identifying violent activity in surveillance environments. As the first step of the proposed framework, a lightweight CNN model is trained on our own pilgrim’s dataset to detect pilgrims from the surveillance cameras. These preprocessed salient frames are passed to a lightweight CNN model for spatial features extraction in the second step. In the third step, a Long Short Term Memory network (LSTM) is developed to extract temporal features. Finally, in the last step, in the case of violent activity or accidents, the proposed system will generate an alarm in real time to inform law enforcement agencies to take appropriate action, thus helping to avoid accidents and stampedes. Results: We have conducted multiple experiments on two publicly available violent activity datasets, such as Surveillance Fight and Hockey Fight datasets; our proposed model achieved accuracies of 81.05 and 98.00, respectively. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

23 pages, 4076 KiB  
Article
Estimating Interpersonal Distance and Crowd Density with a Single-Edge Camera
by Alem Fitwi, Yu Chen, Han Sun and Robert Harrod
Computers 2021, 10(11), 143; https://doi.org/10.3390/computers10110143 - 5 Nov 2021
Cited by 13 | Viewed by 4035
Abstract
For public safety and physical security, currently more than a billion closed-circuit television (CCTV) cameras are in use around the world. Proliferation of artificial intelligence (AI) and machine/deep learning (M/DL) technologies have gained significant applications including crowd surveillance. The state-of-the-art distance and area [...] Read more.
For public safety and physical security, currently more than a billion closed-circuit television (CCTV) cameras are in use around the world. Proliferation of artificial intelligence (AI) and machine/deep learning (M/DL) technologies have gained significant applications including crowd surveillance. The state-of-the-art distance and area estimation algorithms either need multiple cameras or a reference object as a ground truth. It is an open question to obtain an estimation using a single camera without a scale reference. In this paper, we propose a novel solution called E-SEC, which estimates interpersonal distance between a pair of dynamic human objects, area occupied by a dynamic crowd, and density using a single edge camera. The E-SEC framework comprises edge CCTV cameras responsible for capturing a crowd on video frames leveraging a customized YOLOv3 model for human detection. E-SEC contributes an interpersonal distance estimation algorithm vital for monitoring the social distancing of a crowd, and an area estimation algorithm for dynamically determining an area occupied by a crowd with changing size and position. A unified output module generates the crowd size, interpersonal distances, social distancing violations, area, and density per every frame. Experimental results validate the accuracy and efficiency of E-SEC with a range of different video datasets. Full article
(This article belongs to the Special Issue Feature Paper in Computers)
Show Figures

Figure 1

Back to TopTop