Next Article in Journal
Modeling of Oil–Water Two-Phase Flow in Horizontal Pipes Using CFD for the Prediction of Flow Patterns
Next Article in Special Issue
Hybrid ML Algorithm for Fault Classification in Transmission Lines Using Multi-Target Ensemble Classifier with Limited Data
Previous Article in Journal
Comprehensive Technical Inspection of a Medieval Bridge (Ponte de Vilanova, in Allariz) Using Microtechnological Tools
Previous Article in Special Issue
Benchmarking a Novel Particle Swarm Optimization Dynamic Model Versus HOMER in Optimally Sizing Grid-Integrated Hybrid PV–Hydrogen Energy Systems
 
 
Review
Peer-Review Record

A Review of Passenger Counting in Public Transport Concepts with Solution Proposal Based on Image Processing and Machine Learning

Eng 2024, 5(4), 3284-3315; https://doi.org/10.3390/eng5040172
by Aleksander Radovan 1,*, Leo Mršić 2, Goran Đambić 1 and Branko Mihaljević 3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Eng 2024, 5(4), 3284-3315; https://doi.org/10.3390/eng5040172
Submission received: 27 October 2024 / Revised: 1 December 2024 / Accepted: 3 December 2024 / Published: 10 December 2024
(This article belongs to the Special Issue Artificial Intelligence for Engineering Applications)

Round 1

Reviewer 1 Report (Previous Reviewer 2)

Comments and Suggestions for Authors

The authors have addressed all my questions, and I recommend the publication of the paper.

Author Response

Comment 1: The authors have addressed all my questions, and I recommend the publication of the paper.

Response: Thank you! Just in case, I'm sending the improved version.

Author Response File: Author Response.pdf

Reviewer 2 Report (Previous Reviewer 1)

Comments and Suggestions for Authors

The article deals with the issue of counting passengers in public transport, with the fact that it is resubmitted with additions. I have also received comments that briefly state that everything has been incorporated.

I would like to make the following comments:

• the title of the article is misleading, including the documented abstract, as the article does not directly deal with image processing or machine learning;

• as the authors also state in their answers to questions, the article is an overview without the actual implementation of the APC system, including verification of measured data and testing on CNN type YOLO and other methods;

• the abstract should be fundamentally modified with respect to the other text in the article;

• as Table No. 1 is presented, Table No. 2 should also be commented on, i.e. with reference to references, because the authors take everything from the literature, as stated in the introduction in Chapter 2 on line 479-482. It is not clear how the accuracy was determined unless it was taken from the literature, which must be indicated with a reference;

• It is similar to table no. 3, where it is not clear what the descriptions originated from, it is necessary to document references or specific methodologies and procedures;

• chapter 3 shows generalities and information for which citation lines 543 – 619 are missing. At the same time, it is clear that the authors take over everything, but do not verify;

• chapter 4 describes the design that is already used in PT today;

• the discussion presents general conclusions that are not supported by the authors' research or verification. It is therefore recommended to add references to citations. At the same time, the recommendations are very general without a specific mention or reference to the literature or a directly suitable Usecase, etc.;

• The conclusion is also very general, without a specific recommendation for example the use of CCTV type, YOLO algorithm type, etc.

The article can be professional with clear outputs or research, which should be acknowledged and emphasized already in the abstract. As a professional article, I recommend reject and return.

If it will be a research study, I recommend supplementing the article with citations with greater interlinking in the text and stating this intention already in the abstract and in the introduction for the reader, including reworking the conclusion and expanding the discussion with interlinked citations.

best regards

Author Response

Comment 1: the title of the article is misleading, including the documented abstract, as the article does not directly deal with image processing or machine learning;

Response 1: Thank you for the comment. The title has been updated to "A Review of Passenger Counting in Public Transport Concepts with a Solution Proposal Based on Image Processing and Machine Learning" since a part of the manuscript describes a solution proposal because machine learning concepts with image processing is one of the most often used concepts in passenger counting in public transport. The sentence in the abstract was also updated to highlight this approach in the manuscript: "The paper explores various technologies and algorithms, like card swiping, infrared, weight and ultrasonic sensors, RFID, Wi-Fi, Bluetooth, LiDAR, thermos cameras, including CCTV cameras and traditional computer vision methods and advanced deep learning approaches, highlighting their strengths and limitations". All changes were marked in red in the updated version of the manuscript.

 

Comment 2: as the authors also state in their answers to questions, the article is an overview without the actual implementation of the APC system, including verification of measured data and testing on CNN type YOLO and other methods;

Response 2: Thank you for the comment. This fact about the manuscript is also highlighted in the updated abstract: "This paper provides a comprehensive review of current methodologies and technologies used for passenger counting, without the actual implementation of the automatic passenger counting system, but with a proposal based on image processing and machine learning techniques and concepts, since it represents one of the most used approaches.".

 

Comment 3: the abstract should be fundamentally modified with respect to the other text in the article;

Response 3: Thank you for the comment. The abstract has been rewritten to move the focus from solely machine learning concepts to a review of existing state-of-the-art technologies.

 

Comment 4: as Table No. 1 is presented, Table No. 2 should also be commented on, i.e. with reference to references, because the authors take everything from the literature, as stated in the introduction in Chapter 2 on line 479-482. It is not clear how the accuracy was determined unless it was taken from the literature, which must be indicated with a reference;

Response 4:  Thank you for the comment. All technologies with typical precision ranges have been referenced with existing references from the literature in Table 2.

 

Comment 5: It is similar to table no. 3, where it is not clear what the descriptions originated from, it is necessary to document references or specific methodologies and procedures;

Response 5:  Thank you for the comment. All technologies with typical computational complexity, real-time performance, and hardware requirements have been referenced with existing references from the literature in Table 3.

 

Comment 6: chapter 3 shows generalities and information for which citation lines 543 – 619 are missing. At the same time, it is clear that the authors take over everything, but do not verify;

Response 6:  Thank you for the comment. We have added 20 references to external sources that explain the mentioned concepts.

 

Comment 7:  chapter 4 describes the design that is already used in PT today;

Response 7: Thank you for the comment. Chapter 4 contains a proposal for the implementation of a system for automatic passenger counting in public transport based on the best practices used in other works that have been analyzed and referenced. The proposal is based on the best ways of connecting data sets and the latest models such as Yolo v11, which was recently released and enables improved detection of passengers in public transport. Many existing APC systems are based on older datasets and Yolo implementations, so this implementation proposal can serve as a recommendation for new implementations.

 

Comment 8: • the discussion presents general conclusions that are not supported by the authors' research or verification. It is therefore recommended to add references to citations. At the same time, the recommendations are very general without a specific mention or reference to the literature or a directly suitable Usecase, etc.;

Response 8: Thank you for the comment. We added 18 references to support the general conclusions in the discussion section.

 

Comment 9: The conclusion is also very general, without a specific recommendation for example the use of CCTV type, YOLO algorithm type, etc.

Response 9: Thank you for the comment. Specific recommendations for the number of cameras, placement positions, types of cameras, camera angles, dataset, and YOLO algorithm tasks that fit the requirements of an automatic passenger counting system.

Reviewer 3 Report (New Reviewer)

Comments and Suggestions for Authors

This paper provides a review of APC transit systems. The paper also includes a description of a Machine-learning Vision-based APC and a discussion regarding the European General Data Protection Regulation.

The paper includes interesting aspects, but my general perception is that it is not an in-depth review, and the additional sections (4. Solution proposal and 5. GDPR Compliance) in their present version add little value to a research article.

Please find detailed comments in the following.

Abstract: the online version is slightly different from the pdf version. Check "offers a solution proposal".

Section 2: Each subsection is a little shallow. Some important aspects are missing:

2.1 Card swiping: is not useful to provide the passenger's destination, especially for non-pendular travel patterns.
2.2 Infrared solutions, not only use infrared beams, but also infrared cameras with image processing. This solution is already in use by many transit operators globally (e.g. MTA in NYC using Irma Matrix from Iris Sensing).
Image processing over infrared images allows boarding and alighting differentiation.
2.3 Very incomplete description of this technique. Needs to extend regarding random MAC addresses and detailed functioning mechanism.
2.5 It is probably worth mentioning the difficulty of bulk counting with RFID.
2.6 It is probably worth mentioning these are not designed for APC. The camera angles are not adequate.

Section 3: Looks like a shallow description of each paper. The papers seem disconnected or unrelated as there is no grouping by subtopics.

KLT is not defined, is it an algorithm, a filter, a methodology...?

The sensor-grid mat is not included in section 2.

There is another section 2. "Methodology", after section 3.

Section 3: Concepts and techniques... Needs subsections instead of random concepts, unstructured and disconnected. It should also include repeated citations.

Does Figure 3 show an image from [60] as claimed in the text or from [61] as claimed in the figure's title?

The paragraph "The initial step in developing..." looks more like a subsection for Section 3 "Concepts", than a paragraph for the proposed solution.

A block diagram may facilitate the reader's comprehension in section 4.

As a general final comment: Crowd counting is not mentioned in the whole document. APC for in-line (one-by-one) boarding or alighting is somehow considered a solved problem, even with multiple technologies, with several commercial products in the ITS transit market. Crowded boarding or alighting instead remains an open technical challenge. This should be a structural point in any review on this topic.

A review should be as exhaustive as possible. Several important papers are missing in this review. Examples:

- Empirical Study on the Accuracy and Precision of Automatic Passenger Counting in European Bus Services, Olivo et al 2019.

- Benchmarking the Functional, Technical, and Business Characteristics of Automated Passenger Counting Products, Pronello et al 2024.

- Passenger Counting in Mass Public Transport Systems using Computer Vision and Deep Learning, Moreno et al 2023.

- Visual and automatic bus passenger counting based on a deep tracking-by-detection system, Labit-Bonis et al 2021.

- Development of a real-time automatic passenger counting system using head detection based on deep learning, Kim et al 2022.

Comments on the Quality of English Language

"in in the paper"

EM algorithm (Expectation Maximization?) not defined.

"The paper []" is repetitive and shows an unprofessional writing style.

"the research [10]" might be changed by "researchers in [10]"

"to train the machine learning model" -> "to train a machine learning model"

"opens to doors" -> "opens the doors" 

Check the sentence: "by using the Histogram of Oriented Gradients (HOG) [38] feature of passen- gers’s heads on the extracted"

Check the sentence: "in a defined space based on probe request messages the devices send when not connected to the network" (send what?)

Hard to read sentence: "The methodology of this research was based on the analysis of existing implementa- tions of systems for counting passengers in public transport through the detection of the technology used in the research and the vehicle in which the counting was implemented,"

The word "techniques" is missing in -> "There are several machine learning suitable for using"

The initials APC should be used throughout the paper for readability.

Figure titles need punctuation.

"that bans IS system" IS is not defined.

Author Response

Comment 1: Abstract: the online version is slightly different from the pdf version. Check "offers a solution proposal".

Response 1: Thank you for the comment. A part of the abstract has been rewritten to better fit the rest of the manuscript, together with the manuscript title update.

 

Comment 2: 

Section 2: Each subsection is a little shallow. Some important aspects are missing:

2.1 Card swiping: is not useful to provide the passenger's destination, especially for non-pendular travel patterns.
2.2 Infrared solutions, not only use infrared beams, but also infrared cameras with image processing. This solution is already in use by many transit operators globally (e.g. MTA in NYC using Irma Matrix from Iris Sensing).
Image processing over infrared images allows boarding and alighting differentiation.
2.3 Very incomplete description of this technique. Needs to extend regarding random MAC addresses and detailed functioning mechanism.
2.5 It is probably worth mentioning the difficulty of bulk counting with RFID.
2.6 It is probably worth mentioning these are not designed for APC. The camera angles are not adequate.

Response 2: Thank you for the comment. 

The card swiping part was extended with the following part: "

Card swiping also has the limitation of accurately capturing passenger travel patterns, particularly for non-pendular trips. Pendular travel patterns are repetitive, such as daily commuting between home and work or school, but non-pendular travel patterns are irregular or occasional trips with varying routes and destinations and are less predictable. If the automatic passenger counting system only requires swiping at boarding, not at alighting, it creates challenges in providing accurate passenger counting.

"

The IR sensor section has been extended with the following text: "

Besides infrared sensors, infrared cameras provide reliability in counting passengers in public transport under various environmental conditions, because they detect the heat emitted by objects and people in their field of view. Since people have a body temperature higher than the surrounding environment, they stand out clearly in infrared imaging. It can be used tracking the movement trajectory of passengers so it can differentiate between boarding and alighting of passenger and therefore detect the change of passenger number in the vehicle. Since infrared cameras do not rely on capturing visual or personal data like regular cameras, they are more privacy-friendly. They detect only the heat signatures of individuals, making them compliant with privacy regulations, such as GDPR.

Infrared cameras may not perform as well in environments where the temperature variation in minimal, for example, in extremely hot weather when the contrast between passengers and the environment may be minimal. In situations where passengers are packed closely together, such as during rush hours, multiple people’s heat signatures may overlap and produce inaccurate passenger detection.

"

The generation of random MAC addresses is described in the updated section:

"If it is based on a Wi-Fi probe request, it sends out probe requests containing its MAC (Media Access Control) address and signal strength to identify the presence of a passenger and approximate their location relative to the receiver of the signal. Bluetooth-enabled devices periodically emit signals when they are set to a discoverable mode or when scanning for other devices, and receivers capture these signals to detect passengers. To protect user privacy, modern devices use randomized MAC addresses in their probe requests. Instead of broadcasting the actual hardware MAC address, the device sends out a pseudo-randomly generated MAC address during probe requests. These randomized addresses change frequently, for example, every few minutes, and are unique to each network or detection session, making them difficult to track across multiple locations."

 The RFID section has been extended with this sentence to address the difficulty of bulk counting with RFID:

"In situations where multiple passengers enter a vehicle simultaneously, overlapping tar responses can overwhelm the RFID reader, where the system might fail to determine the exact number of passengers or differentiate between closely spaced tags. "

The section about CCTV cameras has been extended with the following sentence to address the challenges with angles: "Frequent problems with the use of CCTV cameras are related to the angles where the cameras are directed, so to cover the entire space of the vehicle, multiple cameras must be used or wide-angle cameras must be used."

Comment 3:

Section 3: Looks like a shallow description of each paper. The papers seem disconnected or unrelated as there is no grouping by subtopics.

Response 3: the descriptions of the papers are extended, connected, and grouped together.

 

Comment 4: KLT is not defined, is it an algorithm, a filter, a methodology...?

Response 4: KLT is referenced, described and defined when first mentioned in the manuscript: "It employs a Kanade-Lucas-Tomasi (KLT) tracker [23], as an approach to feature extraction aimed at addressing the traditional image registration techniques by using spatial intensity information for efficient matching, for feature detection and a unique clustering algorithm based on the appearance and disappearance of feature points. "

 

Comment 5: The sensor-grid mat is not included in section 2.

Response 5: Thank you for the comment. New section 2.10. is added to the manuscript.

 

Comment 6: There is another section 2. "Methodology", after section 3.

Response 6: Thank you for the comment. The numbering of sections is fixed now.

 

Comment 7: Section 3: Concepts and techniques... Needs subsections instead of random concepts, unstructured and disconnected. It should also include repeated citations.

Response 7: Thank you for the comment. Subsections are added and repeated citations are used.

 

Comment 8: Does Figure 3 show an image from [60] as claimed in the text or from [61] as claimed in the figure's title?

Response 8: Thank you for the comment. The number of references is fixed now.

 

Comment 9: The paragraph "The initial step in developing..." looks more like a subsection for Section 3 "Concepts", than a paragraph for the proposed solution.

Response 9: Thank you for the comment. The paragraph is moved to the section with concepts.

 

Comment 10: A block diagram may facilitate the reader's comprehension in section 4.

Response 10: A block diagram was added to section 4.

 

Comment 11: As a general final comment: Crowd counting is not mentioned in the whole document. APC for in-line (one-by-one) boarding or alighting is somehow considered a solved problem, even with multiple technologies, with several commercial products in the ITS transit market. Crowded boarding or alighting instead remains an open technical challenge. This should be a structural point in any review on this topic.

Response 11: The conclusion section was extended with a reflection on crowded boarding and a reference to a paper that proposes a solution. The manuscript was also extended with the Yolo v11 model and the CrowdHuman benchmark.

Comment 12: 

A review should be as exhaustive as possible. Several important papers are missing in this review. Examples:

- Empirical Study on the Accuracy and Precision of Automatic Passenger Counting in European Bus Services, Olivo et al 2019.

- Benchmarking the Functional, Technical, and Business Characteristics of Automated Passenger Counting Products, Pronello et al 2024.

- Passenger Counting in Mass Public Transport Systems using Computer Vision and Deep Learning, Moreno et al 2023.

- Visual and automatic bus passenger counting based on a deep tracking-by-detection system, Labit-Bonis et al 2021.

- Development of a real-time automatic passenger counting system using head detection based on deep learning, Kim et al 2022.

Response 12: thank you for the great references. All papers were added as references.

Comments on the Quality of English Language - thank you, suggestions are implemented.

Round 2

Reviewer 2 Report (Previous Reviewer 1)

Comments and Suggestions for Authors

Dear

the article has been improved and most of my comments have been incorporated. I recommend considering whether the supplemented texts including citations in chapter 9 Conclusion would not be more appropriate to insert into chapter 8 Discussion. Chapter 9 should describe the main conclusions from the article. Overall, the proposals with recommendations would be clearer.

I recommend doing a final check of the text.

best regards

Author Response

Comment 1: the article has been improved and most of my comments have been incorporated. I recommend considering whether the supplemented texts including citations in chapter 9 Conclusion would not be more appropriate to insert into chapter 8 Discussion. Chapter 9 should describe the main conclusions from the article. Overall, the proposals with recommendations would be clearer.

Response 1: thank you for your comment. The texts including citations in chapter 9 Conclusion were moved to Chapter 8 Discussion.

Reviewer 3 Report (New Reviewer)

Comments and Suggestions for Authors

All my previous comments were extensively considered and corrections were pertinent. The article looks much better now.

Author Response

Comment: All my previous comments were extensively considered and corrections were pertinent. The article looks much better now.

Response: thank you for your time, suggestions, and help in improving our manuscript.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Dear

The article deals with the issue of passenger counting in public transport, which is important for the optimization of PT in cities. The topic is mainly oriented image processing.

I recommend checking if the Topic "Unmanned Vehicles Technology and Embodied Intelligence Systems for Intelligent Transportation" is set well, because the topic is ruined for public transport, I recommend fixing this

- it is missing from how many means of transport the data from the bus was obtained, from how many cameras, what type, what settings, etc. The function of the cameras in the bus is often set differently depending on the type of BUS

- how many records were used and what method was used. YOLO is just a method using CNN to process your own image.

- why was there an 80/20 split in the data set and how many data samples were there? Was a 70/30 ratio also considered?

- the novelty and research contribution of the article is not stated

- no outputs from data processing are documented in the article. There are no documented graphs regarding the reliability of the method compared to the actual state of counting passengers in the vehicle, etc.

- some parts of the article, such as methods of detection, but also GDPR are known and are of a general nature. It is not clear how it was specifically used and implemented in this case, clarification can be recommended.

- the discussion is led in the right direction, but it is very general and it is not clear what specifically from the given research is usable or has been verified as recommendable for future use for PT or just buses

- in the conclusion, the use of Lidar is recommended, although it is not mentioned in the text. What does it mean that this is the right technology for counting passengers in PT?

- the conclusion does not state what was new and the specific outputs of the contribution of the article, such as the reliability of detection, comparison of values, etc., apart from general known information and formulations.

 

Although the article is interesting, it shows fundamental errors - the lack of a research method, a description of the own approaches used, a clear innovation and comparison with another adder or verification of the given reliability, etc. is not documented. Similarly, it is not documented how the given outputs were verified, on which sample, on which specific technology and settings including stating general conclusions.

I recommend that the article be rejected and returned, if necessary fundamentally and significantly reworked and refined in all parts, so that the foundations for a professional and scientific article are fulfilled.

 Best regards

Comments on the Quality of English Language

Dear

I don’t have major comments

Best regards

Reviewer 2 Report

Comments and Suggestions for Authors

The paper provides an in-depth review of various passenger counting technologies, encompassing both traditional methods and advanced deep learning approaches. Here are some comments.

1. The paper mentions that the latest version of YOLO is V9, but the current latest version is V10. It is recommended that the author verify and update this information to ensure accuracy.

2. When discussing the internal and external network infrastructure of public transport vehicles, the author only mentioned the use of Wi-Fi routers and cellular networks for wireless connections. It is suggested to include a discussion on the potential of 5G networks, as they offer significant advantages in low latency and high bandwidth, which are particularly suitable for real-time video transmission and monitoring systems.

3. This paper could benefit from a comparison of the accuracy of camera data, mobile signaling data, and card swiping data. The current version lacks specific data support; referring to related studies could help discuss the advantages and disadvantages of these methods.

4. The proposed passenger counting methods need to consider computational resources and real-time performance in practical applications. It is recommended to add a discussion on these aspects, analyzing the performance of different methods in terms of computational complexity, processing time, and hardware requirements.

5. Some abbreviations, such as GDPR, should be explained when first used to ensure readers understand their meanings.

6. As a review paper, having only 46 references may seem insufficient. It is recommended to include more relevant studies to enrich the literature review section and provide a comprehensive overview of the field.

Reviewer 3 Report

Comments and Suggestions for Authors

This study provides an overview of bus passenger counting methods using image recognition and machine learning techniques, mainly introducing the technologies and concepts of image processing and machine learning, as well as integration methods with other sensor devices. This review aims to explore in depth the effectiveness, scalability, and practicality of different passenger counting solutions, and propose solution recommendations. However, the content of the manuscript is not very complete, mainly discussed from a technical perspective, and the organization and evaluation of academic research methods are not thorough enough. Therefore, I suggest the author make significant revisions to the manuscript.

 

1.      In the section 1, it is suggested highlighting the research background and value. The content of bus passenger counting technology is too fragmented and lacks systematic introduction. Why only introduce the technology of video recognition of passengers without mentioning the technology of identifying the number of passengers by extracting card swiping data? It is necessary to add a table that displays the technology, related equipment, and technical characteristics currently used for bus passenger counting.

2.      In section 2, the article only briefly introduces the image recognition and deep learning techniques, but lacks necessary introduction to the characteristics, applicability, and specific application methods of the techniques, algorithms, and models.

3.      In the section 3, the article provided a brief introduction to the solution proposal of camera position setting, communication, and image data collection, but did not explain the difficulties and current mainstream countermeasures in these parts. It is suggested to supplement the corresponding content.

4.      Why mention 'GDPR Compliance and Passenger Counting Systems' in the section 4? What is the logical relationship with the previous text? It is suggested to add corresponding explanations.

5.      In the conclusion, what dimensions should be considered in algorithm design and innovation, and what additional considerations are needed for future personalized, customized, and modular public transportation? It is recommended to supplement the corresponding introduction.

Back to TopTop