Pedestrian Crossing Sensing Based on Hough Space Analysis to Support Visually Impaired Pedestrians

There are many visually impaired people globally, and it is important to support their ability to walk independently. Acoustic signals and escort zones have been installed on pedestrian crossings for the visually impaired people to walk safely; however, pedestrian accidents, including those involving the visually impaired, continue to occur. Therefore, to realize safe walking for the visually impaired on pedestrian crossings, we present an automatic sensing method for pedestrian crossings using images from cameras attached to them. Because the white rectangular stripes that mark pedestrian crossings are aligned, we focused on the edges of these rectangular stripes and proposed a novel pedestrian crossing sensing method based on the dispersion of the slope of a straight line in Hough space. Our proposed method possesses unique characteristics that allow it to effectively handle challenging scenarios that traditional methods struggle with. It excels at detecting crosswalks even in low-light conditions during nighttime when illumination levels may vary. Moreover, it can detect crosswalks even when certain areas are partially obscured by objects or obstructions. By minimizing computational costs, our method achieves high real-time performance, ensuring efficient and timely crosswalk detection in real-world environments. Specifically, our proposed method demonstrates an impressive accuracy rate of 98.47%. Additionally, the algorithm can be executed at almost real-time speeds (approximately 10.5 fps) using a Jetson Nano small-type computer, showcasing its suitability as a wearable device.


Introduction
There are many visually impaired people in the world, including Japan. For instance, in 2022, the number of visually impaired people in Japan was approximately 310,000, and the number of people who required guide dogs was approximately 3000. This has been illustrated in Figure 1. However, there are only 848 guide dogs currently being used in Japan [1]. Furthermore, as shown in Figure 2, when the number of guide dog users per one million people is compared by country, this number is extremely small [2]. In particular, when compared with the United Kingdom, there is an approximately 10-fold difference. If the utilization rate of the United Kingdom was to be realized in Japan, the number of guide dog users in Japan would be approximately 10,000, which would meet the needs of all guide dog applicants. In other countries, the number of guide dog users is small in relation to the number of visually impaired people, and the environment in which the visually impaired people can move freely is underdeveloped.     The low usage level of guide dogs in Japan can be attributed to the differences in sidewalks, breeding environments, and a lack of social understanding.
Therefore, there is a need for walking supports that do not rely on living organisms. Two types of devices have been used to improve safety at crosswalks, where accidents are particularly common. The first is an acoustic traffic signal with a device that emits a guiding sound. The second is an escort zone with braille blocks on the crosswalk. These devices and considerations to achieve safe crosswalks for all pedestrians, including the visually impaired, have become widespread.
However, accidents involving the visually impaired continue to occur at crosswalks. This can be because the acoustic traffic signals do not emit sound from night to morning owing to noise considerations. Additionally, the escort zone deteriorates, and the unevenness is worn away.
Therefore, as shown in Figure 3, there are three methods outlined in this study: pedestrian-mounted cameras [3,4], in-vehicle [5][6][7][8][9][10][11][12], and field fixed [13,14], and pedestrianmounted cameras. However, in-vehicle cameras and fixed-field crosswalk detection methods are not useful for walking support systems.   We established that it is possible to realize safer walking when a visually impaired person can perform the detection directly using a pedestrian-mounted system. Therefore, we propose a wearable pedestrian system.
In order to overcome the limitations of existing methods, we focused on developing a detection method specifically designed for pedestrians to wear [15][16][17][18][19][20][21][22][23][24][25][26][27]. This innovative approach addresses the challenges associated with conventional methods. One such challenge is the dominance of experiments conducted during the daytime, which often results in crosswalk images lacking pedestrians obstructing the view. This limitation restricts the applicability of the existing methods in real-world scenarios. Additionally, there is a scarcity of methods capable of efficient real-time processing, further impeding their practical usability.
To tackle these challenges, our study introduces a cutting-edge pedestrian-mounted ( Figure 4) crosswalk detection method. Our primary goal was to not only address the aforementioned limitations but also create a hardware solution that is user-friendly, portable, and lightweight, ensuring ease of use and convenience for pedestrians.    The core principle of our proposed method lies in leveraging the fact that crosswalks typically exhibit a distinctive pattern of continuous white rectangular shapes. By utilizing the Hough transform method [28], we enhanced the contours generated by a Canny edge detector [29] and the shapes of these rectangular patterns, enabling the accurate detection of crosswalks based on the variations in the slope of the straight lines associated with the contours in Hough space.
The unique characteristics of our proposed method enable it to handle various challenging scenarios that traditional methods struggle with. For instance, it can effectively detect crosswalks in low-light conditions during nighttime, where illumination may vary. Furthermore, the method is adept at detecting crosswalks even when certain parts of the crosswalk are partially obscured by objects or obstructions. Importantly, our method achieves an almost real-time performance by minimizing computational costs, ensuring the efficient and timely detection of crosswalks in real-world environments.
The proposed method demonstrates a high accuracy rate of 98.47%. Additionally, the algorithm can be executed almost in real-time (approximately 10.5 fps) using a Jetson Nano small-type computer, which highlights its applicability as a wearable device.

Related Work
Many studies have been conducted on supporting visually impaired with wearable devices [3,4], including crosswalk detection. Although the goal is to detect crosswalks, the objects to which the cameras are attached vary significantly. There are three main types of cameras: in-vehicle, fixed-in-place, and pedestrian-mounted cameras.
These cameras have different objectives and results. First, for the in-vehicle cameras, there are many studies, including those of Yuhua Fan et al. [5], J. Choi et al. [6], and many more [7][8][9][10][11]. These cameras aim to detect pedestrians on the crosswalk. However, it is difficult to put them into practical use because detection is insufficient. In addition, they were developed considering the perspective of the vehicle and not that of the visually impaired persons.
Second, there are many examples of detecting crosswalks using a fixed camera installed near a crosswalk [12,13]. The goal of these systems is to detect pedestrians on a crosswalk using a surveillance camera located at the site. They are not useful in assisting the visually impaired in walking because they cannot provide guidance.
Therefore, because it is difficult for the in-vehicle and fixed-in-place cameras to provide walking support for the visually impaired, we aim to realize safer and more accurate walking by allowing visually impaired individuals to achieve it directly.
Therefore, a pedestrian wearable system is proposed. There are several studies on pedestrian crossing detection, including the work of Ruiqi Cheng et al. [14] as an example of a similar method. The collection of papers encompasses a wide range of innovative research efforts aimed at improving the detection and recognition of marked pedestrian crossings in various contexts, with a specific emphasis on addressing challenging scenarios and catering to the needs of individuals with visual impairments. Wu et al. [15] propose a block-based Hough transform approach that effectively identifies marked crosswalks in natural scene images, contributing to the development of robust detection methods. Radványi et al. [16] introduce advanced crosswalk detection techniques tailored for the Bionic Eyeglass, offering enhanced functionality and usability for visually impaired users. Cao et al. [17] present an image-based detection method specifically designed for pedestrian crossings, utilizing visual cues and patterns to identify these critical areas. Akbari et al. 18] propose a visionbased marked crosswalk detection method that caters to the unique needs of individuals with visual impairments, empowering them with improved mobility and safety. Mascetti et al. [19] introduce ZebraRecognizer, an advanced pedestrian crossing recognition system explicitly developed for individuals with visual impairment or blindness, offering realtime assistance and guidance. These papers collectively demonstrate a broad spectrum of approaches, including the integration of computer vision applications into wearable devices (Silva et al. [20]), leveraging the ubiquity of camera phones for crosswalk detection (Ivanchenko et al. [21]), employing sophisticated SVM-based column-level approaches for accurate detection in low-resolution images (Romić, K. et al. [22]), and harnessing the power of deep convolutional neural networks for the precise identification and localization of marked crosswalk areas (Haider et al. [23]). Furthermore, the collection includes papers exploring diverse areas, such as image analysis techniques for crosswalks (Shioyama et al. [24]), the development of lightweight semantic segmentation networks for the rapid detection of blind roads and crosswalks (Cao et al. [25]), the creation of crosswalk guidance systems for the blind (Son et al. [26]), and the utilization of RGBD cameras for detecting both stairs and pedestrian crosswalks (Wang et al. [27]). These comprehensive efforts contribute to certain improvements in crosswalk detection technology, but its practical application level is still poor in terms of achieving pedestrian safety, especially for individuals with Sensors 2023, 23, 5928 5 of 18 visual impairments. One major reason for this is that the detection rate is not very high, particularly when a part of the crosswalk is obscured or during nighttime.
Therefore, based on the results of previous studies, this study proposes a method for the automatic detection of crosswalks from camera images worn by the visually impaired to realize safe walking at crosswalks. Furthermore, as shown in Figure 4, the proposed method can be implemented using small and lightweight hardware that can be worn and easily be carried by the user. The proposed method exhibited a higher accuracy than any other method reported in the literature. Figure 4 shows the application of the proposed pedestrian-crossing detection method. The method aims to detect the pedestrian crossing using images from the camera worn by the pedestrian and guide them across the crosswalk via audible cues.

Outline of Application
Pedestrian crossings are designated as white rectangular stripes. Therefore, in this study, we focused on the edges of white rectangular regions and propose a novel pedestrian crossing detection method based on the variance of the slope of a straight line in Hough space formed by the white stripes.

Image Acquisition
Images can be acquired using various two-dimensional cameras. Table 1 lists the specifications of the camera used in this study. We used the camera in an iPhone 7 to conduct the experiment.  Figure 5 shows the input image obtained by the camera. As shown in Figure 4, the camera was fixed to the chest of the pedestrian and faced forward. The width of the pedestrian crossing was approximately 10 m, and video acquisition was begun approximately 3 m in front of the pedestrian crossing. In addition, because we assumed the need for support while walking, images were captured continuously. method can be implemented using small and lightweight hardware that can be worn and easily be carried by the user. The proposed method exhibited a higher accuracy than any other method reported in the literature. Figure 4 shows the application of the proposed pedestrian-crossing detection method. The method aims to detect the pedestrian crossing using images from the camera worn by the pedestrian and guide them across the crosswalk via audible cues.

Outline of Application
Pedestrian crossings are designated as white rectangular stripes. Therefore, in this study, we focused on the edges of white rectangular regions and propose a novel pedestrian crossing detection method based on the variance of the slope of a straight line in Hough space formed by the white stripes.

Image Acquisition
Images can be acquired using various two-dimensional cameras. Table 1 lists the specifications of the camera used in this study. We used the camera in an iPhone 7 to conduct the experiment. Figure 5 shows the input image obtained by the camera. As shown in Figure 4, the camera was fixed to the chest of the pedestrian and faced forward. The width of the pedestrian crossing was approximately 10 m, and video acquisition was begun approximately 3 m in front of the pedestrian crossing. In addition, because we assumed the need for support while walking, images were captured continuously.

Flow of the Proposed Pedestrian Crossing Detection Method
The process flow of the proposed automatic crosswalk detection method is shown in Figure 6. Here, we provide a summary of the overall processing flow in the following sentences. First, the acquired image was grayscaled. Second, edge detection was performed using the Canny edge detection method. Thereafter, Hough transform was performed on the image to detect the straight-line part of the edge. When there are more than three straight lines obtained via Hough transform, we continued the process as a candidate for the crosswalk area. Three or more straight lines are selected here because the width of the three straight lines is approximately 1 m, which can be used as a guide to maintain safety. Subsequently, the slope of the straight lines is measured, and only the lines that are close to the slope are drawn. Here, the extraction of the drawn lines is based on the variance in the inclination of the lines. We varied the threshold value and adopted 0.03 because it had the highest accuracy in many experiments. The drawn lines were then compared with the original edge image, and the range of the crosswalk was extracted. Finally, the labeling process was performed, and the detected labels, except for those that were significantly smaller, were combined to determine the crosswalk area, completing the detection of the crosswalk. The above content pertains to the overall process of the proposed method. Detailed information on the major stages shown in Figure 6 is presented in the subsections below. performed on the image to detect the straight-line part of the edge. When there are more than three straight lines obtained via Hough transform, we continued the process as a candidate for the crosswalk area. Three or more straight lines are selected here because the width of the three straight lines is approximately 1 m, which can be used as a guide to maintain safety. Subsequently, the slope of the straight lines is measured, and only the lines that are close to the slope are drawn. Here, the extraction of the drawn lines is based on the variance in the inclination of the lines. We varied the threshold value and adopted 0.03 because it had the highest accuracy in many experiments. The drawn lines were then compared with the original edge image, and the range of the crosswalk was extracted. Finally, the labeling process was performed, and the detected labels, except for those that were significantly smaller, were combined to determine the crosswalk area, completing the detection of the crosswalk. The above content pertains to the overall process of the proposed method. Detailed information on the major stages shown in Figure 6 is presented in the subsections below.

Edge Detection
This section presents the content regarding edge detection shown in Figure 6, including the grayscaling process. The Canny method, which is often used for preprocessing to

Edge Detection
This section presents the content regarding edge detection shown in Figure 6, including the grayscaling process. The Canny method, which is often used for preprocessing to recognize objects in an image [30][31][32][33][34], was used to detect edges in the moving image. The Canny method was classified into five stages.
First, smoothing was performed using a Gaussian filter to weigh the pixel values around the pixel to be processed as well as the Gaussian distribution. The relationship between the input image, I; Gaussian filter kernel, K g ; and smoothed image, G, is given according to Equation (1). * denotes a convolution integral.
Second, the smoothed image G was differentiated using the Sobel filter [35]. The relationship between the horizontal differential kernel, K x , of the Sobel filter; vertical differential kernel, K y ; horizontal differential image, G x ; and vertical differential image, G y , is given according to Equations (2) and (3).
Third, the gradient magnitude, |G|, and direction θ were obtained from differential image G according to Equations (4) and (5), respectively.
Fourth, the contour of the differential image |G| was thinned via non-maximum suppression processing. Specifically, the pixel value of the pixel of interest is compared with the pixel value adjacent to the gradient direction of the contour; if the pixel value of the pixel of interest is not the maximum, the pixel value is not considered an edge pixel.
Finally, hysteresis threshold processing was used to select reliable contours and unreliable contours based on the maximum and minimum thresholds, and only highly reliable contours were drawn. Specifically, this stage classifies contours into three types. First, a reliable contour is used when the pixel value is larger than the maximum threshold. Second, the contour is considered unreliable when the value is smaller than the minimum threshold value. Finally, if it is between the maximum and minimum thresholds, the contour is reliable if adjacent contours have high reliability and vice versa. In this study, the maximum and minimum thresholds were set to 300 and 50, respectively.
The result of this process is presented in Figure 7. First, smoothing was performed using a Gaussian filter to weigh the pixel values around the pixel to be processed as well as the Gaussian distribution. The relationship between the input image, ; Gaussian filter kernel, ; and smoothed image, , is given according to Equation (1). * denotes a convolution integral.
Second, the smoothed image was differentiated using the Sobel filter [35]. The relationship between the horizontal differential kernel, , of the Sobel filter; vertical differential kernel, ; horizontal differential image, ; and vertical differential image, , is given according to Equations (2) and (3).
Third, the gradient magnitude, | |, and direction were obtained from differential image according to Equations (4) and (5), respectively.
Fourth, the contour of the differential image | | was thinned via non-maximum suppression processing. Specifically, the pixel value of the pixel of interest is compared with the pixel value adjacent to the gradient direction of the contour; if the pixel value of the pixel of interest is not the maximum, the pixel value is not considered an edge pixel.
Finally, hysteresis threshold processing was used to select reliable contours and unreliable contours based on the maximum and minimum thresholds, and only highly reliable contours were drawn. Specifically, this stage classifies contours into three types. First, a reliable contour is used when the pixel value is larger than the maximum threshold. Second, the contour is considered unreliable when the value is smaller than the minimum threshold value. Finally, if it is between the maximum and minimum thresholds, the contour is reliable if adjacent contours have high reliability and vice versa. In this study, the maximum and minimum thresholds were set to 300 and 50, respectively.
The result of this process is presented in Figure 7. The Canny method, which performs the above processing, is characterized by less false detection and the non-detection of contours than the Sobel filter and Laplacian filter, which are also used for edge detection.   The Canny method, which performs the above processing, is characterized by less false detection and the non-detection of contours than the Sobel filter and Laplacian filter, which are also used for edge detection.

Drawing Straight Lines by Detecting the Edge Components Related to Them
Following Figure 6, we utilized Hough transform to detect straight lines representing edges. Once the lines were detected, we proceeded to draw solid lines on them using the equations. Hough transform is widely used in object recognition from images. Generally, Equation (6) is used to show the parameters of a straight line.
However, when the slope of the line is parallel to the y-axis, the slope becomes ±∞, and an unnecessary intercept information is used. This is where Hough transform occurred. As shown in Figure 8, Hough transform is a method that performs calculations in the (θ, ρ) space with linear distance ρ from the origin of each pixel and angle θ between that line and the x-axis. In this space, it is represented according to Equation (7). In Figure 8, points A, B, and C lie on a straight line, while point D does not. ρ = x· cos θ + y· sin θ a method that performs calculations in the ( , ) space with linear distance from the origin of each pixel and angle between that line and the -axis. In this space, it is represented according to Equation (7). In Figure 8, points A, B, and C lie on a straight line, while point D does not.
According to Equation (7), countless lines pass through a pixel. However, the coordinates ( , ) of a pixel can be fixed to represent a straight line between pixels. In addition, as angle changes, the corresponding linear distance from the origin to each pixel is determined using a single solution. Therefore, we only need to consider one dimension. Thereafter, considering the symmetry of the figure, we can find the same ( , ) in several pixels if we find the solution of ( , ) in the range of 0 ≤ ≤ . This means that the parameters of the function of the straight line in Hough transform are the same. That is, pixels on the same line have the same ( , ).
Therefore, if we calculate ( , ) for all edge pixels and plot them with angle on the horizontal axis and linear distance from the origin to each pixel on the vertical axis, the curves of the pixels on the same line will have the same ( , ) and will thus intersect. The greater the number of curves at the intersection, the more reliable the straight line is.
There are two types of Hough transform: ordinary [36][37][38][39] and probabilistic [40]. The former has a high detection rate of straight lines because it calculates for all pixels, but it requires more processing. In the latter method, the minimum number of pixels required for straight-line detection is selected arbitrarily for calculation, and the straight-line detection rate and the processing time are low.
In this study, the probabilistic Hough transform of the latter method did not provide a sufficient detection rate; therefore, the ordinary Hough transform of the former method was used. Figure 9 shows an image in which only the straight lines obtained through Hough transform are extracted.  According to Equation (7), countless lines pass through a pixel. However, the coordinates (x, y) of a pixel can be fixed to represent a straight line between pixels. In addition, as angle θ changes, the corresponding linear distance ρ from the origin to each pixel is determined using a single solution. Therefore, we only need to consider one dimension. Thereafter, considering the symmetry of the figure, we can find the same (θ, ρ) in several pixels if we find the solution of (θ, ρ) in the range of 0 ≤ θ ≤ π. This means that the parameters of the function of the straight line in Hough transform are the same. That is, pixels on the same line have the same (θ, ρ).
Therefore, if we calculate (θ, ρ) for all edge pixels and plot them with angle θ on the horizontal axis and linear distance ρ from the origin to each pixel on the vertical axis, the curves of the pixels on the same line will have the same (θ, ρ) and will thus intersect. The greater the number of curves at the intersection, the more reliable the straight line is.
There are two types of Hough transform: ordinary [36][37][38][39] and probabilistic [40]. The former has a high detection rate of straight lines because it calculates for all pixels, but it requires more processing. In the latter method, the minimum number of pixels required for straight-line detection is selected arbitrarily for calculation, and the straight-line detection rate and the processing time are low.
In this study, the probabilistic Hough transform of the latter method did not provide a sufficient detection rate; therefore, the ordinary Hough transform of the former method

Extraction of Parallel Lines Based on Variance in Angle Information in Hough Space
The contents of this section pertain to the calculation of dispersion (variation parallel lines, as indicated in Figure 6. The white rectangular stripes in a pedestrian c ing have parallel boundaries. Therefore, if the variance value, 2 , of the angle is calcu using Equation (8) from the information on the angle of the straight line obtained thr Hough transform in the previous section, 2 will be zero in the ideal state.
Therefore, when 2 is close to zero, the area is recognized as a pedestrian cro However, 2 does not completely approach zero in reality. In this study, the thre value, ℎ 2 , of 2 was set to 0.03 from the experimental value because linear disto exists due to perspective and other noise.
2 was obtained from the information on the angle of the straight line obtaine Hough transform, and when the value satisfied Equation (9), the lines are process represent a pedestrian crossing.

Combining Edge Image and Hough Transform Image
In this section, we primarily focus on the combining process of the edge imag the Hough transform image, following a comparison between the two, as depicted in ure 6. The edge image ( Figure 7) and Hough transform image ( Figure 9) were comp to extract only the pedestrian crossing area. Here, the Hough transform image is drawn by detecting a straight line from the edge image using Hough transform. Bo binarized images. Therefore, from Equations (10) and (11), only the pixels that hol logical product of both the edge image and the Hough transform image a tracted. With this processing, the straight part of the pedestrian crossing in the edge i can be extracted. The results of this processing on the edge image in Figure 7 are pres in Figure 10.
However, if there is an edge other than the pedestrian crossing in the extension edge of the pedestrian crossing, the edge will be output even though it is minute. Thi was removed during the labeling step.

Extraction of Parallel Lines Based on Variance in Angle Information in Hough Space
The contents of this section pertain to the calculation of dispersion (variation) for parallel lines, as indicated in Figure 6. The white rectangular stripes in a pedestrian crossing have parallel boundaries. Therefore, if the variance value, S 2 , of the angle is calculated using Equation (8) from the information on the angle of the straight line obtained through Hough transform in the previous section, S 2 will be zero in the ideal state.
Therefore, when S 2 is close to zero, the area is recognized as a pedestrian crossing. However, S 2 does not completely approach zero in reality. In this study, the threshold value, S 2 th , of S 2 was set to 0.03 from the experimental value because linear distortion exists due to perspective and other noise.
S 2 was obtained from the information on the angle of the straight line obtained via Hough transform, and when the value satisfied Equation (9), the lines are processed to represent a pedestrian crossing.

Combining Edge Image and Hough Transform Image
In this section, we primarily focus on the combining process of the edge image and the Hough transform image, following a comparison between the two, as depicted in Figure 6. The edge image (Figure 7) and Hough transform image ( Figure 9) were compared to extract only the pedestrian crossing area. Here, the Hough transform image is a line drawn by detecting a straight line from the edge image using Hough transform. Both are binarized images. Therefore, from Equations (10) and (11), only the pixels that hold the logical product of both the edge image G E and the Hough transform image G H are extracted. With this processing, the straight part of the pedestrian crossing in the edge image can be extracted. The results of this processing on the edge image in Figure 7 are presented in Figure 10.

Labeling
To determine the crosswalk area, we performed labeling on the combined ima the edge image and the Hough transform image. Labeling is a process of concaten consecutive output pixels in a binarized image and assigning the same number to th There are two types of labeling: four-connected (Figure 11), which gives the label to consecutive pixels in the vertical and horizontal directions of the binarized im and eight-connected (Figure 12), which gives the same label to pixels connected in vertical, horizontal, and diagonal directions. The red area in the figure represents the els of interest. In this study, we used the eight-connected method, which allows lab in the diagonal direction in consideration of the rotation of the object.
The multiple detected labels were then sorted by size, and only those with large a were extracted, and the labels were merged to draw the entire crosswalk area. Figu shows the labeling results, and Figure 14 shows the result of merging the labels.   However, if there is an edge other than the pedestrian crossing in the extension of the edge of the pedestrian crossing, the edge will be output even though it is minute. This part was removed during the labeling step.

Labeling
To determine the crosswalk area, we performed labeling on the combined image of the edge image and the Hough transform image. Labeling is a process of concatenating consecutive output pixels in a binarized image and assigning the same number to them.
There are two types of labeling: four-connected (Figure 11), which gives the same label to consecutive pixels in the vertical and horizontal directions of the binarized image, and eight-connected (Figure 12), which gives the same label to pixels connected in the vertical, horizontal, and diagonal directions. The red area in the figure represents the pixels of interest. In this study, we used the eight-connected method, which allows labeling in the diagonal direction in consideration of the rotation of the object.

Labeling
To determine the crosswalk area, we performed labeling on the combined image of the edge image and the Hough transform image. Labeling is a process of concatenating consecutive output pixels in a binarized image and assigning the same number to them.
There are two types of labeling: four-connected (Figure 11), which gives the same label to consecutive pixels in the vertical and horizontal directions of the binarized image, and eight-connected (Figure 12), which gives the same label to pixels connected in the vertical, horizontal, and diagonal directions. The red area in the figure represents the pixels of interest. In this study, we used the eight-connected method, which allows labeling in the diagonal direction in consideration of the rotation of the object.
The multiple detected labels were then sorted by size, and only those with large areas were extracted, and the labels were merged to draw the entire crosswalk area. Figure 13 shows the labeling results, and Figure 14 shows the result of merging the labels.

Labeling
To determine the crosswalk area, we performed labeling on the combined imag the edge image and the Hough transform image. Labeling is a process of concatena consecutive output pixels in a binarized image and assigning the same number to the There are two types of labeling: four-connected (Figure 11), which gives the s label to consecutive pixels in the vertical and horizontal directions of the binarized im and eight-connected (Figure 12), which gives the same label to pixels connected in vertical, horizontal, and diagonal directions. The red area in the figure represents the els of interest. In this study, we used the eight-connected method, which allows labe in the diagonal direction in consideration of the rotation of the object.
The multiple detected labels were then sorted by size, and only those with large a were extracted, and the labels were merged to draw the entire crosswalk area. Figur shows the labeling results, and Figure 14 shows the result of merging the labels.   The multiple detected labels were then sorted by size, and only those with large areas were extracted, and the labels were merged to draw the entire crosswalk area. Figure 13 shows the labeling results, and Figure 14 shows the result of merging the labels.

Experimental Environment
We performed experiments using the proposed system and evaluated its performance. The experiments were conducted using 1390 crosswalk images taken during the day in different environments. This means that the images were captured while walking on various pedestrian crossings located on different roads, and under both day and night conditions. We calculated the true positive (TP) and false negative (FN) rates using crosswalk images. In addition, we used 1100 images that did not include pedestrian crossings to confirm the false positive (FP) and true negative (TN) rates.
A similar experiment was conducted using 520 crosswalk images taken at night and 520 images without crosswalks.
The camera was fixed to the chest of the pedestrian during video acquisition. The pedestrian walked at a normal walking speed. We utilized the C++ programming language and the OpenCV libraries for our implementation.
We further verified the real-time performance using the Jetson Nano computer . Table  2 shows information on the Jetson Nano. A 30 s video was processed and the processing time was calculated.

Evaluation
Video acquisition was performed at different times and places, and processing was performed for each input video. The selected results are presented in Figures 15-17. Figure 15 shows only the crosswalk. Figures 16 and 17 show pedestrians and the crosswalk, respectively. In each figure, panel (a) shows the input image, panel (b) shows the edge

Experimental Environment
We performed experiments using the proposed system and evaluated its performance. The experiments were conducted using 1390 crosswalk images taken during the day in different environments. This means that the images were captured while walking on various pedestrian crossings located on different roads, and under both day and night conditions. We calculated the true positive (TP) and false negative (FN) rates using crosswalk images. In addition, we used 1100 images that did not include pedestrian crossings to confirm the false positive (FP) and true negative (TN) rates.
A similar experiment was conducted using 520 crosswalk images taken at night and 520 images without crosswalks.
The camera was fixed to the chest of the pedestrian during video acquisition. The pedestrian walked at a normal walking speed. We utilized the C++ programming language and the OpenCV libraries for our implementation.
We further verified the real-time performance using the Jetson Nano computer. Table 2 shows information on the Jetson Nano. A 30 s video was processed and the processing time was calculated.

Evaluation
Video acquisition was performed at different times and places, and processing was performed for each input video. The selected results are presented in Figures 15-17. Figure 15 shows only the crosswalk. Figures 16 and 17 show pedestrians and the crosswalk, respectively. In each figure, panel (a) shows the input image, panel (b) shows the edge image, panel (c) shows an image in which only the straight lines obtained via Hough transform are extracted, panel (d) shows the composite image, panel (e) shows the labeling, and panel (f) shows the pedestrian crossing sensing results. Figures 18 and 19 show the sensing results for the input and output images.
presence or absence of a pedestrian crossing is not correctly recognized. We evaluated th identification results according to the accuracy, which is summarized in Tables 3 and 4 The calculation of accuracy was performed following the definition in Equation (12). Th accuracy was 98.5% when the proposed algorithm was tested with pedestrian crossin images and images without pedestrian crossing. Hence, the proposed method detecte pedestrian crossings in both environments with good accuracy and a comparatively low FP rate. In addition, the accuracy of our results is higher than that of previous studies. Fo instance, Wu et al. [15] had an average accuracy of 95.3%, and Cao et al. [17] achieved a accuracy of 94.9% (Table 5).
The processing time for a 30 s video at 30 fps was 85.6 s. These results indicate tha real-time processing can be performed at approximately 10.5 fps.
Considering the walking speed of a visually impaired person, this performance i considered sufficient. presence or absence of a pedestrian crossing is not correctly recognized. We evaluated th identification results according to the accuracy, which is summarized in Tables 3 and 4 The calculation of accuracy was performed following the definition in Equation (12). Th accuracy was 98.5% when the proposed algorithm was tested with pedestrian crossin images and images without pedestrian crossing. Hence, the proposed method detected pedestrian crossings in both environments with good accuracy and a comparatively low FP rate. In addition, the accuracy of our results is higher than that of previous studies. Fo instance, Wu et al. [15] had an average accuracy of 95.3%, and Cao et al. [17] achieved a accuracy of 94.9% (Table 5). The processing time for a 30 s video at 30 fps was 85.6 s. These results indicate tha real-time processing can be performed at approximately 10.5 fps.
Considering the walking speed of a visually impaired person, this performance i considered sufficient.          Thus, it is shown that detection is possible even when the crosswalk is hidden by a person.
If the labeling of parallel lines exceeding 1/100 of the screen size is possible, it can be recognized as a pedestrian crossing. Therefore, it is not the measure of population density, but how it appears from the camera that is important. Table 3 summarizes the results obtained during the day. Table 4 summarizes the results obtained at night. Tables 3 and 4 summarize the following evaluation results: TP,  TN, FP, and FN. TP and TN correspond to images in which the presence or absence of a pedestrian crossing is correctly recognized. FP and FN correspond to images in which the presence or absence of a pedestrian crossing is not correctly recognized. We evaluated the identification results according to the accuracy, which is summarized in Tables 3 and 4. The calculation of accuracy was performed following the definition in Equation (12). The accuracy was 98.5% when the proposed algorithm was tested with pedestrian crossing images and images without pedestrian crossing. Hence, the proposed method detected pedestrian crossings in both environments with good accuracy and a comparatively low FP rate. In addition, the accuracy of our results is higher than that of previous studies. For instance, Wu et al. [15] had an average accuracy of 95.3%, and Cao et al. [17] achieved an accuracy of 94.9% (Table 5).   Table 5. Comparison of accuracy.

Methods Accuracy
Wu et al. [15] 95.3% Cao et al. [17] 94.9% This method 98.5% The processing time for a 30 s video at 30 fps was 85.6 s. These results indicate that real-time processing can be performed at approximately 10.5 fps.
Considering the walking speed of a visually impaired person, this performance is considered sufficient. Accuracy = TP + TN TP + TN + FP + FN (12)

Discussion
In this study, a pedestrian crossing was detected based on edge information and the characteristics of the slope of straight lines. Thus, only a minimal amount of ambient light is necessary for detection, and the results are not affected by the light level owing to the weather or time of day. The results are more dependent on the capturing sensitivity of the camera rather than the specific degree of lighting. Modern high-sensitive cameras are capable of capturing objects in images even under low light conditions. Based on the results obtained thus far, if a pedestrian crossing can be captured in images similar to Figure 18(c1), the proposed method has demonstrated successful detection capabilities. We conducted our experiments under non-rainy and non-snowy conditions, both during daytime and nighttime. The proposed method has been found to function successfully under these weather conditions. Therefore, as shown in Figures 15-17, the method is considered effective in many cases. Additionally, we succeeded in detecting pedestrian crossings at night.
However, if a significant portion of the pedestrian crossing marking is missing, it will not be detected because no sufficiently straight lines can be detected in the edge image.
Regarding the number of pedestrians on the crossing, we have observed that the appearance of the white regions of the pedestrian crossing changes randomly based on the positions of the pedestrians. Therefore, in our experience, the detection results of the pedestrian crossing depend more on the area of the pedestrian crossing that can be captured in the camera images rather than the exact number of pedestrians. Based on the results obtained thus far, we have observed that if the edges of up to two white layers of the pedestrian crossing can be detected, even when pedestrians are present on those layers, this method can successfully detect a pedestrian crossing.
On the other hand, the proposed method primarily detects pedestrian crossings based on the parallel lines associated with the edges of the parallel white layers of a pedestrian crosswalk. This indicates that as long as the camera can capture the pedestrian crossing from any direction in the images, the proposed method will work effectively. The appearance of the parallel layers on the images is independent of the capturing direction.

Conclusions
In this study, we developed a pedestrian crossing detection method to assist visually impaired pedestrians to walk safely. Our proposed detection method detects edges in an image, processes the edge information in Hough space, and analyzes the variance of the edge inclination in Hough space. Our proposed method has unique characteristics that make it effective in handling challenging scenarios where traditional methods struggle. It excels at detecting crosswalks in low-light conditions, even when visibility is limited or obstructed. This method achieves high real-time performance by minimizing computational costs, ensuring the efficient and timely detection of crosswalks in real-world environments. It demonstrates an impressive accuracy rate of 98.47%. The algorithm can be executed at almost real-time speeds (approximately 10.5 fps) using a Jetson Nano small-type computer, highlighting its potential as a wearable device. Conducting a wide range of subjective experiments with visually impaired individuals using the proposed method and dedicated hardware will be a key focus of our future work.