Visual Positioning Indoors: Human Eyes vs. Smartphone Cameras

Artificial Intelligence (AI) technologies and their related applications are now developing at a rapid pace. Indoor positioning will be one of the core technologies that enable AI applications because people spend 80% of their time indoors. Humans can locate themselves related to a visually well-defined object, e.g., a door, based on their visual observations. Can a smartphone camera do a similar job when it points to an object? In this paper, a visual positioning solution was developed based on a single image captured from a smartphone camera pointing to a well-defined object. The smartphone camera simulates the process of human eyes for the purpose of relatively locating themselves against a well-defined object. Extensive experiments were conducted with five types of smartphones on three different indoor settings, including a meeting room, a library, and a reading room. Experimental results shown that the average positioning accuracy of the solution based on five smartphone cameras is 30.6 cm, while that for the human-observed solution with 300 samples from 10 different people is 73.1 cm.


Introduction
With the application and development of technologies based on user location information, location-based services are now growing at a rapid pace. Especially in large and complex indoor environments such as museums, airports, shopping malls, and underground constructions, there is an urgent need for high accuracy location services. For outdoor environments where an open sky is visible, Global Satellite Navigation System (GNSS) can provide excellent positioning accuracy, however, GNSS signals are weak and can be easily blocked or attenuated by buildings [1]. Therefore, to achieve a seamless indoor/outdoor positioning solution with high accuracy is still challenge [2].
Indoor environments are characterized by all types of complex situations, such as obstacles, signal fluctuation or noise, environment setting changes, etc. [3]. The complex space topology and challenging signal propagation environment introduce a lot of difficulties in indoor positioning, though there are various signals available, including Wi-Fi, Bluetooth, radio-frequency identification, sensor measurements, images, ultrasound, light, magnetic fields, etc. [4]. Thus, indoor positioning is still a hot research topic though it has been studied for decades [5].
Humans can locate themselves in their ambient environment based on visual observations. In 1971, O'Keefe found place cells that form a storage facility for location information. The human brain can constitute a complete map of an indoor environment, and activate a place cell when a location is identified. The indoor location information in the place cells is fused with the information of multiple nerve cells [6]. May-Britt and Edvard explain that there are four types of cells at work in the human brain for the purpose of localization: grid cells, border cells, velocity cells, and head directional cells [7]. The brain navigation system is composed of a variety of different kinds of nerve cells what can obtain

Methods
It is assumed that there is a smartphone user in an indoor environment. He or she can take a picture of the doorframe with the smartphone. The size of the doorframe is available from the floor plan of the building. The pixel coordinates of the corresponding corners are obtained by the improved corner detection algorithm. Then the three angle elements and three direction elements of the smartphone can be acquired by the rational functions model (RFM). Finally, the user location in the doorframe coordinate system then can be obtained by the coordinate translation relationship. Figure 1 shows the central projection model, in which three different coordinate systems are involved, i.e., the object space coordinate, the plane coordinate and the pixel coordinate. The object space coordinate can be established with a right-hand Cartesian coordinate system. Starting clockwise from the bottom left corner of the doorframe, coordinates of the doorframe corners are (0, 0, 0), (0, 0, l), (0, ω, l), (0, ω, 0), where l and ω are the length and width of the door. The pixel coordinate system is a two-dimensional (2D) plane coordinate system, where the pixel coordinates corresponding to the door corners in the object space points are (u 1 , v 1 ), (u 2 , v 2 ), (u 3 , v 3 ), (u 4 , v 4 ), and (u 0 , v 0 ) are the pixel coordinates of the main point projection defined as O 1 . The camera coordinate system is based on the main point O c and the x c − O c − y c plane, which is parallel to the pixel coordinate system. corner detection algorithm. Then the three angle elements and three direction elements of the smartphone can be acquired by the rational functions model (RFM). Finally, the user location in the doorframe coordinate system then can be obtained by the coordinate translation relationship. Figure 1 shows the central projection model, in which three different coordinate systems are involved, i.e., the object space coordinate, the plane coordinate and the pixel coordinate. The object space coordinate can be established with a right-hand Cartesian coordinate system. Starting clockwise from the bottom left corner of the doorframe, coordinates of the doorframe corners are (0,0,0), (0,0, ), (0, , ), (0, , 0), where l and are the length and width of the door. The pixel coordinate system is a two-dimensional (2D) plane coordinate system, where the pixel coordinates corresponding to the door corners in the object space points are ( , ), ( , ), ( , ), ( , ), and ( , ) are the pixel coordinates of the main point projection defined as O . The camera coordinate system is based on the main point O and the x − O − y plane, which is parallel to the pixel coordinate system. In this paper, the method for positioning mainly consists of the following four steps: Firstly, when the image is acquired, the door corners in pixel coordinates are determined. An improved corner detection of the image will be applied to extract the door corners. Secondly, the smartphone's exterior orientation elements are calculated, which include angle and linear elements. Finally, the relative position between the user and the door will be obtained, which is based on the transformation of the camera coordinate system to the object space coordinate system.
It should be noted that, in order to achieve accurate positioning results, the smartphone camera needs to be calibrated beforehand. The whole method is described in detail as follows (Algorithm 1): Camera calibration by MATLAB's calibration tools (Section 2.1); Acquire the side lengths of the door from the floor plan of the building and obtain the corner's pixel coordinates of the doorframe by the corner detection algorithm (Section 2.2); Obtain the exterior orientation elements by the rigorous imaging model recovery algorithm (Section 2.3); Calculate the user's position by the relationship of two coordinate systems (Section 2.4). end for In this paper, the method for positioning mainly consists of the following four steps: Firstly, when the image is acquired, the door corners in pixel coordinates are determined. An improved corner detection of the image will be applied to extract the door corners. Secondly, the smartphone's exterior orientation elements are calculated, which include angle and linear elements. Finally, the relative position between the user and the door will be obtained, which is based on the transformation of the camera coordinate system to the object space coordinate system.
It should be noted that, in order to achieve accurate positioning results, the smartphone camera needs to be calibrated beforehand. The whole method is described in detail as follows (Algorithm 1): Camera calibration by MATLAB's calibration tools (Section 2.1); Acquire the side lengths of the door from the floor plan of the building and obtain the corner's pixel coordinates of the doorframe by the corner detection algorithm (Section 2.2); Obtain the exterior orientation elements by the rigorous imaging model recovery algorithm (Section 2.3); Calculate the user's position by the relationship of two coordinate systems (Section 2.4). end for

Camera Calibration
Most smartphones on the market have a digital zoom that is able to enlarge the area of each pixel for image magnification. Since the lens of the camera is not perfect, the problem of image distortion occurs during the acquisition of the image [20]. The distortion types of the camera lens mainly include radial distortion, tangential distortion, and thin prism distortion. In more detail, the radial distortion is mainly caused by the defect in the shape of the "tube" or "fisheye" of the camera, which causes the pixel point to deviate from the ideal position along the radial direction. As shown in Figure 2, the tangential distortion and the thin prism distortion are mainly caused by the fabrication of the lens and the error of the installation, which results in distortion along the radial direction and the direction perpendicular to the radial direction [21].

Camera Calibration
Most smartphones on the market have a digital zoom that is able to enlarge the area of each pixel for image magnification. Since the lens of the camera is not perfect, the problem of image distortion occurs during the acquisition of the image [20]. The distortion types of the camera lens mainly include radial distortion, tangential distortion, and thin prism distortion. In more detail, the radial distortion is mainly caused by the defect in the shape of the "tube" or "fisheye" of the camera, which causes the pixel point to deviate from the ideal position along the radial direction. As shown in Figure 2, the tangential distortion and the thin prism distortion are mainly caused by the fabrication of the lens and the error of the installation, which results in distortion along the radial direction and the direction perpendicular to the radial direction [21]. Therefore, in order to obtain accurate measurements in pixel coordinates, deriving the distortion parameters of the camera is required. The relationship between the pixel coordinates of the ideal image and that of the actual image is described in Equation (1), which considers two tangential distortions and three radial distortions: where ( , ) is the original pixel coordinate and ( , ) is the corrected pixel coordinate. , , are the parameters of the radial distortions, , are the parameters of the tangential distortions, and r is the radius of pixel.
This work adopts the calibration method proposed by Zhang [21], which has been proved with high calibration accuracy, good robustness, concise calibration operation, and low hardware requirements. The method assumes that a black-and-white lattice plate is on the plane of the world coordinate system, and the initial parameter values of the camera are obtained through the linear imaging model. Then the objective function of nonlinear distortion is calculated by using a nonlinear imaging model. Based on the nonlinear optimization algorithm, the optimal solution of the camera parameters can be obtained.
To further improve the accuracy of calibration, in particular, to reduce the calibration error caused by the problem of bending of the calibration plate itself and the coordinate error of the feature points, the method chooses an LCD to display the calibration template, which aims to maintain the high geometric precision and flatness of the template plane [20,21]. Therefore, in order to obtain accurate measurements in pixel coordinates, deriving the distortion parameters of the camera is required. The relationship between the pixel coordinates of the ideal image and that of the actual image is described in Equation (1), which considers two tangential distortions and three radial distortions: x d = x u 1 + k 1 r 2 + k 2 r 4 + k 3 r 6 + 2p 1 x u y u + p 2 r 2 + 2x 2 u y d = y u 1 + k 1 r 2 + k 2 r 4 + k 3 r 6 + p 1 r 2 + 2y 2 where (x u , y u ) is the original pixel coordinate and (x d , y d ) is the corrected pixel coordinate. {k 1 , k 2 , k 3 } are the parameters of the radial distortions, {p 1 , p 2 } are the parameters of the tangential distortions, and r is the radius of pixel. This work adopts the calibration method proposed by Zhang [21], which has been proved with high calibration accuracy, good robustness, concise calibration operation, and low hardware requirements. The method assumes that a black-and-white lattice plate is on the plane of the world coordinate system, and the initial parameter values of the camera are obtained through the linear imaging model. Then the objective function of nonlinear distortion is calculated by using a nonlinear imaging model. Based on the nonlinear optimization algorithm, the optimal solution of the camera parameters can be obtained.
To further improve the accuracy of calibration, in particular, to reduce the calibration error caused by the problem of bending of the calibration plate itself and the coordinate error of the feature points, the method chooses an LCD to display the calibration template, which aims to maintain the high geometric precision and flatness of the template plane [20,21].

Determination of the Pixel Coordinates of the Door Corners
To obtain the pixel coordinates of the corners of the door, the method first uses the Harris corner detection method [22,23], and then applies the SUSAN corner detection method [24] to remove the redundant corner points to improve the accuracy of detection. The pixel coordinates of the door corners are thus calculated by averaging the pixel coordinates of corner points in a certain window. In Harris corner detection, we calculate a round window N 0 with the center of (x 0 , y 0 ) and the radius equal to r 1 . Thus, the grayscale variation can be expressed as: where (∆x, ∆y) is the unit pixel, the points I(x + ∆x, y + ∆y) belong to the round window N 0 . ω(x, y) represents a Gaussian kernel function in which σ = 1. By expanding Equation (2) with the second-order Taylor polynomial, we obtain: Since o ∆x 2 + ∆y 2 in Equation (3) is negligible: By further assuming that M = Σ (x,y)∈D ∆I(x, y)·∆I(x, y) T , and due to it being the semi-definite matrix, we translate Equation (4) to: where {λ 1 , λ 2 } are the two eigenvalues of M, and the corner response function f R is defined as: Thus, the corner points can be detected according to the two eigenvalues of M [22]. In this paper, it chooses k = 0.05 and if f R (x, y) > 0, the point is regarded as a corner. However, there are still redundant corner points, which are detected with errors. Thus, this method further uses the SUSAN corner detection method to eliminate the redundancy so as to obtain the corner points with more accuracy. The SUSAN corner detection is described as follows: Firstly, we compare the grayscale of individual pixel points and template nuclei in the template area to determine whether the pixel points belong to the USAN area, and the rules is: where I(r 0 ) is the gray value on the central point r 0 , and I(r) is the gray value of the point r inside a template. c(r, r 0 ) represents the difference of the gray value between the pixel of r and r 0 . In this work, the threshold is set as t = 50 and the number of pixels in a template is set as 37. Secondly, we further calculate the number of pixels whose gray values are close to the center of the template: Lastly, the point response function is used to eliminate the edges and internally redundant points. The threshold g is set to half of the number of pixels, i.e., g = 16: After the corner detection, a round window with the radius equal to three pixels is used to calculate the average coordinate of corner point as the four door corners' pixel coordinates: where ( , ) is the corner point pixel coordinate inside the window, ( , ) is the door corner point coordinate, and is the number of corners.

Determination of the Exterior Orientation Elements
According to the camera coordinate system and the object coordinate system transformation relationship, Equation (11) can be obtained: where ( , , ) are the coordinates of the camera in the camera coordinate system, R is the camera angle rotation matrix, T is the camera translation matrix, and ( , , ) are the coordinates of the homonymous points on object space coordinate system: Equation (12) shows the transformation relationship between the pixel coordinate system and the camera coordinate system, where ( , ) are the corrected pixel coordinates of the four corners of the doorframe, and where and are the focal length in the x and y directions, and and are the coordinates of the principal point of the photograph in the pixel coordinate system. The transformation relation between the pixel coordinate system and the object coordinate system is: As shown in Figure 1, the gate corner points, corresponding the pixel points and the principal point of the photograph are collinear. Thus, we transform Equation (13) to Equation (14): After the corner detection, a round window with the radius equal to three pixels is used to calculate the average coordinate of corner point as the four door corners' pixel coordinates: where (x, y) is the corner point pixel coordinate inside the window, (u i , v i ) is the door corner point coordinate, and N is the number of corners.

Determination of the Exterior Orientation Elements
According to the camera coordinate system and the object coordinate system transformation relationship, Equation (11) can be obtained: where (X c , Y c , Z c ) are the coordinates of the camera in the camera coordinate system, R is the camera angle rotation matrix, T is the camera translation matrix, and (X w , Y w , Z w ) are the coordinates of the homonymous points on object space coordinate system: Equation (12) shows the transformation relationship between the pixel coordinate system and the camera coordinate system, where (u, v) are the corrected pixel coordinates of the four corners of the doorframe, and where f x and f y are the focal length in the x and y directions, and u 0 and v 0 are the coordinates of the principal point of the photograph in the pixel coordinate system. The transformation relation between the pixel coordinate system and the object coordinate system is: As shown in Figure 1, the gate corner points, corresponding the pixel points and the principal point of the photograph are collinear. Thus, we transform Equation (13) to Equation (14): where [a 1 , a 2 , a 3 ; b 1 , b 2 , b 3 ; c 1 , c 2 , c 3 ] are the elements in R and [X, Y, Z] are the elements in T. The basic principle of all the recovery algorithms for the rigorous imaging model are linearization of the collinear Equation (14). We adopt the classical rational polynomial model to restore the rigorous imaging model, i.e.,: The least squares solution of the above parameters can be obtained: where P is the weight of the observation. However, the control points have the same accuracy, thus P is the unit matrix. In this paper, considering the door is in the center of the picture, the starting value of T is a quarter of the total of the corner coordinates and the starting value of R is α = 90 • , ω = 0 • , and k = 45 • . Finally, if ∆X < 1 × 10 −3 , the iteration will be stopped and the exterior elements are calculated as: where {X, Y, Z, α, ω, k} are the final results, {X 0 , Y 0 , Z 0 , α 0 , ω 0 , k 0 } are the starting values, and {∆X n , ∆Y n , ∆Z n , ∆α n , ∆ω n , ∆k n } are the corrections in the nth iteration.

Computation of the Smartphone Camera Position in the Doorframe Coordinate System
After obtaining the optimal solution of six exterior orientation elements, Equation (18) can be used to calculate the object space coordinates of the main point. Then we will acquire the relative position at the photograph moment between the smartphone and target: where the camera position in the camera coordinate system is (0, 0, 0), and (X s , Y s , Z s ) is the camera position in the object space coordinate system.

Results
In this work, the method is tested in three typical office areas with different smartphones. As shown in Figure 4, three scenarios are selected as the experimental examples. Our first experiments are carried out in a typical meeting room in an office area, which is shown in Figure 4a. The area of the field test is approximately 8.5 m by 15 m. As shown in Figure 4b, the room of the library has We chose five different brands of smartphones in the field tests whose prices range from 1000 to 6000 CNY. As shown in Figure 5, they include the Xiaomi 5, Huawei P9, Samsung Note 5, Lenovo Tango, and iPhone 7P. These are among the most popular smartphones found in the current market in China. In addition, we also compare the border positioning capabilities with the human brain. It should be noted that although most smartphones are equipped with a digital zoom camera, in which the focal length is constant, different smartphones have different distortion parameters and different coverage areas. The black and white standard plate is projected in the center of the photograph when we make a calibration for the phone's camera lens. Therefore, during the field tests, the target should be projected in the center of the image area as far as possible to reduce the error of the distortion correction.

Camera Calibratoration
This part mainly focuses on the evaluation of the relative position information acquisition ability and accuracy evaluation of smartphones in different experimental areas. Tables 1 and 2 show the internal parameters and the distortion parameters of five cameras. From the results, the pixel error of each smartphone is less than 0.3 pixel during the calibration.  We chose five different brands of smartphones in the field tests whose prices range from 1000 to 6000 CNY. As shown in Figure 5, they include the Xiaomi 5, Huawei P9, Samsung Note 5, Lenovo Tango, and iPhone 7P. These are among the most popular smartphones found in the current market in China. In addition, we also compare the border positioning capabilities with the human brain. We chose five different brands of smartphones in the field tests whose prices range from 1000 to 6000 CNY. As shown in Figure 5, they include the Xiaomi 5, Huawei P9, Samsung Note 5, Lenovo Tango, and iPhone 7P. These are among the most popular smartphones found in the current market in China. In addition, we also compare the border positioning capabilities with the human brain. It should be noted that although most smartphones are equipped with a digital zoom camera, in which the focal length is constant, different smartphones have different distortion parameters and different coverage areas. The black and white standard plate is projected in the center of the photograph when we make a calibration for the phone's camera lens. Therefore, during the field tests, the target should be projected in the center of the image area as far as possible to reduce the error of the distortion correction.

Camera Calibratoration
This part mainly focuses on the evaluation of the relative position information acquisition ability and accuracy evaluation of smartphones in different experimental areas. Tables 1 and 2 show the internal parameters and the distortion parameters of five cameras. From the results, the pixel error of each smartphone is less than 0.3 pixel during the calibration. It should be noted that although most smartphones are equipped with a digital zoom camera, in which the focal length is constant, different smartphones have different distortion parameters and different coverage areas. The black and white standard plate is projected in the center of the photograph when we make a calibration for the phone's camera lens. Therefore, during the field tests, the target should be projected in the center of the image area as far as possible to reduce the error of the distortion correction.

Camera Calibratoration
This part mainly focuses on the evaluation of the relative position information acquisition ability and accuracy evaluation of smartphones in different experimental areas. Tables 1 and 2 show the internal parameters and the distortion parameters of five cameras. From the results, the pixel error of each smartphone is less than 0.3 pixel during the calibration.

Relative Positioning Accuracy Based on the iPhone 7P
In this part, we chose the iPhone 7P to experiment in three different environments. Each region is set with five lines whose angle with the door is 30 • , 60 • , 90 • , 120 • , and 150 • , and there are six test points per straight line. Due to the size of each scene, there are different intervals between the testing points. Figure 6 shows the error distribution in each area. In Figure 6, the red lines represent the position of the door. The solid black spots represent the error of the testing points, where a larger black point corresponds to a larger error of the position. Then the tendency of the accuracy can be plotted by the error of these discrete testing points. As shown in Figure 6, the color changing from blue to yellow means the accuracy becomes worse. Thus, the blue area represents the smallest relative position error. As the relative position error increases, the region's color become lighter. The yellow area represents the largest relative position error. However, the white area of the three scenes are regions where the camera cannot obtain the picture of the door.

Relative Positioning Accuracy Based on the iPhone 7P
In this part, we chose the iPhone 7P to experiment in three different environments. Each region is set with five lines whose angle with the door is 30°, 60°, 90°, 120°, and 150°, and there are six test points per straight line. Due to the size of each scene, there are different intervals between the testing points. Figure 6 shows the error distribution in each area. In Figure 6, the red lines represent the position of the door. The solid black spots represent the error of the testing points, where a larger black point corresponds to a larger error of the position. Then the tendency of the accuracy can be plotted by the error of these discrete testing points. As shown in Figure 6, the color changing from blue to yellow means the accuracy becomes worse. Thus, the blue area represents the smallest relative position error. As the relative position error increases, the region's color become lighter. The yellow area represents the largest relative position error. However, the white area of the three scenes are regions where the camera cannot obtain the picture of the door.
(a) Scene one  Figure 6 only shows the performance of the iPhone 7P in the three scenes. Next, we will test four other smartphones to explore their tendencies.

Tests with Various Smartphones
In order to study the universality of the visual positioning method based on smartphones, here we use four other smartphones to test the method. We evaluated the method and tendency for error by the absolute value of the relative positioning accuracy in different areas and different smartphones.
(b) Scene two (c) Scene three  Figure 6 only shows the performance of the iPhone 7P in the three scenes. Next, we will test four other smartphones to explore their tendencies.

Tests with Various Smartphones
In order to study the universality of the visual positioning method based on smartphones, here we use four other smartphones to test the method. We evaluated the method and tendency for error by the absolute value of the relative positioning accuracy in different areas and different smartphones. Figure 7 shows the tendency of absolute accuracy of the testing points at three different straight lines in test scenario 1. It can be seen from the three pictures of Figure 7 that the greater the relative distance, the larger the relative position errors. As shown in Figure 7a, when the relative distance ranges from 226.4 cm to 726.4 cm, the accuracy becomes worse. When the relative distance is 226.4 cm, the error of Samsung Note 5 is 10.0 cm, however, when the relative distance is 1226.4 cm, the error is 45.2 cm. The tendency can also be shown by the other four smartphones.
Meanwhile, by comparing the testing points of different lines, it can also be found that the relative position error becomes worse when the angle between the lines and the door decreases. As shown in Figure 7a,b, when the Samsung is 226.4 cm from the door, the error at the 90 • line is 10.0 cm and the error at the 60 • line is 16.0 cm. When the Samsung is 626.4 cm from the door, the error at the 90 • line is still smaller than that at the 60 • line.  Figure 7 shows the tendency of absolute accuracy of the testing points at three different straight lines in test scenario 1. It can be seen from the three pictures of Figure 7 that the greater the relative distance, the larger the relative position errors. As shown in Figure 7a, when the relative distance ranges from 226.4 cm to 726.4 cm, the accuracy becomes worse. When the relative distance is 226.4 cm, the error of Samsung Note 5 is 10.0 cm, however, when the relative distance is 1226.4 cm, the error is 45.2 cm. The tendency can also be shown by the other four smartphones.
Meanwhile, by comparing the testing points of different lines, it can also be found that the relative position error becomes worse when the angle between the lines and the door decreases. As shown in Figure 7a,b, when the Samsung is 226.4 cm from the door, the error at the 90° line is 10.0 cm and the error at the 60° line is 16.0 cm. When the Samsung is 626.4 cm from the door, the error at the 90° line is still smaller than that at the 60° line.   Table 3 shows the comparison of five different smartphones in three areas in terms of mean value, the variance, and the maximum of the error of relative position. From Table 3, according to the comparison of three scenes, the average of all smartphones is the best in scene one and worst in scene three. However, iPhone's worst average is 39.2 cm in scene two, which can be treated as an experimental error. The maximum in scene one also is smaller than that in scene three. In scenario one, the maximum error is only 56 cm, while the maximum values in scene 2 and 3 are 120.3 cm and 109.3 cm. It may be that scene one has a more suitable environment for testing.
In addition, Table 3 also shows that various smartphones have different results. The iPhone 7P has the best accuracy of relative position among the smartphones. The average error of the iPhone 7P is 7.2 cm in scene one, however, the worst result is obtained from the Samsung Note 5 in scene three, with an average error of 46.6 cm. What caused this is the camera lens of each smartphone is different, as well as testing in different environments.
There are many differences between the three scenes; in spite of this, smartphones show good performance in this test. All smartphone positioning accuracy can be below 50 cm in each scene. Thus, this method shows our smartphone can provide better positioning accuracy to us.

Comparison between the Smartphones and the Brain
In this paper, mainly in order to simulate the brain border cell function, an image sensor based on smartphones can maintain the smartphone in obtaining the relative position relationship of the border of the object, and provide a location information service for a human being. Thus, at each test point in scene 3, 10 testers were asked to estimate the relative position with the border by themselves. Table 4 shows the average error and maximum error, as well as the standard deviation of the 10 individuals at 30 points.   Table 3 shows the comparison of five different smartphones in three areas in terms of mean value, the variance, and the maximum of the error of relative position. From Table 3, according to the comparison of three scenes, the average of all smartphones is the best in scene one and worst in scene three. However, iPhone's worst average is 39.2 cm in scene two, which can be treated as an experimental error. The maximum in scene one also is smaller than that in scene three. In scenario one, the maximum error is only 56 cm, while the maximum values in scene 2 and 3 are 120.3 cm and 109.3 cm. It may be that scene one has a more suitable environment for testing.
In addition, Table 3 also shows that various smartphones have different results. The iPhone 7P has the best accuracy of relative position among the smartphones. The average error of the iPhone 7P is 7.2 cm in scene one, however, the worst result is obtained from the Samsung Note 5 in scene three, with an average error of 46.6 cm. What caused this is the camera lens of each smartphone is different, as well as testing in different environments.
There are many differences between the three scenes; in spite of this, smartphones show good performance in this test. All smartphone positioning accuracy can be below 50 cm in each scene. Thus, this method shows our smartphone can provide better positioning accuracy to us.

Comparison between the Smartphones and the Brain
In this paper, mainly in order to simulate the brain border cell function, an image sensor based on smartphones can maintain the smartphone in obtaining the relative position relationship of the border of the object, and provide a location information service for a human being. Thus, at each test point in scene 3, 10 testers were asked to estimate the relative position with the border by themselves. Table 4 shows the average error and maximum error, as well as the standard deviation of the 10 individuals at 30 points. Table 4. Comparison of the human brain and the smartphone brain in scene three (error in centimeters). In Table 4, ten young people were tested in the third scene. Table 4 shows that although the estimated accuracy of tester 5 is good, other people have a weak perception of distance. The worst of them is tester 9: the average of his estimation is 89.8 cm. In addition, Tester 6 has high accuracy when he is close to the border, but in the case of a relatively large distance, his distance cognition is very poor. In comparison with Table 3, it is shown that the average result obtained from smartphones is better than people, and the maximum of the human estimate error ranges from 119.7 cm to 236.4 cm, which is larger than the error of the smartphones. Furthermore, the estimates of the tester are not stable, based on the larger standard deviation. Through the comparison of the smartphone and the tester, we find that the performance of the smartphone is much better than people.

Discussion
In this section, we mainly highlight some of our experiences with the smartphone visual positioning. We will have a deeper discussion with respect to the experimental results.

Accuracy Analysis
In this section, we discuss the error equation of the classical rigorous imaging model by the rational function model used in this paper. Additionally, we offer a discussion on the changing trend of the absolute distance error in distance and angle.
The restoration of the rigorous imaging model by the rigorous imaging model is mainly a process of solving the accumulated error: At the beginning, we have calibrated the phone camera using the LCD screen. Thus, in this equation, we think that the correction of the principal point of the photograph coordinate and focal length are equal to 0. The number of control points is four (n = 4). Due to obtaining photos horizontally, and there is an angle with the door, we assume that α = 90 • , ω = 0 • , k = 0, which are shown in Figure 1. In addition, we assume that x 1 = −x 2 = x 3 = −x 4 = x, y 1 = −y 2 = y 3 = −y 4 = y,

Analysis of Applicability
Despite the fact the various smartphones tested in various places have different relative position errors, the average accuracy is much higher than for humans, thus meeting the user demand. This method can not only acquire the relative position of the border, but also provide reliable border information for the indoor positioning based on the smartphone brain.
In the above test, the difference in the accuracy in different scenes is mainly affected by the environmental factors. First, the quality of the target images will be affected by the surrounding environment. As shown in Figure 4a, the doorframe completely fits the metope, and the doorframe and metope line is distinct. However, the doorframe is prominent against the wall and there are varying degrees of color confusion in Figure 4b,c. The complexity of the environment leads to larger pixel errors. Thus, smartphones working in various environments will have some degree of precision fluctuation. Second, the doorframe sizes vary in different environments, which will affect the scope of imaging in the photograph. Due to the variation of the distortion range, the doorframe sizes will affect the scope of the imaging in the photography, the distortion correction error affected by the doorframe size may lead to positioning errors.
The difference in the various smartphones in the same place is mainly caused by the difference of the camera lens. First, the smartphones have different viewing angles. The iPhone 7P and XIAOMI 5 can obtain a picture containing the whole door in some places very close to the door, while the others cannot. Second, the difference of the lens includes different distortion parameters, while the distortion correction accuracy was slightly different. Additionally, the focusing algorithms of the five smartphones are different, which will lead to differences in the distortion correction.

Comparison of the Smartphone with Brain
In this paper, the prediction of the relative location information using smartphones is generally better than that of the human brain. Although human beings are not very good at relating their relative position with the border, the brain fusion positioning system is still worth learning. With the improvement performance of the smartphone, the sensor of it becomes more abundant. Thus, the smartphone's perception of environmental information is bound to surpass human capabilities. Perhaps we can simulate our brain GPS system to make full use of the environment information that is perceived by the phone. In this paper, we simulated the function of border cells, and the result we obtained can be used in a smartphone indoor positioning system. In the future, we will simulate the system of the brain, and the result may be better.

Conclusions
We have presented the visual location method based on a doorframe. This method achieved the function of border cells that obtain the relative position of the border. We experimented with multiple phones in different environments, and the result shows the universality of this method. On the other hand, by contrast with the border perception ability of the human brain, this method can be used to support the human indoor location perception service.