Non-Contact Measurement and Identification Method of Large Flexible Space Structures in Low Characteristic Scenes

The end-operation accuracy of the satellite-borne robotic arm is closely related to the satellite attitude control accuracy, and the influence of the vibration of the satellite’s flexural structure on the satellite attitude control is not negligible. Therefore, a stable and reliable vibration frequency identification method of the satellite flexural structure is needed. Different from the traditional non-contact measurement and identification methods of large flexible space structures based on marker points or edge corner points, the condition of non-marker points relying on texture features can identify more feature points, but there are problems such as low recognition and poor matching of features. Given this, the concept of ‘the comprehensive matching parameter’ of scenes is proposed to describe the scene characteristics of non-contact optical measurement from the two dimensions of recognition and matching. The basic connotation and evaluation index of the concept are also given in the paper. Guided by this theory, the recognition accuracy and matching uniqueness of features can be improved by means of equivalent spatial transformation and novel relative position relationship descriptor. The above problems in non-contact measurement technology can be solved only through algorithm improvement without adding hardware devices. On this basis, the Eigensystem Realization Algorithm (ERA) method is used to obtain the modal parameters of the large flexible space structure. Finally, the effectiveness and superiority of the proposed method are verified by mathematical simulation and ground testing.


Introduction
With the development of space technology, spacecraft and their accessory parts-such as solar wings, truss brackets, antennas, and robotic arms-are increasingly presented in the form of large flexible structures [1]. To reduce the spacecraft's load, the flexible parts are mostly made of lightweight and ultra-thin composite materials. This large deflection flexible material has the characteristics of low stiffness, low-frequency modal density, and relatively small modal damping ratio. When the satellite is on orbit, flexible attachments of the satellite are susceptible to external intensity, causing continuous vibration, which is difficult to attenuate. This will lead to changes in the overall dynamics model of the satellite, thus causing a great impact on the satellite attitude control [2], which will lead to the degradation and failure of the satellite performance [3]. Finally, the vibration will indirectly affect the accuracy of the end-effector of the on-board robotic arm. Therefore, many countries are studying the on-orbit identification of dynamic characteristics of flexible structures attached to spacecraft [4]. Due to the difference between the space environment and the ground environment, the vibration characteristics of the flexible structure are also different. The simulation often cannot effectively simulate the space environment, nor can it predict accidental external excitation. The simulation results cannot reflect the on-orbit state of the spacecraft. Therefore, the on-orbit identification of the vibration information of the satellite flexible appendage is urgently needed.
The existing on-orbit modal identification methods are mainly divided into two categories. One is based on the existing measurement devices or control system data on the star body to complete the measurement. For example, Tang et al. [1] proposed a method for on-orbit modal identification of flexible structures of spacecraft based on reaction wheel actuators. The other is to observe the vibration of large flexible structures by adding special measuring devices. This method is more accurate and reliable and can be divided into contact and non-contact [5]. In Japan, several experiments were conducted on the Engineering Test Satellite-6 and Engineering Test Satellite-8. The modal parameters of the solar wing are identified based on the data from the satellite's own attitude control system and the data measured by the acceleration sensors attached to the flexible solar wing at the same time [6,7]. This contact measurement method attaches the measurement device to the surface of the flexible structure, which affects the dynamics information of the flexible attachment itself and leads to errors in the identification information. At present, the better solution is non-contact measurement. Non-contact optical measurement based on binocular cameras is an effective technical approach, which can be subdivided into optical measurements with and without marking points. Some international research institutions have explored the measurement method of unmarked points. The Image Science and Analysis Group (ISAG) performed modal identification of the ISS 2A solar wing by binocular camera [8], but only two feature points at the end of the 2A solar wing could be identified. NASA obtained more dynamic information about feature points in the PASDE experiment with an optical measurement scheme, but six cameras were used for this purpose [9], which increased the complexity of the hardware system and algorithm. Relevant research has also been carried out in China, but the main examples of research are focused on the method with marker points. Wu et al. [10] proposed a visual measurement method based on circular marker points. Zang et al. [11] also used the method of attaching circular reflective markers on the solar wing for modal identification. Qiu et al. [5] used self-luminous light sources as measurement points, and then used a binocular camera for parameter identification. All these on-orbit measurement schemes add other physical devices, such as passive reflective marker points or active light sources, to the binocular vision system. The introduction of physical devices increases the complexity of on-orbit measurement and reduces the stability of the measurement system. Therefore, studies regarding a modal measurement identification method with rich identifiable features and good stability without marker points are urgently needed.
However, there are certain technical difficulties in this kind of measurement method without marking points. If the corner points of the edge of the flexible sails are used as identification points, there are fewer identifiable features, which is not conducive to improving the identification accuracy. If the texture features-such as the grid corner points-of the solar wing are identified, different feature points are highly similar to each other. When the optical camera identifies the feature points and matches the feature points, the similar descriptors will lead to a higher mismatch rate, so that the spatial threedimensional coordinates of the feature points cannot be accurately measured. Moreover, in the case of no identification points, the feature points extracted by the optical camera are often different between frames. Therefore, it is more difficult to obtain the continuous dynamic coordinate values of the feature points at fixed positions on the flexible solar wing compared to the case with identification points. In addition, the measurement accuracy of an optical camera is usually lower than that of traditional contact measurements, so the improvement of measurement accuracy is also an issue to be considered.
To solve the above problems, this paper first introduces the concept of the comprehensive matching parameter of scenes and then proposes a new feature point detection and matching method for the grid features on satellite flexible solar wings. This method improves the characteristic of the scene and the uniqueness of feature matching through mathematical methods such as projection transform and image processing methods such as clustering and binaryzation, as well as proposing new descriptors to solve the problems that features of the solar wing are highly repetitive and not easy to match. Finally, under the condition of no additional physical target, the spatial 3D coordinates of the grid feature points on the satellite flexible solar wings are continuously tracked by the binocular vision system and used for modal recognition. The on-orbit vibration recognition simulation test is completed for the low feature degree scene of the satellite flexible attachment, and the accuracy of the test results is relatively high, which verifies the feasibility of the theory in this paper.

Structural Dynamics Modeling Analysis of Flexible Solar Wing
In this paper, we construct a multi-body dynamics model [12] of the whole star based on the coupled satellite model of the central rigid body and flexible attachment shown in Figure 1, in which the central rigid body is C, the origin of the coordinate system, O C is located at the center of mass of the rigid body, and the two flexible solar wings are Flex 1 and Flex 2 , respectively. The two flexible solar wings are fixed on the side near the satellite body and the ends are free. According to the hypothetical modal method, the elastic displacement of any point on the flexible solar wing is shown in Equation (1): where φ(r) is 3 × ∞ order modal function matrix and η(t) is ∞ × 1 order modal coordinate vector. According to the Lagrange equation, the dynamics equation of the system can be established as follows: m efficient matrices, determined by the parameters of the system itself, reflecting the characteristics of the system. In this section, the dynamics model of the rigid-flexible coupled spacecraft is established, and the state-space equation of the system is obtained. This dynamic model can provide the output data of the system. We design the measurement scheme based on the dynamic model in this section to obtain the observed data for the identification of the vibration modal information of the flexible attachment.  Figure 2 is the on-orbit measurement simulation test scheme used in this paper. The flexible solar wing coordinate system and camera coordinate system are shown in the figure. The test device simulates the satellite body and the unilateral solar wing. The flexible solar wing mainly vibrates freely along the B Z direction of its coordinate system. In the process of camera selection, the camera frame rate is considered to be 5 to 10 times of the sampling frequency. The camera resolution needs to ensure that the measurement accuracy reaches the submillimeter level. The global shutter camera is used, and the external signal generator controls the camera to trigger synchronously by outputting a fixed frequency square wave signal. Appropriate cameras are arranged as far as possible on one edge of the satellite center rigid body to increase the baseline length. The longer baseline length is beneficial to improve the measurement accuracy of the binocular system. The angle between the axes of the two cameras and the baseline is acute to increase the overlapping range of the two cameras' field of view. Considering the camera image quality, the camera focal length is controlled in the middle of the flexible solar wing, thus ensuring the clarity of the acquired images. After the cameras take pictures, the image data are transmitted to the processor for post-processing through the network interface, and the whole system is powered by a DC power supply. x s +D s x s (5) where

On Orbit Measurement Scheme Design and Test Process
, C s and D s are the observation matrices of velocity and displacement, respectively. A s and B s are constant coefficient matrices, determined by the parameters of the system itself, reflecting the characteristics of the system. In this section, the dynamics model of the rigid-flexible coupled spacecraft is established, and the state-space equation of the system is obtained. This dynamic model can provide the output data of the system. We design the measurement scheme based on the dynamic model in this section to obtain the observed data for the identification of the vibration modal information of the flexible attachment. Figure 2 is the on-orbit measurement simulation test scheme used in this paper. The flexible solar wing coordinate system and camera coordinate system are shown in the figure. The test device simulates the satellite body and the unilateral solar wing. The flexible solar wing mainly vibrates freely along the Z B direction of its coordinate system. In the process of camera selection, the camera frame rate is considered to be 5 to 10 times of the sampling frequency. The camera resolution needs to ensure that the measurement accuracy reaches the submillimeter level. The global shutter camera is used, and the external signal generator controls the camera to trigger synchronously by outputting a fixed frequency square wave signal. Appropriate cameras are arranged as far as possible on one edge of the satellite center rigid body to increase the baseline length. The longer baseline length is beneficial to improve the measurement accuracy of the binocular system. The angle between the axes of the two cameras and the baseline is acute to increase the overlapping range of the two cameras' field of view. Considering the camera image quality, the camera focal length is controlled in the middle of the flexible solar wing, thus ensuring the clarity of the acquired images. After the cameras take pictures, the image data are transmitted to the processor for post-processing through the network interface, and the whole system is powered by a DC power supply.

On Orbit Measurement Scheme Design and Test Process
After setting up the hardware platform, the experiment is carried out according to the process shown in Figure 3. After the camera is installed, the calibration plate information is first collected to calibrate the camera, and the internal and external parameters of the camera are obtained. The calibrated camera is used to collect the vibration information of the flexible solar wing. The image pixel grayscale data are used as the input information of the feature enhancement and feature extraction module to detect the texture boundary of the solar wing. The feature points are identified, and the pixel coordinates of the feature points are used as output. This process is mainly completed under the guidance of the theory in Section 4. Then, pixel coordinates of the feature point are used as the input of the feature point spatial 3D information extraction module to calculate the spatial 3D position and velocity information of the target features. Finally, the three-axis velocity of the target feature point is input into the modal identification algorithm in Section 5  After setting up the hardware platform, the experiment is carried out according to the process shown in Figure 3. After the camera is installed, the calibration plate information is first collected to calibrate the camera, and the internal and external parameters of the camera are obtained. The calibrated camera is used to collect the vibration information of the flexible solar wing. The image pixel grayscale data are used as the input information of the feature enhancement and feature extraction module to detect the texture boundary of the solar wing. The feature points are identified, and the pixel coordinates of the feature points are used as output. This process is mainly completed under the guidance of the theory in Section 4. Then, pixel coordinates of the feature point are used as the input of the feature point spatial 3D information extraction module to calculate the spatial 3D position and velocity information of the target features. Finally, the three-axis velocity of the target feature point is input into the modal identification algorithm in Section 5 for modal analysis. The information of modal order, fundamental frequency, and modal coordinates are output. The related test equipment parameters, assembly relationships, and test results are presented in Section 6.

The Concept and Connotation of Comprehensive Matching Parameter
In the experimental process shown in Figure 2, feature point extraction and matching are the key steps. In the existing optical measurement schemes with marker points, the presence of marker points directly provides identifiable features at fixed locations on the solar wing. It is only necessary to find the marker points in each image frame. The significant difference between marker points and background information greatly reduces the difficulty of feature extraction. Moreover, the marker points are clearly distributed. It is easy to find the correspondence of the marker points in the left and right eye images of the binocular camera, so the problems of feature point matching and 3D spatial information measurement are also easily solved. However, in the optical measurement problem without marking points studied in this paper, the camera does not have a clear identification target, and the corner points or speckle features of the solar wing grids are highly similar, so it is difficult to establish the correspondence between the same feature points in the left and right eye images, which is prone to mismatching. In addition, in the low-feature scene studied in this paper, the solar wing grid texture features are not easy to extract stably during its vibration process, and the features are highly repetitive. When dealing with such issues, many traditional algorithms can be used, such as DOG operator, SIFT operator, SURF operator, ORB operator [13][14][15], and so on. However, these methods ignore the enhancement of the scene features themselves and cannot improve the distinguishability of the scene features. Therefore, even if the binocular camera can correctly extract feature points, it is difficult to achieve the correct matching between the left and right eye images. Due to the lack of theoretical guidance, the existing unmarked point feature extraction and matching methods usually have no ideal practical effect. To solve this problem and complete the on-orbit measurement scheme, this paper proposes the following concepts. Definition 1. The comprehensive matching parameter of the scene is an index to judge whether the features of the scene itself are easy to identify and distinguish, and whether the feature point matching is robust and correct. This index is expressed by the letter 'CMP'. It contains two contents: the characteristic of the scene and the uniqueness of the matching.

Definition 2.
The characteristic of the scene refers to the property that feature points in the scene can be easily identified continuously and stably, and feature points have clear distinguishable indicators from each other. It is expressed by index 'char'. Definition 3. The uniqueness of matching means that the matching process of binocular camera feature points can form a one-to-one mapping relationship, i.e., it is always possible to find a unique and correct corresponding feature point in the right camera image for the feature point in the left camera image. It is expressed by index 'MU'.

Definition 4.
The relative position relation descriptor is a descriptor that describes the positionrelated features of feature points, which is used to distinguish repetitive features in low-feature scenes.
This chapter will introduce the relevant content and usage of these definitions in detail. The Definitions 1-3 are used to solve the problem of optical measurement without marker points. Definitions 3 and 4 are used to solve the problem of how to extract the dynamic coordinates of feature points in time domain in different images without marker points.

Stable Extraction of Repetitive Features
As shown in Figure 4, the existing feature point extraction algorithms cannot guarantee that the feature points detected in one frame of the camera can still be detected in the next frame due to factors such as viewing angle, illumination, and resolution. However, the Sensors 2023, 23, 1878 7 of 21 method in this paper requires stable tracking of fixed feature points in each frame to ensure stable extraction and stable matching. In this way, we can obtain continuous feature point position coordinate data and use them as input information for modal recognition.  To achieve stable feature extraction, the scene images are first clustered using the Kmean algorithm. The criterion function of the algorithm when there are K pattern classes is as follows: where j S is the jth aggregation class and the clustering center is j Z . j N is the number of samples contained in the jth aggregation class j S . According to the clustering algorithm criterion of the K-means algorithm, the clustering centers should be chosen such that the value of the criterion function J is extremely small, i.e., Equation (8) can be solved as follows: For the satellite on-orbit scene, the K value of the pattern class is taken as 3, and the scene image is binarized according to the clustering results to highlight the desired features. Then, the Sobel operators x s and y s are introduced in the x and y directions of the image pixel coordinates, respectively, from which the gradient amplitude and direction at each pixel point of the image are calculated in turn as shown in Equation (10) and Equation (11), respectively.
When the gradient value is large, the corresponding pixel point is the boundary point on the image, so that the pixel points surrounding each grid of the flexible solar wing can be found. Pixel points  To achieve stable feature extraction, the scene images are first clustered using the K-mean algorithm. The criterion function of the algorithm when there are K pattern classes is as follows: where S j is the j-th aggregation class and the clustering center is Z j . N j is the number of samples contained in the j-th aggregation class S j . According to the clustering algorithm criterion of the K-means algorithm, the clustering centers should be chosen such that the value of the criterion function J is extremely small, i.e., Equation (8) can be solved as follows: For the satellite on-orbit scene, the K value of the pattern class is taken as 3, and the scene image is binarized according to the clustering results to highlight the desired features. Then, the Sobel operators s x and s y are introduced in the x and y directions of the image pixel coordinates, respectively, from which the gradient amplitude and direction at each pixel point of the image are calculated in turn as shown in Equation (10) and Equation (11), respectively.
When the gradient value is large, the corresponding pixel point is the boundary point on the image, so that the pixel points surrounding each grid of the flexible solar wing can be found. Pixel points x p (i), y p (j) of a grid boundary constitute a set B, and the number of all pixels in the set is size B . From this, the coordinates of the grid center feature points can be obtained as shown in Equation (12). Each grid center point can be found as a feature point. This method ensures the stability of feature point extraction and ensures that the found feature points can be evenly distributed on the surface of the solar wing, avoiding the aggregation of feature points.

Characteristic Enhancement
Scene feature points themselves have strong repeatability, that is, the 'appearances' of the feature points are similar, which is the inherent characteristic of the scene and cannot be changed. Therefore, this paper considers adding a unique 'label' to the extracted feature points from other angles as a distinction to enhance the scene characteristics. In the solar wing coordinate system shown in Figure 2, each detectable feature point in the solar wing plane is neatly arranged in the form of a "queue". If the solar wing coordinate system plane X B O B Y B is transformed to be parallel to the camera pixel plane, the pixel coordinates of each feature point will also be neatly arranged in the form of "queue" and easy to distinguish. This transformation relationship is expressed by Equation (13), where P N is the pixel coordinates after projection transformation, P P is the pixel coordinates of the detected feature points in the original image, R and T are the rotation and translation transformation matrices between the camera coordinate system and the solar wing coordinate system, respectively. The equivalent transformation is shown in Figure 5, in which the original tilted observation view is transformed to a forward view perpendicular to the solar wing plane X B O B Y B . In this process, the optical axis direction of the camera u = u 1 e 1 + u 2 e 2 + u 3 e 3 tends to coincide with the normal vector of the flexible solar wing plane v = v 1 e 1 + v 2 e 2 + v 3 e 3 , i.e., Equation (14).
points can be obtained as shown in Equation (12). Each grid center point can be found as a feature point. This method ensures the stability of feature point extraction and ensures that the found feature points can be evenly distributed on the surface of the solar wing, avoiding the aggregation of feature points.

Characteristic Enhancement
Scene feature points themselves have strong repeatability, that is, the 'appearances' of the feature points are similar, which is the inherent characteristic of the scene and cannot be changed. Therefore, this paper considers adding a unique 'label' to the extracted feature points from other angles as a distinction to enhance the scene characteristics. In the solar wing coordinate system shown in Figure 2, each detectable feature point in the solar wing plane is neatly arranged in the form of a "queue". If the solar wing coordinate system plane B B B X O Y is transformed to be parallel to the camera pixel plane, the pixel coordinates of each feature point will also be neatly arranged in the form of "queue" and easy to distinguish. This transformation relationship is expressed by Equation (13), where N P is the pixel coordinates after projection transformation, P P is the pixel coordinates of the detected feature points in the original image, R and T are the rotation and translation transformation matrices between the camera coordinate system and the solar wing coordinate system, respectively. The equivalent transformation is shown in Figure 5, in which the original tilted observation view is transformed to a forward view perpendicular to the solar wing plane B B B X O Y . In this process, the optical axis direction of the camera v (14).
Mathematical transformation  From Equation (14), it can be seen that this conversion process is only a change in the vector angle and does not change the nature of the feature points themselves. However, as shown in Figure 6, the figure illustrates the distribution of some feature points in one frame during the image acquisition. After characteristic enhancement, the feature points From Equation (14), it can be seen that this conversion process is only a change in the vector angle and does not change the nature of the feature points themselves. However, as shown in Figure 6, the figure illustrates the distribution of some feature points in one frame during the image acquisition. After characteristic enhancement, the feature points are neatly distributed in the pixel coordinate system with strong positional properties, and the feature point distribution tends to be more discrete, so that the originally chaotic and similar feature points are significantly distinguished. This means that the feature nature of the scene is enhanced. According to the coordinate attribute of the feature point, the scene Sensors 2023, 23, 1878 9 of 21 characteristic evaluation index char value is given based on the dispersion degree of the feature point distribution, as shown in Equation (15).
Sensors 2023, 23, x FOR PEER REVIEW 10 of 23 are neatly distributed in the pixel coordinate system with strong positional properties, and the feature point distribution tends to be more discrete, so that the originally chaotic and similar feature points are significantly distinguished. This means that the feature nature of the scene is enhanced. According to the coordinate attribute of the feature point, the scene characteristic evaluation index char value is given based on the dispersion degree of the feature point distribution, as shown in Equation (15) In Equation (16), the function nor is shown in Equation (17), which is a normalized function containing an inverse tangent function that maps variance values greater than 0 to the interval 0,1     . C is the size factor, which reflects the measurement target size and the acquisition image size, and is used to amplify the data difference. S X is the independent variable, i.e., the variance 2 x S or 2 y S , with expressions as in Equations (18) and (19).
After the above derivation, the comprehensive expression of char value can be obtained as shown in Equation (20). The char value is used as the evaluation index of the characteristic of the scene. The value of char describes the dispersion degree of feature points in the x and y directions of pixel coordinates. Its value usually ranges from 0 to 1. The larger the value is, the higher the dispersion of the feature points, and the more discrete and uniformly distributed feature points tend to be independent of each other whilst having better location differentiation. In Equation (15), S 2 x is the variance of pixel coordinates x of feature points, S 2 y is the variance of pixel coordinates y of feature points. The whole equation is the geometric mean of f di f S 2 x and f di f S 2 y . f di f is a normalization function considering the scene size and image size parameters. Since different image resolutions have different sizes, the variance size of the pixel coordinate distribution is not consistent. In order to unify the description, the variance value needs to be normalized, f di f as shown in Equation (16).
In Equation (16), the function nor is shown in Equation (17), which is a normalized function containing an inverse tangent function that maps variance values greater than 0 to the interval [0, 1]. C is the size factor, which reflects the measurement target size and the acquisition image size, and is used to amplify the data difference. X S is the independent variable, i.e., the variance S 2 x or S 2 y , with expressions as in Equations (18) and (19).
After the above derivation, the comprehensive expression of char value can be obtained as shown in Equation (20). The char value is used as the evaluation index of the characteristic of the scene. The value of char describes the dispersion degree of feature points in the x and y directions of pixel coordinates. Its value usually ranges from 0 to 1. The larger the value is, the higher the dispersion of the feature points, and the more discrete and uniformly distributed feature points tend to be independent of each other whilst having better location differentiation.
So far, without adding additional physical devices, the effective feature points are identified using binocular vision system and the pixel information of the feature points on the left and right eye images are obtained. Then, the location characteristics of these feature information in the scene are enhanced by mathematical methods. The existing feature point detection and matching algorithms do not work well in low-feature scenes precisely because they cannot form this one-to-one mapping relationship. Some people have improved the traditional method to improve the matching accuracy [16]. Some artificial intelligence algorithms build convolutional neural networks, improve descriptors and feature point processing methods, etc. [17][18][19][20][21], which achieves a good matching effect, but also cannot achieve continuous and stable tracking of certain feature points. Their detected feature points are still different in different frames so that the feature point position data cannot be used for the identification of flexible solar wing modalities. In order to solve this problem, this paper proposes the relative positional relationship descriptor in Definition 4, which is described in detail as follows.
First, the region of interest is found, and the distribution statistics of the feature points located in the region of interest are calculated according to the pixel coordinate x direction. Due to the presence of the boundary at the folded connections of the solar wing, the statistical graph of the characteristic pixel coordinate distribution appears as the distributed pixel interval shown in Figure 7. Let us assume that the left boundary of the compartment is b l and the right boundary is b r . When Equation (21) is satisfied, the interval is considered to be the boundary of the pixel coordinates, that is, the gap at the junction of the flexible solar wing. η boundary in Equation (21) is the threshold for determining the boundary. The boundary feature and the boundary coordinates reflect the position boundary characteristics of the pixel coordinates x direction of the feature points in the region of interest. The camera can completely capture the upper and lower boundaries of the solar wing in the y direction of the pixel coordinate, so the upper and lower boundaries of the solar wing in the image can be used as a marker to locate the feature points.
So far, without adding additional physical devices, the effective feature points are identified using binocular vision system and the pixel information of the feature points on the left and right eye images are obtained. Then, the location characteristics of these feature information in the scene are enhanced by mathematical methods.

Relative Position Relationship Descriptors
The existing feature point detection and matching algorithms do not work well in low-feature scenes precisely because they cannot form this one-to-one mapping relationship. Some people have improved the traditional method to improve the matching accuracy [16]. Some artificial intelligence algorithms build convolutional neural networks, improve descriptors and feature point processing methods, etc. [17][18][19][20][21], which achieves a good matching effect, but also cannot achieve continuous and stable tracking of certain feature points. Their detected feature points are still different in different frames so that the feature point position data cannot be used for the identification of flexible solar wing modalities. In order to solve this problem, this paper proposes the relative positional relationship descriptor in Definition 4, which is described in detail as follows.
First, the region of interest is found, and the distribution statistics of the feature points located in the region of interest are calculated according to the pixel coordinate x direction. Due to the presence of the boundary at the folded connections of the solar wing, the statistical graph of the characteristic pixel coordinate distribution appears as the distributed pixel interval shown in Figure 7. Let us assume that the left boundary of the compartment is l b and the right boundary is r b . When Equation (21) is satisfied, the interval is considered to be the boundary of the pixel coordinates, that is, the gap at the junction of the flexible solar wing. boundary η in Equation (21)  After finding the boundary with positioning function, the relative position relationship descriptor describing the position attribute of feature points is used to describe different feature points, and the feature points are matched. The descriptors of traditional feature point matching algorithms are usually associated with the gray gradient transform of the pixels around the feature points. As shown in Figure 8, the SIFT method, which is the most effective traditional algorithm [16], is used to compare with this method. The SIFT method uses a 4 × 4 × 8 vector of 128 dimensions to describe the magnitude and direction of the gray gradient of pixels within a fixed radius of the feature point. However, the similar "appearance" of the feature points in the solar wing scene means that the grayscale variations around the feature points are similar, i.e., the descriptors are similar, and the correct matching relationship cannot be found. In this paper, we describe the distance of each feature point to the x and y boundary using the relative position relationship descriptors based on the found boundary features, i.e., Param1-6 in Figure 8. Since the feature points tend to be neatly arranged according to the queue relation after the projection transformation, the relative position relation descriptors of different feature points have obvious differences. Suppose the pixel coordinate of a feature point is x p (i, j), y p (i, j) . The i and j in the coordinates describe this queue relationship. The feature point on the right of the current feature point can be described as x p (i + 1, j), y p (i + 1, j) . The feature point above the current feature point can be described as x p (i, j + 1), y p (i, j + 1) . Given the pixel coordinate x direction threshold η p 1 and y direction threshold η p 2 , there is only one feature point satisfying the pixel coordinate limit of Equation (22). At this point, the relative positional relationship descriptor RPD can be expressed in the form of Equation (23).

Evaluation of Matching Effect
To verify the feature point recognition and matching effect of this method, this paper's method is compared and analyzed with the SIFT method and SuperGlue algorithm, which is based on graph neural network [22,23]. Mikolajczyk et al. [18] used the correct rate and recall rate as the evaluation index of feature matching in their study. Here, the two are fused and the MU value is used as an evaluation metric for comparative analysis. MU is expressed as follows. In conjunction with the representation of the relative position relationship descriptor in Figure 8, it is clear that Param1 is not equal to Param3, so possible match point 1 can be excluded. When the distance between two feature points is less than a given threshold, the two feature points are considered to be corresponding. In this example, Param1 is approximately equal to Param5, Param2 is approximately equal to Param6, so that the correct matching is obtained.
Compared with traditional methods, the method in this paper can clearly distinguish different feature points with few parameters, which makes feature point matching efficient and accurate.

Evaluation of Matching Effect
To verify the feature point recognition and matching effect of this method, this paper's method is compared and analyzed with the SIFT method and SuperGlue algorithm, which is based on graph neural network [22,23]. Mikolajczyk et al. [18] used the correct rate and recall rate as the evaluation index of feature matching in their study. Here, the two are fused and the MU value is used as an evaluation metric for comparative analysis. MU is expressed as follows.
where, R is the recall rate of matching results, P is the correct rate of matching results, and b is the weight of correct rate in the evaluation index. The values of MU, P, and R range from 0 to 1. When the values of P and R are both 1, it means that all feature points are found and all of them are matched correctly, which is an ideal situation. It should be guaranteed that the location information of feature points on the solar wing could be collected accurately and continuously. We are more concerned about the correctness rate than the recall rate. To meet this requirement, the correctness rate of the matching result is required to be very high, so the value of P needs to be given a high weight. Select b = 20.
In order to solve the value of MU, we need to calculate the correct match rate P and the recall rate R, as in Equation (25).
According to Equation (28), the theoretical pixel coordinate value (u 2C, v 2C ) of the feature point identified in the left eye in the right eye can be calculated, and the actual value (u 2T, v 2T ) of the pixel coordinate of the relevant feature point can be read from the right eye image; given the threshold η as shown in Equation (29), when the threshold η is less than the set value, it means that the theoretical calculated value and the actual detected value match, that is, the correct feature point is matched, and then the evaluation index MU value can be found. Combined with the evaluation parameter char in Section 4.2, the comprehensive matching parameter (CMP) of the scene can be obtained by taking the summation average of the parameter MU and the parameter char from Equation (30). The value of CMP ranges from 0 to 1, and the closer the value is to 1, the stronger the characteristic of the scene and the better the matching of feature points. Ideally, the value of CMP can be taken to 1, which means that the characteristic of the scene is very high, the features are easy to extract and distinguish, and the matching is completely correct after extracting feature points.
So far, we have enhanced, identified, and extracted the scene features from two dimensions of the comprehensive matching parameter of the scene. We can extract the continuous pixel coordinate change data of certain feature points steadily under the premise of binocular vision system only. The continuous change data of the spatial 3D position of feature points can be obtained using these feature point pixel coordinates combined with camera parameters. The three-axis velocity of the spatial motion of the feature point in the camera coordinate system is obtained by differentiating the spatial three-dimensional coordinates in the time domain. This velocity information is used as the input parameter of the modal recognition algorithm.

Modal Identification Algorithms
In Section 4, a continuous sequence of spatial 3D coordinates of feature points in the time domain is obtained by a binocular vision system. The modal parameter information of the system can be obtained by feeding the data into the modal recognition algorithm. ERA and Stochastic Subspace Identification (SSI) are typical representatives of time-domain methods in modal parameter recognition algorithms, which have been successfully applied in specific models in the aerospace field. Based on them, a series of improved algorithms have been born, such as ERA based on data correlation improved by the random decrement (RDT+ERA/DC) technique, and the data-driven stochastic subspace identification (SSI-DATA) technique. The modal omission problem occurs when these two methods are used alone. However, we find that the operations of these two time-domain modal parameter identification methods are not completely independent, but are related. Therefore, in this paper, we consider the fusion of the two methods RDT+ERA/DC and SSI-DATA, aiming to solve their own modal omission problems. The fusion unified calculation flow is shown in Figure 9.

1.
The cross-correlation analysis of the vibration response output signal is carried out to obtain the time series of cross-correlation function enhanced by modal information.
The cross-correlation function R ijk (τ) of observation points i and j can be expressed in the form of Equation (31).
ϕ ir ϕ js a kr a ks a k −e λr τ λ r +λ s where a k denotes the constant associated with the excitation point k only, ϕ ir and ϕ js denote the r and s order mode oscillations of the observation points i and j, a kr and a ks denote the parameters associated with the excitation point k and the mode order r or s, respectively, b jr = 2n ∑ s=1 ϕ js a kr a ks a k −1 λ r +λ s , λ r and λ s are the main diagonal elements of the diagonal matrix. If we take f (t) as white noise, we obtain the following: where δ(·) denotes the pulse function. So far, based on analyzing the difference and connection between ERA and SSI system time domain identification methods, the correlation analysis technique is introduced, the generalized Hankel matrix is constructed, and the SVD decomposition and least squares method are used to solve the system state matrix and extract the modal parameter information. The false and real modes are distinguished according to MAC and MSV. Thus, the fusion of ERA and SSI time domain methods is realized. It effectively solves the problem of modal omission caused by RDT+ERA/DC and SSI-DATA algorithms because the modes with a small vibration response are eliminated as noise. In order to increase the signal-to-noise ratio of the cross-correlation function of the observed signal, the cross-power spectrum of the observed signal is calculated first, and then the cross-correlation function of the observed signal and its time series are obtained by inverse Fourier transform.

2.
The generalized Hankel matrix constructed from this time series is shown in Equation (33), and the H(0) and H(1) in the generalized Hankel matrix are extracted.
where h(t) is the system unit impulse response function at t time, α and β are arbitrary integers.

3.
Considering that the CVA weighted processing method is highly resistant to noise, the SVD decomposition of the matrix H(0) after weighting processing is performed to obtain the system observability matrix. The singular value decomposition of H(0) can be obtained as follows: where U is the (α + 1)l × 2n dimensional left singular matrix, V is the (β + 1)m × 2n dimensional right singular matrix, S is the 2n × 2n dimensional diagonal matrix. U, V, and S satisfy the requirements of Equation (35). 4.
The least squares method is used to extract the system matrix of the state space model. Introducing E T l = [I l 0 l · · · 0 l ], E T m = [I m 0 m · · · 0 m ] the system matrix of the system state space model is derived as follows:

5.
The information of the modal parameters is extracted from the system matrix. Based on the solutions of vibration theory and state equation, the i order modal frequency ω i , damping ratio ξ i , mode matrix Cψ and amplitude matrix of the system ψ −1 B are obtained based on the ERA method as follows: where s i = ln(z i )/∆t denotes the eigenvalues of the continuous-time system matrix A and ∆t denotes the sampling interval. Based on the SSI method, the i order modal frequency ω i , damping ratio ξ i , and the vibration matrix of the system Φ are obtained as follows: where s i = ln(z i )/∆t, ∆t denote the sampling time interval.

6.
The modal amplitude coherence coefficient (MAC) and modal singular value (MSV) criteria are used to distinguish the real and false modes.
Define the MAC as shown in Equation (39): where the row vector q H i represents the time series of the identified modal amplitude of the i order modal motion, and the value interval of γ i is [0, 1]. γ i → 1 when the i order modal is the true modal, and γ i → 0 when the i order modal is the false modal. MSV is a method for determining the degree of contribution of each order of the identified modal to the impulse response signal and is calculated as follows: where c i is the column vector of the system matrix C and b i is the row vector of the system matrix B. z i is the eigenvalue of the system matrix A. When |z i | < 1 and β → µ (µ is a sufficiently large number), Equation (40) becomes the form of Equation (41).
So far, based on analyzing the difference and connection between ERA and SSI system time domain identification methods, the correlation analysis technique is introduced, the generalized Hankel matrix is constructed, and the SVD decomposition and least squares method are used to solve the system state matrix and extract the modal parameter information. The false and real modes are distinguished according to MAC and MSV. Thus, the fusion of ERA and SSI time domain methods is realized. It effectively solves the problem of modal omission caused by RDT+ERA/DC and SSI-DATA algorithms because the modes with a small vibration response are eliminated as noise.

Mathematical Simulation
A three-dimensional model of the whole star which has a central rigid body coupled with flexible attachment is established. The dimension of the central rigid body is set to 600 × 600 × 800 mm and the dimension of the flexible solar wing is 1600 × 400 × 2 mm, which is the same as the physical model. When the flexible solar wing is subjected to external excitation frequency of 0.66573, the first-order vibration pattern is mainly shown in Figure 10.

Mathematical Simulation
A three-dimensional model of the whole star which has a central rigid body coupled with flexible attachment is established. The dimension of the central rigid body is set to × × 600 600 800 mm and the dimension of the flexible solar wing is × × 1600 400 2 mm, which is the same as the physical model. When the flexible solar wing is subjected to external excitation frequency of 0.66573, the first-order vibration pattern is mainly shown in Figure 10.

Relevant Parameters of the Ground Test
In order to verify the feasibility and measurement accuracy of the method of on-orbit measurement and identification of large flexible structures based on non-contact optical measurement in the low-feature scene of this paper, a flexible solar wing model dynamics test platform is built as shown in Figure 11. The length and width of the three-stage solar array is 1600 mm × 400 mm. The camera baseline length is about 450 mm. The camera resolution is 4096 × 3000. The maximum camera frame rate is 46 fps. Other cameras and lens related parameters are shown in Table 1. During the test, the flexible solar wing is subjected to random external excitation and vibrates freely. The vibration information is collected by the binocular camera, which is then transmitted to the processor for centralized processing.

Relevant Parameters of the Ground Test
In order to verify the feasibility and measurement accuracy of the method of on-orbit measurement and identification of large flexible structures based on non-contact optical measurement in the low-feature scene of this paper, a flexible solar wing model dynamics test platform is built as shown in Figure 11. The length and width of the three-stage solar array is 1600 mm × 400 mm. The camera baseline length is about 450 mm. The camera resolution is 4096 × 3000. The maximum camera frame rate is 46 fps. Other cameras and lens related parameters are shown in Table 1. During the test, the flexible solar wing is subjected to random external excitation and vibrates freely. The vibration information is collected by the binocular camera, which is then transmitted to the processor for centralized processing. test platform is built as shown in Figure 11. The length and width of the three-stage solar array is 1600 mm × 400 mm. The camera baseline length is about 450 mm. The camera resolution is 4096 × 3000. The maximum camera frame rate is 46 fps. Other cameras and lens related parameters are shown in Table 1. During the test, the flexible solar wing is subjected to random external excitation and vibrates freely. The vibration information is collected by the binocular camera, which is then transmitted to the processor for centralized processing.

Flexible Solar Wing
Binocular camera Satellite body equivalent support Figure 11. Physics experiment platform. Firstly, the binocular vision system is calibrated. The intrinsic parameters of the camera are shown in Table 2. The extrinsic parameters are shown in Equation (42). Then, the superiority of the method is verified in terms of the characteristic of the scene and the  Firstly, the binocular vision system is calibrated. The intrinsic parameters of the camera are shown in Table 2. The extrinsic parameters are shown in Equation (42). Then, the superiority of the method is verified in terms of the characteristic of the scene and the uniqueness of the matching. For the scene of this paper, C = 10 4 is selected, and the characteristic of the scene is calculated according to Equation (20), as shown in Table 3.   Table 3. Scene Characteristic Comparison.

Before Enhanced Features After Enhanced Features
Scene characterization char 0.4665 0.9091 The actual effect of the relative position descriptor matching method, SIFT algorithm, and SuperGlue algorithm between different frames is shown in Table 4. From the table, it can be seen that the SIFT algorithm, as the most stable and best performing traditional feature point recognition matching algorithm [16], only has an MU value of 0.5499 in this scene, which indicates that the SIFT algorithm has poor applicability in this scene. The SuperGlue method based on artificial intelligence still maintains a good matching effect in this low-feature scene, but it still has the problem shown in Figure 4. The feature points detected between different frames are quite different, and so the fixed feature points cannot be tracked stably, and the data cannot be used for modal identification. In contrast, the feature point matching algorithm based on the relative position relation descriptor proposed in this paper ensures a better matching effect, and in each frame of the image, the fixed feature points with clear location attributes are tracked stably. Then, the comprehensive matching parameter of the scene is calculated according to Equation (30), and the results are shown in Table 5. The data show that the comprehensive matching parameter of the SIFT algorithm is only 0.5082, indicating that this algorithm is not suitable for low-feature scenes, which is consistent with its actual performance in scene feature point matching. The matching effect of SuperGlue algorithm has been greatly improved due to the introduction of artificial intelligence and should have a better matching effect if the characteristic of the scene is enhanced. The method used in this paper has the highest CMP value, which not only ensures the correct rate of feature matching but also can stably extract the continuous position changes of the required feature points in each frame, which has better practical significance. Finally, the spatial 3D coordinates of the target measurement point were tracked using the calibrated binocular camera. The 3-axis spatial position and velocity of the feature points in the camera coordinate system were obtained as shown in Figures 12 and 13. To evaluate the measurement accuracy of the vision system in this experimental scheme, five OptiTrack series Prime X 22 cameras were used to form a high-precision camera set with a frame rate of 360 fps and a 3D measurement accuracy of ±0.15 mm. The targets used by OptiTrack cameras were pasted at the three target measurement points, and the 3D spatial positions of the three target points were measured. Due to the influence of the target's height, the coordinates of the spatial points obtained by the OptiTrack camera set and the camera in this experimental scheme are not exactly the same. To solve this problem, the measured data from the OptiTrack camera set were transformed to the camera coordinate system of this experimental binocular camera. Then, the Y coordinate difference of the adjacent measured points in the camera coordinate system was calculated. This value was compared with the Y coordinate difference of the feature points measured by the binocular camera in this paper's experimental scheme. The absolute error curve of measurement was obtained as shown in Figure 14. The dashed line in Figure 14 shows the average absolute error in 102 sampling frames, and the value is 0.937044118 mm. Hereby, the binocular camera scheme in this paper can achieve sub-millimeter level measurement accuracy, which is an effective on-orbit measurement method.
After obtaining the motion information of the free vibration process of the flexible solar array under external excitation, the modal identification is carried out by using the fusion method of RDT+ERA/DC and SSI-DATA. The results are shown in Table 6 and Figure 15. The true value of the vibration frequency of the flexible solar wing is 0.7146 Hz, so the experimental identification error is 2.82% and the mathematical analysis error is 6.84%. ular camera in this paper's experimental scheme. The absolute error curve of measurement was obtained as shown in Figure 14. The dashed line in Figure 14 shows the average absolute error in 102 sampling frames, and the value is 0.937044118 mm. Hereby, the binocular camera scheme in this paper can achieve sub-millimeter level measurement accuracy, which is an effective on-orbit measurement method.   After obtaining the motion information of the free vibration process of the flexible solar array under external excitation, the modal identification is carried out by using the fusion method of RDT+ERA/DC and SSI-DATA. The results are shown in Table 6 and Figure 15. The true value of the vibration frequency of the flexible solar wing is 0.7146 Hz , so the experimental identification error is 2.82% and the mathematical analysis error is 6.84%.   After obtaining the motion information of the free vibration process of the flexible solar array under external excitation, the modal identification is carried out by using the fusion method of RDT+ERA/DC and SSI-DATA. The results are shown in Table 6 and Figure 15. The true value of the vibration frequency of the flexible solar wing is 0.7146 Hz , so the experimental identification error is 2.82% and the mathematical analysis error is 6.84%.  Figure 14. Measurement error of this experimental vision scheme. System orders / n hankel matrix singular values

Conclusions
This paper proposes a non-contact optical measurement method for the on-orbit measurement and identification of the modal parameters of large flexible structures of satellites, and the following conclusions are obtained.
The concept of the comprehensive matching parameter of scenes is proposed, based on which the features of the scenes themselves are enhanced through theoretical derivation and mathematical transformation. The relative position relationship descriptor is proposed for the description and matching of feature points so that stable tracking measurements of specific feature points can be achieved by using only binocular cameras without relying on physical identification points.
To demonstrate the feasibility of the above scheme, a physical experimental platform is built for experimental testing and the binocular measurement results are evaluated. The average absolute measurement error is about 0.937 mm. Compared with the traditional targeting scheme, the scheme in this paper takes into account the measurement accuracy while simplifying the experimental setup and improving the stability of the system.
In this paper, the binocular vision system measurement data are applied to the modal identification method based on the fusion of Eigensystem Realization Algorithm and Stochastic Subspace Identification. The first-order frequency of the flexible solar wing vibration is 0.7364 Hz and the discrimination error is 2.82%, which proves that the non-contact optical measurement scheme has good modal identification accuracy and practical application value. In the future, we can consider combining the feature enhancement method or the feature point description method proposed in this paper with the neural network to improve feature point detection efficiency and recognition accuracy.

Conclusions
This paper proposes a non-contact optical measurement method for the on-orbit measurement and identification of the modal parameters of large flexible structures of satellites, and the following conclusions are obtained.
The concept of the comprehensive matching parameter of scenes is proposed, based on which the features of the scenes themselves are enhanced through theoretical derivation and mathematical transformation. The relative position relationship descriptor is proposed for the description and matching of feature points so that stable tracking measurements of specific feature points can be achieved by using only binocular cameras without relying on physical identification points.
To demonstrate the feasibility of the above scheme, a physical experimental platform is built for experimental testing and the binocular measurement results are evaluated. The average absolute measurement error is about 0.937 mm. Compared with the traditional targeting scheme, the scheme in this paper takes into account the measurement accuracy while simplifying the experimental setup and improving the stability of the system.
In this paper, the binocular vision system measurement data are applied to the modal identification method based on the fusion of Eigensystem Realization Algorithm and Stochastic Subspace Identification. The first-order frequency of the flexible solar wing vibration is 0.7364 Hz and the discrimination error is 2.82%, which proves that the noncontact optical measurement scheme has good modal identification accuracy and practical application value. In the future, we can consider combining the feature enhancement method or the feature point description method proposed in this paper with the neural network to improve feature point detection efficiency and recognition accuracy.