An Improved Method Combining ANN and 1D ‐ Var for the Retrieval of Atmospheric Temperature Profiles from FY ‐ 4A/GIIRS Hyperspectral Data

: In our study, a retrieval method of temperature profiles is proposed which combines an improved one ‐ dimensional variational algorithm (1D ‐ Var) and artificial neural network algorithm (ANN), using FY ‐ 4A/GIIRS (Geosynchronous Interferometric Infrared Sounder) infrared hyper ‐ spectral data. First, according to the characteristics of the FY ‐ 4A/GIIRS observation data using the conventional 1D ‐ Var, we introduced channel blacklists and discarded the channels that have a large negative impact on retrieval, then used the information capacity method for channel selection and introduced a neural network to correct the satellite observation data. The improved 1D ‐ Var effec ‐ tively used the observation information of 1415 channels, reducing the impact of the error of the satellite observation and radiative transfer model, and realizing the improvement of retrieval accu ‐ racy. We subsequently used the improved 1D ‐ Var and ANN algorithms to retrieve the temperature profiles, respectively, from the GIIRS data. The results showed that the accuracy when using ANN is better than using improved 1D ‐ Var in situations where the pressure ranges from 800 hPa to 1000 hPa. Therefore, we combined the improved 1D ‐ Var and ANN method to retrieve temperature pro ‐ files for different pressure levels, calculating the error by taking sounding data published by the University of Wyoming as the true values. The results show that the average error of the retrieved temperature profiles is smaller than 2 K when using our method, this method makes the accuracy of the retrieved temperature profiles superior to the accuracy of the GIIRS products from 10 hPa to 575 hPa. All in all, through the combination of the physical retrieval method and the machine learn ‐ ing retrieval method, this paper can certainly provide a reference for improving the accuracy of products. amount of observation data for the same area, which is very suitable for training machine learning models. This paper combines traditional physical retrieval methods with a machine learn ‐ ing method, which can provide a reference for improving the accuracy of products. We can improve the utilization rate of the GIIRS observation information by improving the retrieval method. In short, we believe that the GEO satellite ‐ based infrared hyperspectral detection technology has broad development prospects.


Introduction
Infrared satellite hyperspectral vertical detection of atmospheric state can obtain abundant atmospheric spectral information and be able to distinguish more detailed atmospheric vertical structure [1][2][3]. The atmospheric temperature profile is an important atmospheric parameter. Using infrared hyperspectral data, combined with relevant atmospheric infrared radiative transfer models and using 1DVar, neural networks, and other methods to retrieve atmospheric temperature profiles becomes a powerful method for the large-scale, high-frequency, and high-precision acquisition of atmospheric temperature and humidity profiles [4,5].
Fengyun-4A (FY-4A), a new generation of geostationary orbit meteorological satellites for quantitative remote sensing applications in China, was successfully launched on 11 December 2016. Geosynchronous Interferometric Infrared Sounder (GIIRS) is the first instrument in the world to detect the vertical atmospheric structure by means of infrared interferometry spectroscopy in geostationary orbit [6][7][8]. Its primary instrument is a timemodulated Fourier transform spectrometer, whose main purpose is to implement the vertical structure observation of atmospheric temperature and humidity parameters, to improve the resolution of meteorological vertical observation, and to provide services for numerical prediction, weather monitoring, and atmospheric chemical composition detection [9]. Luo et al. [10] evaluated the FY-4A/GIIRS retrieval ability of atmospheric temperature, humidity, ozone, and other atmospheric compositions with Entropy Reduction (ER) and Degrees of Freedom for Signal (DFS) as criteria, combining them with the 1976 American standard atmospheric profile database. The results showed that the ER of temperature and humidity was 37.53 and 28.79, and the DFS for the signal was 10.78 and 8.08, respectively.
At present, the main retrieval methods [11] of temperature and humidity profiles using infrared hyperspectral data include feature vector statistical regression, artificial neural network, and physical methods. Guan et al. [12], Jiang et al. [13], and Zhang et al. [14] used spectral radiance observations from Atmospheric Infrared Sounder (AIRS) to retrieve atmospheric temperature and humidity profiles in China using feature vector statistical regression. The results showed that the obtained temperature and humidity profiles were nearly consistent with the data distribution of the European Centre for Medium-Range Weather Forecasts (ECMWF). Ma et al. [15] used a feature vector method to obtain the initial profile, then used Newton's nonlinear iterative method to further improve the retrieval accuracy. Although the calculation of the feature vector method is quick and simple, it cannot consider the whole of the physical process of atmospheric radiative transfer, result in a poor retrieval accuracy.
The development of ANN can effectively improve the accuracy and speed of atmospheric remote sensing retrieval [16][17][18][19][20], among which Cabrera Mercader C.R., et al. [21] used the ANN retrieval algorithm to retrieve the atmospheric humidity profiles respectively on land and sea. The results showed that the RMSE of relative humidity was 6-14% over the ocean and 6-15% over the land. Cai [22] also retrieved FY-4A/GIIRS atmospheric temperature and humidity profiles using ANN and used AIRS related products for verification. The results showed that the average temperature RMSE in the troposphere was less than 1 K, and the humidity below 200 hPa was better than 10%. However, the ANN retrieval algorithm is not suitable for areas where data is scarce and there are not enough training samples. Therefore, ANN cannot construct a representative retrieval model for these locations [23,24].
The physical retrieval method takes into account the process of atmospheric radiative transfer and is not dependent on sample training [25,26]. Guan et al. [1] used IASI (Infrared Atmospheric Sounding Interferometer) infrared hyperspectral data and 1D-Var to retrieve atmospheric temperature and humidity profiles. The retrieval results showed that the absolute error for the temperature average value was less than 0.6 K, and the average absolute value of error for the water vapor mixing ratio was less than 0.022 g/kg. Shen et al. [27] used CrIS satellite data and obtained retrieval results of atmospheric temperature and humidity profiles using the NOAA Unique Combined Atmospheric Processing System (NUCAPS) algorithm, indicating that the accuracy was less than 20% when the height of humidity profiles was below 300 hPa. However, the accuracy increases to 30% when the height of humidity profiles was above 300 hPa. Zhu et al. [28] used HIRAS, an infrared hyperspectral atmospheric detector carried by the FY-3D polar orbit meteorological satellite, to retrieve atmospheric temperature and humidity profiles using the 1D-Var retrieval algorithm. The retrieval results are superior to the priori profile as a whole, among which the RMSE of temperature profiles below 100 hPa was less than 1.5 K, the partial pressure level was less than 1 K, and the RMSE of humidity profiles was less than 10% as a whole. Polar orbit satellites are excellent at detecting temperature and humidity information. Geostationary orbit satellites also have the advantages of both high spectral resolution and high temporal resolution to retrieve atmospheric temperature and humidity profiles, as well as being able to detect the rapidly changing mesoscale atmospheric environment and provide higher time resolution temperature and humidity changing information in certain areas.
Based on the FY-4A/GIIRS interferometer atmospheric vertical detector, this study proposes an improved FY-4A/GIIRS retrieval method combining ANN retrieval algorithm with an improved 1D-Var retrieval algorithm. This algorithm mainly includes channel blacklist establishment, temperature channel selection through the information capacity method, observation error correction through the ANN algorithm, Radiative Transfer for TOVS (RTTOV) rapid radiative transfer model, the iterative solution of Optimal Estimation Method (OEM) in the 1D-Var retrieval method, and the integration of the ANN retrieval algorithm which is used to retrieve specific parts of the vertical atmospheric pressure level. In this context, the accuracy of retrieval results was verified using sounding data published by the University of Wyoming, and we compared the accuracy of GIIRS temperature profile products and retrieval temperature profiles of our method. Meanwhile, we analyzed the detection capability of the FY-4A/GIIRS infrared hyperspectral atmospheric vertical detector for atmospheric temperature.

FY-4A/GIIRS Data
The performance parameters of the FY-4A/GIIRS instrument are shown in Table 1. GIIRS was composed of 32 × 4 pixels surface array detectors to improve the detection sensitivity by long time integration. The interferometer had 1650 channels, which could achieve high resolution continuous spectral coverage of infrared radiation to the earth and atmosphere [29][30][31]. The satellite observation data in our study mainly used the FY-4A/GIIRS interferometer L1 level data after radiometric calibration and geographical positioning (http://satellite.nsmc.org.cn/PortalSite/Data/Satellite.aspx). The data used in our experiment was mainly located in India, and the time is 00:00, 06:00, 12:00 UTC on 1-6 August 2019 and 00:00, 06:00, 12:00 UTC on 1-10 February 2020.

ERA5 Data
In our experiment, the reanalysis dataset was used as the reference standard for training models. We used the ERA5 dataset as reanalysis datasets, provided by ECMWF (https://apps.ecmwf.int/datasets/). The reanalysis datasets, with their high accuracy and wide spatial coverage, use the most advanced global assimilation system to assimilate data from ground observation station sources, sounding balloons, radio-sounding, aviation aircraft, satellites, etc., and contain a large number of meteorological elements. This experiment used the ERA5 hourly mean data, dated from 1-6 August 2019 and 1-3 February 2020, which included pressure level and surface information such as atmospheric specific humidity, atmospheric temperature, 2 m temperature, skin temperature, 10 m U wind component, 10 m V wind component, and surface pressure.

Sounding Data
We used the sounding data (http://weather.uwyo.edu/upperair/sounding.html) published by the University of Wyoming as true values for experimental accuracy verification. Data were collected from 4 February 2020 to 10 February 2020. Sounding balloons were usually released at 0:00 every day, and some stations may also release balloons at 12:00. Our chosen region was India.

GIIRS Temperature Product Data
The GIIRS product (https://data.nsmc.org.cn/portalsite/Data/Satellite.aspx) was used in the test stage for comparison. The product type was Atmospheric Temperature and Humidity Profile, and the product name was the Atmospheric Vertical Sounding Product (regional synthesis). Launched on 24 May 2018, the product was updated to its second version on 27 May 2019. The product in this experiment was used in India from 4-10 February 2020.

GFS Data
The Global Forecast System (GFS) is a weather forecast model produced by the National Centers for Environmental Prediction (NCEP). Dozens of atmospheric and land-soil variables are available through this dataset, including temperatures, winds, precipitation, soil moisture, and atmospheric ozone concentration. In our study, we used the dataset of historical GFS forecast data, which included the times of 00, 06, 12, 18 UTC and the Horizontal resolution of 0.5°, and we selected the GFS dataset collected with observations from FY-4A/GIIRS at the same time and space. The forecast time limit was 6 h.

RTTOV Model
RTTOV is developed from the TIROS Operational Vertical Sounder (TOVS) rapid radiative transfer model, developed by ECMWF in the early 1990s (https://nwp-saf.eumetsat.int/site/software/rttov/). RTTOV can not only quickly and accurately simulate the observed brightness temperature of various satellite instruments under the conditions of given atmospheric state parameters [32], it can also quickly calculate the Jacobian matrix of observed radiation with respect to its atmospheric state (temperature and absorbed gas of each level).

Backpropagation Artificial Neural Network Model
This section mainly introduces the model principle of the backpropagation neural network algorithm, used for observation error correction and temperature profile retrieval. Detailed information, such as how to use the network, will be expanded in Section 3.
In our experiment, one hidden layer was set to connect the input layer and the output layer. For example, in Figure 1, the neurons in each layer were connected with the neurons in the adjacent layer through the weight W, the intercept b, and the activation function. The activation function was selected as the sigmoid function (in Equation (1), x is input, y is output) in this experiment.
The parameter I is the neural network input data, i represents the input of the i-th neuron, H is the data of the neural network training hidden layer, j represents the j-th neuron of the hidden layer, O is the output data of the neural network training output layer, k represents the k-th neuron of the output layer, and T is the target output data. The parameter Ee represents the mean square error between the output of the neural network training and the target output. ij W represents the weights of input layer i and hidden layer j, jk W represents the weights of hidden layer j and output layer k, and  represents the interval size. Equation (2) was used to make statistics on the deviation between target output and the actual output for each iteration. Equations (3) and (4) were utilized to change the weight function in order to make the actual output of the network closer to the target output. The optimal weight function was obtained after many iterations and the network model was saved.

Method
Using the infrared hyperspectral data of the FY-4A/GIIRS, an improved FY-4A/GIIRS retrieval method was proposed which effectively combined the advantages of the ANN retrieval algorithm and the 1D-Var retrieval algorithm. This method possessed higher retrieval accuracy. This section mainly introduces the principle of data preprocessing, the ANN retrieval algorithm (Met-ANN), improved 1D-Var retrieval algorithm (Met-I1DVar), and proposed improved strategies which included creating a channel blacklist and using ANN to correct observation data. Eventually, Met-Combine was proposed by combining the two methods.

Accuracy Evaluation Method
In this study, we used the calculation of ME (Mean Error), RMSE (Root Mean Squared Error), and MAE (Mean Absolute Error) as the standard for accuracy evaluation. The performance was calculated from the following equations: where i x is the retrieval value; i x ' is the actual value; n is the number of samples. In our study, we paid more attention to RMSE. We believed that the higher the accuracy of the retrieval method, the smaller the RMSE between the retrieval temperature profile and the true temperature profile.

Data Preprocessing
In the actual atmospheric temperature profile retrieval experiment, we preprocessed the data before the retrieval experiment in order to improve the quality of the data. In order to reduce the interference of clouds in the satellite observation field of view, we used the FY-4A/AGRI L2 Cloud Mask products (https://data.nsmc.org.cn/portalsite/Data/Satellite.aspx) to select L1 observation data for clear sky pixels (pixels except clear sky are considered as cloud pixels). The channel spectral radiation of the observation data at level L1 has a spectral side-lobe effect, so we used Hamming window function for apodization to suppress it [33]. Finally, the level L1 observation data (FY-4A/GIIRS), reanalysis data (ERA5), and forecast data (GFS) were synchronized in time and space by linear interpolation. Other data were interpolated linearly based on the long wave temporal and spatial (longitude and latitude) data from GIIRS, as in Equation (8). In terms of pressure level, the forecast data and sounding data were interpolated to the corresponding altitude levels by taking the pressure levels of ERA5 reanalysis data (37 levels in total, from 1 hPa to 1000 hPa near the ground) as the benchmark. Equation (8) x represents the independent variable of the interpolation function. In this experiment, x represents time and space, 0 x and 1 x represent the two closest points to x , Y represents the dependent variable of the interpolation function, and Y in this experiment represents the brightness temperature.
The data used for model training was mainly collected in India on 1-6 August 2019 and 1-3 February 2020 respectively. Figure 2 shows the location of the data set utilized for training from 00:30 to 01:30 on 1 February. Its longitude range is approximately 68.7° E-91° E while the latitude range is mainly in the land area of 8.4° N-32° N. In particular, the blue point refers to the selected clear sky pixel, and the red point refers to the abandoned cloud pixel.

ANN Retrieval Algorithm
This section mainly introduces the principle and process of the ANN retrieval algorithm (Met-ANN). The process is shown in Figure 3. Data preprocessing (see Section 3.2) was performed on level L1 observation data (FY-4A/GIIRS) and reanalysis data (ERA5). We considered the influence of the satellite zenith angle on observation data, the radiance of each channel of each sample is multiplied by the cosine of the zenith angle of the satellite [22]. To be specific, the input layer was the observed brightness temperature of all channels (1650 neurons in total), and the target output layer was the temperature profile data of all pressure levels (37 neurons in total) [34]. The learning rate was 0.05. We randomly divided the dataset into 70%, 15%, and 15% as the training set, validation set, and test set respectively. We built an ANN training model, set the number of hidden layers to 1, and the determination of the number of hidden neurons mainly referred to in the article [35]. The author found that the number of hidden neurons obtained by the empirical Equation (9) In Equation (9), n is the number of neurons in the input layer, m is the number of neurons in the output layer, and h (an integer by rounding up) is the number of neurons in the hidden layer.

Improved 1D-Var Retrieval Method
As shown in Figure 4, this section mainly introduces the principle and process of the improved 1D-Var retrieval algorithm (Met-I1DVar), at the same time introducing the method for establishing the channel blacklist, and the proposed strategy for error correction of observation data through ANN.

Covariance of Priori Error and Covariance of Observation Error
We preprocessed the L1 observation data (FY-4A/GIIRS), reanalysis data (ERA5), and forecast data (GFS) (see Section 3.2). We brought the reanalysis data in the RTTOV model to obtain the simulated brightness temperature/radiance. We then calculated the priori error covariance and the observation error covariance. The priori error covariance ap S was obtained from the deviation statistics of the GFS forecast data and the ERA5 reanalysis data, in the form of: Specific calculations are shown in Equation (11): In Equation (11), ij s represents the priori error, i k x is the k -th sample data of the i -th level;   i E x is the average error of the forecast value of the i -th level; n is the total number of samples. Figure 5 shows the covariance of temperature priori error. We considered the selected channels to be approximately independent. The observation error covariance matrix S  is a diagonal matrix, and its diagonal elements are: In Equation (12), m represents the number of channels.

Creating the Channel Blacklist
We selected the matched FY-4A/GIIRS and reanalysis profile data, and brought the true value profile, underlying surface, and angle data in the RTTOV model to obtain the simulated observation brightness temperature. We compared the simulated observed brightness temperature with the actual observed brightness temperature, and then analyzed the RMSE, which was caused by both the observation error and transfer model error (https://nwp-saf.eumetsat.int/downloads/rtcoef_rttov13/visir_lbl_comp/lbl_comp_ rtcoef_fy4_1_giirs_v7pred_101L.html). The observation error analysis was carried out for each channel to find out the channel where the observation error was significantly greater than that of the adjacent channel, and the blacklist of the observation channel was established to screen the observation data.
After data preprocessing, we randomly selected 30,000 sets of matching data, brought the reanalysis data in the RTTOV radiative transfer model to obtain the simulated brightness temperature, and calculated the RMSE between the simulated brightness temperature of each channel and the satellite observation brightness temperature. The statistical results are shown in Figure 6. After analysis, the blacklist of channels (235 channels in total) included a wave number range of: 726.25~735.625 cm −1 (RMSE was greater than 2 K), 1067.5~1130 cm −1 (RMSE was greater than 2 K), 1680~1753.125 cm −1 (RMSE was greater than 3 K). Since it was very important for retrieval temperature that the wave numbers were between 2200 cm −1 and 2250 cm −1 , these channels were not included in the blacklist.
In the following retrieval experiment, these channels were directly eliminated to reduce the influence of the errors in the data and the forward model. Figure 6. The RMSE between the simulated brightness temperature of each channel and the satellite observation brightness temperature , the blue line represents the long wave wavelengths and the red line represents the medium wave wavelengths.

Channel Selection
In our retrieval experiment, the more channels selected was not necessarily better. Selecting too many channels would introduce errors that interfered with the accuracy of the retrieval, leading to an increase in the retrieval error [36]. When the influence of information increase was greater than the influence of error increase, the retrieval error decreased; otherwise, the retrieval error increased.
Entropy can represent the uncertainty of information. In Equation (13), assuming that the probability distribution of vector x was a Gaussian distribution, Ŝ is the error covariance matrix of state vector x, the information entropy C is defined as: In Equation (14), H is the difference of information entropy before and after retrieval, which is defined as the information capacity contained in satellite observation: In the FY-4A/GIIRS retrieval process, the column vectors of the atmospheric state constitute the vector x, the priori error covariance matrix is ap S , the error covariance matrix of retrieval parameters is  S , and K is the Jacobian matrix. The estimate of  S is as follows: We combined the literature [37] to set the channel selection plan. We set the initial error covariance matrix  S to ap S . Figure 7 shows the process of one iteration. One channel was selected from the standby channels when making iteration per time, and the error covariance matrix  S and information capacity H were updated according to Equations (14)- (16). After traversing all the standby channels, the channel with the maximum information capacity was selected as the channel for this iteration. In total, 41 iterations were made for the temperature channel selection experiment.

Use ANN to Correct Observation Data
Regarding actual retrieval iterations, the main source that affected the final retrieval result was the error between the brightness temperature simulated by the forward radiative transfer model and the brightness temperature observed by the satellite. This error included both the radiative transfer model error and the satellite observation error. The correction of observation data in the original 1D-Var retrieval method only considered the satellite observation error. Our aim was to make the corrected satellite observations more similar to the simulated values of the transfer model.
Aimed at the shortcomings of the above physical methods, our FY-4A/GIIRS retrieval method introduced the ANN method to correct spectral data from satellite observations. In 11,695 matching sets of data, we brought the reanalyzed profile data, underlying surface and angle data in the RTTOV radiative transfer model to obtain the simulated brightness temperature. The satellite's observed brightness temperature (excluding the channel in the blacklist) was used as the input sample of the training neural network, and the corresponding simulated brightness temperature (the channel for 1D-Var retrieval algorithm) was used as the training target output sample. The number of hidden neurons is as Equation (9).
As shown in Figure 8, the horizontal axis of the figure was the target output brightness temperature (T), and the vertical axis was the output brightness temperature of the network (Y). Each group of samples was dotted with a blue dot on the figure according to its own T and Y. The closer the line Y = T was, the better the result of the network training was. We set the correlation coefficient R as an index to evaluate the quality of the model training. The R of the training set was 0.9998, and the R of the test set was 0.9993. From the analysis of the experimental test results, the network trained in this experiment was good and could be used for error correction of observation data.

Building an Objective Function
In this section, we used Newton's non-linear iterations to minimize the objective function and obtain the temperature profiles. The objective function J [38][39][40][41] was defined as: In Equation (17),  is the Lagrange smoothing factor,   The final iteration equation is described as follows: In Equation (18), n is the number of iterative times, 1 n x  and n x are the profile results of the (n + 1)-th and the n-th respectively, the initial x is derived from the GFS forecast profile,

 
n F x is the brightness temperature obtained from forward transfer model in the n-th iteration, and n K is the Jacobian matrix calculated in the n-th iteration. The convergence condition is defined as Equation (19).
At the same time, the influence of the observation information and priori information on the solution was adjusted. The smaller  was, the greater the influence of observed information would be; and inversely the greater the influence of priori information would be. The smooth factor  was initially set to 1.
We randomly selected 1000 sets of preprocessed data from 1-6 August 2019 for comparison experiments. We defined Met-I1DVar, apart from 'Create Channel Blacklist' and 'Use ANN to Correct Observation Data', as the conventional 1D-Var. We used the conventional 1D-Var and Met-I1DVar to retrieve the temperature, respectively. As shown in Figure 9, we calculated the RMSE of the retrieval temperature profiles of two methods based on the reanalysis temperature profiles, respectively. From the statistical results of RMSE, Met-I1DVar had an obvious advantage over the conventional 1D-Var retrieval algorithm. The RMSE of Met-I1DVar was less than 1 K in the 10 hPa-200 hPa and 300 hPa-800 hPa pressure levels. The results showed that the accuracy of profile retrieval could be effectively improved by establishing a channel blacklist and introducing ANN for observation data correction.

Construct an Improved FY-4A/GIIRS Retrieval Method
After data preprocessing (see Section 3.2), ANN (see Section 3.3) and improved 1D-Var (see Section 3.4) were respectively used for the retrieval experiment. Since there were significant differences between the accuracy of some pressure levels for the two algorithms, after theoretical analysis of the algorithm and statistical results of a large number of experiments, we selected the appropriate algorithm for retrieval according to their different pressure levels, defining our method as Met-Combine. Finally, we carried out error analysis and accuracy evaluation. Figure 10 shows the process of Met-Combine. We selected 1000 sets of data from all samples for testing (1-6 August 2019 and 1-3 February 2020), and calculated RMSE and ME from the retrieval results of the two methods based on the reanalysis data respectively, then compared the retrieval accuracy of Met-ANN and Met-I1DVar. The relevant results are shown in Figure 11. Met-I1DVar had a larger RMSE at 800 hPa-1000 hPa near the surface, while the RMSE of the remaining pressure levels was smaller than Met-ANN. Finally, we built an improved temperature retrieval method based on the above retrieval results and theoretical analysis. In temperature retrieval experiments, Met-ANN was used at the pressure levels of 800 hPa-1000 hPa, while Met-I1DVar was selected for the remaining pressure levels.

Test Data
Our experiment data was selected in India, the training data of which was from 1-3 February 2020; and the test data of which was from 4-10 February 2020 at 00:00UTC. Figure 12 shows the sounding data (red dot) and GIIRS observation data (blue dot) used for testing at 0:00 on 4 February. GFS forecast data was interpolated and matched with GIIRS observation data through the linear interpolation method, as described in Section 3.2. We allocated GIIRS observation data points and sounding data points through the conditions specified in Equation (20). If the number of blue dots nearby red dots was less than 4, or these blue dots were generally located to one side, we would delete the data. The green box in the figure contains the sounding data points we deleted. There were altogether 118 sets of sounding data used for testing on 4-10 February 2020. After retrieving the temperature profiles, the profile data and sounding data were also interpolated temporally and spatially as per the interpolation method described in Section 3.2. Sounding data was taken as the true value in this experiment to evaluate the accuracy of the temperature profile retrieval method. In Equation (20)

Correction of Observation Data
The training data was collected from the whole of the Indian region on 1-3 February. The simulated brightness temperature was obtained by substituting the reanalysis data into RTTOV. The error of the observed brightness temperature of the channel selected was also corrected as per the method in Section 3.4.4. See Section 4.1 for the descriptions of test data. Statistics on the MAE of the observed brightness temperature and the simulated brightness temperature of the test set before and after correction were made.
As shown in Figure 13, the MAE of all selected channels after correction was less than that before correction. Additionally, the MAE of most corrected channels was less than 0.5 K. We could find a trend: the greater the wave number, the more obvious the correction effect of the channel. Using the ANN method to correct the error of the observation data of the selected channel could effectively reduce the error of the observation data and the forward model.

Compare the Retrieval Results of Met-ANN with Met-I1DVar
As shown in Section 4.1, where the test data was introduced, we utilized Met-ANN and Met-I1DVar to retrieve the temperature profiles, and took the sounding data as the true value, so as to statistically analyze the MAE and RMSE of temperature profiles retrieved by using the two methods. As shown in Figure 14, MAE and RMSE showed basically the same variation error curve. The error with Met-ANN was smaller than that with Met-I1DVar from 650 hPa to 850 hPa and from 975 hPa to 1000 hPa, respectively and the error with Met-I1DVar was smaller than that with Met-ANN in other atmospheric pressure levels. The error of the two methods had little difference between 600 hPa and 1000 hPa. However, Met-ANN was advantageous over Met-I1DVar between 10 hPa-600 hPa.

Compare the Retrieval Results of Met-Combine with GIIRS Products
The test method was introduced in Section 4.1. Statistics on MAE and RMSE of Met-Combine and GIIRS temperature profile products were made. As shown in Figure 15, the variation error curves of MAE and RMSE were basically the same. The error of the GIIRS product was smaller than that with Met-Combine from 575 hPa to 850 hPa and from 925 hPa to 1000 hPa, while that with Met-Combine was smaller than that of GIIRS product in other atmospheric pressure levels. The error of the GIIRS product was smaller than 2 K in general between 525 hPa and 1000 hPa. The average error with Met-Combine was smaller than that of the GIIRS product, between 10 hPa and 1000 hPa.

Discussion
Taking the reanalysis data of ERA5 as a basis, we adopted the ANN method to correct the selected observation data for temperature retrieval. The corrected errors of almost all channels are lower than those made before correction, and the MAE among the corrected observed brightness temperatures of most channels and those simulated can be lower than 0.5 K. The reanalysis data of ERA5 features high temporal and spatial resolution with high accuracy, the data of which can be collected almost anywhere in the world, which is very suitable for ANN training as a basis. The network inputs are the observed brightness temperatures of all channels excluded in the blacklist. A large number of channels will be abandoned in the subsequent retrieval test. However, these abandoned channels provide the observation information during brightness temperature correction, so Met-I1DVar (see Section 3.4) effectively improves the utilization rate of the observation information.
We compared the accuracy of Met-ANN (see Section 3.3) with that of Met-I1DVar for temperature retrieval. In Section 3.5, we selected some data sets for training based on ERA5 data to perform the temperature retrieval tests. According to the ME and RMSE results of the two methods, we found that the accuracy of Met-ANN from 800 hPa to 1000 hPa is higher than that of Met-I1DVar, and the accuracy of other pressure levels is lower than that of Met-I1DVar. In Section 4.3, we adopted Met-ANN and Met-I1DVar based on the sounding data published by the University of Wyoming to perform temperature retrieval tests on the test data sets and compiled statistics on the MAE and RMSE results. We found that Met-ANN is better from 600 hPa to 1000 hPa, while Met-I1DVar is obviously better in other levels. In both tests, we found that the accuracy of Met-ANN is higher than that of Met-I1DVar in pressure levels near the ground, and Met-I1DVar is much better in pressure levels further away from the ground.
In Section 4.4, we compared the accuracy of temperature profiles retrieved by Met-Combine (see Section 3.5) based on the sounding data published by the University of Wyoming in early February 2020 in India with that of temperature profile products obtained by GIIRS. We found that the temperature profile products obtained by GIIRS in some pressure levels (such as 10 hPa-500 hPa and 850 hPa-925 hPa) can be improved. As a load on GEO satellites, GIIRS has a higher temporal resolution and can provide a large number of samples for ANN training. Met-Combine provides the advantages of machine learning and can effectively extract more observation information than using physical methods.
However, Met-Combine has some shortcomings: (1). From 550 hPa to 800 hPa, the accuracy of temperature profiles retrieved according to Met-Combine is lower than that of temperature profile products obtained by GIIRS. The method in this paper is applicable to Ver. V3 of GIIRSL1 data. (2). In Met-Combine, the retrieval results of different pressure levels obtained by the above mentioned two methods are combined together, which may cause inconsistent sensitivity. We layered the atmosphere vertically according to the atmospheric pressure so that the retrieved profiles are discrete, and then we output profiles with a higher vertical resolution by fitting. This method is helpful for reducing the influence of inconsistent sensitivity. In addition, we believe that the error (such as RMSE) between a retrieved profile and the true value is more noticeable than inconsistent sensitivity. Therefore, the inconsistent sensitivity of profiles in different pressure levels caused by the combined methods does not influence the availability of data. (3). Met-Combine also has the same shortcomings as encountered in the ANN method.
For example, incomplete training samples may cause bad retrieval effects on some test data, and complicated steps regarding the training coefficient will lead to a relatively long model training time, in addition, data quality for training is hard to control. Therefore, whether Met-Combine can be adopted in practice and obtain better retrieval results still needs further verification.

Conclusions
In this paper, the 1D-Var retrieval algorithm (see Met-I1DVar in Section 3.4) is improved. A comparison test of temperature profile retrieval between the improved 1D-Var algorithm and ANN algorithm (see Met-ANN in Section 3.3) based on the GIIRS data observed in India in early February was performed. We found that the ANN retrieval algorithm is more suitable for the temperature retrieval of pressure levels near the ground, while the 1D-Var algorithm is more suitable for pressure levels further away from the ground. By combining Met-ANN and Met-I1DVar, a retrieval method suitable for retrieving atmospheric temperature profiles based on the GIIRS observation data (see Met-Combine in Section 3.5) was proposed. Using the GIIRS data observed in India in early February, the error of temperature profiles retrieved according to Met-Combine and the error of temperature profile products obtained by GIIRS were recorded. For some pressure levels, the accuracy of the temperature profiles that were retrieved according to Met-Combine is higher than that of GIIRS temperature profile products. We found that Met-Combine is more suitable for improving the accuracy of GIIRS temperature profile products at 10 hPa-500 hPa and 850 hPa-925 hPa.
Compared with the ANN retrieval algorithm, Met-Combine introduces the priori information of forecast data and effectively adopts the observation information of underlying surfaces and satellites; compared with the traditional 1D-Var retrieval algorithm, Met-Combine can extract more channel observation information, reduce errors from the radiative transfer model and satellite observation and effectively use the observation information of 1415 channels combined with the characteristics of FY-4A/GIIRS observation data. Therefore, the accuracy of Met-Combine is higher than the other two methods.
GIIRS is a load on GEO Satellite FY-4A, which is the first infrared hyperspectral detector on the geostationary orbit. Compared with other similar instruments, it has an obviously higher time resolution. In a short period of time, GIIRS can provide a large amount of observation data for the same area, which is very suitable for training machine learning models. This paper combines traditional physical retrieval methods with a machine learning method, which can provide a reference for improving the accuracy of products. We can improve the utilization rate of the GIIRS observation information by improving the retrieval method. In short, we believe that the GEO satellite-based infrared hyperspectral detection technology has broad development prospects.  Data Availability Statement: All data generated or analyzed during this study are included in this article.