Next Article in Journal
Meteor Radar for Investigation of the MLT Region: A Review
Previous Article in Journal
A Mixed Method Study to Explore How Maintenance Personnel Can Enhance Wildfire Smoke Resilience at Long-Term Care Facilities in the US Mountain West
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Communication

Optimal Variables for Retrieval Products

by
Simone Ceccherini
Istituto di Fisica Applicata “Nello Carrara” del Consiglio Nazionale delle Ricerche, Via Madonna del Piano 10, 50019 Sesto Fiorentino, Italy
Atmosphere 2024, 15(4), 506; https://doi.org/10.3390/atmos15040506
Submission received: 26 January 2024 / Revised: 6 April 2024 / Accepted: 18 April 2024 / Published: 20 April 2024
(This article belongs to the Section Atmospheric Techniques, Instruments, and Modeling)

Abstract

:
The increase in satellite instruments sounding the atmosphere will increase the frequency of several instruments simultaneously measuring either the same vertical profile or vertical profiles related to nearby geo-locations, and users will consult fused products rather than individual measurements. Therefore, the retrieval products should be optimized for use in data fusion operations, rather than for the representation of the profile. This change in paradigm raises the question of whether a more functional representation of the retrieval products exists. New variables for the retrieval products are proposed that have several advantages with respect to the standard retrieval products. These variables, in the linear approximation of the forward model, are independent of the a priori information used in the retrieval, allow us to represent the profile with any a priori information and can be used directly to perform the data fusion of a set of measurements. Furthermore, the use of these variables allows us to reduce the stored data to about one third of its volume with respect to the use of standard retrieval products.

1. Introduction

The retrieval of the vertical profile of an atmospheric parameter requires the solution of an inverse problem [1,2,3] that is often ill-posed [4], and in order to obtain a stable solution, some a priori information has to be added in the retrieval process. A commonly used method to retrieve atmospheric parameters using remote sensing is the optimal estimation method [1], where the a priori information is represented by an a priori profile and by an a priori covariance matrix (CM) of the unknown parameter, and the solution is given by the profile corresponding to the maximum a posteriori probability calculated with the Bayes theorem (see e.g., [5]).
Since, in the last few years, the number of satellite instruments that are sounding the atmosphere has increased at a high rate, it is very likely that more instruments will simultaneously measure either the same vertical profile or vertical profiles corresponding to nearby geo-locations. In this case, the different retrieved profiles can be combined into a single product that includes all of the available information, and we refer to this combination as data fusion [6]. Accordingly, the choice of the a priori information and of the vertical grid has to take into account the possibility that the result of the retrieval will be fused with other measurements [7,8,9,10,11]. The data fusion approach is alternative to that of the synergistic retrieval [12], in which all of the available observations are simultaneously used in a single retrieval; for a detailed description and comparison of the two methods see [13] and references therein.
In light of the increased requirement of fused products, we consider the possibility of using new variables representing the retrieval products, with the purpose of simplifying the subsequent fusion processes. A change in the retrieval products is proposed in view of developing a shared formalism, which facilitates the interface between data providers and data users, while ensuring a full exploitation of the available information. The advantages of the new variables with respect to those currently used are analyzed on a theoretical basis.
When the result of the retrieval is used in subsequent data fusion operations, the vertical grid of the fusing products should be as fine as needed for the representation of the information content of the final fused product, rather than of the information content of the individual measurement, because, as shown in [14], in the latter case, some information is lost. This can be easily done because the use of the a priori information allows for representing the profile on a vertical grid as fine as desired. Therefore, the retrieval products are no longer chosen with the objective of providing the user with a useful representation of the observed profile, but rather as the best input for the fusion process, possibly independent of a priori information. Therefore, the question arises of whether by removing the objective of the graphical representation of the profile a more functional data transfer of the retrieval products can be considered.
Generally, in order to make complete use of the products in further processing such as data fusion or data assimilation, the retrieval products are represented by means of the retrieved profile, the averaging kernel matrix (AKM), the retrieval CM and the a priori information used in the retrieval.
We propose new variables calculated starting from these standard retrieval products that are a new way to save the information provided by the measurements and have several advantages with respect to the standard quantities. In the linear approximation of the forward model, the new variables are independent of the a priori information used in the retrieval and decrease the data volume requirement. Furthermore, they can be used to represent the profile with any a priori information and are quite suitable for subsequent data fusion operations.
In Section 2, we recall useful notations and equations, linearize the transfer function and introduce the new variables. In Section 3, we describe the advantages of the new variables with respect to the standard retrieval products concerning representation of the profile, data fusion and reduction of the data volume. Finally, in Section 4, we draw the conclusions.

2. The New Variables

2.1. Recall of Notations and Equations

We assume to have retrieved the vertical profile x ^ of an atmospheric parameter from a set of observations (radiances) y with the optimal estimation method [1], using a profile x a and a CM S a as a priori information. We indicate with f ( x ) the forward model, which allows us to express the observations y as a function of the true profile x t by the following equation:
y = f ( x t ) + ε ,
where ε is the vector including both the noise errors of the observations and the forward model errors, due to parameter errors and physical approximations of the forward model. Generally, the forward model calculates the radiative transfer through the Earth’s atmosphere and knowing the state of the atmosphere, the observation geometry (for example either limb or nadir) and the characteristics of the instrument allows us to simulate the radiances measured in the given conditions. In order to simplify the formulation, we assume that there are no forward model errors and, therefore, ε includes only the noise errors of the observations and is characterized by ε = 0 and ε ε T = S n y . indicates the mean value, and S n y is the CM of the noise errors of the observations. A formulation that takes into account forward model errors can be obtained defining new observations corrected for the bias of the forward model errors and replacing S n y with the sum of S n y and the CM of the random part of the forward model errors.
The sensitivity of x ^ to the true profile x t is described by the AKM A = x ^ x t , and the retrieval errors of x ^ are described by the CM S , which is the sum of the CM of the noise errors S n and the CM of the smoothing errors, which are due to the smoothing of the true profile caused by the averaging kernels, S s . The AKM and the CMs are given by (see Equations (3.28)–(31) in [1]):
A = ( F + S a 1 ) 1 F ,
S n = ( F + S a 1 ) 1 F ( F + S a 1 ) 1 ,
S s = ( F + S a 1 ) 1 S a 1 ( F + S a 1 ) 1 ,
S = S n + S s = ( F + S a 1 ) 1 ,
where
F = K T S n y 1 K ,
with K being the Jacobian of the forward model f ( x ) calculated at x ^ : K = f ( x ) x | x = x ^ . The matrix F is the Fisher information matrix [1,15], defined as
F = P ( y | x ) ( ln P ( y | x ) x ) ( ln P ( y | x ) x ) T d y ,
where P ( y | x ) is the conditional probability distribution to obtain y given x , which, considered as a function of x , is referred to as the likelihood function  L ( x ) [16]. In the case that the inverse problem can be solved without constrain ( S a 1 = 0 ), that is when we can find the solution of maximum likelihood, from Equations (3)–(5), we see that F is equal to the inverse matrix of the CM of the retrieval errors ( S ), which coincides with the CM of the noise errors ( S n ). From this consideration, we can understand that the physical meaning of F is quantifying the information provided by the observations y about the retrieved vertical profile.
F depends on the a priori information used in the retrieval through K calculated at x ^ , which depends on x a and S a . Therefore, the dependence of F on the a priori information is due to the second order terms in the expansion of the forward model as a function of the profile x , and consequently, when the linear approximation of the forward model is valid, F is independent of the a priori information.

2.2. Linearization of the Transfer Function and Variables α

We can consider the whole measuring system, including both the observing system and the retrieval method, as an operation that transforms the true profile x t into the retrieved profile x ^ and, accordingly, defines the retrieved profile x ^ as a function of the true profile x t . This function is referred to as the transfer function [1], and besides being a function of x t , it is also a function of the noise errors ε of the observations y . This dependence can be seen recalling that really x ^ depends on x t through the observations y ; therefore, using Equation (1) we can write x ^ = x ^ ( y ) = x ^ ( f ( x t ) + ε ) , which we indicate as x ^ = x ^ ( x t , ε ) . We note that x ^ ( x t , ε ) ε = x ^ ( y ) y y ε = x ^ ( y ) y because from Equation (1), it results that y ε is the identity matrix.
Expanding the transfer function at the first order around the a priori profile x t = x a and zero errors ε = 0 , we obtain:
x ^ ( x t , ε ) x ^ ( x a , 0 ) + x ^ ( x t , ε ) x t | x t = x a ε = 0 ( x t x a ) + x ^ ( x t , ε ) ε | x t = x a ε = 0 ε .
Concerning the first term of the expansion, we recall that the retrieved profile obtained with the optimal estimation method in the absence of errors is a weighted mean between the true profile and the a priori profile. Therefore, when the true profile coincides with the a priori profile, the retrieved profile in the absence of errors results in the a priori profile, that is x ^ ( x a , 0 ) = x a . This result is peculiar of the optimal estimation method, and if we wish to extend the results of this article to retrieval methods different from the optimal estimation, it is necessary to identify a linearization point for which we know the value assumed by the transfer function. This consideration also applies to the complete data fusion method [6,17] and to all the methods that are based on the expansion of the transfer function.
Under the approximation that the derivatives do not significantly depend on the point where they are calculated, we have x ^ ( x t , ε ) x t | x t = x a ε = 0 x ^ ( x t , ε ) x t | x t = x ^ ε = 0 = A and x ^ ( x t , ε ) ε | x t = x a ε = 0 x ^ ( x t , ε ) ε | x t = x ^ ε = 0 = x ^ ( y ) y | y = f ( x t ) = G , where G is the gain matrix and is given by
G = ( K T S n y 1 K + S a 1 ) 1 K T S n y 1 = ( F + S a 1 ) 1 K T S n y 1 .
On the basis of these considerations, Equation (8) becomes
x ^ = x a + A ( x t x a ) + G ε .
Following the approach described in the complete data fusion method [6,17], we define the vector α :
α = x ^ x a + A x a ,
which can be calculated knowing the retrieved profile, the a priori profile and the AKM. Substituting x ^ from Equation (10) into Equation (11), we see that α is equal to
α = A x t + G ε
and provides a measurement of the true profile made using the rows of A as weighting functions. Equation (12), together with Equations (2) and (9), shows that α (differently from x ^ ), in the linear approximation of the forward model is independent of the a priori profile x a ; however, through the expressions of A and G , it maintains dependence on the a priori CM S a .

2.3. The New Variables β

We define the vector β as
β = S 1 α = S 1 ( x ^ x a + A x a )
and using Equations (2), (5), (9) and (10), we obtain
β = F x t + δ ,
where the vector δ is given by
δ = K T S n y 1 ε .
Equation (14) provides the physical meaning of β , that is the measurement of the true profile in which the weighting functions are the rows of F , and δ is the vector that includes the errors of this measurement. Furthermore, from Equations (14) and (15) we see that β , in the linear approximation of the forward model, is uniquely determined independently of both x a and S a .
Using Equation (14), we calculate the sensitivity of β to the true profile, which is the AKM of β
A β = β x t = F
and from Equations (6), (14) and (15), we calculate the CM of β
S β = ( β β ) ( β β ) T = δ δ T = K T S n y 1 ε ε T S n y 1 K = F .
Therefore, both the AKM and the CM of β coincide with the Fisher information matrix F .
From Equation (13), we see that the dimensions of β are the inverse of the dimensions of x ^ : [ β ] = [ x ^ ] 1 ; therefore, β does not represent a profile of the parameter that we aim to retrieve. However, as we noticed in the introduction, this is not a problem, because the objective of the retrieval products is no longer the graphical representation of the profile, but to efficiently provide all of the information of the observations to subsequent data analyses.

3. Advantages of the Use of the Variables β

3.1. Representation of the Profile Using Any Constraint

Using Equations (2), (5) and (13), we can obtain β from the retrieved profile x ^
β = ( F + S a 1 ) [ x ^ x a + ( F + S a 1 ) 1 F x a ] = = ( F + S a 1 ) [ x ^ x a + ( F + S a 1 ) 1 ( F + S a 1 S a 1 ) x a ] = = ( F + S a 1 ) [ x ^ ( F + S a 1 ) 1 S a 1 x a ]
and, multiplying on the left both sides of this equation by ( F + S a 1 ) 1 , we can derive x ^ from β :
x ^ = ( F + S a 1 ) 1 ( β + S a 1 x a ) .
Equation (19) can be used to recover the original retrieved profile using the a priori information x a and S a used in the retrieval procedure, but since in the linear approximation of the forward model F and β are independent of the a priori information, in this approximation, Equation (19) can be used to produce a profile with any a priori information we like.

3.2. Data Fusion

If we suppose to have N independent measurements x ^ i of the same vertical profile x t , obtained with the optimal estimation method and characterized by the AKMs A i and CMs of the retrieval errors S i , we can combine these measurements in a single vertical profile that includes the information of all of the N measurements using the complete data fusion formula [17]
x f = ( i = 1 N S i 1 A i + S a 1 ) 1 ( i = 1 N S i 1 α i + S a 1 x a )
where x a and S a are the a priori profile and CM used to constrain the fused profile x f , and α i are the vectors defined by Equation (11) for each measurement:
α i = x ^ i x a i + A i x a i
where x a i is the a priori profile used in the retrieval of the i-th measurement. The use of Equation (20) is equivalent to perform the data fusion using the approach of the Kalman filter [1,18], as shown in [19]
Using Equations (2), (5) and (13), we can rewrite Equation (20) as
x f = ( i = 1 N F i + S a 1 ) 1 ( i = 1 N β i + S a 1 x a )
where β i and F i are the β and F quantities related to each one of the N measurements. Equation (22) shows that the vectors β i and the Fisher information matrices F i are the only quantities needed to perform the data fusion of a set of measurements.

3.3. Reduction in the Data Volume

In this subsection, we compare the data volume required by the standard retrieval products with that required by the new variables β . In the case of standard retrieval products, the quantities that have to be stored to allow for the complete use of the products, in further processing of the data such as data fusion or data assimilation, are: x ^ , A , S and x a . S a is not necessary, because it can be obtained from A and S by means of
S a = ( I A ) 1 S
which is derived using Equations (2) and (5).
If we suppose that the profile has n components, then A is composed by n2 values and S by n(n + 1)/2 independent values (because it is a symmetric matrix). Therefore, in the case of standard retrieval products, we have to store (3n2 + 5n)/2 values.
In the case of the new variables β , the quantities that have to be stored to allow for the complete use of the products are β and F (which is a symmetric matrix); therefore, the values that have to be stored are (n2 + 3n)/2. In case we wish give more complete information specifying where the Jacobian K is calculated, we can also give x ^ , and the values that have to be stored are (n2 + 5n)/2. In Table 1, we summarize the data volume of the quantities stored when using the standard retrieval products and the new variables.
Since the main storage requirement is due to the square term, the use of the variables β allows us to reduce the stored data to about one-third of its volume with respect to the use of the standard retrieval products.

4. Conclusions

With the increasing use of the atmospheric profiles retrieved from atmospheric satellite observations in data fusion operations, the requirement that these products provide a representation of the observed quantity is less important, and other features, such as completeness and compactness of the information, are becoming more relevant. In light of this, new retrieval variables have been proposed when the retrieval has been performed with the optimal estimation method and the first order approximation of the transfer function is appropriate. These variables, referred to as β , are the measurement of the true profile obtained using the rows of the Fisher information matrix as weighting functions. This measurement does not provide a representation of the profile, but has several useful properties: in the linear approximation of the forward model, it is independent of the a priori information used in the retrieval, and both the AKM and the CM of β coincide with the Fisher information matrix. Furthermore, the variables β can be used to obtain the representation of the vertical profile with an a priori information selected by the user, and they can be directly used to perform the data fusion of a set of measurements performed with different instruments. For the exploitation of these products in the subsequent operations, it is sufficient to provide β and the Fisher information matrix F , which fully characterizes the measurement, being both its AKM and its CM. Accordingly, the use of the variables β allows us to reduce the stored data to about one-third of its volume with respect to the use of the standard products. These properties of the variables β make them a perfect retrieval product when further processing is performed by the users and encourage the possibility of considering finer retrieval grids, possibly concerted by the scientific community rather than determined by instrumental considerations. On the other hand, the standard products have the advantage of providing a graphical representation of the measured profiles. However, it is important to notice that the possibility of a graphical representation is obtained at the cost of a constraint on the adopted retrieval grid. The retrieval grid is usually limited in extension and density of points in order to avoid a too large bias of the a priori information, and different instruments freely use different retrieval grids that complicate comparisons. A storage procedure that does not depend on the a priori information can use a retrieval grid commonly used with the other instruments and avoid these difficulties.
The communities of data providers and data users are invited to test and validate the efficiency of this new interface.

Funding

The results reported in the article were obtained in the context of the Earth-Moon-Mars (EMM) project, led by INAF in partnership with ASI and CNR, funded under the National Recovery and Resilience Plan (NRRP), Mission 4, Component 2, Investment 3.1: “Fund for the realisation of an integrated system of research and innovation infrastructures”-Action 3.1.1 funded by the European Union-NextGenerationEU.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The author is grateful to Bruno Carli for the useful discussions.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Rodgers, C.D. Inverse Methods for Atmospheric Sounding: Theory and Practice; Series on Atmospheric, Oceanic and Planetary Physics; World Scientific: Singapore, 2000; Volume 2. [Google Scholar] [CrossRef]
  2. Menke, W. Geophysical Data Analysis: Discrete Inverse Theory; Academic: San Diego, CA, USA, 1984. [Google Scholar]
  3. Twomay, S. Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements; Elsevier: New York, NY, USA, 1977. [Google Scholar]
  4. Doicu, A.; Trautmann, T.; Schreier, F. Numerical Regularization for Atmospheric Inverse Problems; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar] [CrossRef]
  5. Sivia, D.S.; Skilling, J. Data Analysis: A Bayesian Tutorial; Oxford University Press: Oxford, UK, 2006. [Google Scholar]
  6. Ceccherini, S.; Carli, B.; Raspollini, P. Equivalence of data fusion and simultaneous retrieval. Opt. Express 2015, 23, 8476–8488. [Google Scholar] [CrossRef] [PubMed]
  7. Ceccherini, S.; Carli, B.; Cortesi, U.; Del Bianco, S.; Raspollini, P. Retrieval of the vertical column of an atmospheric constituent from data fusion of remote sensing measurements. J. Quant. Spectrosc. Radiat. 2010, 111, 507–514. [Google Scholar] [CrossRef]
  8. Ceccherini, S.; Cortesi, U.; Del Bianco, S.; Raspollini, P.; Carli, B. IASI-METOP and MIPAS-ENVISAT data fusion. Atmos. Chem. Phys. 2010, 10, 4689–4698. [Google Scholar] [CrossRef]
  9. Warner, J.X.; Yang, R.; Wei, Z.; Carminati, F.; Tangborn, A.; Sun, Z.; Lahoz, W.; Attié, J.-L.; El Amraoui, L.; Duncan, B. Global carbon monoxide products from combined AIRS, TES and MLS measurements on A-train satellites. Atmos. Chem. Phys. 2014, 14, 103–114. [Google Scholar] [CrossRef]
  10. Cortesi, U.; Del Bianco, S.; Ceccherini, S.; Gai, M.; Dinelli, B.M.; Castelli, E.; Oelhaf, H.; Woiwode, W.; Höpfner, M.; Gerber, D. Synergy between middle infrared and millimeter-wave limb sounding of atmospheric temperature and minor constituents. Atmos. Meas. Tech. 2016, 9, 2267–2289. [Google Scholar] [CrossRef]
  11. Schneider, M.; Ertl, B.; Tu, Q.; Diekmann, C.J.; Khosrawi, F.; Röhling, A.N.; Hase, F.; Dubravica, D.; García, O.E.; Sepúlveda, E.; et al. Synergetic use of IASI profile and TROPOMI total-column level 2 methane retrieval products. Atmos. Meas. Tech. 2022, 15, 4339–4371. [Google Scholar] [CrossRef]
  12. Aires, F.; Aznay, O.; Prigent, C.; Paul, M.; Bernardo, F. Synergistic multi-wavelength remote sensing versus a posteriori combination of retrieved products: Application for the retrieval of atmospheric profiles using MetOp-A. J. Geophys. Res. Atmos. 2012, 117, D18. [Google Scholar] [CrossRef]
  13. Ridolfi, M.; Tirelli, C.; Ceccherini, S.; Belotti, C.; Cortesi, U.; Palchetti, L. Synergistic retrieval and complete data fusion methods applied to simulated FORUM and IASI-NG measurements. Atmos. Meas. Tech. 2022, 15, 6723–6737. [Google Scholar] [CrossRef]
  14. Ceccherini, S.; Carli, B.; Raspollini, P. Vertical grid of retrieved atmospheric profiles. J. Quant. Spectr. Radiat. Transf. 2016, 174, 7–13. [Google Scholar] [CrossRef]
  15. Ceccherini, S.; Carli, B.; Raspollini, P. Quality quantifier of indirect measurements. Opt. Express 2012, 20, 5151–5167. [Google Scholar] [CrossRef] [PubMed]
  16. Fisher, R.A. On the mathematical foundation of theoretical statistics. Philos. Trans. R. Soc. Lond. 1922, A222, 309. [Google Scholar] [CrossRef]
  17. Ceccherini, S.; Zoppetti, N.; Carli, B. An improved formula for the complete data fusion. Atmos. Meas. Tech. 2022, 15, 7039–7048. [Google Scholar] [CrossRef]
  18. Kalman, R.E. A New Approach to Linear Filtering and Prediction Problems. J. Basic Eng. ASME 1960, 82, 35–45. [Google Scholar] [CrossRef]
  19. Ceccherini, S. Comment on “Synergetic use of IASI profile and TROPOMI total-column level 2 methane retrieval products” by Schneider et al. (2022). Atmos. Meas. Tech. 2022, 15, 4407–4410. [Google Scholar] [CrossRef]
Table 1. Data volume of the quantities stored when using the standard retrieval products and the new variables.
Table 1. Data volume of the quantities stored when using the standard retrieval products and the new variables.
Standard ProductsNew Variables
QuantitiesNumber of ValuesQuantitiesNumber of Values
x ^ n β n
A n2 F n(n + 1)/2
S n(n + 1)/2
x a n
Total number of values
(3n2 + 5n)/2(n2 + 3n)/2
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ceccherini, S. Optimal Variables for Retrieval Products. Atmosphere 2024, 15, 506. https://doi.org/10.3390/atmos15040506

AMA Style

Ceccherini S. Optimal Variables for Retrieval Products. Atmosphere. 2024; 15(4):506. https://doi.org/10.3390/atmos15040506

Chicago/Turabian Style

Ceccherini, Simone. 2024. "Optimal Variables for Retrieval Products" Atmosphere 15, no. 4: 506. https://doi.org/10.3390/atmos15040506

APA Style

Ceccherini, S. (2024). Optimal Variables for Retrieval Products. Atmosphere, 15(4), 506. https://doi.org/10.3390/atmos15040506

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop