Py4CAtS—PYthon for Computational ATmospheric Spectroscopy

Radiation is a key process in the atmosphere. Numerous radiative transfer codes have been developed spanning a large range of wavelengths, complexities, speeds, and accuracies. In the infrared and microwave, line-by-line codes are crucial esp. for modeling and analyzing high-resolution spectroscopic observations. Here we present Py4CAtS—PYthon scripts for Computational ATmospheric Spectroscopy, a Python re-implemen-tation of the Fortran Generic Atmospheric Radiation Line-by-line Code GARLIC, where computationally-intensive code sections use the Numeric/Scientific Python modules for highly optimized array processing. The individual steps of an infrared or microwave radiative transfer computation are implemented in separate scripts (and corresponding functions) to extract lines of relevant molecules in the spectral range of interest, to compute line-by-line cross sections for given pressure(s) and temperature(s), to combine cross sections to absorption coefficients and optical depths, and to integrate along the line-of-sight to transmission and radiance/intensity. Py4CAtS can be used in three ways: in the (Unix/Windows/Mac) console/terminal, inside the (I)Python interpreter, or Jupyter notebook. The basic design of the package, numerical and computational aspects relevant for optimization, and a sketch of the typical workflow are presented. In conclusion, Py4CAtS provides a versatile environment for “interactive” (and batch) line-by-line radiative transfer modeling.


Introduction
Radiative transfer is an important aspect in various branches of physics, esp. in the atmospheric sciences, both for Earth and (exo-)planets [1][2][3].Many radiative transfer models have been developed in recent decades (see [4], for an early review and https://en.wikipedia.org/wiki/Atmospheric_radiative_transfer_codes for an up-to-date listing (All links in this paper were checked and active 4 April 2019.)),ranging from simple approximations to complex, often highly optimized codes and usually tailored to a specific spectral domain.
An essential prerequisite for the analysis of data recorded by atmospheric remote sensing instruments as well as for theoretical investigations such as retrieval assessments is a flexible, yet efficient and reliable radiative transfer model (RTM).Furthermore, as the retrieval of atmospheric parameters is in general a nonlinear optimization problem (inverse problem), the retrieval code must be closely connected to the radiative transfer code (forward model).
Radiative transfer depends on the state of the atmosphere (pressure, temperature, composition), the optical properties of the atmospheric constituents (molecules and particles), geometry, and spectral range.Because of the numerous parameters the setup of a radiative transfer calculation can be cumbersome and error prone, and several interfaces have been developed to ease the application.MODO ( [5], see also http://www.rese.ch/products/modo/) is a "graphical front-end" to the MODTRAN [6] radiative transfer code, and a Python interface to MODTRAN's predecessor, the low-resolution band model LOWTRAN, is available at https://pypi.org/project/lowtran/.Wilson [7], see also http://www.py6s.rtwilson.com/,has developed a Python interface to the widely used 6S (Second Simulation of the Satellite Signal in the Solar Spectrum, [8]) model.
In the infrared and microwave region, line-by-line (lbl) modeling of molecular absorption is mandatory esp.for high spectral resolution.Furthermore, lbl models are required for the setup and verification of fast parameterized models, e.g., band, k-distribution, or exponential sum fitting models [6,9,10].Moreover, lbl models are ideal for radiative transfer in planetary atmospheres, where fast models often parameterized for Earth-like conditions are not suitable [11,12].
Several web sites offer an Internet access to lbl databases and/or lbl models: "HITRANonline" [33] at http://hitran.org/ allows the downloading of line and other data from the most recent HITRAN version.With the "Information System HITRAN on the Web (HotW)" at http://hitran.tsu.ru(see also http://spectra.iao.ru/)several types of spectra can be simulated using HITRAN data.The HITRAN Application Programming Interface (HAPI) [34] is "a set of routines in Python which aims to provide remote access to functionality and data given by the HITRANonline".A similar service (visualization etc.) for the GEISA database is offered at https://cds-espri.ipsl.upmc.fr/geisa/.The "wavelength search engine" SpectraPlot ( [35], see also http://www.spectraplot.com/) is a "web-based application for simulating spectra of atomic and molecular gases" employing, among others, HITRAN and HITEMP.http://www.spectralcalc.com/provides a web interface to the widely used LinePak [29] model (with extended functionality for subscribed users).Goddard's Planetary Spectrum Generator (PSG, https://psg.gsfc.nasa.gov/,Villanueva et al. [36]) can be used to generate high-resolution spectra of planetary bodies (e.g., planets, moons, comets, exoplanets).Knowledge of the atmospheric transmission is also important for ground-based astronomy observations, and some tools for the correction of telluric absorption are available, e.g., MolecFit ( [37], see also https://www.uibk.ac.at/eso/software/molecfit.html.en)and TAPAS (Transmissions Atmosphériques Personnalisées Pour l'AStronomie, [38], online at http://cds-espri.ipsl.fr/tapas/)(both codes use LblRTM).An early attempt to a "virtual lab" providing a web interface to MIRART (the Fortran 77 predecessor of GARLIC, [39]) and FASCODE is described in Ernst et al. [40].
In general, RTMs incl.lbl models work as a kind of "black-box".Given a set of specifications on spectral range, atmospheric conditions, geometry etc. compiled in an input file (or configuration file etc., e.g., the TAPE5 used by LOWTRAN, MODTRAN, and FASCODE), the program is executed reading this input file (and probably some further auxiliary files, e.g., the lbl database) and one or several spectra (transmission, radiance, . . . ) are returned (e.g., saved in file(s)).This approach is clearly advantageous when numerous transmission and/or radiance spectra must be generated, e.g., for the operational analysis of thousands or millions of observed spectra.However, to better understand the physics of radiative transfer, inspection of intermediate variables can provide useful insight.For example, the selection of an appropriate spectral range or measurement geometry is crucial for the design and setup of a new measurement system, and analysis and visualization of line parameters, absorption cross sections and coefficients, layer optical depths, etc. can be useful.Methods and tools have been developed for the "automated" definition of "microwindows", i.e., spectral intervals optimized for retrieval of temperature or molecular concentrations from thermal emission infrared observations (e.g., [41][42][43][44][45][46]). Despite these and similar efforts interactive and flexible RTMs with an easy access to all kind of intermediate variables ("step-by-step", similar to a debugger) appear to be useful, also for pedagogical purposes.
Originally, the development of Py4CAtS was motivated by the idea of computational steering [56], i.e., numerically time-consuming tasks such as the lbl evaluation of molecular absorption cross sections are performed by a code (e.g., GARLIC subroutines) written in compiled languages such as Fortran (or C, C++, . . . ) that is made accessible from Python (or another scripting language) using a wrapper such as PyFort [57] or F2Py [58].More precisely, the intention was to call subroutines of GARLIC from Python.Unfortunately porting the wrapper code from one machine to another turned out to be difficult and time-consuming with early versions of PyFort.However, thanks to the increased performance of NumPy a Python implementation of lbl cross sections became feasible, i.e., the interface to GARLIC's subroutines became less important.Accordingly, the further development of GARLIC and Py4CAtS became largely independent, and Py4CAtS is now a full lbl radiative transfer tool kit delivering absorption cross sections and coefficients, optical depths, transmissions, weighting functions, and radiances.
The paper is organized as follows: After a brief review of the basic facts of lbl infrared and microwave (for brevity simply IR in the following) radiative transfer in the following section some numerical and implementation aspects are discussed in Section 2.2.Usage of the code as well as some implementation details are presented in Section 3. The paper continues with a discussion of several aspects in Section 4 before the final conclusion in Section 5. Implementation aspects are described in the Appendix A.

Molecular Absorption and Radiative Transfer
Radiative transfer in the IR is essentially described by the Schwarzschild equation and Beer-Lambert law (assuming negligible scattering and local thermal equilibrium in an inhomogeneous atmosphere [1][2][3]).Given pressure p and temperature T and line parameters from HITRAN, GEISA, etc. the first step of a simulation is to compute the absorption cross section (xs) for a particular molecule m k m (ν, p, T) = ∑ l S ml (T) g(ν; νml , γ ml (p, T)) . ( where ν is the wavenumber and the sum comprises all relevant lines characterized by position νml , strength S ml , and broadening parameter γ ml .The line strength S(T) at temperature T is obtained as (note that indices for lines and molecules have been omitted) with reference strength S(T 0 ) and lower state energy E i from the database; Q(T) is the product of rotational and vibrational partition functions (for details see [25], subsection 3.4 or [59]) and c, h, and k B are speed of light, the Planck constant, and the Boltzmann constant, respectively.The combined effect of pressure broadening (corresponding to a Lorentzian profile g L ) and Doppler broadening (corresponding to a Gaussian profile g G ) is described by a convolution resulting in the Voigt line profile [60] The Lorentz and Gauss half widths γ (at half maximum, HWHM) are pressure-and temperaturedependent, where the air-broadening coefficient γ 0 L and exponent n are given in the database for reference pressure p 0 and temperature T 0 , and µ is the molecular mass.Please note that contributions to the Lorentzian width (4) due to self-broadening have been ignored: Except for water at Earth's bottom of atmosphere (BoA), molecular mixing ratios are usually very small, and the self-broadening contribution would be small to negligible.Moreover, the self-broadening coefficient is often close to the foreign broadening coefficient (or even undefined, at least for older versions of HITRAN and GEISA).Clearly, for planetary atmospheres (e.g., Mars or Venus) self-broadening can be an important issue.
The increasing quality of spectroscopic observations has indicated deficiencies of the Voigt profile, and several more advanced profiles have been suggested [61,62].In addition to the Lorentz, Gauss, and Voigt profile, implementations of the Rautian and speed-dependent Voigt profile [63] are available.
The absorption coefficient (ac) is defined as the sum of all molecular cross sections scaled by the molecular number densities n m (in general depending on altitude z and the corresponding pressure) Integration of the absorption coefficient along the line-of-sight gives the optical depth (od), i.e., for a plane-parallel atmosphere and a vertical path from altitude z lo to z hi This is closely related to the monochromatic transmission describing the ratio of incoming and outgoing radiation Finally, the integral form of Schwarzschild's equation gives the (spectral) radiance or intensity (ri) where the first term describes an attenuated background contribution I b at a distance s b from the observer (corresponding to the optical depth τ b ), e.g., from Earth's (or a planet's) surface in case of nadir-viewing.The source function in the second term is Planck's function depending on temperature, The finite spectral resolution of measured spectra can be modeled by convolution of the monochromatic transmission (8) and/or radiance (9) with an appropriate spectral response function (SRF, a.k.a.instrument line shape, ILS), e.g., Weighting functions (wf ) are an important concept for atmospheric temperature sounding and are a measure of the contribution of a particular atmospheric layer to the radiation seen by an observer, see Equation (10).They are defined by ∂T ∂z for a vertical path, or more generally for a slant path with s = z/ cos θ and zenith angle θ (with θ = 0 and θ = 180 • for vertical uplooking and downlooking, respectively).

Lbl Molecular Absorption Cross Sections: Multigrid Voigt Function
For the computation of the Voigt profile g V (3) depending on three variables it is convenient to introduce the Voigt function K(x, y) depending on two variables, essentially the distance x to the center peak and the ratio y of the line widths As there is no closed-form solution for this convolution integral, numerical approximations are mandatory.Most modern algorithms consider the complex error function [64,65] whose real part is identical to the Voigt function Rational approximations, closely related to continued fractions and defined as the quotient of two polynomials [66,67], have been used successfully for a large variety of functions including the complex error function (e.g., [68][69][70]).Py4CAtS (like GARLIC) uses an optimized combination [71] of the Humlíček [69] and Weideman [70] algorithms where Z = L+iz L−iz and L = 2 −1/4 N 1/2 .For N = 24 this provides an accuracy better than 10 −4 everywhere except for very small y and intermediate x that are hardly relevant in practice; for N = 32 the relative error |∆K|/K is less than 8 × 10 −5 for all x, y of interest.
Because of the huge number of Voigt function evaluations even highly optimized complex error function algorithms are insufficient and further optimizations are mandatory.Py4CAtS evaluates the monochromatic cross sections on a uniform (equidistant) wavenumber grid with the spacing estimated as a fraction of the "typical" line widths, i.e., δν = γ /n with the mean half width γ and a default sampling rate n = 4.In case of the Voigt line profile (3), its width is estimated using an approximation due to Whiting [72] (accurate to about one percent) Please note that δν and hence the number of grid points and cross section values are dependent on pressure, temperature, and molecular mass (via Equations ( 4), (5), and ( 18)), and change with atmospheric level and molecule.These wavenumber steps are clearly appropriate in the line center region; however, in the line wings the profiles are relatively smooth (roughly g(ν − ν) ≈ 1/(ν − ν) 2  for the Lorentz or Voigt profile) and a coarser grid would be sufficient.
Py4CAtS uses a two-or three-grid approach [73].In the two-grid scheme, the Voigt line profile is calculated on the fine grid only in the line center region, and on a coarse grid (typically with spacing ∆ν = 8δν) in the entire spectral region.The line profiles are accumulated independently to the fine and coarse grid cross sections, i.e., the interpolation of the coarse grid cross section is done only once after all lines have been summed up.This allows a speed-up roughly in the order of the ratio of fine and coarse grid points.A significant further speed-up can be achieved with three grids of fine, medium, and coarse resolution and exploitation of the asymptotic properties of the profile function.

Integration of the Beer and Schwarzschild Equations
From the very beginning of the code development (MIRART, GARLIC, and Py4CAtS) numerical schemes have been used whenever possible.Hence, in contrast to most lbl radiative transfer codes dividing the atmosphere into a series of layers with constant pressure, temperature, and concentrations (Curtis-Godson approximation), Py4CAtS and GARLIC are level-based and exploit numerical quadrature for Equations (7) and (9).In particular, the optical depth is approximated by the sum of absorption coefficients according to the trapezoid rule (e.g., [66]).For the Schwarzschild Equation (9) both codes use optical depth τ as integration variable and assume that the Planck function varies linearly or exponentially with optical depth within a layer, for τ l ≤ τ ≤ τ l+1 (corresponding to the altitude interval [z l , z l+1 ]).Note the slightly different notation for the Planck function's argument here compared to Equation (11) (the optical depth depends on wavenumber and temperature).With both approximations the integral (9) can be computed analytically within a layer.
Because the overall runtime is essentially determined by the evaluation of the cross sections, the speed of level-based and layer-based radiative transfer is equivalent (assuming that the number of levels and layers is comparable).With respect to accuracy, none of the two approaches is superior or inferior; in fact, neither of the intercomparison studies [74][75][76] identified path integration as an issue.For an in-depth discussion of the "linear-in-τ" vs. "exponential-in-τ" approximations as well as some further quadrature schemes see subsection 4.1 of the GARLIC paper [25].

The Package: Usage and Implementation
In Py4CAtS the individual steps of an IR radiative transfer computation are implemented in a series of modules and functions, see Figure 1: All Python source files along with supplementary files are delivered as a single tarball.Unpacking this file will create four directories with the source files in src, data in data, documentation in doc, and executable scripts in bin.Actually the files in the bin directory are just links to the scripts in the src directory (e.g.bin/lbl2xs is a link to src/lbl2xs.py), and in the following subsection it is assumed that the bin directory is included in the shell's search path for executables (PATH environment variable for Unix like systems).When Py4CAtS is used inside the IPython interpreter (see Subsection 3.2), all functions etc. should have been imported using a statement such as %run /home/schreier/py4cats/src/py4cats.py.Importing Py4CAtS can be easily done automatically by an appropriate modification of the (I)Python configuration file, e.g. by adding this file to the list of files to run at IPython startup 3

Running Py4CAtS in the Classical Console
A typical session in a Unix/Linux environment (console/terminal) is shown in Fig. 2.After creating a new directory the first step is to extract the relevant lines from the HITRAN or GEISA data base 4 using higstract.Note that for this command at least one option is mandatory, i.e. the spectral range (here 50 to 60 cm −1 ) or a molecule has to be specified (for convenience, higstract -m main ... can be used to extract line data for the first seven molecules (H 2 O, CO 2 , . . ., O 2 ) of HITRAN or GEISA.) higstract saves the data in several tabular ASCII files, one file per molecule.By default, each file has five columns for line position ν, strengths S, energy E, air-broadening parameter γ 0 L , and the exponent n and is named as H2O.vSEan etc., where the extension indicates the columns (further parameters, e.g. for self-broadening, can be requested with the format option -f).
The next step is to compute absorption cross sections according to Eq. ( 1).Without any option, cross sections are computed for the reference pressure and temperature used in the line parameter database, i.e. p 0 = 1013.25 mb and T 0 = 296 K for HITRAN and GEISA.Alternatively, pressure(s) and temperature(s) can be specified with the -p and -T option.For atmospheric applications it is more All Python source files along with supplementary files are delivered as a single tarball.Unpacking this file will create four directories with the source files in src, data in data, documentation in doc, and executable scripts in bin.Actually, the files in the bin directory are just links to the scripts in the src directory (e.g., bin/lbl2xs is a link to src/lbl2xs.py), and in the following subsection it is assumed that the bin directory is included in the shell's search path for executables (PATH environment variable for Unix-like systems).When Py4CAtS is used inside the IPython interpreter (see Section 3.2), all functions etc. should have been imported using a statement such as %run /home/schreier/py4cats/src/py4cats.py.Importing Py4CAtS can be easily done automatically by an appropriate modification of the (I)Python configuration file, e.g., by adding this file to the list of files to run at IPython startup.(In IPython 6 this list is defined by c.InteractiveShellApp.exec_files in .ipython/profile_default/iypthon_config.py.)

Running Py4CAtS in the Classical Console
A typical session in a Unix/Linux environment (console/terminal) is shown in Figure 2.After creating a new directory, the first step is to extract the relevant lines from the HITRAN or GEISA data base using higstract.(Data are available at hitran.org and http://cds-espri.ipsl.upmc.fr/.Py4CAtS can read the original data file as is, i.e., the 160-byte fixed-width format used since HITRAN 2004 or the 100-byte format of the older HITRAN versions.Regarding GEISA, the format of a single record (corresponding to a "physical line") has changed slightly from version to version, and Py4CAtS parses the file name to select the appropriate reader.)Please note that for this command at least one option is mandatory, i.e., the spectral range (here 50 to 60 cm −1 ) or a molecule must be specified (for convenience, higstract -m main ... can be used to extract line data for the first seven molecules (H 2 O, CO 2 , . . ., O 2 ) of HITRAN or GEISA.)higstract saves the data in several tabular ASCII files, one file per molecule.By default, each file has five columns for line position ν, strengths S, energy E, air-broadening parameter γ 0 L , and the exponent n and is named as H2O.vSEan etc., where the extension indicates the columns (further parameters, e.g., for self-broadening, can be requested with the format option -f).column list altitudes, pressure, and temperature).Again, the cross sections will be saved in files, one per molecule, using NumPy's pickle format or alternatively ASCII tabular or HITRAN formatted files.
The computation continues to absorption coefficients by summing all cross sections (level-by-level) scaled with the molecular concentrations, cf.Eq. ( 6), and integrating these to the layer (or delta) optical depth, Eq. ( 7).Finally, the radiance seen by an observer at ground and by a spaceborne observer (actually an observer at top-of-atmosphere, ToA) is computed.
In each step of the calculation the results are written to file(s), and read in for the next step.Obviously this can be quite time-consuming esp.for ASCII files (reading NumPy pickled files is surprisingly fast).In particular, the cross section files can be very large (depending on the spectral region, the size of the spectral interval, the number of molecules and atmospheric levels, etc.).For many applications some of the intermediate quantities are not of interest, and the file transfer operations can be omitted, e.g.(see the dashed arrows in Fig. 2) • lbl2ac compute lbl cross sections and combine to absorption coefficients; • lbl2od compute lbl cross sections and absorption coefficients, then integrate to optical depth.
• xs2od multiply cross sections with densities, sum over molecules, and integrate to optical depth.
In addition to these scripts there are several further Python files that contain functions used by these "main" scripts (e.g. to read or write the data), but can also be executed stand-alone. 5lines.py,xSection.py,absCo.py,oDepth.pycan be used to read, extract subsets in the spectral or altitude regime, summarize, plot, or reformat the respective data.molecules.pycollects properties (e.g.mass) of the IR relevant molecules and also the ID numbers used in the HITRAN and GEISA database.atmos1D.py is useful to handle the atmospheric data (p, T, VMRs, . . .).The functions in hitran.pyand geisa.pyare used by higstract.py,but can be used as a "low-level" interface to the respective dataset.
Unit conversions are implemented in the cgsUnit.pymodule (note that internally Py4CAtS uses cgs units consistently, see Appendix A.6). Furthermore, radiance2radiance.py and radiance2Kelvin.pyallow the conversion of radiance spectra, e.g.wavenumber ←→ wavelength or to equivalent brightness temperature using the inverse of Planck's function (11).
The lbl2ac, lbl2od, or xs2od scripts clearly help to avoid the time-consuming reading and writing of some of the intermediate quantities, however, these scripts turn Py4CAtS back into the "black-box" radiative transfer discussed above.Output is not shown.All functions support the -h option to ask for help, in particular a list of all available options/flags.For some notes on option parsing see Appendix A.7.
The next step is to compute absorption cross sections according to Equation (1).Without any option, cross sections are computed for the reference pressure and temperature used in the line parameter database, i.e., p 0 = 1013.25 mb and T 0 = 296 K for HITRAN and GEISA.Alternatively, pressure(s) and temperature(s) can be specified with the -p and -T option.For atmospheric applications it is more convenient to read the data from a file (e.g., "midlatitude summer" (mls) data assuming the first three column list altitudes, pressure, and temperature).Again, the cross sections will be saved in files, one per molecule, using NumPy's pickle format or alternatively ASCII tabular or HITRAN formatted files.
The computation continues to absorption coefficients by summing all cross sections (level-by-level) scaled with the molecular concentrations, cf.Equation (6), and integrating these to the layer (or delta) optical depth, Equation (7).Finally, the radiance seen by an observer at ground and by a spaceborne observer (actually an observer at top of atmosphere, ToA) is computed.
In each step of the calculation the results are written to file(s), and read in for the next step.Obviously, this can be quite time-consuming esp.for ASCII files (reading NumPy pickled files is surprisingly fast).In particular, the cross section files can be very large (depending on the spectral region, the size of the spectral interval, the number of molecules, and atmospheric levels, etc.).For many applications some of the intermediate quantities are not of interest, and the file transfer operations can be omitted, e.g., (see the dashed arrows in Figure 2) • lbl2ac compute lbl cross sections and combine to absorption coefficients; • lbl2od compute lbl cross sections and absorption coefficients, then integrate to optical depth.

•
xs2od multiply cross sections with densities, sum over molecules, and integrate to optical depth.
In addition to these scripts there are several further Python files that contain functions used by these "main" scripts (e.g., to read or write the data), but can also be executed stand-alone.(Technically, the very first line of all executable scripts is something like #!/usr/bin/env python and the last segment of the module starts with if __name__=='__main__':.)The scripts lines.py,xSection.py,absCo.py,oDepth.pycan be used to read, extract subsets in the spectral or altitude regime, summarize, plot, or reformat the respective data.molecules.pycollects properties (e.g., mass) of the IR relevant molecules and the ID numbers used in the HITRAN and GEISA database.atmos1D.py is useful to handle the atmospheric data (p, T, VMRs, . . .).The functions in hitran.pyand geisa.pyare used by higstract.py,but can be used as a "low-level" interface to the respective dataset.Unit conversions are implemented in the cgsUnit.pymodule (note that internally Py4CAtS uses cgs units consistently, see Appendix A.6). Furthermore, radiance2radiance.py and radiance2Kelvin.pyallow the conversion of radiance spectra, e.g., wavenumber ←→ wavelength or to equivalent brightness temperature using the inverse of Planck's function (11).
The lbl2ac, lbl2od, or xs2od scripts clearly help to avoid the time-consuming reading and writing of some of the intermediate quantities, however, these scripts turn Py4CAtS back into the "black-box" radiative transfer discussed above.
When executing the series of individual steps, the subsequent steps must re-read the data saved in the previous step(s) from file(s) and check the consistency of the data.For example, the xs2ac.pyscript will test if all cross sections cover the same spectral interval, and if the pressures and temperatures match those in the atmospheric data file.Accordingly, a substantial part of the code is devoted to implement appropriate tests.A prerequisite for these tests is that all relevant attributes are saved in the file along with the spectra; for example, a single cross section is saved along with its pressure, temperature, and wavenumber grid specification.

Py4CAtS Used within the (I)Python and Jupyter Shell
Access to intermediate quantities is especially useful for visualization.In the (I)Python interpreter, graphics packages such as Matplotlib [77] allow the plotting of any kind of array.For plotting as well as for the subsequent processing steps it is important to have all attributes of a spectrum available.However, a function returning an array for the spectrum together with its attributes (e.g., with a call such as lines_h2o, pRef, tRef = higstract('/data/hitran/86/lines', 'H2O')) would be cumbersome, and collecting the return values in a dictionary would not significantly improve this.In Py4CAtS this problem is solved by means of subclassed and/or structured NumPy arrays that will be briefly described (in Appendices A.4 and A.5) after a presentation of the typical workflow illustrated in Figure 3.

Atmospheric Data
Knowledge of the atmospheric state, i.e., pressure, temperature, and molecular composition as a function of altitude, is indispensable for lbl modeling and radiative transfer.The Py4CAtS scripts called from the console automatically read these data from a file (along with the line data, cross section data, etc.)This is essentially accomplished by the function atmRead of the atmos1D.pymodule, e.g., the command mls = atmRead('/data/atmos/20/mls.xy')returns the data of the "midlatitude summer" atmosphere of the AFGL dataset [78] regridded to 20 levels (see the very first input labelled In [1]: in Figure 3).atmRead essentially expects a tabular ascii file with rows corresponding to the atmospheric levels and columns for altitudes, pressure (and/or air density), temperature, and molecular concentrations.Two comment lines are mandatory: #what: followed by the column identifiers and #units: followed by the physical units (e.g., "km").
mls is an example of a structured array (see Appendix A.5), where the individual profiles can be accessed by their name, e.g., mls['T'] or mls['H2O'], and the data for a specific level l are given by the row index, e.g., mls[0] or mls[-1] for the bottom and top level, respectively.Please note that "rows" and "columns" can be specified in two ways; for example, the BoA temperature is mls When executing the series of individual steps, the subsequent steps have to re-read the data saved in the previous step(s) from file(s) and check the consistency of the data.For example, the xs2ac.pyscript will test if all cross sections cover the same spectral interval, and if the pressures and temperatures match those in the atmospheric data file.Accordingly, a substantial part of the code is devoted to implement appropriate tests.A prerequisite for these tests is that all relevant attributes are saved in the file along with the spectra; for example, a single cross section is saved along with its pressure, temperature, and wavenumber grid specification.

Py4CAtS used within the (I)Python and Jupyter Shell
Access to intermediate quantities is especially useful for visualization.In the (I)Python interpreter, graphics packages such as Matplotlib [77] allow to plot any kind of array.For plotting as well as for the subsequent processing steps it is important to have all attributes of a spectrum available.
However, a function returning an array for the spectrum together with its attributes (e.g. with a call such as lines_h2o, pRef, tRef = higstract('/data/hitran/86/lines', 'H2O')) would be cumbersome, and collecting the return values in a dictionary would not significantly improve this.In Py4CAtS this problem is solved by means of subclassed and/or structured NumPy arrays that will be All data are stored internally in cgs units, e.g., pressure is converted to dyn/cm 2 = g/cm/s 2 .Molecular concentrations are stored as number densities (i.e., if the data file contains volume mixing ratios (VMR), these are converted to densities by multiplication with the air number density n = p/(kT)).VMR can be obtained with the vmr function, e.g., vmr(mls).Further convenience functions of the atmos1D.pymodule include vcd to integrate the (molecular and air) number densities along a vertical path through the atmosphere to the "Vertical Column Density" (N = n(z) dz with default limits given by top and bottom of atmosphere, z ToA and z BoA ), cmr for the "Column Mixing Ratio", i.e., the ratio of the molecular VCDs and the air VCD.
The atmRegrid (mls, zNew, ...) function allows interpolation of the atmospheric data to a new altitude grid.If the limits of the new grid exceed the old limits, atmRegrid prints a warning.The actual results are depending on the chosen interpolation scheme; in case of the default linear interpolation NumPy's interp function simply repeats the very first or last value at BoA or ToA, respectively.
Data from different files can be combined with the atmMerge function, e.g., combiAtm = atmMerge (mlw, traceGases) (here the second data set "traceGases" is likely comprising only concentration profiles that can be read with the vmrRead function.)If the two data sets are given on different altitude grids, profiles from the second set are interpolated to the grid of the first set.If a profile is defined in both data, then by default the second is ignored unless the flag replace is "switched on".
The data can be easily visualized using the standard Matplotlib [77] functions (e.g., plot(mls['T'],mls['z'])), but for convenience Py4CAtS provides a function atmPlot that expects a single structured array or a list thereof as first argument.(For some general remarks on visualization see Appendix A.2.) For example, atmPlot(mls) shows temperature vs. altitude, and atmPlot([mls,mlw], 'O3', 'mb') compares ozone profiles of the midlatitude summer and winter atmospheres with pressure as vertical axis.Finally, the function atmSave can be used to write the data to file (see also Appendix A.1).
Similar to the atmospheric data, the line data can be plotted using Matplotlib's functions, or, more conveniently, with Py4CAtS' function, e.g., atlas(llDict).The atlas function (named after Park et al. [82]) can be used with a single lineArray or a list or dictionary of lineArrays and displays line strength vs. position by default.

Cross Sections
In the next step the line parameters can be used to compute molecular cross sections, see the In [3]: block in Figure 3. xs = lbl2xs(llDict['CO']) returns the cross section of CO in the spectral range around 2140 cm −1 for the database (here HITRAN) reference pressure and temperature.A single cross section is stored in a subclassed NumPy array (i.e., type(xs) → xsArray) with "attributes" such as lower and upper wavenumber bound, pressure, temperature, and molecule stored in further items, e.g., xs.x or xs.p.As a default, the Voigt profile is considered.Please note that the cross sections are evaluated on a uniform wavenumber grid (with spacing defined by the mean line width, see Section 2.2.1), so it is sufficient (and more memory efficient) to save the very first and very last grid point only.
Cross sections for different pressures and temperatures are obtained by specifying the second or third function argument of lbl2xs.Please note that the data type returned by lbl2xs in these examples is different, i.e., the type is depending on the number of p, T pairs and the type of the line data (a single or a dictionary of lineArray, essentially the number of molecules).In the very first example (CO and one p, T) in Figure 3 a single subclassed NumPy array xsArray is returned, whereas a list of xsArray's is returned for a list of p, T pairs and a single molecule.Finally, the last example gives a dictionary of lists of xsArray's, each list for a single molecule and a dictionary entry for each molecule.
The xSection.pymodule has a function xsPlot to visualize the cross sections, e.g., xsPlot(xss).This function works recursively (cf.Appendix A.3), i.e., it can be called with a single xsArray, a list thereof, or a dictionary of (lists of) xsArray's.Figure 4 demonstrates the combined use of the atlas and xsPlot functions exploiting Matplotlib's function twinx to share the common wavenumber axis.The functions xsInfo and xsSave can be used with a single cross section array, a list of cross sections, or a dictionary of lists to summarize the cross sections' properties or to write the data to file(s); Reading cross section data from file(s) is possible with the xsRead function./molec] 1.01e+03mb 296.00KCO 1.01e+03mb 296.00KH2O 1.01e+03mb 296.00KCH4 1.01e+03mb 296.00KN2O Similar to the atmospheric data, the line data can be plotted using Matplotlib's functions, or, more conveniently, with Py4CAtS' function, e.g.atlas(llDict) .The atlas function (named after Park et al. [82]) can be used with a single lineArray or a list or dictionary of lineArrays and displays line strength vs. position by default.

Cross Sections
In the next step the line parameters can be used to compute molecular cross sections, see the In [3]: block in Fig. 3. xs = lbl2xs(llDict['CO']) returns the cross section of CO in the spectral range around 2140 cm −1 for the database (here HITRAN) reference pressure and temperature.A single cross section is stored in a subclassed NumPy array (i.e.type(xs) → xsArray) with "attributes" such as lower and upper wavenumber bound, pressure, temperature, and molecule stored in further items, e.g.xs.x or xs.p.As a default, the Voigt profile is considered.Note that the cross sections are evaluated on a uniform wavenumber grid (with spacing defined by the mean line width, see subsection 2.2.1), so it is sufficient (and more memory efficient) to save the very first and very last grid point only.
Cross sections for different pressures and temperatures are obtained by specifying the second or third function argument of lbl2xs.Note that the data type returned by lbl2xs in these examples is different, i.e. the type is depending on the number of p, T pairs and the type of the line data (a single or a dictionary of lineArray, essentially the number of molecules).In the very first example (CO and one p, T) in Fig. 3

Absorption Coefficients
Given cross sections of some molecules on a set of p, T levels along with the atmospheric data, in particular the molecular number densities, the absorption coefficient ( 6) for all levels are generated with acList = xs2ac (mls, xssDict).The list contains a spectrum for each atmospheric p, T level, where each spectrum is stored in a subclassed NumPy array: type(acList[0]) → acArray similar to the cross sections, i.e., with attributes stored as, e.g., ac.x and ac.z for the wavenumber range and altitude, respectively.Please note that the number of levels in the atmospheric data set (here mls) and the lengths of the cross section lists in the xssDict dictionary must be identical.Furthermore, all molecules with cross section data must be contained in the atmospheric data (but there can be some "unused" molecules in the atmospheric data set).
The absorption coefficients can be plotted with the standard Matplotlib functions, but Py4CAtS also has a function to make this easier: acPlot(acList).The function acInfo(acList) prints essential information about the absorption coefficients (Actually it is a loop calling the corresponding info method of acArray, i.e., for ac in acList: ac.info()).
The data can be saved to file (tabular ascii) with the standard NumPy savetxt or Py4CAtS' awrite function.The acSave function automatically saves the absorption coefficients along with the atmospheric data, and the acRead function allows reading of the data (incl.the associated atmosphere) back from file, e.g., absCo = acRead(acFile).Both acSave and acRead also support HITRAN formatted files or Python/NumPy's internal pickle format.

Optical Depths
The next step is to integrate the absorption coefficients along the (vertical) path through the atmosphere using the function ac2dod (see In [4] in Figure 3).Similar to cross sections and absorption coefficients this returns a list of (nLevels-1) subclassed NumPy arrays odArray, where each list member is essentially the delta / differential / layer optical depth spectrum along with its attributes lower and upper altitudes, pressures, and temperatures (and the wavenumber interval, too).
The optical depths instances can be combined by addition or subtraction, e.g., the delta optical depths of the first (bottom) two layers can be added by dodList[0]+dodList [1] (see Figure 5).
The main purpose of the __add__ special method is to combine the optical depths from neighboring layers, but it can also be used to sum the optical depths of different molecules in one layer.# delta optical depth list dodl = lbl2od(mls, dictOfLineLists) # the first two layers and their sum odPlot([dodl[0], dodl [1], dodl[0]+dodl [1]]) # also plot total optical depth odPlot(dod2tod(dodl)) thereof, or a dictionary of (lists of) xsArray's.Figure 4 demonstrates the combined use of the atlas and xsPlot functions exploiting Matplotlib's function twinx to share the common wavenumber axis.
The functions xsInfo and xsSave can be used with a single cross section array, a list of cross sections, or a dictionary of lists to summarize the cross sections' properties or to write the data to file(s); Reading cross section data from file(s) is possible with the xsRead function.

Absorption Coefficients
Given cross sections of some molecules on a set of p, T levels along with the atmospheric data, in particular the molecular number densities, the absorption coefficient ( 6) for all levels are generated with acList = xs2ac (mls, xssDict) .The list contains a spectrum for each atmospheric p, T level, where each spectrum is stored in a subclassed NumPy array: type(acList[0]) → acArray similar to the cross sections, i.e. with attributes stored as, e.g., ac.x and ac.z for the wavenumber range and altitude, respectively.Note that the number of levels in the atmospheric data set (here mls) and the lengths of the cross section lists in the xssDict dictionary has to be identical.Furthermore, all molecules with cross section data must be contained in the atmospheric data (but there can be some "unused" molecules in the atmospheric data set).
The absorption coefficients can be plotted with the standard Matplotlib functions, but Py4CAtS also has a function to make this easier: acPlot(acList) .The function acInfo(acList) prints essential information about the absorption coefficients (Actually it is a loop calling the corresponding info method of acArray, i.e. for ac in acList: ac.info()).
The data can be saved to file (tabular-ascii) with the standard NumPy savetxt or Py4CAtS' awrite function.The acSave function automatically saves the absorption coefficients along with the atmospheric data, and the acRead function allows to read the data (incl.the associated atmosphere) back from file, e.g.absCo = acRead(acFile) .Both acSave and acRead also support HITRAN formatted files or Python/NumPy's internal pickle format.

Optical Depths
The next step is to integrate the absorption coefficients along the (vertical) path through the atmosphere using the function ac2dod (see In [4] in Fig. 3).Similar to cross sections and absorption coefficients this returns a list of (nLevels-1) subclassed NumPy arrays odArray, where each list member is essentially the delta / differential / layer optical depth spectrum along with its attributes lower and upper altitudes, pressures, and temperatures (and the wavenumber interval, too).
The optical depths instances can be combined by addition or subtraction, e.g. the delta optical depths of the first (bottom) two layers can be added by dodList[0]+dodList [1] (see Fig. 5).The main purpose of the __add__ special method is to combine the optical depths from neighboring layers, but it can also be used to sum the optical depths of different molecules in one layer.Please note that the ac2dod function (and the xs2dod, lbl2od, . . .function discussed below) only return delta optical depths (dod), further functions can be used to convert to cumulative or total optical depth.Summation of all layer optical depths with the dod2tod function delivers the total optical depth and dod2cod returns the (ac)cumulated optical depth.By default, the accumulation is starting with the very first layer (usually at BoA) and the very last element of the generated list should be the total optical depth, whereas cod = dod2cod(dodList,True) starts accumulating with the very last layer and the very first cod[0] corresponds to the total optical depth.
A quick-look of optical depth(s) can be generated with the odPlot function, and the data (incl.the attributes) can be saved to file using odSave using ASCII, netcdf, or pickle format.Later, the optical depth data can be read from file into a (new) IPython session with oDepth = odRead (odFile).
The oDepthOne function returns the distance s 1 from the (uplooking or downlooking) observer to the point, where the optical depth is one, τ(ν, s 1 ) = 1.0, corresponding to a transmission that has decreased to T = 1/e.This distance should roughly correspond to the location of the weighting function maximum.
3.2.6.Weighting Functions wgtFct = ac2wf(acList[, angle, zObs]) computes weighting functions according to Equation (13) for an observer at altitude zObs looking in direction angle (default 180 • ) and returns a subclassed 2D NumPy array of type wfArray (with wgtFct.shape= (len(vGrid), len(sGrid))).By default, the observer is assumed to be at ToA or BoA for viewing angles larger or smaller than 90 • .The attributes define the wavenumber interval, path distance grid sGrid (relative to the observer (in cm), i.e., from ToA to BoA in case of a downlooking nadir view), observer altitude, and viewing angle.Optionally ac2wf also allows the treatment of finite field-of-view effects with an extra argument FoV to set the type and width (HWHM, in degree) (e.g., FoV='Gauss 7.5').
Alternatively, given the delta/layer optical depths the weighting functions can be approximated by finite differencing using the dod2wf(dodList,zObs,angle) function, but starting from the absorption coefficient is much more reliable.For weighting functions of a horizontal path (zenith angle θ = 90 • ) see the first remark in Section 4.2.
The function wfSave(wgtFct, ...) and wfRead(wfFile, ...) can be used to save and read the data, and wfPlot(wgtFct[, wavenumber, ...]) provides a simple visualization tool.If (a) specific wavenumber(s) are given, a 2D plot is presented, otherwise a contour plot is generated.An example of weighting functions for microwave temperature sounding in the region of the oxygen rotation band is given in Figure 6.Note that the ac2dod function (and the xs2dod, lbl2od, . . .function discussed below) only return delta optical depths (dod), further functions can be used to convert to cumulative or total optical depth.Summation of all layer optical depths with the dod2tod function delivers the total optical depth and dod2cod returns the (ac)cumulated optical depth.By default, the accumulation is starting with the very first layer (usually at BoA) and the very last element of the generated list should be the total optical depth, whereas cod = dod2cod(dodList,True) starts accumulating with the very last layer and the very first cod[0] corresponds to the total optical depth.
A quick-look of optical depth(s) can be generated with the odPlot function, and the data (incl.the attributes) can be saved to file using odSave using ASCII, netcdf or pickle format.Later on, the optical depth data can be read from file into a (new) IPython session with oDepth = odRead (odFile) .
The oDepthOne function returns the distance s 1 from the (uplooking or downlooking) observer to the point, where the optical depth is one, τ(ν, s 1 ) = 1.0, corresponding to a transmission that has decreased to T = 1/e.This distance should roughly correspond to the location of the weighting function maximum.
3.2.6.Weighting Functions wgtFct = ac2wf(acList[, angle, zObs]) computes weighting functions according to Eq. (13) for an observer at altitude zObs looking in direction angle (default 180 • ) and returns a subclassed 2D NumPy array of type wfArray (with wgtFct.shape= (len(vGrid), len(sGrid))).By default the observer is assumed to be at ToA or BoA for viewing angles larger or smaller than 90 • .The attributes define the wavenumber interval, path distance grid sGrid (relative to the observer (in cm), i.e. from ToA to BoA in case of a downlooking nadir view), observer altitude, and viewing angle.Optionally ac2wf also allows to treat finite field-of-view effects with an extra argument FoV to set the type and width (HWHM, in degree) (e.g.FoV='Gauss 7.5').
Alternatively, given the delta/layer optical depths the weighting functions can be approximated by finite differencing using the dod2wf(dodList,zObs,angle) function, but starting from the

Radiance/Intensity
The dod2ri function evaluates the Schwarzschild integral (9) and returns the radiance or intensity, again a subclassed NumPy array riArray with attributes for wavenumber interval, altitude, pressure, and temperature minimum/maximum, observer zenith angle, and background temperature, see the last block in Figure 3. Without optional arguments, the radiance seen by an uplooking observer at the surface (BoA) is computed, whereas dod2ri (dodList, 180.0, mls['T'][0]) gives the radiance for a nadir-viewing observer looking down from ToA with an angle of 180.0 • (relative to the zenith angle) to Earth; the third argument specifies the surface temperature T b (here the BoA temperature of the midlatitude summer (mls) atmosphere) that is used to evaluate a Planck background contribution in (9) with I b (ν) = B(ν, T b ).
Please note that dod2ri does not have any argument to specify the observer altitude, i.e., it computes the radiance at BoA or ToA for an angle smaller or larger than 90 • (a horizontal path with angle 90 • is not implemented).If you want to model the radiance, say, for an airborne observer downlooking from 10 km and have a list of layer optical depths for an atmosphere with a uniform altitude grid of 1 km (hence layer thickness 1 km), supply a list of the first ten optical depths only, i.e., dod2ri(dodList[:10],180).(See also the remark on limitations in Section 4.6.) A further Boolean optional argument can be given to switch to the "B exponential-in τ" approximation instead of the default "B linear-in τ", see Section 2.2.2.
To plot and save the radiance spectrum (along with the wavenumber grid) in a file use the riPlot and riSave functions, respectively.To convolve the radiance spectrum I with a spectral response function according to (12), a special method convolve has been implemented, e.g., radBox1 = radiance.convolve()uses the default "box" with a half width 1.0 cm −1 .Likewise, radGauss2 = radiance.convolve(2.0,'G')uses a Gaussian response function with HWHM 2.0 cm −1 .The convolve special method is also available for the odArray and wfArray classes (in the first case the convolution operates on the corresponding transmission).

Shortcuts
If some of the intermediate quantities are not required, it is possible to go directly from line and atmospheric data to optical depths using dodList = lbl2od (mls, llDict).Likewise, cross sections and absorption coefficients can be bypassed with the lbl2ac and xs2dod functions.Please note that lbl2ac and lbl2od "inherit" most options accepted by lbl2xs or xs2ac.

Selection of Spectral Range, Contributions from Line Wings
To compute cross sections, absorption coefficients, and optical depths for some spectral range ν lo . . .ν hi , all lines in an extended spectral range ν lo − δ . . .ν hi + δ should be considered, where δ is typically some wavenumbers ( cm −1 ).However, unless specified as fourth optional argument xLimits of the lbl2xs function or as option -x of the command line script lbl2xs.py(and similarly for the lbl2ac and lbl2od functions), the cross sections, absorption coefficients, and optical depths returned by Py4CAtS are computed on a uniform grid in the interval [ν first , ν last ] where the lower and upper limits correspond to the position of the very first and last line returned by higstract.As the higstract and lbl2xs scripts are completely independent, this extension is not done automatically.The impact of line wing contributions on cross sections is demonstrated in Figure 7.

Optical Depths, Transmissions, and Weighting Functions for a Horizontal View
The functions ac2dod or lbl2od do not have an angle as argument, so the optical depth returned is always the vertical optical depth through the atmosphere.If transmission (or weighting functions) for a horizontal path (i.e., zenith angle 90 • , hence a homogeneous atmosphere) are needed, the transmission T (ν, s) = exp (−α(ν)l) can be readily evaluated as a function of path length l given an appropriate absorption coefficient α.Likewise, the weighting functions are easily computed according to (13) as the product of transmission T (ν, s) times the absorption coefficient α for some lengths l.

Arbitrary Observer Positions
An observer "inside" a layer, i.e., with the observer altitude different from any atmospheric altitude grid point, is not supported by Py4CAtS.However, if one needs an observer at a specific altitude (e.g., 3.14159 km), one can interpolate the atmospheric profiles to a new grid including this point and proceed as usual.

GARLIC vs. Py4CAtS
As indicated above, Py4CAtS has originally been a (partial) re-implementation of GARLIC.GARLIC has been thoroughly verified by intercomparisons with other lbl codes such as ARTS and KOPRA (e.g., [74][75][76]).Furthermore, it has been recently validated by intercomparison with spectra of the ACE-FTS instrument [84][85][86].Figure 8 shows an intercomparison of GARLIC and Py4CAtS brightness temperature spectra.For the monochromatic spectra differences are mostly below one Kelvin except for a few spikes near strong lines.The convolved spectra agree within some hundredth Kelvin.
Despite the efficiency of NumPy's array processing GARLIC is significantly faster than Py4CAtS.In GARLIC the computation of molecular cross sections, absorption coefficients, and radiances is exploiting the multi-core architecture of modern CPUs, i.e., parallelization by means of OpenMP (see section 3.3 of the GARLIC paper [25]).This allows about a factor 50 speed-up for the Fortran-OpenMP implementation compared to the NumPy implementation on an eight-core desktop for the radiance spectra shown here.

Batch Processing
Given line and atmospheric data, a radiance spectrum is essentially generated by the sequence atmRead, higstract, lbl2od, and dod2ri.Likewise, atmospheric transmission can be modeled with the fourth and last step exp(-dod2tod(...)).If a series of synthetic spectra must be modeled, a combination of these functions into a single function might be convenient.However, such a function is currently not provided in Py4CAtS because the appropriate implementation is clearly depending on the actual task: For an assessment of the impact of spectroscopic data or spectral range on atmospheric radiances, a function combining higstract -lbl2od -dod2ri would be adequate (with atmospheric data ingested only once), whereas the impact of atmospheric data on radiances could be studied with the combination of atmRead -lbl2od -dod2ri.Furthermore, the main modeling task can easily be performed with a single statement such as dod2ri(lbl2od(atmData,llDict),...).Indeed, this can also be easily used for retrieval purposes, e.g., by a combination with the nonlinear least squares solvers of the scipy.optimizepackage.

•
Spherical atmospheres: modeling radiance and/or transmission for limb sounding is not possible; Py4CAtS is assuming a plane-parallel atmosphere.Please note that the quadrature schemes of Section 2.2.2 also work for limb geometry.

•
Jacobians: Several tools have been developed for automatic differentiation of Python code (see http://www.autodiff.org),so derivatives of spectra w.r.t.atmospheric parameters etc. could be implemented similar to the approach used in GARLIC [94].This is currently not foreseen.

•
Other line parameter databases (see also http://hitran.org/links/):In addition to HITRAN/HITEMP and GEISA, further databases have been developed such as ExoMol [19] and databases dedicated to specific spectral regions (e.g., JPL and CDMS catalogs, Pickett et al. [16], Endres et al. [18]), specific molecules (e.g., [95][96][97]), or satellite missions [96,98].Some of these databases have a format similar to HITRAN and GEISA and an appropriate reader could be readily implemented, whereas an implementation of other databases would require some more effort because of their different organization.Please note that a Python script ExoMol_to_HITRAN.py has been developed by the ExoMol consortium, see https://github.com/xnx/ExoMol_to_HITRAN.

•
Predefined cross sections: Both HITRAN and GEISA include a large collection of IR (and UV) cross sections esp.for heavy molecules that can be relevant for atmospheric absorption.Reading and further processing of these data is not yet implemented in Py4CAtS.

•
Scattering: So far only the Schwarzschild equation with thermal emission as source (i.e., Planck function) is supported.However, the optical depths can be used as input for any multiple scattering solver (e.g., DISORT, [99]), see libRadtran [100,101].

•
Line shapes beyond Voigt: some advanced profiles for modeling line-mixing, Dicke narrowing, or speed-dependence [61,62] have been implemented recently.

•
Py4CAtS stores only the "core" line parameters required for cross section modeling.In contrast, HAPI [34] can also keep track of line assignments, transition IDs etc.However, Py4CAtS has been developed with atmospheric spectroscopy as target application, but not molecular spectroscopy.

Conclusions
Py4CAtS, a Python package developed for "computational atmospheric spectroscopy", has been presented and its usage in a Unix-like console or within the (I)Python interpreter/notebook (recommended) has been demonstrated.Starting with line data from HITRAN or GEISA and atmospheric data, the scripts and functions allow the computation of cross sections, absorption coefficients, optical depths, weighting functions, and radiances.In particular, they also provide easy access to all intermediate quantities esp.for visualization.
When the development of Py4CAtS started, lbl modeling with Python appeared to be quite ambitious.However, thanks to NumPy and its dramatic performance improvements, atmospheric radiative transfer modeling with Python can now be done with reasonable speed.In fact, our recent benchmark tests of Voigt and complex error function algorithms indicated that NumPy implementations are not significantly slower than Fortran implementations [102].
As emphasized above, the intention for Py4CAtS has not been a highly efficient and accurate lbl radiative transfer code, and the package is not considered to be a competitor for the codes mentioned in the introduction.Nevertheless, despite the speed limitations of interpreters such as Python and the omission of limb geometry, continuum, or scattering Py4CAtS is believed to be attractive because of the flexibility and ease of use.The future development of Py4CAtS will certainly be driven by further optimizations and extensions of its functionality.
wfRead, and riRead.In all cases tabular ASCII files are supported, in some cases Python's / NumPy's pickle format or netcdf I/O is also available.Cross section and absorption coefficient files can also use the HITRAN format.(The HITRAN cross section format is extensively described at https:// hitran.org/docs/cross-sections-definitions/.In brief, each "portion" starts with a header line defining the molecule, wavenumber range, number of data in this set, temperature (K), pressure (Torr) etc.Following this header, the "values are arranged in records containing ten values of fields of ten for each cross-section".) For completeness, the higstract function to read the HITRAN and GEISA database (and extract some lines) should be mentioned here, too.The extracted lines can be saved to file with save_lines_core for the "core parameters" only (i.e., position, strength etc.) or save_lines_orig for the original format (Please note that there is no tool to convert data from HITRAN to GEISA format or back).To read a set of line data files (HITRAN/GEISA extracts of core parameters) use read_line_file that returns a single "lineArray" for a single file (molecule) or a dictionary thereof.
All routines saving data in ASCII format use the awrite function from the aeiou module, see Appendix A.8 for details. in the pairTypes.pymodule, see Appendix A.9). Pressure and temperature of the cross section and absorption coefficient are single floats for the corresponding atmospheric level, whereas for optical depths they are PairOfFloats corresponding to the atmospheric layer.
The same methods are also defined for acArray, odArray, and riArray, where in addition there is also a method truncate returning the spectrum in a smaller wavenumber interval.Furthermore, a method convolve has been implemented in the riArray class to smooth the radiance with a box, triangle, or Gaussian spectral response function.
For odArray there are also __add__ and __sub__ methods to add and subtract optical depths, see Figure 5.These combinations can only be performed if both spectra are defined in the same wavenumber interval, i.e., od1.x and od2.x are identical.In general, the two optical depths are given on different wavenumber grids, so the coarser spectrum is regridded to the resolution of the denser spectrum first.The __mul__ method can be used to scale optical depths with a (float) number, e.g., to account for a slant path od/cosdg(60).
The core parameters (position, strength, width, . . . ) of the lines extracted from the HITRAN or GEISA databases are also saved internally in a subclassed array lineArray.However, in contrast to the arrays mentioned above (that all have just a single dimension) this has "rows and columns", where the rows correspond to the spectral lines/transitions, and the columns correspond to center wavenumber νl , line strength S l , etc.To make these columns easily accessible, lineArray is a subclassed structured array (see next section Appendix A.5) with attributes holding information about molecule and reference pressure and temperature.

•Figure 1 .
Figure 1.From Hitran/Geisa via cross sections (xs) and absorption coefficients (ac) to optical depths (od) and radiation intensity (ri).Note that cross sections are pressure and temperature (pT) dependent, absorption coefficients also depend on composition (VMR), and optical depth and radiation intensity depends on path geometry (geo).

Figure 1 .
Figure 1.From HITRAN/GEISA via cross sections (xs) and absorption coefficients (ac) to optical depths (od) and radiation intensity (ri).Please note that cross sections are pressure and temperature (pT) dependent, absorption coefficients also depend on composition (VMR), and optical depth and radiation intensity depends on path geometry (geo).

Figure 2 .
Figure 2. Typical workflow for lbl modeling with Py4CAtS in a Unix-like shell: Hitran/Geisa → line parameter extracts → cross sections → absorption coefficients → optical depth → radiance / intensity.Output is not shown.All functions support the -h option to ask for help, in particular a list of all available options/flags.For some notes on option parsing see Appendix A.7.

Figure 3 .
Figure 3.Typical workflow of an IPython session (output not shown except for first two commands).

Figure 4 .
Figure 4. Combination of line atlas and xsPlot.This example has been generated with dll = higstract('/data/hitran/2000/lines',(4273,4312), 'main') # dict of line lists xss = lbl2xs(dll) # dict of cross sections atlas(dll); twinx(); xsPlot(xss) a single subclassed NumPy array xsArray is returned, whereas a list of xsArray's is returned for a list of p, T pairs and a single molecule.Finally, the last example gives a dictionary of lists of xsArray's, each list for a single molecule and a dictionary entry for each molecule.The xSection.pymodule has a function xsPlot to visualize the cross sections, e.g.xsPlot(xss) .This function works recursively (cf.Appendix A.3), i.e. it can be called with a single xsArray, a list

Figure 7 .
Figure 7. Impact of line wings on H 2 O cross section in the ODIN [83] 501 GHz channel.A series of cross sections has been computed taking into account more and more lines to the left and right of the 16 to 17 cm −1 window.