Data Needs for Modeling Low-Temperature Non-Equilibrium Plasmas: The LXCat Project, History, Perspectives and a Tutorial

Technologies based on non-equilibrium, low-temperature plasmas are ubiquitous in today’s society. Plasma modeling plays an essential role in their understanding, development and optimization. An accurate description of electron and ion collisions with neutrals and their transport is required to correctly describe plasma properties as a function of external parameters. LXCat is an open-access, web-based platform for storing, exchanging and manipulating data needed for modeling the electron and ion components of non-equilibrium, low-temperature plasmas. The data types supported by LXCat are electron- and ion-scattering cross-sections with neutrals (total and differential), interaction potentials, oscillator strengths, and electron- and ion-swarm/transport parameters. Online tools allow users to identify and compare the data through plotting routines, and use the data to generate swarm parameters and reaction rates with the integrated electron Boltzmann solver. In this review, the historical evolution of the project and some perspectives on its future are discussed together with a tutorial review for using data from LXCat.


Introduction
Technologies based on non-equilibrium, low-temperature plasmas (LTPs) are ubiquitous in today's society. Plasmas are used in many fields and at the core of certain technologies such as the semi-conductor industry, thin film deposition, treatment of textiles, ozone production, conversion of solid waste, lamps and satellite thrusters [1]. Additionally, LTPs are being intensively studied for new applications in the fields of medicine and hygiene [2], gas conversion and agriculture [3], plasma-assisted combustion [4], and plasma catalysis [5]. Plasma modeling plays an essential role in the development and optimization of plasma sources for these applications. However, as stated by Pitchford et al. [6]: The integrity of the results obtained from numerical modeling depends [...] on the identification of a physical model appropriate for addressing the problem of interest, on a good choice and proper implementation of the numerical solution techniques, and on the availability of reliable input data.
The LXCat project aims to address the latter issue-availability of reliable data-and more broadly to be a platform for the exploration and curation of data.
LXCat is an open-access, web-based platform (http://www.lxcat.net) for storing, exchanging and manipulating data needed for modeling the electron and ion components of non-equilibrium, low-temperature, plasmas. Modeling such plasmas requires extensive input data, the specific form of which depends on the model formulation, as described in Section 2. The initial impetus for the LXCat project was the desire to develop a convenient way of exchanging data for the electron component of LTPs. These are the key data needed for modeling LTPs, and while modeling groups have been generally willing to share data, there was no convenient means for doing so. The emphasis at the beginning was on compilations of "complete" sets of electron-neutral (ground state) scattering cross-sections which are required for input to Boltzmann equation solvers or Monte Carlo simulations to determine the electron energy distribution function (eedf). The term "complete" is used here to imply that the cross-sections account for all the important electron energy and momentum loss processes for electron collisions (with more or less detail). Thus, the calculated transport and rate coefficients (integrals over the eedf) are consistent measurements in swarm experiments [7]. Please note that complete cross-section sets can be used to describe the electron component of the plasma in more general situations. As discussed in Section 4, LXCat today includes other data types associated with electron collisions and transport in LTPs as well as data for modeling the ion component of LTPs.
LXCat is structured into individual databases containing data uploaded by individual contributors who retain control over their database. In some cases, the databases contain data generated by the contributors themselves, but in other cases, the data have been compiled by the contributors from the literature. In all cases, references are provided, and all users are urged to cite those references in publications making use of data downloaded from LXCat. LXCat does not provide recommendations, but intercomparisons of some data sets have been performed under the auspices of the GEC (Gaseous Electronics Conference) Plasma Data Exchange project [8]. Three papers have resulted from these efforts [6,9,10] with a focus on the electron-noble gas scattering cross-sections available at that time. Accompanying these was a review of computational methods for electron-scattering with atoms [11]. Other intercomparisons of data sets for H 2 , O 2 , N 2 and CO 2 have been presented in posters at recent conferences, and these can be found on the LXCat website under the heading "docs and links".
Scientific research is usually evaluated and distributed in the form of journal articles where large data sets are often represented by small excerpts or plots. Excerpts are insufficient in the development of plasma models and data retrieval from plots is tedious and not repeatable [12]. Figure 1 is an illustration of the "traditional" data extraction method from the pre-digital era. Although data extraction has improved with digital publishing and software tools, the process remains unsatisfactory. In contrast, obtaining similar data from LXCat is as simple as selecting the type of data desired, one (or more) of the data sets containing the desired type of data, and acknowledging the need for proper citation. There is no need to dig through physical volumes or to infer numerical values from tiny figures in a crooked scan. Not only does LXCat facilitate access to historical data, but it enables broader access to data that is not easily published in typical venues. For example, large data sets such as those produced by the B-spline R-matrix [13] method are such that only a fraction can be presented in a typical journal article. Though journals have been making progress in allowing for the inclusion of supplementary material, there is no mechanism to ensure that data published in such a manner can be easily read in by plasma modeling tools or compared to other data sets. In contrast, LXCat, by storing data sets under a common template, encourages reuse and accessibility. Example of a "traditional way" of retrieving data from a physically printed plot of a paper by Biberman et al. [14] which was itself scanned in the present case from a printed journal volume (courtesy from J. van Dijk, TU/e).
Accessible LTPs modeling data are key to reproducible studies and facilitate fast and efficient advances in science. As stated in an editorial for Nature Physics [15]: Apart from the obvious advantages that open data could bring for reproducibility and scientific integrity-although there is devil in the detail of how to implement this-one argument in favour of data sharing is that having more pairs of eyes on data may lead to new discoveries and faster scientific progress.
Such considerations have always driven the LXCat project and the associated development of online tools to analyze and compare data. More recently, these considerations have led to the development of a Time Machine feature which acknowledges the need for modeling data to evolve while maintaining access to historical values.
In this paper, we will start with a brief discussion of the data needs for modeling low-temperature, non-equilibrium plasmas. Then we continue with the history of LXCat, review the status of the project while giving insight on the diversity of the data available on the website and operations that can be done with the data (i.e., use of a Boltzmann solver to compute swarm parameters) and finish with a discussion of recent work and potential future developments. The present paper particularly aims to give a tutorial introduction to the types of data present on LXCat, to serve as a guide to new users of the data and website while providing some useful tips and underlining most common pitfalls.

Data Needs for Modeling LTPs
A plasma is composed of positive and negative charged particles (electrons and ions) as well as neutral species which have several internal degrees of freedom. LXCat is focused on data for non-equilibrium LTPs. These plasmas are weakly ionized with the ionization degree (ratio of electron to total particle number densities) often smaller than 10 −4 . Thus, the most numerous particles are background gas particles in the ground state. These plasmas are also typically far from local-thermal equilibrium with electrons possessing much higher translational energy than the other species in the gas phase. This can be attributed to the fact that only charged particles are subject to acceleration by an applied electric field and electrons are inefficient in transferring kinetic energy to the much more massive atoms and molecules [16]. Electron temperatures 1 are typically in the range 1-10 eV, with a significant fraction of the electrons above the ionization threshold of the background gas. It is the hot tail of the electron distribution that sustains the plasma by ionization of the gas. On the other hand, the gas temperature (i.e., the translational temperature) is typically between 300 K and 1500 K leading to ratios (i.e., non-equilibrium) between the temperatures of the electrons and neutrals in the range of 10-400. This contrasts with systems in equilibrium, such as thermal arcs, where gas temperatures can be on the order of 6000-15,000 K [17]. In general, ion temperatures are between that of the gas and the electrons though the distribution functions may be highly anisotropic, depending on the plasma reactor type and on the operating pressure and can reach 100s of eV [18][19][20].
LTPs are modeled by a variety of kinetic, fluid or hybrid (fluid-kinetic) modeling approaches. The primary focus is often on the behavior of electrons and ions, coupled with electromagnetic fields and colliding with the background gas. The most important model input data concern the collisions of the electrons and ions with the gas. These data depend on the type of used model, as we will discuss in the following sections. Each of these models can be implemented in 1, 2 or 3 spatial dimensions to describe the plasma geometry in more or less detail. Global models [21,22] (without spatial resolution) are also used, e.g., to quickly investigate a wide range of operating conditions.

Kinetic Models: Cross-Sections
The most detailed LTP models, known as kinetic models, describe the evolution of the electron and ion distribution functions f ( r, v, t) in phase space, i.e., as a function of position r, particle velocity v, and time t. This can be accomplished either through particle simulation or, equivalently, by solving the Boltzmann equation: The second and third terms on the left-hand side correspond to the transport in space ( r) and velocity space ( v), respectively, with E the electric field, B the magnetic field (if present), q the particle charge and m the particle mass. On the right-hand side, N is the gas particle number density and C is the collision integral operator which expresses how the electrons or ions are scattered in velocity space by collisions with the background gas, considering all different elastic and inelastic collision processes. The main input data to these models are cross-sections σ j , each characterizing the probability of a given collision process j on a (microscopic) particle level.
More precisely, the product N × σ j corresponds to the collision probability per unit traveled length, and since N has the unit m −3 , σ j has the unit of area m 2 . In general, these cross-sections are not constant but are functions of the relative velocity of the two colliding particles, or, equivalently, of their relative kinetic energy. For electron-neutral collisions, the relative velocity is dominated by the electron motion (because of its small mass) so that the cross-sections can be considered functions of electron energy alone, σ j = σ j ( ), with = m e v 2 /2. The energy dependence of the cross-sections is often very strong and non-linear, involving a threshold for inelastic collisions. Hence, they are usually given in the form of look-up tables. The energy range of interest is typically from 0 to 100s of eV, so that it covers the entire electron or ion energy distribution function, including its high-energy tail.
Angular velocity scattering in collisions is usually not isotropic. The probability distribution of different scattering angles χ is given by differential scattering cross-sections I( , χ) = dσ( , χ)/dΩ, where dσ( , χ) is a measure of the probability that the velocity vector is scattered into a solid angle element dΩ = 2π sin χdχ and χ is defined as the angle with respect to the velocity vector before the collision. The relation between the total and differential cross-section is Here the adjective "total" and subscript "T" are added to indicate explicitly that this cross-section is a measure of the probability that a collision occurs regardless of its scattering angle. Differential scattering cross-sections are generally not well known. Therefore, kinetic LTP models usually resort to the assumption of simple angular scattering distributions [23,24]. In the case of electrons, elastic collisions are modeled as an isotropic scattering process via the momentum-transfer cross-section This is the main quantity representing elastic collisions in electron Boltzmann equation analyses, and it is generally known much better than the elastic total cross-section. From the perspective of particle models, this comes down to using simple isotropic elastic collisions with a cross-section σ j = σ j,m to mimic the main effects of badly known anisotropic collisions with a cross-section σ j = σ j,T . It is commonly assumed that an adequate description of the electron distribution function can be obtained from a complete cross-section set {σ j } composed of the elastic momentum-transfer cross-section σ j,m and total cross-sections σ j,T for inelastic processes.

Fluid Models: Transport Coefficients and Reaction Rate Coefficients
Fluid models describe each species of particles (electrons, ions) in terms of macroscopic quantities such as the particle number density n( r, t), the flux density Γ( r, t) or the mean velocity v( r, t), which in general depend on position r and time t. This implies that the particle distribution functions in velocity space are not directly described in these models, but are assumed to have a certain shape and behavior. These models need the input of transport coefficients and reaction rate coefficients that correspond to integrals of crosssection-related quantities over the (assumed) velocity distribution functions.
In many areas of physics and engineering, fluid models are based on the assumption of local-thermal equilibrium, leading to a Maxwellian velocity/energy distribution function, determined entirely by the temperature T. Hence, the transport coefficients and reaction rate coefficients in these models are functions of temperature. In LTP modeling, this standard assumption of a Maxwellian distribution function can be used for electrons in plasma discharges with low collisionality and relatively high ionization degree, where it may be more or less justified as a result of electron-electron interactions. In this case, macroscopic rate coefficients of different electron-impact processes, for example, can be computed from the corresponding cross-sections via the following integral: where k B is the Boltzmann constant. These rate coefficients have units m 3 ·s −1 and should be multiplied by the number densities of the reacting species (electrons and gas particles) to obtain the reaction rate per unit volume. However, in many non-equilibrium LTPs, the electron and ion distribution functions are strongly non-Maxwellian due to the electric field and the collisions with the lowtemperature background gas. Therefore, rather than assuming a Maxwellian distribution function, it is often assumed that the distribution function in these plasmas is governed by local equilibrium between the acceleration by the electric field and the momentum and energy losses due to the collisions [25]. The distribution function then depends on the reduced electric field strength E/N (the ratio of electric field strength E to the gas density N), so that the transport coefficients and reaction rate coefficients can be expressed as functions of E/N. This approach is relevant to a particular type of fluid models, very popular to model collisional LTPs: the so-called drift-diffusion models, describing the electron and ion flux densities by equations of the form where the main transport coefficients are the drift velocity W and the diffusion coefficient D. The drift velocity is proportional to the electric field, W = ±µ E, with a coefficient of proportionality µ called mobility and ± the sign of the particle charge. These transport coefficients can be measured in swarm experiments (as functions of E/N), hence they are known as "swarm parameters". They can also be calculated from an appropriate set of cross-sections by solving the Boltzmann equation for a simple particle swarm system with a uniform electric field and no boundaries, or, alternatively, by Monte Carlo simulation of such a system [7,26]. This Boltzmann equation approach is routinely used in LTPs modeling to obtain the electron transport coefficients. It is then common to invoke the two-term approximation [27] where θ is the velocity angle with respect to the direction of the electric force. The first term corresponds to the isotropic part of the distribution function, controlling the electron energy distribution function (EEDF), and the second term to an anisotropic perturbation, controlling transport. This approximation is valid if the distribution function is almost entirely isotropic, F 1 F 0 , which is often the case for electrons in LTPs due to the comparatively large frequency and small fractional energy transfer of elastic electron-neutral collisions. 2 Substitution of Equation (6) into the Boltzmann equation leads to a relation between F 1 and F 0 and then to an equation relating F 0 ( ) directly to E/N and the set of cross-sections {σ j ( )} [27]. Once this equation has been solved, the rate coefficients are calculated from rather than Equation (4) above. Similarly, the mobility and diffusion coefficient in this approximation are obtained from where σ eff = ∑ j σ j is an effective momentum-transfer cross-section incorporating all different collision processes j, equal to the sum of the elastic momentum-transfer cross-section σ j,m plus all inelastic total cross-sections σ j,T . More detailed solution methods of the Boltzmann equation, as well swarm experiments, yield separate values for the longitudinal diffusion coefficient D L , parallel to the electric field, and the transverse diffusion coefficient D T [7]. Moreover, in case the electron number density changes in time or space due to ionization or attachment processes, corrections apply to the above expressions of the transport coefficients [7,27]. Number changing processes also give rise to additional swarm parameters that are commonly used in LTP modeling: the first 3 Townsend coefficients for ionization, α, and for electron attachment, η, defined as the inverse characteristic length of spatial growth or decay of a steady-state electron swarm. The net electron production rate per unit volume can be calculated by multiplying (α − η) by the electron flux density in the drift direction.
Another parameter of interest in this context is the electron mean energy as a function of E/N,¯ This can be used to parameterize all the above electron transport coefficients and rate coefficients (in look-up tables) as functions of mean energy, disregarding the corresponding E/N values, so that they can be compared with their "Maxwellian" counterparts. Fluid or global models [30][31][32][33] based on this approach contain a fluid energy equation to describe the evolution of the mean energy¯ ( r, t) without assuming it directly related to E/N, similar to the thermal fluid models based on a Maxwellian distribution function. This is known as the "local mean energy approximation", in contrast to the "local field approximation" where the transport coefficients and rate coefficients are explicitly dependent on E/N. Depending on the type of LTP to be modeled, additional complications may be included in the above electron Boltzmann calculations, such as harmonic oscillations of the electric field, a magnetic field, collisions with different types of gas particles (including excited states), electron-electron Coulomb collisions, etc. This introduces additional dependencies of the rate coefficients and transport coefficients on the reduced angular field oscillation frequency or Larmor frequency ω/N, the composition fractions of the gas, the ionization degree n e /N, etc. It finally should also be mentioned that there are circumstances in LTPs where the velocity distribution function is affected by non-local kinetic effects that are not captured by the above assumptions of local equilibrium, so that the fluid transport coefficients and rate coefficients are no longer appropriate.

Additional Needs for Modeling Non-Equilibrium Plasmas
As discussed above, the LXCat project was initially developed for redistributing and comparing different sets of electron-neutral scattering cross-sections, the compilations of which were generally guided by the very accurate measurements swarm parameters as functions of E/N in different gases. As discussed in more detail in Section 5, good agreement with swarm parameters can be achieved using cross-sections sets with varying level of detail, including some which consider only a limited number of lumped excited states. However, there are situations which require more detail than can be obtained in this way. In the field of plasma diagnostics, state-to-state cross-sections are sometimes required to infer plasma properties such as the electron temperature, density or the local electric field strength. An example is the determination of the local electric field strength in nitrogen plasmas using the ratio of the N 2 and N + 2 molecular band emissions [34]. For determining the population rates by electron impact, so-called optical emission crosssections are measured experimentally. Knowledge of the detailed excitation pathways is also needed when one is interested in the production of specific excited states. For example, metastable species in noble gases are frequently of interest. Predicting the formation of these states requires state-to-state models which track the multitude of excited states which can exist. Similarly, in the case of plasma medicine, the production and destruction of particular reactive species can require immense reaction sets. The choice of detail for a particular plasma model requires careful consideration about quantities of interest while balancing the model's complexity.
Beyond some power density and ionization degree, the electron and ion collisions start to modify the background gas, causing non-equilibrium populations of rotational and vibrational excited state states and/or changing its chemical composition so strongly this starts to affect the electron distribution function, so that the electron kinetics and plasma chemistry become strongly linked. Under these conditions, many additional collision data become important, such as cross-sections for electron collisions with all kinds of excited states (rotation, vibrational, electronic) [35] and state to state transition coefficients. Eventually, at sufficiently high ionization degree the plasma transits to the thermal plasma regime where state populations and distribution functions are governed by local-thermal equilibrium laws and radiation transport losses become important. Data needs for modeling transport properties of plasmas in thermal equilibrium are beyond the present scope of LXCat.
From this short discussion, one can see that in contrast to other fields of research such as combustion or atmospheric chemistry, the non-equilibrium nature of LTPs places special requirements on modeling data. Only a handful of extremely detailed state-to-state data sets (mostly for noble gases) can make plausible claims to cover most LTP conditions [36]. However, it can be difficult to validate these datasets and added complications or concerns can limit their use. More frequently, one must use data sets with lumped transitions and effective states while exercising caution about the limitations of these models. Specifically, many such sets are developed with particular plasma models in mind and the individual cross-sections are not intended to be used outside of that context.

Origins
The LXCat project was originally developed by researchers at LAPLACE, a CNRS laboratory on the campus of the University of Toulouse III. The vision for the LXCat project came from Sergey Pancheshnyi in 2007, and by 2008 he had a working version of the website with the compilations of data used in-house by the group GREPHE at LAPLACE.
The first international presentation of the LXCat project was made at the 2010 ICAM-DATA meeting in Vilnius, Lithuania. At that time, there were six databases containing complete sets of cross-sections for electron-neutral interactions covering a total of 34 different neutral species explicitly intended to be used with two-term Boltzmann solvers. At that meeting, Prof. I. Bray from Perth, Australia asked about including results from his CCC calculations on LXCat. Though the results of these calculations feature meaningful differences from the data sets designed for use with a Boltzmann solver, a decision was made to include them because of their physical relevance. The result of this was the CCC database containing results from the group in Perth. Over time, the number of databases on LXCat has grown both in those designed for Boltzmann solvers and those that are not.
In October 2010, the GEC hosted a public discussion session on what was called the "GEC Plasma Data Exchange Project" where the LXCat website was presented for the first time and a call was made for participation from the larger LTPs community to make data available online. Following this discussion, the GEC initiated a series of evening workshops that have been held at the GEC almost every year. In 2011, Prof. L. Viehland from Chatham University in Pennsylvania joined the list of contributors with his large collection of data related to ion-neutral collisions. A second website (ICECat) [37] was created to host these data which include compilations of data published in reviews since the 1970s. Further data from Prof. A.V. Phelps and from Prof. J. d'Urquijo for ion-swarm parameters were also added to this new website.
In 2013, for technical reasons, it was decided to merge LXCat and ICECat into a single, larger project which would retain the name, LXCat. This structure has endured and both electron and ion data are now available through the same interface. Although the number and type of interactions provided by LXCat are greater than ever, the backend has remained essentially unchanged. As the maintainers of LXCat consider how to address the growing needs of the community, several changes are under consideration to improve the operation of LXCat. These aspects and more recent work related to this topic are discussed in Section 6. The community has reacted very positively to these initiatives, with over 50 people from more than 15 countries having so far participated to the project.

LXCat 10 Years Later
LXCat has grown into an international project with contributors from 4 different continents as illustrated in Figure 2. The red markers on the map represent the locations of team members (including both the Tech and Outreach teams). The nations of the data set contributors are colored in blue, showing a broad coalition. Even this picture is somewhat conservative as the data that comprise any given data set have their own diverse origins. This can also be found in the extensive citations that accompany the data sets found on LXCat. A look at the website analytics reveals a similar picture-LXCat is a truly international project. A glance at Figure 3 illustrates the increasing and steady use of LXCat in the LTP community. The number of citations shows the extent of the project's impact with over 1200 citations in the scientific literature (note that multiple references within each paper citing LXCat are not counted here). Also, as illustrated in Figure 3, the number of visitors to the site per week has been increasing since its inception and is now between 500 and 700 visitors per week. A significant fraction of visitors comes from the USA and China, which together represent about half of the total of visitors, according to Google Analytics. The number of downloads follows the same variability as the number of visitors, indicating that most visits are made by active users. Figure 2. Geographic location of the countries currently contributing to the LXCat project. Please note that this map reflects on the location of direct contributors to the project and not necessarily on the countries from which the data originally came from. Although these plots show the clear increase in the number of users, the contributors activity has also been growing. The data sets are not static and are frequently updated by their respective contributors. This posed a challenge for the traceability of data downloaded from LXCat and used in publications which was solved by the development of the Time Machine function which provides access to historical data. Because of the dynamic nature of the data, citing the date of retrieval of any given data set is critical for the traceability of the data. Proper citation of a data set from LXCat also extends to citing the original sources of the data that comprise a data set. For more information on how to cite data used from LXCat, we refer to Appendix A.

Data for Modeling of Plasmas (DMP) Association
The LXCat team has received several queries from the individuals in the LTP community on how to support the project financially. To this end, the non-profit DMP association (https://assoc.lxcat.net/) was created in France with a mission "to facilitate the exchange of data and numerical tools for modeling the physics and chemistry of plasma." Although it serves as the semi-formal organizing structure for LXCat, it possesses a much larger scope. Specifically, the association's objectives are to: Resources of the association so far include annual institutional membership dues, but public and private donations and grants are also possible.

What Is Available on LXCat?
LXCat makes a wide range of data and tools available to its users. The data include many different types of charged particle interaction data that originate from multiple contributors. There are also data that are still being incorporated into the main framework of the database, but are made available in a "staging area", as-is. In terms of the tools, LXCat offers an online Boltzmann solver, as well as links and downloads for other plasma modeling tools that use its data as well as other software of interest to the LTPs community. In this section, we give a brief overview of the range of resources that can be found on LXCat and give more detail on additions that were not covered by the last LXCat review [38].

List of Data Groups, Databases and Species
An extensive review and description of the available databases on LXCat until June 2016 was provided by Pitchford et al. [38]. We therefore provide here only a concise overview of the existing databases and discuss only in more detail notable changes and recent new contributions. In the following, excepted for Figure 4, we make the choice to present data "as-is" after generation by the LXCat online tools to illustrate the functionalities of the website.  Figure 4. Differential scattering cross-section for elastic and inelastic scattering processes with argon ground state. (a) Experimental measurement of elastic electron/Ar collisions from ANU database [39,40] for different electron-impact energies with label from 1 to 8: 1 eV, 1.5 eV, 2 eV, 3 eV, 3.14 eV, 5 eV, 7.5 eV and 10 eV; Theoretical calculations from the BSR database [41][42][43] using the R-matrix method of differential cross-sections for electrons with ground state Ar: (b) elastic collision; (c) electronic excitation from ground state to Ar * 4s [1/2] 1 state; (d) electronic excitation from ground state to Ar * 4s[3/2] 2 state.

Electron-Scattering Cross-Sections
To get an accurate calculation of the electron energy distribution function, any Boltzmann solver requires an input of collision cross-sections forming a complete set, i.e., a set accounting in a consistent manner for the complete electron energy losses and momentum losses due to collisions. For electron-scattering cross-sections, LXCat recognizes 5 distinct processes that each relate back to a particular collision model used in solutions of the Boltzmann equation. • Elastic: elastic momentum-transfer process. • Excitation: electron-scattering process where part of its kinetic energy is transferred into electronic excitation of atoms or molecules. Similarly, excitation of rotational and vibrational levels of molecules are electron-impact excitation processes. • Ionization: ionization process, inelastic collision in which a new electron is created. • Attachment: an electron is lost in the collisional process. This can be either due to formation of negative ions but also electron-ion recombination. • Effective: momentum transfer including all electron-scattering with a given target particle species. This is equal to the sum of the elastic momentum transfer and the total cross-sections for all inelastic and ionization processes. For historical reasons, this is included instead of elastic momentum transfer in some data sets.
For electron-scattering cross-sections with neutrals, there are in total 21 different databases with the following 18 already extensively discussed in [ Additionally, the UBC database [119] contains oscillator strengths for scattering of electron with neutral species in their ground state as a function of the electron energy. For optically allowed transitions, such data allow an estimate of the cross-section for transition between two levels by electron impact. A well-known approximation is the Drawin formula for excitation of an atom between the k and l-levels [120]. In the case of electric dipole allowed transitions, oscillator strengths are also used quite successfully for rescaling plane-wave Born cross-sections for electron-impact excitation of neutral atoms [121][122][123].
Two additional databases are currently under construction and already available online: the eMol-LeHavre database [124] and the Laporta database [125]. The Le Havre group has calculated vibrationally and ion core resolved recombination cross-sections for various molecular ions and is planning to make them available on LXCat with crosssections for N + 2 [126] and BeH + [127,128] already uploaded. The Laporta database is a compilation of vibrationally resolved cross-sections for electron-molecule scattering based on R-Matrix calculations. It currently contains data for vibrationally resolved dissociative excitation and attachment for the NO [129,130] and O 2 [131] molecules.

Swarm/Transport Data for Electrons
There are 9 different databases for swarm and transport parameters of electrons in gases. The Christophorou [95] (data for SF 6 ) and IST-Lisbon [35] (data for Ar, CH 4 , H 2 , He, and N 2 ) databases also contain electron-scattering cross-sections and were already briefly described in Section 4.1.2. The following were already described in [ The following databases are new since 2016: • CDAP (State-to-state electron-impact excitation rate coefficients) [151]: State-to-state electron-impact excitation rate coefficients between argon neutral or xenon ionic states obtained by combination of plasma experiments and theoretical calculations. In particular, the data are used for collisional-radiative modeling and examined in the afterglow plasma of argon (CDAP method) or examined in the different regions of electric propulsion devices using xenon. Rate coefficients for both atoms and ions are given here, and can be used for plasma modeling and optical emission spectroscopy (OES) [152,153]. • Heidelberg [154]: database of electron transport parameters measured at the University of Heidelberg in the years 1978 to 1996. Different electron-swarm experiments were setup by Bernhard Schmidt and co-workers which allowed to measure electron transport parameters in pure electric and perpendicular crossed electric and magnetic fields. This database contains only those data for B=0. Additional swarm data for non-zero B fields will be made available in tabular and graphical form when resources become available. Electron rate coefficients, which is a category of swarm parameters on LXCat, are available both as a function of reduced electric field and electron temperature which is typically defined as k B T e = 2/3¯ where¯ is the electron mean energy. In many experiments such as the ones from Zhu et al. [153] which are stored in the CDAP database, only the electron temperature of the plasma is known from the experiments. It is therefore also a useful parameter for comparing rate coefficients from different data sets which can be measured or calculated. Currently, the output of the online BOLSIG+ solver is stored only as a function of E/N for later comparison with other swarm parameters.

Ion-Scattering Cross-Sections
Ion-scattering cross-sections with neutral species is another type of data required for the modeling of ion transport in weakly ionized plasmas. However, for ions, in contrast to the situation for electrons, there is usually no need for inelastic cross-section data but there is a need for differential elastic scattering or for some approximation thereof. For rare gas ions in their parent gas, Phelps proposed to approximate the total differential cross-section for elastic scattering as the sum of an isotropic component and a backscatter component [157]. Figure 5 shows as an example the ion-scattering of Ar + and He + in their parent gases taken from the Phelps database. The Phelps database also contains data for Xe + in its parent gas and in Ne. In July 2017, the second database providing this type of data was introduced, namely the Viehland database [158]. This database contains data for ions of all elements highlighted in Figure 6 (excluding P) in the noble gases [159]. Viehland provides angular scattering information in the form of angle moments of the total differential scattering cross-sections. He calculated these from the ion-neutral interaction potentials which are also present in the Viehland database.

Interaction Potentials
Viehland has calculated interaction potentials between ions and neutral species and made them available on the LXCat website. Interaction potentials allow the evaluation the collision integral for the scattering of ions by neutral species. This allows one to calculate the ion mobility and diffusion coefficient under an electric field in a background gas [28,29]. In the Viehland database, there are currently interaction potential data for 91 ionic species representing a large portion of the periodic table (see also Figure 6) with a variety of the noble gases and H 2 resulting in 660 potentials. Figure 7 shows an example the interaction potential of Hg + 2 S 1/2 with the noble gases from the Viehland's database [158,160]. One can see that the interaction potential becomes repulsive at a longer internuclear distance for atoms with a higher atomic number because of their larger radius. A potential well appears at a distance of about 6 Bohr and its depth increases with atomic number as well. At large internuclear distances the potential energy asymptotically tends to zero.

Swarm/Transport Data for Ions
There are three databases for swarm parameters of ions: Phelps [112], UNAM [150] and Viehland [158]. In these databases can be found transport parameters of ions, either derived from experiments or theory. In Figure 8 a few of the various swarm parameters that can be found on LXCat for ions are illustrated. The ions diffusion coefficients, drift velocities and mobilities in their parent gas as well as in mixtures (e.g., air) have been tabulated as a function of other parameters (typically E/N and the gas temperature). The Phelps database has data for the ions He + , Ne + , Ar + , N + , and N + 2 in their respective parent gases [112]. The UNAM database contains mobility parameters for CF + 3 in CF 4 , CHF + 2 in CHF 3 , and C + in Ar [161]. The Viehland database is transcribed in part from the Gaseous Ion Transport and Rate Coefficient Database developed by Viehland [159]. In this database are several additional parameters derived from theory following high-order moments expansion of the Boltzmann equation. Data for 424 positively or negatively charged atomic and molecular ions is available for a wide variety of elements across the periodic table of elements (see also Figure 6). Some of the notable changes and extensions since 2016 in the Viehland database involve ions of C [162,163], Co, Cr, and Ni [164], Gd [165], K [166,167], Mg [168][169][170][171], O [172,173], S [174], and Si [175].

List of Species
On LXCat there are many species which can be atoms and molecules as well as atomic and molecular ions. These species can be either in their ground state or carry some internal degree(s) of excitation (e.g., electronic, rotational, vibrational). Due to their currently very large number, we will not list them all here and refer to the LXCat website directly. A visual representation of the diversity of species covered by LXCat is given in Figure 6 where a heatmap of the number of records distributed over the elements constituting the species. One can see that a large amount of data exists for noble gases which is not surprising, considering their importance for the modeling of gas discharges and their fundamental understanding. Other species such as H, C, N, O, F, Cl, Br also have their fair amount of data due various applications, such as in the semi-conductor industry, renewable energies and bio-medical applications, where plasmas are generated in (mixtures of) molecules such as H 2 , O 2 , H 2 O, CH 4 , CO, CO 2 , CF 4 , HBr and SF 6 .

Staging Areas on LXCat
Many groups in the LTPs community have developed software for solving different aspects of low-temperature plasmas physics and chemistry. Some of this software has been made freely available for reuse and further development (open-source codes) by other researchers. A short description made by the developers of the software currently hosted and/or linked on the LXCat website is given in the following with appropriate references: • BOLSIG+ is a user-friendly Windows application for the numerical solution of he Boltzmann equation for electrons in weakly ionized gases in uniform electric fields, conditions which typically appear in the bulk of collisional low-temperature plasmas. Under these conditions the electron energy distribution is determined by the balance between acceleration in the electric field and momentum and energy loss in collisions with neutral gas particles [27,179]. An improved description of Coulomb collisions was developed by Hagelaar and is included in the more recent versions of BOLSIG+ [180]. • ZDPlasKin: Zero-Dimensional Plasma Kinetics solver ZDPlasKin is a Fortran 90 module designed to follow the time evolution of the species densities and gas temperature in a non-thermal plasma with an arbitrarily complex chemistry [21]. BOLSIG+ is integrated in this package. • EEDF is a user-friendly program for the numerical solution of Boltzmann equation for the Electron Energy Distribution Function in low-ionized plasma in an electric field. It is used for calculations of electron transport and kinetic coefficients in gas mixtures. EEDF comes with Data Bank, several files containing data on cross-sections for the electron-scattering from atoms and molecules [118] compatible with EEDF program. These cross-section data can be found in the TRINITI database on LXCat. PumpKin: (pathway reduction method for plasma kinetic models) is a tool for postprocessing of results from zero-dimensional plasma kinetics solvers. The goal is to analyze the production and/or destruction mechanisms of selected species of interest, as well as to reduce complex plasma chemistry models. Only once the user is required to solve first the full chemical reaction system. The output is then used as an input for PumpKin. The PumpKin package analyzes the full chemical reaction system, and automatically determines all significant pathways in the system, i.e., all pathways with a rate above a user-specified threshold [182,183]. • METHES is an open-source Monte Carlo collision code written in MATLAB for the simulation of electron transport in arbitrary gas mixtures in the presence of electric fields. For steady-state electron transport in a uniform electric field, the program provides the transport coefficients, reaction rates and the electron energy distribution function. The program is compatible with the electron-scattering cross-section files from LXCat. Temporal studies of electron transport are possible by tracking position, velocity and number of electrons [184,185]. LoKI-B can easily simulate the electron kinetics in any complex gas mixture (of atomic/molecular species), describing first and second-kind electron collisions with any target state (electronic, vibrational and rotational), characterized by any userprescribed population [187]. • THERMCAT: A fast and robust online tool for simulation of different modes of axially symmetric current transfer from high-pressure arc-discharge plasmas to cylindrical thermionic cathodes, created and maintained by Mikhail Benilov and co-workers. The code computes both the diffuse mode of current transfer and modes with axially symmetric spots and can be used in a wide range of arc currents, plasma-producing gases, and cathode materials and dimensions. The tool also serves as a tutorial that can help to make physicists and engineers working in the field comfortable with multiple solutions describing different modes of current transfer to electrodes in low-temperature plasmas [188,189]. • ELEM: A code for evaluation of the field to thermo-field to thermionic electron emission current density. The code is based on an accurate and computationally efficient method of evaluation of the Murphy-Good formalism [190]. • MOBION: A code for evaluation of the mobility and temperature of ions in a weakly ionized gas as functions of reduced electric field and gas temperature. The code is based on the two-temperature displaced-distribution theory [191].
Additionally, for accepting additional data from more contributors, a new staging area was added for hosting databases under a "free format". This concerns for instance different output formats of databases or databases for which the number of target species is so large that it cannot be currently be handled in a satisfactory fashion by the LXCat web interface. The latter is not considered to be a permanent solution but only to allow quicker distribution of the data. For a more detailed discussion of these aspects, we refer to Section 6. In the present staging area is made available a new XML format by Michele Renda (IFIN-HH, Particles Physics Department, Magurele, Romania) for the data available on LXCat. These data are regularly updated to consider changes made by the contributors. This archive contains the LXCat integral cross-section data in ZCross XML format [192]. This format also contains some additional information such as the complete bibliography references (including DOI) for all the journal articles mentioned in the descriptions and comments.

Video Tutorial
A video tutorial has been prepared to complement this manuscript. This video tutorial begins with an introduction to the LXCat website and proceeds to the online Data Center, where the LXCat hosted data may be accessed by users. The different types of available data are reviewed. As an example, the tutorial demonstrates how to retrieve, view, and download electron-neutral cross-sections for argon from the Phelps database [112,193].
The video tutorial also demonstrates how to use the online BOLSIG+ tool and the previously retrieved Phelps database cross-sections for argon to perform an all-online calculation of swarm coefficients. The results of the online calculation are then saved for later comparison and viewing. At this point, the tutorial returns to the Data Center, this time to retrieve both experimentally measured swarm coefficients, as well as previously saved results of the online calculation. For comparison with the results of the online calculation, the experimental reduced mobility data of Nakamura and Kurachi [194], and the experimental reduced Townsend coefficient data of Kruithof [195] from the IST-Lisbon database are selected [35]. In the final step the tutorial concludes by demonstrating how the results of the online calculation can be compared with retrieved experimental swarm data.

Comparison of Data
On the LXCat website, it is possible to directly select different cross-sections or swarm coefficients and plot them without the need for exporting any data and use of offline programs. Such example is shown in Figure 9 with all the electron attachment crosssections on CO 2 currently present on LXCat. One can see an overall good agreement between the different databases.

LXCat Online Boltzmann Solver and Swarm Parameters Comparison
The online Boltzmann solver corresponds to the March 2016 version of the BOLSIG+ freeware with default options, i.e., the temporal growth model is assumed, and Coulomb collisions are not taken into account. Different output formats, which are not available online, are also provided in the standalone freeware version of BOLSIG+. The longitudinal diffusion coefficient is not calculated in the online version and other quantities, such as energy transport coefficients, are only available in the recent freeware versions [179].
An important tool on the LXCat website is the possibility of generating online a BOLSIG+ database from the online Boltzmann solver, which is a temporary database created, when requested, by the user following execution of the online Boltzmann solver. The contents of this database include swarm/transport parameters versus E/N calculated after selecting a complete dataset for each species considered in the calculation. (see Figure 10 for an example of selected datasets and some computed quantities). The data existing in this temporary database can be plotted along with measured quantities available on the other databases or compared with results from other online calculations in different gas mixtures. This temporary database is deleted when the user exits the LXCat site. A video tutorial was made to show how to perform such calculations online and how to compare the results of the calculations with data available on the website [197].   [110,196]: cross-sections 1-13 belong to CO 2 and 14-38 to N 2 datasets) and computed quantities for a 0.5-0.5 CO 2 /N 2 gas mixture as a function of reduced electric field E/N in Td; (b) electron energy distribution function (EEDFs); (c) mean electron energy; (d) EEDF anisotropy.
In Figure 11 are shown swarm data for a pure CO 2 gas calculated using the Biagi database [80,198] compared to the experimental data available on LXCat (see the caption under the figure for the references) as a function of reduced electric field. One can see a relative spread of the experimental data (see references in Section 4.1.3) but an overall good agreement in the intermediate reduced field strength range. Figure 11. Swarm parameters calculated using the online Boltzmann solver for a pure CO 2 gas using the Biagi database [80,198] and comparison with experimental swarm measurements [133,135, from Dutton [132], ETHZ [134], Heidelberg [154], LAPLACE [149] and UNAM [150] databases as a function of reduced electric field. The computed quantities are: (a) characteristic electron energy, (b) product of the gas number density and electron diffusion coefficient, (c) product of the gas number density and electron mobility coefficient, (d) Townsend coefficients.

Definitions, Guidelines, and Best Practices
•

Historical origins of electron-neutral cross-section sets
To understand how these data may be best used, it is helpful to understand the origin of the data. Historically, cross-section sets were derived via a procedure of matching kinetic model (e.g., two-term Boltzmann, Monte Carlo, etc.) calculated swarm coefficients to experimentally measured swarm coefficients. This was achieved by tuning cross-sections, guided by available experimental data and theory. Cross-sections from sets derived in this manner may be considered to be tuning parameters for their respective kinetic model, as opposed to state-to-state cross-sections derived from theory or experiments. The accuracy of these cross-section sets was not assessed by the accuracy of any individual cross-section, but rather how all these cross sections together yielded the desired swarm coefficients when used in a specific kinetic model. The error associated with the kinetic model used in the tuning procedure directly contributed to the error of the cross-section set. That is, a cross-section set derived using a two-term Boltzmann equation model features all the assumptions and limitations associated with the two-term approximation. Moreover, it has been demonstrated that the ability to generate a set of swarm coefficients is not unique to a single set of cross-sections [7]. The quality of a given data set can only be evaluated within the theoretical framework for which it was derived. Its reuse and evaluation using for instance a multi-term Boltzmann equation model will thus lead to differences that are not only related to intrinsic limitations of the used physical model. Application of a cross-sections dataset using the two-term approximation in a multi-term Boltzmann equation model, or Monte Carlo/PIC model is not recommended and will likely yield non-negligible error.
Today, the experimental data and theory available to guide the tuning procedure are far superior to those of decades past, which has yielded cross-section sets that feature individual cross-sections that are themselves accurate, independent of the remainder of the set, while retaining the ability to generate highly accurate swarm coefficients. In addition, the direct theoretical calculation of a "complete" set of cross-sections has also become practical, for a few special cases in rare gases [6,9,10], yielding cross-section sets that are independent of the limitations associated with the cross-section tuning procedure. •

Definition of complete set
The designation of a complete set of electron-neutral scattering cross-sections has historically indicated that the use of these cross-sections in a specific kinetic model will yield accurate swarm coefficients. This designation remains in use in LXCat, but is also used for theoretical cross-section sets that produce accurate swarm coefficients when used in a kinetic model. When using cross-sections to calculate swarm coefficients or to obtain correct electron energy and momentum conservation balances, it is necessary to use a complete set of cross-sections. •

Mixing cross-sections data sets or using incomplete sets
Most cross-section sets hosted by LXCat were produced via the aforementioned procedure of matching calculated and measured swarm coefficients. These complete cross-section sets are considered to be self-consistent, where the various pathways for electron power loss, deflection, and so on, cumulatively contribute to achieving the desired power balance and correct swarm coefficients. Introducing or substituting cross sections from other sources will affect this power balance and violate the self-consistency of the set. Similarly, the omission of cross-sections from a complete set will also violate the self-consistency of the set.
Users are recommended to use the complete set of cross-sections when applying these data in their models and are urged to not mix data from various sources. In cases where a complete set is not used, or data is compiled from various sources, the user is strongly encouraged to verify that the used set of cross-sections is producing the anticipated results (e.g., by compared to swarm experiments but also other experimental results). •

Selection of a cross-sections data set
Often, many different electron-neutral cross-section sets are available for a given neutral species. Users are strongly encouraged to review the original sources and publications associated with the cross-section set they are using, as well as the comments and notes provided with each LXCat file to confirm that the data are well suited to the problem being addressed. •

The elastic momentum-transfer cross-section and anisotropic scattering
The effects of anisotropic scattering are usually well approximated by simply using the elastic momentum-transfer cross-section for elastic collision events and assuming isotropic scattering [23,220,221]. This is especially useful as anisotropic scattering is often complex and not well quantified for many neutral species. When using anisotropic scattering, the total elastic cross-section should be used, not the elastic momentum-transfer cross-section. This is a common misunderstanding that can yield significant error if not treated correctly. •

Swarm configurations and growth models
Swarm coefficients are dependent on the configuration in which the coefficient was measured or calculated, which is most often either a pulsed Townsend (PT) or steady-state Townsend (SST) configuration. In calculation, these configurations may be accounted for via the selection of the growth model. The PT growth model assumes a temporally exponential growth or decay in the electron density, while the SST growth model assumes a spatially exponential growth or decay. It is emphasized that swarm coefficients, such as drift velocity, are not expected to be the same between these two configurations. When making a comparison between calculated and measured swarm coefficients, the kinetic model should be configured to represent the same configuration in which the experiment was conducted, i.e., experimental data from an SST apparatus should only be compared with a model configured to represent SST conditions. Similarly, experimental data from a PT configuration should only be compared with a model configured to represent PT conditions. Users are again recommended to review the relevant sources and publications associated with the experimental swarm data, as well as the notes and comments included with the LXCat file containing the data to determine the conditions that these data were collected.
Many of the publicly available tools for performing kinetic calculations feature the ability to select the growth model, although not all codes use the same terminology. The standalone version of BOLSIG+ allows for the selection of either a PT or SST growth model, while the online BOLSIG+ tool uses a PT configuration. Similarly, tools such as BOLOS, LOKI-B, METHES, and MultiBolt are all able to calculate swarm coefficients for either PT or SST conditions. It is noted that the PT conditions are a special case of the more general hydrodynamic description of an electron swarm. Codes that employ the hydrodynamic assumption, such as MultiBolt, may produce so-called hydrodynamic swarm coefficients, which are analogous to PT coefficients. •

Piecewise linear interpolation of cross-sections
In many Boltzmann solvers (including BOLSIG+), a piecewise linear function is used for the interpolation of data tables. As emphasized above, the cross-section data available on LXCat should be used consistently with how they were originally compiled. Unless otherwise stated in the headers of the downloaded files, linear interpolation should be used to evaluate cross-sections for energies between data points. •

Bulk and flux transport coefficients
Although rigorously defined [222], the topic of bulk and flux transport coefficients remains not well-understood to much of the LTP community. Here, we briefly review the key points of this subject, and detail some general guidelines for applying and using these data.
If we apply the hydrodynamic assumption to an electron swarm, the electron phase space of this swarm is decomposed into an infinite series of gradients in the electron density. Transport coefficients of a given order quantify the transport of electrons over a gradient in density of that order. For uniform density (i.e., a zeroth order gradient in density), the transport coefficient is simply the ionization rate. The transport coefficient for a first-order gradient is the drift velocity, and the diffusion tensor describes transport over second-order gradients. The presence of non-conservative collisions (i.e., ionization, attachment, etc.) results in slightly modified values for these transport coefficients. The flux transport coefficients are the transport coefficients which neglect the effect of non-conservative collisions. Bulk transport coefficients are the flux transport coefficient plus a small correction to account for non-conservative collisions. In the absence of non-conservative collisions, the flux and bulk transport coefficients are equal to one another. When comparing measured and calculated swarm coefficients, bulk transport coefficients should be used (except in an SST configuration). When these data are to be used in a fluid model, flux transport coefficients are recommended. •

Temperature range for validity of a given set of cross-sections
For some gases it is necessary to consider the populations of lower rotational and vibrational states when deriving a set of cross-sections. These states are often assumed to be Boltzmann distributed with a distribution temperature equal to the background gas temperature. As a result, these cross-section sets are only strictly valid for the single gas temperature that was assumed in the generation of the cross-sections. Users should consult the literature and notes associated with a given set of cross-sections to verify that their application is suitable for a given temperature. In addition to a complete set of crosssections are available for some molecular gases the necessary state-specific cross-sections such that the data set may be extended for other temperatures. •

Treatment of three-body collisions
Some cross-section sets include three-body cross-sections, where a single electron collides with a pair of neutral targets, which results in a pressure dependence of the crosssection. These three-body cross-sections are often normalized to the neutral species density, such that they are easily incorporated into an existing model via simply multiplying this cross-section by the background gas density. The user should pay attention to the background density at which the apparent 2-body cross-section was derived that can differ for different databases. •

Sorting and filtering target species
The full list of species on LXCat is, in many cases, very large. The species list can be sorted according to mass (in atomic mass units, a.m.u.) or charge, by element and names. It is thus possible to apply a filter to the list for identifying all states available for a given species, or to search for a specific state. •

Accessing and viewing large data sets online
While selecting databases, species, and process types, one needs to pay attention to the number of selected data which can be viewed and compared simultaneously. At present, it is only possible to view and compare data for the first 101 processes in the output window. This limitation exists only for viewing and comparing the data. The exported ASCII or XML data files are not affected by this issue. •

Chemistry models and how they affect calculations
In recent years, there have been efforts in the community to include the effect of plasma chemistry on calculated swarm coefficients. This introduced a new type of electron-neutral cross-section set, but the previously detailed points still apply. Reiterating, users should ensure that the manner in which they use these data is consistent with the way these data were generated. In this context, these cross-section sets require both an electron kinetic model which is consistent with the kinetic model used to derive these data, but also to be combined with a plasma chemistry model which is consistent with the plasma chemistry model used in the derivation of this data. One needs to be particularly wary whenever effective states are being used [223] and check for consistency in species definition from different literature sources. •

Formatting for excitation and superelastic processes in a cross-section file
For excitation processes, the name of the target state (i.e., the species involved in the collision with the electron) and the excited state (i.e., product(s) following the collision) can be specified on the same line, separated by either an arrow, "->", or by a double-head arrow "<->" (e.g., "Ar->Ar*", or "Ar<->Ar*", where "Ar" is the target state, and "Ar*" is the excited state). The latter case is used to denote that the excitation is a reversible process, with the reverse process corresponding to a superelastic collision (i.e. the electron gain energy after the collision). In the double-head arrow format, the online Boltzmann solver will automatically define and additionally use the corresponding superelastic collision in its kinetic computation.

LXCat under the Hood
The present design and implementation of LXCat have served the community well for many years. However, as the complexity of the data sets has grown, it has become clear that further improvements to the backend may substantially improve the LXCat user experience. In particular, ongoing work is focused on enabling more timely improvements and additions to LXCat. In the past year we have identified several bottlenecks, extracted a set of specifications for a new version of LXCat, and prototyped these ideas. In this section, we present the results of these experiments by contrasting old and new ideas about: • The data exchange format, the data model and the choice for a database type; • The overall framework design.

The Exchange Format and Choice for a Database Type
The data exchange format was originally designed for one particular purpose: the exchange of complete cross-sections sets for a discharge that is governed by elastic electronscattering, electron-impact excitation and ionization and perhaps electron attachment. Such data are tuned to give excellent results for transport coefficients and rate coefficients when used in combination with a Boltzmann solver (see Section 2 for more detail). Interestingly, such calculations can be done without any knowledge of the target particle's precise electronic or, for molecules, its rovibrational configuration. This is demonstrated by the appearance of compound states such as Ar * in LXCat. In the case of an excitation process the input file merely needs to provide the names of the initial and final states of the atom or molecule, the threshold energy of the process and the cross-section data. This can be seen in Figure 12 which shows an excerpt from the Phelps database [112]. Although highly flexible, the current LXCat file format poses a few challenges in practice. First, a custom parser is often required, and the lack of a formal schema makes it difficult to verify the integrity of the document. Second, some fields are used in a redundant manner (c.f. lines 2-6) which can lead to confusion in parsing and on what is relevant to a given model. Third, there are several relevant quantities for new data sets which cannot be expressed in a consistent manner. As an example, if the database entry refers to a particular state of argon such as Ar( 1 S 0 ), there is no method to detect that the state has been defined using the LS coupling scheme. Furthermore, the use of different notation schemes by different authors makes it difficult to reliably infer the parameters S = 0 and L = 0. This problem is often circumvented by the inclusion of such information in the form of contributor-specific annotations in the comment blocks. As an example, the Lisbon database for N 2 [101] contains comments specifying detailed state information that is used by the LoKI-B Boltzmann solver [187]. This observation suggests the possibility of improving on the original format by the inclusion of optional lines such as "S: 0".
Unfortunately, this straightforward approach poses new complications for the backend. Presently, LXCat uses an SQL database to store its data with every row corresponding to a data item. If excited states were to be added in the way described above, both the name and the LS parameters would need to be encoded as new columns in the current schema. Conceptually, the layout would be similar to that found in Table 1. This table illustrates one explicit and one implicit problem with the aforementioned approach. The explicit problem is the proliferation of NULL values in the table. These are necessary for species where a particular coupling scheme is not applicable. In this case, neither e nor Ar * are described by the LS coupling scheme. The larger, implicit problem is that the addition of each new optional parameter affects the entire table, even those entries for which the parameter is not relevant. These issues suggest the need to reconsider the LXCat backend to better serve the community. Although some techniques exist to ameliorate these issues (e.g., table indirection), they do not solve the fundamental problem which is that table-based databases are not well suited for heterogeneous data sets.
In recent months we have created an experimental implementation of LXCat that takes advantage of modern concepts in computer science, such as the JSON file format and graph or hybrid databases. Although considerably simplified, it provides an idea of the possibilities that are offered by reconsidering LXCat's data structures. Let us first look at Figure 13, which shows the JSON equivalent of the content of the SQL table in Table 1.
It is clear that this document does not suffer from the excessive number of unused table entries. Only meaningful data are stored, and those can be retrieved with commonly available JSON parsers which incorporate rigorous syntax checking. Moreover, the semantic correctness of the file can be checked with a JSON Schema document which describes exactly what fields and values are allowed. Again, the tools to implement and verify the JSON Schema are widely available for all commonly used platforms. 4 Although the JSON file format appears to be a promising alternative to the current format, there is still the question of how to store and retrieve these documents. We have experimented with both graph and hybrid databases, in which JSON objects are treated as first-class citizens. Where traditional, relational databases store rows of data in tables, these databases store collections of JSON objects, quite like Figure 13. Furthermore, a graph database, as the name suggests, allows objects to be addressed in terms of their relations to each other. This type of data structure is expected to be very beneficial with the type of data on LXCat where hierarchical relations exist between states, and there are networks of reactions with arbitrary types. Switching to a flexible, yet structured framework, with a graph or a multi-model database at its core promises to enhance LXCat's current capabilities while simultaneously opening the door for a future chemistry database.

Framework Implementation
Conceptually, the present implementation of LXCat consists of two halves: the SQL database and the website. The latter provides the interface that allows users request data. This request is then translated into a database query, the results of which are served to the user in the form of a graph that is displayed on a screen or a file that can be downloaded. At present, the two halves are tightly coupled and there is no abstraction defining how the database can be accessed. This situation is depicted in Figure 14a.  This approach has various disadvantages. First, from the project management point of view the direct access of the database violates the principle of separation of concerns. This violation poses practical challenges. As an example, a graphic designer updating the appearance of LXCat must be careful not to accidentally alter database queries that appear in the same files. Moreover, this approach hampers code reuse both internally and externally. One can envision plasma modeling tools such as BOLSIG+ [27], LoKI-B [187], HPEM [224] and PLASIMO [225], providing integrated access to data sets on LXCat with a suitable abstraction. Such implementation would of course have to abide to the LXCat policy on data redistribution with the written consent required on a case-by-case basis between the contributors and the third parties (see Appendix B).
This situation can be improved by inserting an Application Programming Interface (API) layer between the database and the website or other front-end. This layer is comprised of a set of endpoints that provide well-defined stable interfaces to the database backend. The endpoints take care of the communication with the database making the website agnostic to the type of backend. This means that the database can be SQL, graph, or hybrid with no user or application visible changes. The endpoints can also validate, authenticate, or otherwise provide pre/post-processing for requests. An example of this would be the translation of a JSON-style output file into the legacy LXCat format that may be expected by an application. A prototype API, as seen in Figure 14b, has already been implemented in our experimental server, with highly favorable results.

Afternote
We hope that the results of our experimentation have made it clear that the LXCat project is alive and kicking, and that we have an open mind when it comes to technological innovation and an expansion of LXCat's data offerings. At the same time, we want the users of LXCat to know that backward compatibility has been a design criterion right from the start. Case in point, the above example of a JSON-LXCat format converter has already been developed. Although we intend to provide contributors and users finer grained control over the data they access, lumped species, such as He * , will always be available. Likewise, though LXCat now provides data sets that were not calibrated to swarm data, the "complete" data sets will continue to be at the heart of LXCat's offerings. By providing access to a wide range of data along with the appropriate context, it is the goal of LXCat to continue to advance the reliability and reproducibility of plasma models.
In recent years, several times we have been in contact with colleagues who are interested in such data-scientific and implementation aspects of the development of LXCat. We hope that this exposition will incite enthusiasm in those colleagues and prompt them to join the LXCat project as we work to make the LXCat project future-proof.

Recent Developments and Perspectives
To date, the execution of the LXCat project has depended upon volunteers from the LTP community with an informal organizational structure. This grew out of the original efforts by a core group of people located at the University of Toulouse. In recent years, the project has become much more international (and delocalized as well with the establishment of mirror sites) with people from France, the Netherlands, US, China and Germany active in the maintenance of the website, answering queries, performing technical and scientific development of tools and of the website as well as outreach activities (see section 3). Since September 2019, a mirror website was installed at Drake university in the US and the server in France was decommissioned. The installation of additional nodes is currently considered. The server in the Netherlands (at the Eindhoven University of Technology) now acts as the master node.
A driving force of the LXCat project is the constant stream of input the LXCat team receives from the LTP community (like data, technical help, but also many valuable suggestions or requests). Meetings such as the evening LXCat workshop at the Gaseous Electronics Conference (GEC) are in that respect very valuable. Considering that, we will provide a short survey of some of the main discussion points which rose at recent international meetings.
A recurring query is whether LXCat should recommend data or not. In this paper, we have tried to address the reasons why LXCat does not (and cannot) recommend data (see Sections 1 and 2 and more particularly Section 5 for more detail). The LXCat project however aims to support activities toward exchange and comparison of data sets (see Section 3).
The LXCat project largely provides cross-sections for reactions which are required for modeling electron and ion-swarm parameters in non-equilibrium low-temperature plasmas. However, there are additional reactions which are needed for a full description of the electron and ion kinetics such as electron-ion recombination, dissociative recombination or attachment, three-body and charge exchange processes. Due to the internal complexities of the database organization, such an extension still requires a significant effort. Work is ongoing for supporting for these additional reaction types and new data formats that can help addressing such issues was outlined in Section 6.1.
In Section 6, we discussed possibilities for a new framework implementation of LXCat that would allow addressing more functionalities on the website and for interactions with the databases. One benefit would be for instance allowing proper use and storage of coupling schemes used for defining excited levels in atoms and molecules. Different communities of users of cross-sections have adopted different notations (e.g., L-S coupling, Racah notation, Paschen notation,...) due to historical reasons. These notations can be translated into one another if the notation used in the database is flagged and that sufficient information on the states is stored such that an unambiguous conversion between notations can be made. The proposed implementation would also allow considering on an equal footing the species on the left and on the right-hand side of the collisional process. This is important for considering state-to-state processes, particularly in the case when forward and backward (i.e., superelastic processes for electron collisions) reactions are equally important.
Along the lines of needing to support cross-sections for new types of reactions, there is also the growing need to provide means of comparing and storing reaction mechanisms and/or rate coefficients for the charged and neutral components of LTPs. This is particularly prevalent in higher pressure LTP applications such as plasma medicine, plasma-assisted combustion, and water treatment. Discussions are ongoing in the LTP community regarding the requirements for such an effort [36]. It is worth noting that for plasma chemistries and the development of so-called "reaction mechanisms", chemistry data sets also cannot be considered so far universal and are valid only within a certain framework. This is similar to most of, presently, existing "complete data sets" for electron-neutral scattering data which are developed upon some simplifying assumptions. Particularly in high density conditions, one needs to pay particular attention to the definition of excited states and to underlying assumptions that were used while determining experimental rate coefficients between those states. Such issues can already be underlined even for simple cases such as a pure helium plasma when 3-body collisions become important [223].

Conclusions
LTP modeling is a significant challenge that requires a great deal of atomic and molecular data. This applies regardless of whether it is a global, kinetic, or fluid model. LXCat was born out of this need for data, initially supplying electron-scattering databases that were carefully calibrated to swarm experiments but applicable more broadly. As a digital platform, it obviated the need for "traditional" data extraction, allowed wider access to essential data, and enabled reproducible plasma simulations.
Since its inception, LXCat has grown substantially. This began with the inclusion of electron-scattering data derived from quantum mechanical models, soon followed by the inclusion of ion-neutral interaction potentials, differential scattering cross-sections, oscillator strengths, and swarm data. It has also become a more general plasma modeling resource with the incorporation of a web interface for BOLSIG+, links to numerous compatible software packages, and through its ongoing publication of comparative analyses for different gases of complete data sets. Not only has the content of LXCat grown, but its participants-contributors, users, and maintainers alike-now touch nearly every part of the globe, forming a truly international collaboration.
Looking forward, there are many plans in motion to improve LXCat for its users and contributors. Database enhancements are underway to make new data types available for use. In parallel, there is an ongoing internal research effort on improved data formats and database technologies. Improvements in this area will provide users with powerful new sorting and comparison mechanisms, increase the ways in which contributors can identify and categorize their data, and improve the resilience of the project as a whole. Finally, there are nascent efforts to address the growing need of the community for non-equilibrium plasma chemistry data in the form of reaction rate databases [36]. In summary, there are several efforts in progress to extend and continue LXCat's broad impact.
However, this work cannot continue without the ongoing support of LXCat's users and contributors. Of the users, we ask that you always cite the data that you use, we seek your feedback on what could be improved, and hope that you consider joining the project in a more active role. Of our contributors, we gratefully acknowledge your ongoing support and likewise ask that you continue to tell us how to make the data sharing process easier for you and welcome your suggestions on future developments. Ultimately, LXCat is a grassroots organization whose health depends on the same community that it supports. Together, we can build a more robust understanding of LTP science and realizing new applications. Acknowledgments: The LXCat team would like to acknowledge the Gaseous Electronics Conference for having hosted annual workshops on the "Plasma Data Exchange Project" for three consecutive years (2011-2013) and since 2014, the LXCat evening workshop series. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. Department of Energy or the United States government.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. How to Cite Data Retrieved from LXCat and Online Calculations
"Open access" should not mean that downloaded data become anonymous. Proper referencing of data downloaded from the website is essential for the survival of LXCat and for giving proper credit to the researchers who have contributed to build the data sets in the first place and to make them available.
Publications making use of data downloaded from LXCat should reference the site as well as the publications, if any, listed in each contributor's database. Original references should be used where possible and reference must be made to the specific database(s) from which data were retrieved, the LXCat site address, and the retrieval date.
Example: SIGLO database, www.lxcat.net, retrieved on 26 June 2020. This is the general reference that allows, using the Time Machine to identify uniquely the data as they were retrieved from LXCat. In any publication, this date, which is present in the header of the downloaded files should be given. After selecting the type of data (e.g., differential scattering cross-section for electrons), the user can select one (or several) databases. A short description of the contents of the database is given and contributors can give additional information for referencing under "How to reference". The references (e.g., publications, website) given there should also be cited when using the database. Additionally, after the selection of the type of processes and the species, each species appears as a data group and additional references are often given by the contributors. They usually correspond to papers which discuss the compilation of the entire data group. Those references should also be cited when using the data set. At the level of individual cross-sections, citations to papers can be found for some databases as well. The users of LXCat are encouraged to read those papers and cite them.
Swarm coefficients and electron energy distribution functions calculated on the LXCat website should make reference to both the cross-section data set in the above format and to BOLSIG+, the online Boltzmann solver, by citing Hagelaar and Pitchford [27]. For general references on the LXCat project, the publications of Pancheshnyi et al. [37], Pitchford et al. [38] as well as the present paper are recommended for any citation(s).

Appendix B. Referencing and Redistribution of Data
Several guidelines were defined at the beginning of the LXCat project. These are: (1) anyone willing to contribute data to the site can request a password and set up a database; the contents and maintenance of the databases remain the responsibility of the individual contributors. (2) The site is open access and data can be downloaded without registering or paying a fee. (3) The databases are dynamic and contributors are free to make changes as they see fit. Consequences of these guidelines are that data for the same processes can be listed in multiple databases and that different versions of the same data can be accessed on the website. Hence users are strongly encouraged to reference data downloaded from LXCat in all publications making use of the data.
The database contributors retain ownership and are responsible for the contents and maintenance of the individual databases. LXCat does not authorize third parties (commercial interests, in particular) to redistribute data from the LXCat site. The following guidelines are prescribed for use of data on LXCat by third parties: 1. Third parties should direct their users to the LXCat website to obtain data. Before downloading data, visitors to the site are asked to click to confirm that "Yes. I understand the LXCat policy and I agree to properly reference these data". The header in downloaded files contains reference information that should be included in all publications making use of LXCat data and further guidelines for proper referencing are given in Appendix A. 2. The LXCat team can provide customized interfaces to institutional members of the non-profit association "Data for Modeling Plasmas" (http://assoc.lxcat.net). This option is useful for code developers who recommend specific LXCat data sets for use in their software; that is, users who access the LXCat website through such a customized interface are directed to the specific data chosen by the code developers. 3. In cases where neither of the above options is practical, third parties can include data from LXCat in their packages under the following conditions: • Explicit written permission must be obtained from the owner of the database (the contributor) whose data is being included in the third-party package. • LXCat must be notified in advance and a Memorandum Of Understanding (MOU) must be prepared between the LXCat team, the contributors who agree to have their data set redistributed within a software or any other shareable format and the MOU must be signed by all parties. • A statement must be clearly visible with the redistributed data saying that "The data were downloaded from [database name], www.lxcat.net, [retrieval date].
LXCat is an open-access website with databases contributed by members of the scientific community.". • All output files generated by a third-party software when using data from LXCat must include in the header the reference "[database name], www.lxcat.net, [retrieval date]" and all other references supplied by the contributors as included in the headers of files downloaded from LXCat.