BASECOL2020 New Technical Design

: The BASECOL database has been created and scientiﬁcally enriched since 2004. It contains collisional excitation rate coefﬁcients of molecules for application to the interstellar medium and to cometary atmospheres. Recently, major technical updates have been performed in order to be compliant with international standards for management of data and in order to provide a more friendly environment to query and to present the data. The current paper aims at presenting the key features of the technical updates and to underline the compatibility of BASECOL database with the Virtual Atomic and Molecular Data Center. This latter aims to interconnect atomic and molecular databases, thus providing a single location where users can access atomic and molecular data.


Introduction to BASECOL
The current publication aims at providing a description of the latest technical developments performed on the BASECOL database. The intention is to provide a paper that conveys the technical quality of the BASECOL service in relation with the Virtual Atomic and Molecular Data Center (VAMDC) [1][2][3], which is supported by the VAMDC consortium 1 . The current VAMDC e-infrastructure interconnects about 38 atomic and molecular databases that cover atomic and molecular spectroscopy and processes. About 90% of the inter-connected databases handle data that are used for the interpretation of astronomical spectra and for the modeling in media of many fields of astrophysics. VAMDC offers a common entry point to all encorporated databases through the VAMDC portal 2 and it develops central services such as the species database 3 or the Query Store service 4 . VAMDC also develops standalone tools in order to retrieve and handle the data such as the SPECTCOL tool [4] 5 . VAMDC provides software and support in order to include new databases within the VAMDC e-infrastructure 6 . One current feature of VAMDC e-infrastructure is the constrained environment for the description of data, in particular the XSAMS schema and other standardized protocols 7 that ensure a higher quality for the distribution of data.

The New Architecture of the BASECOL Service
In past years, major technical updates of BASECOL were prompted for the following reasons: (1) some of the technologies used in BASECOL2012 [5] were deprecated and difficult to maintain; (2) we needed to adapt our internal processes and procedures to the new international data-quality standards. Indeed, the data-feeding procedure of BASECOL2012 was manual with no strict trace of the actions performed by the scientific maintainer. In addition, it lacked an automatic ingestion of the metadata necessary to ensure the interoperability with the VAMDC ecosystem.

New Features
The new version of the BASECOL ecosystem is divided into three software components: 8 request from the referee.

•
The core component is a relational database built upon the MariaDB RDBMS. A critical point associated with this component is how to preserve the scientific quality of the data, and how to guarantee data-traceability and reproducibility when numerical data, metadata, documentation, and bibliographic references are ingested in the database (see Section 3).

•
The second software component is the website which is a web user interface (WebUI) embedded into a web browser: it provides a public and human visible access to the data (see Section 4).

•
The last component is a REST-like API that provides a software interface to the database. It is a set of web services that perform the data-extraction from the central database. This API may be used not only by the website, but also by any other client. This API relies on the the Spring Java framework. The endpoint of each web service composing the API, reachable through the HTTP protocol, returns JSON 9 formatted data.
As shown in Figure 1, the BASECOL system is composed of a private "ingestion instance" and of a public "production instance". The latter is interfaced with VAMDC. As shown in Figure 2, a BASECOL "instance" is composed of its relational database, its API, and its web user interface. The "ingestion instance" allows the database maintainer to verify and to validate the imported data. The "ingestion database" is dumped into the "production database" at regular intervals, and a specific procedure is put into place in order to ensure that only the validated data are visible in the "production instance" website.  Contrary to the previous version of the BASECOL environment, the new software organization follows the multitier architecture: the user interface and the database access are now completely decoupled, making future evolution of the system easier. The structure of the database can be modified without any impact on the user interface layer, as long as the REST API remains the same.

Removed Features
Now, the bibliographic section of the database only includes the publications that are referenced from the collisional data sets (cf. Section 4.2) and from the energy tables. This reduces the bibliographic entries to about 300 entries and this is justified by the overall coherence of the ingestion and versioning procedures.
The VAMDC portal and the SPECTCOL tool [4] implement the IVOA-SAMP protocol [8], created to connect client-side tools to improve productivity when working with multiple data types. VAMDC data can be directly piped into any tool implementing the SAMP protocol, e.g., TOPCAT 10 . Since the TOPCAT tool may be used to compare (or cross-match) different rate coefficients, we removed the cross-matching feature from the BASECOL web-interface.
We also removed the visualization of output files in a format compatible with the RADEX code 11 (the format of those files is explained in Van der Tak et al. [6]) as users can build their own RADEX files using the SPECTCOL tool. In addition, at least two websites already provide this kind of information: the LAMDA database [6] and the CASSIS database [9] 12 . The BASECOL database is rather a reference repository for collisional inelastic rate coefficients and the advice would be to take collisional data from the BASECOL database in order to build the RADEX files on those known websites [6,9] or in any other internal or public secondary display of collisional rate coefficients. This would ensure a proper versioning of the collisional data sets as a specific effort on versioning is made in BASECOL (see Section 3 and Appendix B), and a similar versioning effort is available in the molecular spectroscopic databases connected through VAMDC: CDMS, JPL, and HITRAN [10][11][12][13].

Technical Evolution for Improved Data Integrity and for VAMDC Content Requirements
We designed a new data import system that improves data integrity and authenticity (R7 of CoretrustSeal recommendation 13 ), and that introduces rigorous procedures in managing archival long lasting storage of data. The data import system is composed of an import ASCII file, of a Java application that parses and loads the import file into the database and of a git repository. We first describe the import ASCII file and then the import procedure.

Description of the Import ASCII File
The import ASCII file is composed of several "Data" sections, and the concatenation of these sections constitutes what is called a "collisional data set". Each section of the import file corresponds to an "object" that follows the versioning rules (cf. Appendix B). In this paper, we give the same name both to the section of the import ASCII file and to the "object" which is the concept to which rules of versioning are applied. An "object" may include metadata and numerical data or just metadata. Some of the metadata are aimed to allow the compatibility of BASECOL with the VAMDC architecture, some metadata are either mandatory or optional, and some medatata are related to the uniqueness of an "object". The concept of uniqueness is attached to the numerical data as well, and is used in the versioning process handled by the import script (cf. Section 3.2 and Appendix B). The different data sections are the "Element" section, the "Energy table" section, the "Energy Origin" section, the "Rate Coefficients" section, the optional "Fitting coefficients" section and the "Publications" section. The "Element", "Energy table", and "Energy Origin" sections must be provided for the molecule called "target", i.e., the molecule whose collisional excitation is of interest for astrophysical users, and for the "collider", i.e., the perturbing atom, molecule or electron. These sections are described thereafter.

Element Section
The "Element" section is compulsory and corresponds to the object "Element" of Appendix B which is composed of a list of metadata given in Table 1. An example of an "Element" section input for N 2 H + is given in Table A1. The metadata of the "Element" object allowing compatibility with VAMDC concern the identification of the species. The VAMDC standards provide standardization of the species through the metadata "stoichiometric formula" (the atoms are ordered in alphabetic order and the number of occurences of the atom is given after the atomic symbol), and through the "inchikey". When a species is already present in BASECOL it is possible to retrieve the species metadata from BASECOL through a web user interface (cf. Appendix C); otherwise, information can be retrieved from the species database 14 when the metadata are linked to VAMDC, with the exception of "mass". If neither BASECOL nor the species database contain the species information, the BASECOL data provider can contact the VAMDC team to get advice (support@vamdc.eu). Table 1. List of properties/metadata for the object "Element". The symbols [M] and [O] indicate that the metadata are either mandatory or optional in the import file. In the column "Uniq.", we indicate whether the metadata defines the uniqueness of the object. In the column "Purpose", we indicate whether the information is meant for the BASECOL (B) public WebUI display and/or for VAMDC (V) interoperability. In "V(parameter)", "parameter" indicates the name of column in the species database. The mass unit is atomic mass unit (a.m.u) and the default value for mass is zero.  Table Section The "Energy Table" section is compulsory and corresponds to the object "Energy Table" of Appendix B, which is composed of a list of metadata given in Table 2 and of a numerical table. The imported numerical energy table identifies the labels of the collisional transition in the rate coefficients table. The administrator can import the energy table used in the rate coefficients calculations, or an energy table created with data from spectroscopic databases such as CDMS, JPL, and HITRAN [10][11][12][13]. A template of an "Energy table" section input is given in Table A2. The numerical energy table is composed of labels, values of the quantum numbers, and numerical values for the energy. The energy tables are currently being harmonized so that the default energy unit be "cm −1 ". The quantum numbers' description follows the VAMDC standards: the atomic quantum numbers are standardized in the XSAMS document 15 and the molecular quantum numbers are described in the case-by-case document 16 where different situations are identified (for example: case ="dcs" means diatomic close-shell molecules, case = "hunda" means Hund's case (a) diatomics, etc....). For each "case", a set of quantum numbers is defined, and the import file must follow those standards. If a given pattern of quantum numbers characterization exists in BASECOL (similar type of molecule and coupling case), the choice of quantum numbers can be retrieved from the import file WebUI (see Appendix C). Table 2. List of properties/metadata for the "Energy Table" object. "case" refers to the case-by-case document (http://www.vamdc.eu/documents/cbc-1.0/). The symbols [M] and [O] indicate that the metadata are either mandatory or optional in the import file. In the column "Uniq.", we indicate whether the metadata defines the uniqueness of the object. In the column "Purpose", we indicate whether the information is meant for the BASECOL (B) public WebUI display and/or for VAMDC (V) interoperability.

Energy Origin Section
The "Energy Origin" section is compulsory and corresponds to the object "Energy Origin" of Appendix B which is composed of a list of metadata given in Table 3 and of a numerical table with a single line. This single line gives the values of the quantum numbers associated with the energy origin of the corresponding "Energy Table" section (cf. Section 3.1.2). A template of an "Energy Origin" section input is given in Table A3. Table 3. List of properties/metadata for "Energy Origin" object tables. "case" refers to the case-by-case document (http://www.vamdc.eu/documents/cbc-1.0/). The symbols [M] and [O] indicate that the metadata are either mandatory or optional in the import file. In the column "Uniq.", we indicate whether the metadata defines the uniqueness of the object. In the column "Purpose", we indicate whether the information is meant for the BASECOL (B) public WebUI display and/or for VAMDC (V) interoperability.

Rate Coefficients Section
The "Rate Coefficients" section comes next, it is compulsory, and corresponds to the object "Rate Coefficients" of Appendix B, which is composed of a list of metadata given in Table 4 and of a numerical table. A template of a "Rate Coefficients" section input is given in Table A4.
BASECOL allows the collider to undergo transitions, and therefore the initial and final levels of the collider must be provided. The BASECOL collisional rate coefficients (in units of cm 3 s −1 ) can be state-to-state rate coefficients, effective rate coefficients, and thermalized coefficients; those items are scientifically defined in Dubernet et al. [5] and must be included in the the metadata "rateType" of Table A4. Table 4. List of properties/metadata for the object "Rate Coefficients". The symbols [M] and [O] indicate that the metadata are either mandatory or optional in the import file. In the column "Uniq.", we indicate whether the metadata defines the uniqueness of the object. In the column "Purpose", we indicate whether the information is meant for the BASECOL (B) public WebUI display and/or for VAMDC (V) interoperability. The unit of "temperatures" is Kelvin.

Fitting Coefficients Section
The import script optionally processes the "Fitting Coefficients" section. This section corresponds to the object "Fitting Coefficients" of Appendix B, which is composed of a list of metadata given in Table 5 and of a numerical table. A template of a "Fitting Coefficients" section input is given in Table A5. The transitions of the numerical fitting coefficients table must be organized identically to the transitions of the related rate coefficients table. Table 5. List of properties/metadata for the object "Fitting coefficients". The symbols [M] and [O] indicate that the metadata are either mandatory or optional in the import file. In the column "Uniq.", we indicate whether the metadata define the uniqueness of the object. In the column "Purpose", we indicate whether the information is meant for the BASECOL (B) public WebUI display and/or for VAMDC (V) interoperability. Currently, 86 collisional rate coefficients sets have fitting tables, and those fits have been either provided by the authors or calculated by the BASECOL maintainers. The list of fitted collisional sets is indicated in Table A1 of our previous publication [5]. No additional sets have been fitted since 2013 because of lack of manpower 17 . The fitting functions are the so-called "common fit equation":

Property Name Description
with T the temperature in Kelvin, R the rate coefficients in cm 3 /s, a k , a N , and the fitted parameters, and specific fitting functions used in various publications for their data: "Faure et al. 2004" [14], "Faure et al. 2001" [15], "Sarpal et al. 1993" [16], " Lim et al. 1999" [17] as indicated in Table A5.

Publications Section
Finally, the import file imposes to process the "Publications" section. This section may include several "Publication" objects; each object is composed of a list of metadata given in Table 6 and follows the versioning defined in Appendix B. The "Publications" section input file must contain at least one "Publication" object, which is the reference to the paper where the collisional data set has been published. This main reference is indicated in the publication input file with $$mainArticle = yes; this reference appears in red in the "References" section of the public WebUI (cf. Section 4.2). If the collisional dataset is obtained through several papers, additional references, identified with $$mainArticle = no, can be included. The BASECOL team strongly recommends to include the references to the potential energy surface used in the collisional calculations. Additional references may be considered such as references linked to the collider and to the target energy levels, as well as to the method used in the calculations. We do not provide an input file for this entry as some minor format modifications are currently being made. In the case of old papers with no DOI, a random DOI is created with the string "tmp_doi" followed by a random unique identifier. This random internal DOI is not provided to VAMDC as DOI is not mandatory in VAMDC. In the import file, DOI is mandatory as we want to enforce the good practice of using unique identifiers. The references' entries of the BASECOL database are currently updated with their DOI. The keywords metadata are composed of a key metadata followed by its value. The key metadata are "Perturbing Element", "Target Element", "Possible systems", "Transitions", "Type of data", "Possible Method", and "Miscellaneous". To each reference, a set of "key metadata:value" is attached that characterizes the content of the corresponding paper. These keywords allow the search by keywords in the "Articles" part of the BASECOL website (cf. Section 4). 17 We are considering re-offering this service through subscription. Table 6. List of properties/metadata for the object "Publications". The symbols [M] and [O] indicate that the metadata are either mandatory or optional in the import file. In the column "Uniq.", we indicate whether the metadata define the uniqueness of the object. In the column "Purpose", we indicate whether the information is meant for the BASECOL (B) public WebUI display and/or for VAMDC (V) interoperability. It should be stressed that the publication part of the BASECOL database is extremely important because it ensures provenance of the data, and it ensures that BASECOL follows FAIR principles [7] (cf. Section 6), which is an intrinsic feature in VAMDC.

Property Name
In addition, the BASECOL policies ask the users to cite the authors of the collisional datasets. Some improvement of that section could be made possible in the future by deploying a software similar to the one developed for the HITRAN and the AMBDAS databases [18] that allows for retrieving the information using a DOI.

Import Procedure
The import script performs integrity and format checks on the import ASCII file. The importation process starts when the import file is valid. The import script is designed either to import a whole new "collisional data set" or to update an existing "collisional data set". In each "Data" section of the "collisional data set" (Section 3.1), the import script checks the values of the metadata related to the uniqueness of an "object", and, in addition, for the "Energy Table", "Energy Origin", "Rate Coefficients", and "Fitting Coefficients" sections, it checks the values of the numerical tables. If the value of a single item that we call "uniqueness item", metadata, or numerical value does not correspond to what is already in the database, the "collisional data set" is considered as a new collisional data set; if all the uniqueness items have the same value as in the database, the " collisional data set" is considered as an update of an existing "collisional data set".
When a new "collisional data set" is imported in the database, a collisional entry is created. In order to ensure data-traceability and reproducibility, a version number is attributed to the freshly created set and a comment describing the details of this creation operation is stored. Alternatively, when an existing "collisional data set" is updated, the metadata not linked to the uniqueness concept and affected by the change are modified, a new version number of the "collisional data set" is created, and a comment describing the type of change is stored. Detailed information on the versioning is provided in Appendix B.
Once the import file has been processed, its scientific content, i.e., the "collisional data set", is copied into the database of the BASECOL "ingestion instance". This "collisional data set" is visible on the website of the BASECOL "ingestion instance", and the administrator can check the imported data. Once the information is checked, the administrator allows its visibility in the "production instance" of BASECOL. Indeed, by default, any new "collisional data set" status is "non visible" in the production database, thus protecting the "production instance" of the database from non checked information when the "ingestion" database is dumped into the "production" database. The visibility on the "production instance" website of BASECOL implies that VAMDC can access the "collisional data set".
The processed import ASCII file is additionally copied into a git versioning system. The git versioning system allows traceability in case of a later update and allows restoration of the database in case of corruption and/or loss of the database. This process guarantees the data integrity.

Public Access to BASECOL via the Web User Interface (WebUI) at basecol.vamdc.org
The public interface of BASECOL has been simplified, and we describe here its current features. The "Home" section is the landing page for the URL: https://basecol.vamdc.org; it provides the citation policies and the units. There are two ways to query the database: a guided query with the "Browse Collisions" (see Section 4.1.1) section that corresponds to a simplified version of BASECOL2012 collisional query browser and the "Search Collisions" (see Section 4.1.2) that provides queries by several free search criteria. The "Search articles" section (cf. Figure 3) provides a search on the bibliographic entries which are associated with numerical data. The "Tools" section gives access to information on the SPECTCOL tool [4] and on a scientific package for the water-H 2 collisions [19][20][21][22][23][24]. The search on energy tables has been removed following recommendations from the French PCMI community 18 .

Browse Collisions
The query part of the "Browse Collision" section is a four-step process: from a single page, the user successively chooses a target, its symmetry, then a collider and its symmetry. The interface provides an auto-completion for the species name and can suggest available colliders according to the selected target. The available datasets corresponding to the restrictions submitted by the user are listed immediately below the query section. Figure 4 shows an example for the request of CO colliding with ortho-H 2 .
In this example, we see that the data set of Yang et al. [25] has the flag recommended = "yes" as it is more complete and more recent than the previous sets [26,27] which have recommendation = "no".

Search Collisions
The query part of the "Search Collision" section allows for freely querying the database with the year of publication and the authors of paper attached to the datasets, the target and collider species. Therefore, it is possible to see the complete content of the database when no criteria are chosen, it is possible to see all collisional sets with electrons as the collider, and so on. This is an extremely useful tool for both the users and the managers of the database. We provide an example in Figure 5.

General Display
Once the user has selected a collisional data set, the user interface displays on the same single page the complete numerical and documentary information. Figure 6 shows the return page for the selection of Yang et al. [25] data set. From this page, one can access the rate coefficients, the fits to the rate coefficients, the energy tables with state labels that are used to identify states in the rate coefficient tables, the bibliographic references attached to the collisional data set (the main reference where the dataset is published is shown in red), and then a set of information that allows for rapidly characterizing the methodology used in the calculations. In BASECOL2012 [5] user interface, those items were split between several tabs. Figure 6. Return page containing the CO-ortho-H 2 data set when the reference of Yang et al. [25] is selected. In a future release we will simplify the interface so that the sections "Range of Energy", "Basis Set", "Method" be combined into a single item called "Methodology".
The user can visualize the rate coefficients and energy tables on the WebUI and can download the data as text files. When fitting coefficients are available, a new graphical display allows for comparing the calculated rate coefficients with the fitting function.

History and Versioning
In the "History" part of the return page, the different versions of the collisional datasets will be available. At the time of publication, none of the sets have been updated since the creation of the versioning system. Therefore, we show how it works in Figure 7. The versions correspond to major changes in the database (numerical data), and each version has a creation date and might have some minor modifications done at a later date. The currently displayed version is explicitly notified to the user by the "(this page)" tag after the corresponding version number. Clicking on any of the other versions will bring a similar return page (such as Figure 6) corresponding to the selected version. That history part reflects the complex versioning system described in this paper and is an important achievement as it allows full traceability and reproductibility of the data.

VAMDC Access to BASECOL
The BASECOL database is interoperable within the VAMDC ecosystem 19 , and the interoperability is ensured by the metadata associated with the imported data (cf. Section 3.1), and by the implementation of VAMDC software configured for the BASECOL database data model.

BASECOL Implementation of VAMDC Standards
By implementing the node software 20 , the BASECOL database becomes a VAMDC-federated resource. BASECOL is currently the sole VAMDC connected database that implements the Java version of the node software 21 . The node software is a wrapping between the internal BASECOL structure and the VAMDC standards: it implements a data access which is a variant of the TAP protocol from the IVOA 22 , deals with the query language 23 used in the VAMDC e-infrastructure, returns data in the XSAMS format 24 , and uses standardized field names described in a dictionary: the "Restrictables" 25 contain the list of field names that are used to filter the query to the database and the "Returnables" 26 refer to the type of data that the database returns to VAMDC. The current BASECOL "Restrictables" and "Returnables" are reported in Tables 7 and 8. Finally, the database must be registered in the VAMDC registry 27 . The VAMDC registry is a database of metadata describing the VAMDC nodes and web applications/services. These metadata are collected by querying directly the services' endpoints. The information stored into the registries allows humans and/or machines to find the address of a given node, in order to select nodes by the kinds of data they offer and to find out which query terms are supported at a node. The parameters of Tables 7 and 8 can be found in the registries description of BASECOL 28 .
Members of the VAMDC Consortium have a password allowing them to register a new database or a web application, once they have followed the quality procedures concerning the resource to be registered.

BASECOL and VAMDC Ecosystem
Within the VAMDC ecosystem, BASECOL can be accessed from different VAMDC tools such as the VAMDC portal 29 and the SPECTCOL tool [4], and from any other software that has implemented the VAMDC protocols and standards. On the VAMDC portal, the BASECOL data may be displayed with the "Collisional XSAMS to HTML" display, and Figure 8 gives an example of output that corresponds to querying the CO molecule on the VAMDC portal without any selection of the collider. When the BASECOL data are queried through the VAMDC portal and the SPECTCOL tool, each query is stored in the VAMDC query store 30 and can be linked to a DOI. When this DOI is used in publication, the citations associated with the datasets are acknowledged, therefore increasing the impact factor of the data producers [28][29][30].

BASECOL and FAIR Principles
We would like to conclude this paper with a discussion about the compliance of BASECOL with the FAIR principles [7].
• Findable: On one hand, as described in Section 5, BASECOL is registered into public registries and may be found via the VAMDC discovering facilities. On the other hand, once BASECOL is discovered as a useful data repository, the collisional data it contains may be searched (cf. Section 4.1) using the collision search engine or browsed using the collisional browser. • Accessible: Once found, data may be accessed directly on the Basecol Web site (cf. Section 4.2) or extracted by the VAMDC infrastructure.

•
Interoperable: the BASECOL interoperability comes from its integration within the VAMDC infrastructure. BASECOL data extracted via VAMDC may be handled using VAMDC compliant tools and processors (cf. Section 5.2). • Reusable: Trust in data is a crucial element for data reusability. With BASECOL, scientists are able to cite where data are coming from, with a fine-grained granularity that allows to achieve scientific reproducibility. Indeed data are both versioned and timestamped, and a user may refer to a specific version. All the "collisional data set" versions are stored in the BASECOL database and are available via the web interface. Another aspect that increases the trust in the BASECOL data is the direct link between a given collisional set and the refereed publication where the data set was published: the methods, hypothesis, and algorithms used for producing a given data set always come with the data.

Conclusions
This paper provides a detailed analysis of the latest technical developments on the BASECOL database and its environment. The major upgrades concern its very careful versioning system linked to an ingestion system that verifies the imported data with respect to the content of the database, as well as the possibility for users to retrieve previous versions of the collisional data sets. Another major update is the dual system of a validation and of a production environment coupled to the storage of the collisional data sets ascii files in a github repository. BASECOL is now designed to be a database fully compliant with the VAMDC e-infrastructure, in particular through the import script that contains the parameters necessary for the VAMDC interoperability. Citing one referee: "The 2020 version is undoubtedly a major update compared to the 2012 version: simpler, clearer with excellent traceability and reproducibility. The website is well designed and user friendly." Nevertheless, some issues must still be solved, they concern a more friendly import management environment and, of course, the scientific update of the database. In the future, we might foresee partnerships with websites that provide the above cited RADEX files, and we might adopt a different business model for the sustainability of the database.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

Appendix A. Examples or Templates of the Import File Sections
The import ASCII file is composed of several "Data" sections. Examples or templates of those sections are provided in this appendix. Table A1 provides an example of an "Element" section input for N 2 H + ; it corresponds to the metadata of Table 1. Table A1. Example of a "Element" section input for N 2 H + as requested by the import script. The metadata are in red and the values in blue. The metadata "elementType" have fixed values chosen among [molecule | atom | negative molecular ion | positive molecular ion | particle]. The metadata $$mass give the molecular weight in atomic mass unit (in the species database, it is used to infer the mass number). The metadata "molecularConstant" is historical, concerns the rotational constant of a linear molecule, and has units in MHz. "htmlName" uses normal HTML language and is used in the BASECOL web interface.  Table A2 provides a template of an "Energy table" input; it includes the metadata of Table 2. Table A2. Template of an energy table input as requested by the import script. The metadata are in red and the values in blue. The "case" metadata are dcs | hunda | hundb | ltcs | nltcs | stcs | lpcs | asymcs | asymos | sphcs | sphos | ltos | lpos | nltos. Q1 and Q2 are characters for quantum numbers. Both the case and the quantum numbers are described in the case-by-case document (http://www.vamdc.eu/documents/cbc-1.0/). In the second part of the table, all fields are mandatory. The first column gives the label attached to the energy level; this label identifies the levels in the rate coefficient table. The number of levels must be the number of levels necessary to identify the energy levels of the rate coefficients table. The second column gives the energy (in cm −1 ), the two following columns give the values of quantum numbers identified in the parameter $$quantumNumbers. In the current template, there are two quantum numbers and two levels.
$$case = case (see caption) $$symmetry= none | A | E | meta | ortho | para (for atoms it is "none") $$title = Free Text (for example: Rotational energy levels of HCN) $$comments = Free Text $$type = theoretical | experimental $$energyUnit = cm − Table A3 provides a template of an "Energy Origin" input; it includes the metadata of Table 3. Table A3. Template of an "Energy origin" section input as requested by the import script. The metadata are in red and the values in blue. The "case" metadata are dcs | hunda | hundb | ltcs | nltcs | stcs | lpcs | asymcs | asymos | sphcs | sphos | ltos | lpos | nltos. Q1 and Q2 are characters for quantum numbers. Both the case and the quantum numbers are described in the case-by-case document (http://www.vamdc.eu/documents/cbc-1.0/). In the second part of the table, all fields are mandatory. The logic of the numerical line is the same as in Table A2.
$$case = case (see caption) $$symmetry = none | A | E | meta | ortho | para (for atoms it is "none") $$quantumNumbers = Q1; Q2 # [M] 1 0.0 value value [M] Table A4 provides a template of a "Rate Coefficients" input; it includes the metadata of Table 4. Table A4. Template of a "Rate Coefficients" input as requested by the import script. The metadata are in red and the values in blue. In the second part of the table, all fields are mandatory. The first column corresponds to the target's initial level, the second column to the target's final level, the third column to the collider's initial level, and the fourth column to the collider's final level. Then, the rate coefficients at the different temperatures indicated as values of the parameter $$temperatures are provided. In the current template, the transition is between the third and the first rotational levels of a target with the collider in its fundamental state. Two temperatures are considered at T1 and T2. The values 0.130607E-10 and 0.161225E-10 are the rate coefficients in cm 3 s −1 .
$$title = Free text (Author1 & al, year) $$comment = Free text $$pesComment = Free text $$methodComment = Free text $$precision = Free text $$rateType = state to state | effective | thermalized $$process = none | fine | hyperfine | ro-torsional | ro-vibrational | rotation | vibration $$year = value $$contributor = Free text $$temperatures = T1 ; T2; ..  Table A5 provides a template of a "Fitting Coefficients" input; it includes the metadata of Table 5. Table A5. Template of a "Fitting coefficients" input as requested by the import script. The metadata are in red and the values in blue. None of the information is requested by VAMDC. The metadata $$designation has the following choices explained in the text (common fit equation  (see Table A4); this is followed by the values of fitting coefficients whose symbols are indicated in the parameter $$coefficients.
[M] stands for mandatory. In the current template, the transition is between the third and the first energy levels of the target with the collider in its fundamental state, and the fitting functions have two coefficients. the BASECOL database, in order to modify and to add information, and then to create an import file in a flawless format that can be treated by the import script. This is particularly useful for the retrieval of metadata associated with species, for the discovery of the right format for quantum numbers, and in general for the retrieval of any information from the BASECOL database in order to create a new import file. Figure A1 shows an example for "species": a species is selected in the auto-completion bar of "Available elements", and then the fields are automatically filled. Figure A2 shows an example for "quantum numbers": a case is chosen; then, the list of standardized VAMDC quantum numbers appears, and the administrator can choose the relevant ones for his/her application. Figure A1. Extract from the webUI for the creation of an "Element" input file (Table A1). In this example, the information related to the H 2 O species is retrieved from BASECOL. Figure A2. Extract from the WebUI for the creation of an "Energy Table" input (Table A2). In this example, the information related to the case "dcs" (diatomic close shell molecules) is retrieved from BASECOL. On the right of the "Quantum numbers" box, a list of quantum numbers appears. The administrator chooses the relevant quantum numbers and includes them in the box.