1. Introduction
In the context of this paper, Trajectory Determination Systems (TDS) [
1] are software components that compute 
trajectories using a variety of sensor measurements as input. The word trajectory usually refers to the path of a flying object. In this article, we will first generalize its use to the path of any moving object through the 3D space as a function of time. In addition, we further generalize it to paths of entities—objects or mathematical abstractions—whose coordinates change with time in some 
navigation, 
state or 
parameter space 
N; e.g., the actual 3D path plus the velocity and attitude angles of a 3D rigid body. Clearly, this concept is closely related to the mathematical one of a stochastic process since we describe the coordinates of the object as a time-dependent 
N-valued random variable. More to the point, a TDS estimates optimal—in some sense—trajectories from sensor measurements and mathematical models. A TDS may be either a real-time or post-processing tool.
Today, the applications of trajectory determination are becoming increasingly relevant and demanding: unmanned aircraft navigation [
2], precision agriculture [
3,
4], indoor mapping, indoor positioning [
5], robotics—including autonomous vehicle driving— [
6] are blooming examples among many others. Moreover, the trend is not bounded by the traditional relative small size of the professional markets since accurate trajectory determination is also an enabler of mass market applications [
7]. In general, these applications cannot be served by mono-sensor systems due to the complexity and variety of the navigation environments. In parallel, sensing technology—including satellite navigation technology—is evolving fast and contributing new sensors either in their very concept or in their features (size, weight, error properties and other) [
8,
9]. To leverage their potential for trajectory determination in the context of extremely competitive markets, the measurements of new sensors shall be modelled and integrated in TDS as quickly as possible.
The challenges discussed above are not only scientific, but also software engineering ones, both at internal computational and external interface levels. The described situation, large variety and rapid evolution of input data, is not particular to TDS but a general condition of modern software systems. In fact, coping with change and evolution of software systems is an old challenge of software engineering. Many software development methods like Extreme Programming [
10] or, more in general, the family of agile software development methods [
11] were, to a significant extent, motivated and influenced by the need to cope with unforeseeable change and rapid evolution. Many programming languages were also shaped by these challenges, one of the most popular being object-oriented languages [
12].
One solution to the problem is to develop systems—TDS in particular—that are easily extensible. Over the past years we have developed a number of TDS [
13,
14,
15] that achieved extensibility through genericity. For this purpose, we designed internal estimation engines and external interface engines, for a wide class of data and mathematical models in a way that dealing with new data types or new models neither requires to modify the estimation engines nor the input/output ones [
14,
15], following a plug-and-play paradigm. As opposed to the simultaneous approach in [
13,
14], ASTROLABE targets sequential estimation and is based on seed concepts anticipated in [
15]. More specifically, under “wide class” we understand an abstraction that supports, in principle, all existing and future sensor models and data for TDS. A similar approach is reported in [
16] for a “Reconfigurable Integration Filter Engine” (RIFE). While [
16] describes a generic TDS, in ASTROLABE we concentrate on a generic interface independent of the internal navigation engine and target all types of trajectories. A related article [
17] mentions the Unified Aiding Information Drives (UAID) generic sensor interface for RIFE. However, in [
16] the use of a new sensor requires that the sensor type is available in a system library whereas we pursue a different level of abstraction where new types of sensors can be also integrated in a plug-and-play fashion.
We note that there are other approaches to the extensibility of TDS. One possibility is to develop, for each sensor type, a “maximalist” model and interface with the hope that it is general enough. Another approach is to use sensor replacement models (RPM) [
18]. This technique is popular in small-scale photogrammetry and remote sensing. It is based on general analytical formulations, usually unrelated to the physics of the sensor, like rational functions (algebraic fractions where both the numerator and the denominator are polynomials). Its benefits notwithstanding, a RPM is nothing else than a model that works for a wide variety of imaging sensors. A true generic interface shall allow for a seamless use of any type of model and entirely leave to the decision of the sensor specialists which particular model shall be used.
As with software engineering, extensibility is an old issue in geomatics with some early contributions that had influenced our work: in [
19] early attempts to classify geodetic entities under a common umbrella were made; in [
20] a pioneering generic adjustment engine was presented; in [
21] the advantages of the object-oriented approach to generic adjustment systems were discussed; and in [
22,
23] the underlying general structure of network adjustment systems was analysed. Though loosely related to our problem, we acknowledge the research on ISO standards for geomatics as reported in [
24] that have the potential to generate generic interfaces for TDS. These are contributions from geomatics relevant to TDS as they equally apply to real-time and post-processing tools.
Generally speaking and surprisingly to some extent, the literature on navigation and orientation, either in broad general surveys like [
25] or even for specific generic systems like [
26,
27] or [
28] do not tackle generic interfaces. It has been only recently, that the design of plug-and-play systems has made apparent the need for generic interfaces; automation—e.g., in robotics—and harmonisation—e.g., large organisations—are drivers of this. Thus, as already referenced, [
17] mentions the UAID generic sensor interface for their universal plug-and-play navigation system. [
29] refers to the Defense Advanced Research Projects Agency (DARPA) All-Source Positioning and Navigation (ASPN) programme and its set of generic interface documents known as the “ASPN Interface Control Documents (ICDs)”. ASPN was launched by DARPA in November 2010 with the aim of developing low cost sensor fusion technologies and achieving a plug-and-play architecture; i.e., that sensors can be added, removed, or replaced on-the-fly and that their measurements can be included in the input data streams with minimal delay [
30]. The DARPA initiative is an independent confirmation of the relevance of the topic of this article.
To at least alleviate this problem, this research work proposes a data abstraction. The identification of the common traits defining the essence of the information provided by sensors leads to a data model general enough to support such diversity, that is, able to represent any kind of sensor whose error distributions may be fully modelled by means of a mean and a covariance matrix (see 
Section 2) including metadata. Data (values) by themselves are meaningless; a rigorous data model must take care of including the metadata that unequivocally characterize data if it means to be complete. Ignoring metadata has been a constant source of problems in the past—and it still is [
31]. Unfortunately, applications in production environments regularly tend to ignore metadata for the sake of performance (The absence of metadata implies a series of assumptions on default values concerning data, as the physical units or the expected reference frame and coordinate system used, among others. As a consequence, it should not be necessary to perform conversion tasks (units, coordinate systems) on input data, thus reducing the computational effort required to provide a solution and, consequently, improving performance.).
Standards covering sensor and observation data models already exist [
32,
33] and efforts to bring those closer to geomatics also exist [
24]. Although these data models would have been powerful (and general and extensible) enough for the purposes of our work, we focus, however, 
in a much narrower field of applications than those targeted by the aforementioned standards; a suitable data model for TDSs needs to specify a fewer number of characteristics to be sufficient, which leads to a more concise specification. In short, ASTROLABE may be seen as a subset of  [
32,
33]. Additionally, terseness is an advantage from several standpoints, as for instance: (1) the amount of information to store when a TDS is working in real time is noticeably reduced, thus decreasing the time needed to save data; (2) transmitting such data through a network connection reduces both bandwidth requirements and transfer time; (3) the burden related to the preparation and management of data and metadata is significantly reduced in actual production environments and (4) the implementation of an Application Programming Interface (API) managing such data model is much simpler and compact.
On the other side, standards designed to represent data from specific types of sensors exist and are widely used. A well known example is the Receiver INdependent EXchange (RINEX) format [
34], specifically designed for GNSS range measurements, and its relatives, the Antenna Exchange (ANTEX) format, the IONosphere map EXchange (IONEX) format and the Standard Product 3 (SP3) orbit format. We note that each of these formats are defined independently and that the inclusion of new GNSS systems requires modification and recompilation of the IO software in most cases. Another popular example is the LASer file format (LAS) [
35] for point clouds derived from laser scanning. The main advantage of such kinds of standards, being tailored and optimized for specific sensors, is, at the same time, the reason to avoid them; one of the goals of the research work presented in this paper is to devise a generic data model able to manage diversity and evolution.
This paper presents ASTROLABE, developed since 2009 at the former Institute of Geomatics (IG) and since 2014 at the Geomatics Division of the Centre Tecnològic de Telecomunicacions de Catalunya (CTTC) with the cooperation of GeoNumerics. Its goal is to create a generic, extensible, complete and efficient data model suitable especifically for TDSs. This work is inspired in the generic data models and interface specifications of two generic and extensible network adjustment tools, GeoTeX [
13] and GENA [
14]. At the time when the decision to develop the generic standard was made, to the best knowledge of the authors, there was nothing comparable, academic or industrial. It is true that the standards defined in  [
32,
33] already existed but, as stated above, these were too ambitious for our purposes.
ASTROLABE is the generic interface of CTTC’s NAVEGA [
15] and GeoNumerics’ NEXA trajectory determination systems. In the rest of the article, for the sake of clarity and readability, we will use NAVEGA examples and refer to NAVEGA functions whenever helpful.
This work is structured as follows: 
Section 2 identifies and introduces the data entities involved in the process of trajectory estimation; these data entities are the pillars on which the ASTROLABE data model relies. 
Section 3 describes such data model in full detail from the conceptual standpoint, including data and metadata. On 
Section 4 the ASTROLABE data model is translated into a formal specification comprising a file and a network interfaces, which are briefly described. An almost complete C++ library implementing these two interfaces is briefly discussed in 
Section 5. Some considerations on the kind of sensors (and related observations) that may be modeled using ASTROLABE are included in 
Section 6. To finish, conclusions are presented in 
Section 7. The interested reader may find a more detailed description of the ASTROLABE file interface in 
Appendix A. An example describing ASTROLABE data and metadata used in a real-life project (GAL) may be found in 
Appendix B.
  2. The Foundations of the Data Model
Shortly stated, the goal of a TDS is to derive trajectories out of sensor data, whatever these are, be it in post-processing or real-time mode.
As any other kind of application, a TDS transforms input data into some kind of result using specialized logic appropriate for the problem to solve. In the particular case of TDSs, these use observations coming from instruments (sensors) as input, and apply a series of mathematical models specific for the intervening observations and instruments to compute the states. The computation of these values in consecutive points of time deliver a time-ordered series of states, the trajectory.
The most common methods used to determine trajectories in TDSs are sequential least-squares estimators. Kalman filters [
36] are the best known among them. Although these filters do not specifically require Gaussian distributions for errors, they yield the exact conditional probability estimate when this happens. Further, to fix the ideas, for the data model described hereafter, a Gaussian distribution of the observation error will be assumed. Nevertheless, the data model can be used for non-Gaussian situations. In fact, the proposed data model supports a broad range of error probability distribution functions, namely those for which the first and second moments exist.
Four relevant entities arise from the previous description: states (a.k.a. parameters), observations (a.k.a. measurements), instruments and mathematical models. The data model is precisely based on them. In our context, it is convenient to add a “new” one, namely, the observation equation, intended to identify a particular combination of observations, states, instruments and mathematical models.
Below, a short formal description of these entities is provided to help to focus the discussion.
      
- (Observable and) Observation. An observable is a numerical property of a physical system that can be determined by a sequence of physical or mathematical operations. Technically, it is a random variable. An observation or measurement is one of the values that an observable or random variable may take; i.e., it is a random sample of a random variable. (For example, the various repeated measurements between A and B of a distance meter instrument—the measured distances—are observations and the abstract concept of distance between A and B is the observable). - It is worth mentioning that for ASTROLABE, and this is a language abuse, an observation is, in fact, a vector of measurements for a set of observables (an array of observations as defined just above). - In the context of TDSs, typical observables are acceleration, temperature, range or angular speed. Observations for these observables might be −1.3 m/s, +23.556°, +12,832.34 m and −27.42 rad/s respectively. 
- State. Similar to observations, a state is a time-dependent random vector whose expectation and covariance have to be estimated from known observations, instruments and mathematical models. Note that in simultaneous, least-squares estimation the states are called parameters. - Typical states in TDS environments are position or attitude random vectors. 
- Instrument. An instrument (or sensor) is a device used to measure some observable, thus delivering observations. From the data abstraction point of view, it is an entity which contains the constants that characterizes an actual instrument. - Within the context of TDSs, inertial measuring units, gyroscopes or GNSS receivers are examples, among others, of instruments. 
- (Mathematical) Model. A mathematical description of a system or process. In the context of ASTROLABE, it is a stochastic equation or a stochastic differential equation; traditionally, this has also been described as a functional model plus an stochastic model. 
The last data entity in ASTROLABE’s data model is the observation equation which, in our context, is defined as:
	  
The last part of the former sentence introduces a new facet in the definition of observation equations. From the processing standpoint these may be viewed as 
commands or 
triggers that tell the trajectory determination software that the right moment to derive a new state has come (and that the information to use is the one specified by such equation). A series of observations, all of them including measurements reported by—probably—different, cooperating sensors, may be held in stock until the best moment to derive the new state comes. This moment is reported by the observation equation. This is specially important when working with systems that hybridize a set of sensors to compute trajectories; some usual examples combine, for instance, GNSS receivers, inertial measuring units and magnetometers for such purpose. The use of the simultaneous data reported by these sensors noticeably increases the quality of the solution (see [
37]).
Figure 1 depicts graphically an hypothetical observation equation. This example assumes that there exist 
j different kinds of models, 
l kinds of observations, 
o different kinds of states and 
z kinds of instruments. The observation equation in the figure relates the model whose identifier is 
h, two observations with identifiers 1 and 
k respectively, two states, whose identifiers are 
m and 
o and, finally, a single instrument with identifier 
y.
   3. The Data Model in Detail
The sections to follow will describe in detail the observation, state, instrument and observation equation entities just introduced. Such a description takes place from the data and metadata standpoints; it is the result of the abstraction process that leads to the ASTROLABE data model. However, before such a description may be detailed, it is necessary to discuss how data and metadata are identified and cross-referenced. 
Section 3.1 will introduce the [meta]data identification mechanism used in ASTROLABE. 
Section 3.2, 
Section 3.3, 
Section 3.4 and 
Section 3.5 will then describe both data and metadata.
  3.1. On Data Identification and Cross-Referencing: Types, Identifiers and Instance Identifiers
ASTROLABE relies on a hierarchical, three-level mechanism to unequivocally identify and cross-reference data. Metadata employ two of these levels, namely types and identifiers, while data complete the identification triplet incorporating instance identifiers.
A type is a code tagging a unique kind of object (be it an observation, state, instrument or model) with a specific, and also unique, set of properties. An example of such object would be a certain kind of instrument, e.g., a temperature-compensated barometer. A unique type code would be assigned to this kind of barometer. Should a different kind of barometer exist, as for instance, one whose measurements were not compensated by means of temperature, a new, different type code should be used. Examples of these type codes could be barometer_t_compensated and barometer_basic respectively.
There may exist, nonetheless, many brands and models of barometers that use temperature to correct their measurements. All these would be incarnations of barometer_t_compensated (instrument) objects, that, in spite of sharing a common type, might behave differently—e.g., being more or less accurate or delivering data using different physical units. In other words, even though all the temperature-compensated barometers may be described by a common set of properties, the actual values of these properties may differ.
The identifier is used to take these differences into account. An identifier is a unique code used to tell apart different incarnations of objects (temperature-compensated barometers in the example above) that, in spite of being characterized by the same set of properties have different values for these.
A tuple composed by a type and an identifier unequivocally identifies any kind of object in ASTROLABE metadata. 
Figure A2, lines 01–29, shows the full XML specification of temperature-compensated barometer observations using the ASTROLABE file interface (Although this section takes care of describing a data model that may be implemented in many different ways, examples are presented using the XML syntax defined by ASTROLABE to materialize its file interface. Such file interface is detailed in 
Section 4.1.). Note the type/identifier codes in lines 02 and 04 of this example.
Actual data records use the aforementioned identifier to characterize themselves. That is, these records include the identifier as an extra field pointing to the metadata that will serve to describe them. 
Figure 2 depicts the type/identifier hierarchy and shows as well how actual data use the identifier code to state what kind of information is stored in data records.
The identification schema just depicted is not complete yet from the data (not metadata) standpoint, since it does not take into account that multiple, identical instruments might be used simultaneously when collecting data. That is, data originating from two or more identical instruments—sharing the same type and identifier codes—may be present in a dataset. To solve this problem, the third element in ASTROLABE’s data identification schema is used. It is the instance identifier.
The instance identifier is a code used to distinguish between several instances of objects whose types and codes are identical. It purpose is to allow multiplicity.
Thus, actual data records use a combination of an identifier plus an instance identifier to fully characterize themselves. The identifier points to metadata (and therefore, to the exhaustive description of the information the data record contains) while the instance identifier is used to discriminate between several instances of identical data sources. 
Figure 2 depicts graphically this situation. 
Figure 3 shows several actual barometer data records in ASTROLABE XML format including the same metadata identifier (baro1, see 
Figure A2) but two different instance identifiers (1 and 2). That is, data from two identical barometers have been collected at times 124.88 to 124.92 (two readings from barometer 1, only one from barometer 2).
Note that the values of the instance identifiers may be chosen arbitrarily, providing these are different for every object instance they represent.
  3.2. Data: Observations
Observations are one of the main pivots the ASTROLABE data model turns around; not in vain, observations are the primary source of data used to derive results, either in TDS or in many other kind of software tools. In fact, many of these tools consider observations as the unique source of information—often forgetting metadata completely. Therefore, finding a powerful abstraction for observations is of capital importance for a data model that means to be generic and extensible.
The type of information provided by sensors is—apparently—very different. It is evident that a barometer will not deliver the same kind of data as with a spectrometer, a gravimeter or a densitometer, to mention some. Moreover, and now moving to the software realm, the algorithms used to deliver useful results out of sensor readings are different too. Taking into account, however, how information is processed is not the task of a data model; it must deal only with those aspects that are exclusively intrinsic to data.
In spite of the aforementioned dissimilarities in observations, a common structure, their essence, may be identified in all of them. The observations that may be modelled by ASTROLABE consists of a set of expectations (The word expectation in the context of this paper stands for 
the best estimate of the expectation of the random variable associated to the measurement.) (the measurements reported by the device) and an assessment of its quality (the covariance matrix (This is so because ASTROLABE assumes observations with error distributions fully defined by the first and second moments; therefore, an observation always consist of an array of expectations plus a covariance matrix.)). No matter what the observation is, these elements will always be present (Some sensors may not provide actual values for the covariance matrices that should accompany every measurement. In such situations the nominal quality values reported by the manufacturer may be used. Such default values may be specified by means of metadata. For instance, the example in 
Figure A2 (line 18) provides a default value (1 hPa) for the standard deviation component of the covariance matrix by means of the <c> tag.).
Of course, the number of values present in each kind of observations will vary; a 3-axial accelerometer will report three readings (acceleration around the three axes) while a thermometer will provide just one (the temperature). Even when considering sensors of the same type providing the same number of readings, the units used to deliver such values may vary: m/s2 or cm/s2 in the case of accelerometers, for example.
These differences are, however, accidental and may easily modeled through the proper use of metadata (see 
Section 3.5). What remains is the fact that every observation will be composed of a set of expectations and the related covariance matrix.
Therefore, the first approach to define the common structure for observations would state that it is composed of two elements:
		
It is necessary to be able to tell apart different kinds of observations; otherwise, it would be impossible to interpret the semantics of these measurements: observations from a distancemeter would be indistinguishable from those coming from an odometer. To do so, the identifier and instance identifiers defined in 
Section 3.1 are introduced:
		
The unique identifier will point to the appropriate metadata, where aspects as the dimension of the vector of expectations or the units used—as well as other accidental properties—will be specified. The instance identifier, as stated in 
Section 3.1, will serve to tell apart observations originating from different instances of equivalent data sources (as for instance, two identical barometers).
Note that the need to identify different kinds of observations responds to the (geodetic) principle of heterogeneity (Heterogeneity refers to the use of different types of observables for the determination of the same set or subset of parameters or states. Better systematic error correction is also the goal of heterogeneity.) on which the ASTROLABE data model relies. The need to identify several instances of the same kind of observation responds to the (geodetic) principle of redundancy (Redundancy refers to the repetition of measurements, apparently superfluous and unneeded, with the objective of mitigating the random dispersion of the observables (random variables) being measured (samples of the observable). Better outlier detection is also the goal of redundancy.).
Time is a 
requirement in TDSs because of the nature of their purpose. Observations must therefore be time-tagged to register the moment when the sensor delivered its readings.
		
Sometimes, it is interesting to keep track of extra data that, although not an intrinsic component of an observation, helps to complement it. This auxiliary data may be of any kind. For instance, an accelerometer delivering acceleration measurements may be seriously affected by temperature. This observable (temperature) is not, from the conceptual standpoint, part of the acceleration observation; however, ignoring it may lead to incorrect results when used by a TDS. On the contrary, extending the observation with this information may help the TDS to, for instance, apply a calibration method to correct the data coming from the accelerometer.
The former is just an example to illustrate why auxiliary information should be a companion of pure observation data. Auxiliary values are called tags in the context of ASTROLABE. The number of tags is arbitrary; other accelerometers, for instance, might be affected by other kinds of observables. Other sensors may need not auxiliary values at all.
Therefore, the ASTROLABE data model makes tags an integral part of its observation data entity; the number and properties of tags—that may be zero—for each kind of observation is defined by means of metadata (see <t_spec>, 
Figure A2, lines 20–28).
		
Leaving aside metadata, and from the conceptual—ASTROLABE’s, at least—standpoint, there is nothing else that an observation should include. There are, however, two more items that the current observation model has adopted: an extra event tag and an activation flag.
The typical dataset used in trajectory determination (either a file or a network stream, for instance) will include at least two kinds of data entities: observations and observation equations. Since both data entities will be merged in datasets, a mechanism to tell apart these entities is necessary. A simple marker identifying the kind of entity it precedes will suffice for that purpose. ASTROLABE names this marker as event tag.
The final element composing the observation model is provided just for (processing) convenience purposes: it is the activation flag. When processing data, some observations may be detected as wrong ones (for instance, because a magnetometer starts producing invalid readings when approaching a powerful source of electromagnetic interferences, as power lines). If these observations are not removed from the computation of the output trajectory the results will be distorted. Of course, it is possible to eliminate such observations from the input dataset thus removing their harmful effects; however, this distorts how data was captured, since the original observation will no longer be available in the input dataset.
The solution to this—apparent—problem is to provide with the aforementioned activation flag. An observation set to “inactive” by the human operator must be ignored by any TDS just as it would have never existed.
This leads to the final step in the definition of the observation entity data model as seen by ASTROLABE:
		
Table 1 summarizes the discussion above.
 Figure 4 shows two examples of observations using the ASTROLABE XML syntax. The metadata describing these of observations may be found in 
Figure A2. 
Section 3.5 describes metadata in detail.
 It is possible to know that these are observations because of the <l> tag opening (and closing) data records. This XML tag corresponds to the event tag described above. The active flag is represented as the “s” (status) attribute, which may take two values: “a” for active or “r” meaning removed or inactive. Both observations include their identifiers (pointing to the metadata that characterize them) in the “id” attribute. The first observation states that its identifier is “baro1” while the second one identifies itself as “imu1”. To link these observations to specific instances of sensors the instance identifier, represented by the “n” attribute, is used, whose values, for the “baro1” and “imu1” observations are, respectively, 32 and 41.
The next field is the time stamp (124.88 in both observations). In the case of the “baro1/32” observation, one tag (auxiliary value) has been included (23.44); this is so because the sensor related to this observation is a temperature-compensated barometer; the tag corresponds to the temperature reading at the moment when atmospheric pressure was measured. There are no tags for the “imu1/41” observation.
Then, the measurements themselves are included. In the case of “obs1/32” a single value (pressure, 1023.44) is provided. The “imu1/41” observation, on the contrary, provides the readings for three angular velocities and accelerations (0.01 0.02 0.015 0.32 0.43 9.95).
Lines 2 and 5 provide the covariance matrices (a single standard deviation in the case of “obs1/32”) for these observations respectively.
See 
Figure A1 for a complete ASTROLABE XML example showing both observations and observation equations.
  3.3. Data: States, Instruments and Mathematical Models
States and instrument data may be modelled using exactly the same structure just discussed in 
Section 3.2, at least in the context of ASTROLABE and its goals (In a trajectory determination and, in general, in a parameter or state estimation “ecosystem” the role of measurements, states and instruments uses to commute. In other words and more specifically, an instrument—i.e., its calibration parameters, time varying or not—could be the result of a trajectory determination exercise. However, in a next software run, an instrument calibration set of states can be seen as an instrument constant. Or for example, a GNSS-derived position can be a state of a GNSS trajectory determination based on GNSS measurements. However, in a next INS/GNSS loosely-coupled trajectory determination run, those position states become the position observations. In ASTROLABE a trade off between the engineering model and the mathematical model is made. We could refer to the measurements, states and instruments as known or unknown stochastic processes. However, by telling apart the measurements, from the states, from the instruments we facilitate the use of the standard and the modelling process. This is why, mathematically and structurally, observations, states and instruments are alike.). The following are the definitions for state and instrument data (which have been provided for the sake of completeness, since these are exactly the same as the definition of an observation:)
		
In the case of observations and states, the coincidence of their definitions should not surprise the reader; in fact, when working in post-processing environments, TDSs use to compute trajectories in three steps, the so-called forward, backwards and smoothing ones. The output (trajectories, made of states) produced by the two first steps become the input of the third one. The role of states is thus changed to observations at that point.
Observations and states may easily be told apart by the context in which these both entities appear. Observations will always be included in input files/network streams. States are always the output of TDSs, so these will always be found in output files/network streams. For an example of states written in the ASTROLABE XML syntax, the reader may refer to 
Figure 4, which, in fact, depicts observations.
Instruments, on the other side, are not random variables. That is, time does not affect instrument data, which is considered constant information. From the structural standpoint, instrument data is, once more, a set of values and their quality information (that is, the expectations and covariance matrices found in observations and states). Note that, in the case of instrument (constant) data, the covariance is considered by ASTROLABE as a mere indication of the quality of the list of values.
Since instruments must also be identified, the three level hierarchy used in observations and states is adopted here. Tags are provided for the same reason as with observation data; the difference, in the case of instrument constants, is that these tags are purely informative, and contain their values at the moment the aforementioned constants were computed—that is, when the instrument was calibrated.
The use of time stamps and activation flags as an integral part of the instrument data needs some justification. Although purely informative, time stamps are used to keep track of the time when the instrument constants were calculated. This is very important in the case of unstable instruments that must be recalibrated often; an explicit calibration time helps to avoid problems due to the use of outdated information. Note, finally, that the time stamp must use the same reference frame and coordinate system that those used in observation records.
The presence of the activation flag might seem controversial: if observations from some kind of instrument are included in a dataset, the constants defining the instrument itself should be included as well. On the contrary, if the observations contains no data from some kind of instrument, the inclusion of the instrument data themselves would be superfluous. Therefore, the possibility of activating or deactivating instruments seems useless. This is true; however, this flag is included for convenience, purely practical reasons: it allows storing the constants of several instruments in the same single file. Depending on the sensors (and therefore, observations) used to compute a particular trajectory, the instruments available in such file may be activated or deactivated correspondingly, and the unique instruments file reused easily.
Typical examples of instruments constants would be the focal length of a camera or the position of the center of a GNSS antenna.
The following are two instrument data records that adhere to the definition above but include no tags. The instrument is described in the metadata file fragment shown in 
Figure A3, more specifically in lines 138–171.
		
These records show the expected pressure readings (1024.01 and 1023.99 mBar, see the <units> tag in 
Figure A3, line 155) for two (instance identifiers 51 and 52 respectively) identical (id = “p0_h0” in both cases) barometers, while their heights are 0.0 meters (the units are defined in line 166 of 
Figure A3). The value of the time tag (informational only) is 124.88 in both records. No covariance matrices have been included; should these be necessary (just for informational purposes only) then default values for such matrices may be retrieved from the corresponding metadata (<c> tags in lines 156 and 167 in 
Figure A3).
Concerning mathematical models, there are no data for these; in fact, ASTROLABE models are mathematical equations implemented by TDSs. The only information related to mathematical models that is included in the data model is their metadata (see 
Section 3.5 and 
Figure A4).
  3.4. Data: Observation Equations
As already discussed in 
Section 2 and shown in 
Figure 1, the observation equation relates all the intervening data entities needed to derive a new output state (models, observations, states and instruments).
The first attempt to define the data model for observation equations directly mirrors this definition:
        
Note that the identifiers used to refer to states, observations and instruments are instance identifiers (see 
Section 3.1). This means that if data coming from different instances of identical sensors are available, it is possible to refer individually to each of these sensors in the observation equation.
For instance, assuming an observation equation relating one model to two observations and one state whose respective identifiers were compute_position (the model), GPS and IMU (the observations) and position (the state), the observation equation would roughly look like this:
		
If two identical GPS receivers were used to collect data, using directly the metadata identifiers to characterize the observation equation records would not leave room for multiplicity. On the contrary, the use of instance identifiers solves this problem. Assuming that the instance identifiers are 20 and 21 (GPS receivers 1 and 2) and 30 (IMU) and 10 (state), the following observation equations would involve, respectively, the first and second GPS receiver:
		
As seen in the example above, the cardinality of the different lists of instance identifiers may be zero in some cases. The model identifier must, however, be always present.
Extra items must be added, as it happens to observation, states and instruments themselves. An event tag, an activation flag and a time stamp must be added to complete the data model for this entity. Their purpose is described in 
Section 3.2—although that section talks about observations, the explanations given there may be applied here so these are not repeated again. In this context, however, the activation flag may be interpreted as a quick and practical way to avoid the (erroneous) computation of states when a significant number of the observations involved are clearly wrong.
		
Table 2 briefly describes all the elements integrating an observation equation.
 The explicit inclusion of the model identifier opens the way to the use of different kinds of models in structurally identical observation equations. Revisiting the example above, the model used to derive the new state may be changed while involving exactly the same lists of observations and states:
		
One positive consequence of this feature is that researchers may tests new algorithms simply changing the model identifier used to derive new states in observation equations.
Figure 5 includes three examples of observation equations in ASTROLABE XML syntax. All these may be easily identified by the opening tag <o> which represents the observation equation event tag. The activation flag is materialized by means of the “s” attribute (whose values, “a” and “r” stand for “active” and “inactive” or “removed” respectively). Note that the equation at line 3 has no explicit activation tag: it is assumed active by default.
 The observation equation identifiers (attribute “id”) point to the respective models involved in the equation, namely “pva1_d”, “imu1_bias_d” and “height_update”. After these, the time tag (124.88 in all cases) and the lists of observation, state and instrument instance identifiers are included. The metadata for these equations may be found in the example in 
Figure A4 using the model identifiers (id) shown in 
Figure 5. (ASTROLABE metadata are discussed in 
Section 3.5). Such metadata state that, for example, model “height_update” points to 
one observation whose identifier is “baro1”, 
two states (“baro1” and “pva1”) and just 
one instrument, (“p0_h0”). This implies that a total of 
four instance identifiers must be present in the observation equation, as shown in line 3 of 
Figure 5: instance identifier 27 for the observation, 34 and 32 for the states and 51 for the instrument.
See 
Figure A1 for a complete ASTROLABE XML example showing both observations and observation equations.
  3.5. Metadata
Section 3.2, 
Section 3.3 and 
Section 3.4 describe the structure of the different data entities involved in ASTROLABE’s data model. For this definition to be rigorous, some additional aspects related to these entities must be specified. These aspects constitute the metadata.
 Observations and states share the same metadata structure, that is, the same kind of facets are specified. For this reason, these will be described below within a common frame. Instruments and models, on the contrary, are a case apart, so their metadata will be specified separately.
The main items constituting the metadata for observations and states are the following:
		
- Type and identifier.-  See  Section 3.1-  for a detailed explanation on type and identifier codes. 
 
- Toolbox. Name of the software module—typically, a Dynamic Link Library (DLL, Windows environment) or Shared Library (Linux environment)—including an implementation of the logic related to the observation or state. The way this name is used by the underlying operating system to find the library is not defined by ASTROLABE. 
- Dimension. Number of elements in the observation/states expectations vector. 
- Referencing. The reference frame plus coordinate system to which data refer to is specified here. Alternatively, it is possible to define a coordinate reference frame instead. 
- Units. Specifies the units of the individual elements of the expectations vector for observations or states. 
- Covariance matrix. Default covariance matrix for the expectations vector for observations and states. The units of the covariance matrix are the same as those provided for the expectations vector, even in the default case. ASTROLABE accepts “reduced” or “full” covariance matrices. A reduced covariance matrix consist of just standard deviations—assuming zeros for correlation values. This is an optional field. 
- Scale factors. This optional field contains a list of positive scale factors for the standard deviations included in the covariance matrix. The number of elements in this list equals the dimension of the expectations vector. Scale factors values of 1 are assumed when this field is not present. 
- Tags (auxiliary values) characterization.-  Since observations or states may be optionally accompanied by tags, ( Section 3.2- ), it is necessary to characterize these. The way to do it is to define:
             
 - -
- The number or tags related to the observation or state (dimension.) 
- -
- For each of these tags, provide the referencing data (again, reference frame plus coordinate system or coordinate reference frame), as well as the units in use. 
 
Figure A2 shows how three different observations are defined (lines 1–29, 30–53 and 54–77), using the ASTROLABE XML syntax. Note the XML tags used to describe the metadata elements: <l_spec> starts the definition of an observation; <type> is used to input the observation’s type code. The identifier is specified by means of the <id> tag—which is embedded in a higher level structure named <lineage>. Toolbox codes are specified by means of the <toolbox> tag, while reference frames and coordinate systems are described by means of the (<ref>, <ref_frame_VC>, <coor_system_VC>) triplet. For units, the <units> tag is used, <c> is for covariance matrices and <s> is for their scale factors. The tags are characterized by means of the <t_spec> XML tag, which in turn use other ones already defined.
 Figure A3 shows the specification of three states (lines 78–94, 95–119 and 120–137). (This figure also depicts the specification of an instrument, and it will be discussed below.) Note that, as stated before, observation and states are equivalent from the structural standpoint, so the XML tags used to describe these are almost the same. The only difference is the XML tag <p_spec> (opening the definition of a state).
 Instruments are characterized as follows:
		
Figure A3 includes the definition of an instrument (see lines 138–171). Most of the XML tags used in this example should now be obvious (as <id>, <toolbox>, <units>, <c> or <s>, already commented when describing observation and state metadata). Instruments are described by means of the <i_spec> tag. Once more, their types are input by means of the <type> tag. <c_list> is used to describe the list of constants; each constant is specified by means of the <item> tag, where units, covariances, scale factor and referencing information may be detailed.
 The following are the most important fields constituting the metadata characterizing a model.
        
- Type, identifier and toolbox.-  Same meanings as for observation/state/instrument metadata. See  Section 3.1-  for details on type and identifier codes. 
 
- List of observations. The set of identifiers of the observations involved in the model. 
- List of states. The set of identifiers of the states involved in the model. The role played by each state is also specified. These roles may be either free (the TDS will be responsible for estimating its value) or constant (an input, immutable value will be provided for the state). 
- List of instruments. The set of identifiers of the instruments involved in the model. This list is optional, since some models may work without the need of instrument constants. 
- List of sub-models. A model may rely in other models to perform its task. This list includes their identifiers, which is optional. 
In 
Figure A4 three models are defined (each one starting with an <m_spec> tag). The lists of observations and states are input by means of the <l_list> and <p_list> tags respectively. When present—these are optional—instrument and sub-model lists are specified by means of the <i_list> and <sub-m_list> tags. All lists are made of items (XML tag <item> in all cases) providing the identifiers (tag: <id>) of the involved observations, states, instruments or sub-models. State items also describe their role by means of the <role> tag.
It is worth mentioning that the identifiers of observations, states and parameters referenced in these lists correspond to those found in the examples included in 
Figure A2 and 
Figure A3. These, together with 
Figure A4 make a complete example of ASTROLABE metadata.
Note that metadata for models describe the model structure and not the logic needed to derive new states out of the involved inputs—this is the task of TDSs themselves. The model type and identifier are the pointers which such software must use to ascertain both the specific mathematical model and the kind of data intervening in the computation of new states.
  5. The ASTROLABE C++ Library
The CTTC has implemented a portable object oriented C++ library including reader and writer (file interface) and sender/receiver (network interface) classes. It offers all the necessary tools to process ASTROLABE data and metadata.
Specific classes have been provided to manage each available data format (as for instance, text or binary data files, see 
Section 4.1), so a client software module may use the specific implementation for each of these formats if desired. However, the use of external ASTROLABE header files describing how and where data are to be found, opened the way to a couple of generic classes (a reader or receiver, a writer or sender) able to manage all the available formats in a completely transparent way.
For instance, the generic reader class uses the information found in the ASTROLABE header file to ascertain how and where actual data are stored. Once this is determined, it instantiates (transparently) the appropriate specific reader class. The advantage of using such generic reader lies in the fact that client modules do not need to be aware of the details concerning how data are stored; thus, code is simpler and valid for any available format.
Another feature only available in the file interface of the ASTROLABE C++ library is the ability to read files either in forward or backward directions. This feature has been included since TDSs working in post-processing mode usually generate trajectories in three steps: (1) computing the output in forward direction (that is, from its beginning to its end); then (2) in backwards direction and; finally (3) filtering the two previous results to obtain the final output. Processing backwards implies the need to read data in backward direction, so adding such kind of feature to the library facilitates the development of client software modules.
Performance is also an issue that has been addressed in this library. All readers and writers use a technique known as “buffered reading (writing)” to speed up the process. It is worth to mention that NAVEGA doubled its performance when the ASTROLABE library was used to substitute the former readers and writers that did not implement this technique—no other changes were made to the tool. This is a noticeable improvement, especially when post-processing long datasets covering several hours of data that might take up to an hour or more of elapsed time to process.
The network interface always transmits binary data for efficiency reasons. However, the representation (and interpretation) of binary information may differ depending on the architecture of the processor managing it, more precisely when data are transferred from one computer to another via a network connection. To avoid such problems, the ASTROLABE library encodes and decodes all data using the XDR (eXternal Data Representation) standard [
41] thus overcoming this issue. This encoding/decoding processing is performed transparently.
  6. Use Cases. Sensors Supported. Sensors Not Supported
The ASTROLABE data model has been put to the test in real use cases along the years, facing different situations involving several projects as well as a variety of sensors. Such exposition to real-life problems made the original data model evolve to what it is now. Obviously, not all the concepts described in this paper were implemented from the very beginning, as for instance, the proper handling of metadata.
Up to the moment, the tandem NAVEGA/ASTROLABE has been able to deal with several kinds of sensors and their related observations. 
Table 4 summarizes a subset of the projects where NAVEGA has been used, including the sensors and related observation types that were involved in these projects. The interested reader will find more details about one of these projects, GAL, in 
Appendix B.
In general, ASTROLABE will be able to incorporate any kind of sensor providing observations whose error distributions are fully defined by its first and second moments (see 
Section 2). This is so because the structure (model) of an ASTROLABE observation (see 
Section 3.2) includes an expectations array plus a covariance matrix, which is all that an observation with such kind of error distribution needs to be properly modeled.
This is true at least from the data standpoint, that is, ASTROLABE’s. Obviously, and assuming that it is possible to model some kind of observation using ASTROLABE’s approach, it will also be necessary to devise the proper mathematical model(s) for such observation to be useful from the TDS software’s point of view. In short, it is necessary to provide with the appropriate mathematical machinery to transform the input observation into output states. This said, it must be clarified that this is not a problem of the ASTROLABE data model at all, since it takes care of data, not algorithms.
The other side of the coin is that observations with error distributions that cannot be fully defined by its first two moments may not be modeled using ASTROLABE. For example, map-matching contexts are usually affected by this kind of problems. See for instance [
48] for a description of a situation where the constraints set by the environment (walls) make observations exhibit other types of error distributions.
  7. Conclusions and Outlook
The ASTROLABE data model, file and network specifications and successive implementations of the C++ library have been put to the test for several years now. This includes a variety of European projects where different kinds of sensors had to be incorporated (see 
Table 4 for details). Being conceived as a generic and extensible system, the addition of such new sensors posed no unsurmountable problems for the devised abstraction. Needless to say, this continuous exposition to real-life conditions served to improve the data model progressively. It is possible to state, that ASTROLABE has been, at least up to the moment, able to manage change, evolution and innovation in the field it is specifically targeted at: data representation in the context of TDSs dealing with observations whose error distributions are fully charaterized by the first and second moments. This is of special importance, since the decision of developing a specific data model instead of using existing, mature ones as those described in [
32,
33] supposed a non-negligible risk.
The ASTROLABE data model and its two specifications (file and network) opened the path to the development of a C++ library exposing a very terse API and a high level of abstraction. Third party software modules accessing ASTROLABE data by means of this library may use very high level code that is independent on details like how and where data are stored (or transmitted).
The development of the ASTROLABE data model, specifications and library went hand in hand with the implementation of CTTC’s TDS, NAVEGA. NAVEGA was designed using the same principles on which ASTROLABE rely, that is, it is a generic and extensible software tool mirroring internally the concepts present in the data model. This correspondence notably eased its development and, as expected, reduced to a minimum the maintenance tasks required to cope with change. Other TDSs may also benefit from the characteristics of the ASTROLABE specification by incorporating the library and its features related to data input/output and metadata management.
ASTROLABE is not yet complete. Although both the specification and the C++ library are very close to their final versions, some work remains to be done; for instance, no proper metadata to define the coordinate reference frame for time stamps exist yet. The foreseen work for the coming months will serve to address these minor issues.
The CTTC is considering putting the ASTROLABE specification as well as the portable C++ library in the public domain as soon as the work on these is finished. For more details, contact the authors.