This study addresses some problems and tested solutions—that historians experience when approaching knowledge of the past. According to the principles of constructed past theory, published recently [1
], we aim to introduce and discuss the validity of our information system to gather and exploit historical data, and the underpinning concepts of our methodological approach. Our main goal is to strengthen the chances of building the historical discourse on a scientific basis, taking into account the risks of bias and ideological implication concomitant to Social Sciences and Humanities (SSH) and describing our methodology as a measure to mitigate risks.
Indeed, the narration of the past offered by historians—the term is understood here in the broadest possible sense, including archaeologists, palaeographers, anthropologists, philologists, and all scholars dealing with the past in some way at some point of their research—is a construction built from what is left, a collection of remains of different nature and kind. The uncovering of these remains and their articulation within a creative process in no way compromises the ontology of the past itself —that it did happen [2
] (pp. 591–592). Therefore, the past is represented, mediated by witnessing or speaking for it in its absence and connecting it with contemporary understanding.
Nowadays, so-called ‘digital humanities’ offer new ways to develop and disseminate humanistic research. Despite this, their highest interest are the new chances of addressing research in SSH in a brand new way, not only faster, but more effective in terms of data gathering and exploitation, and hence transforming investigation itself by asking new and more complex questions. Incomprehensibly, history as a discipline does not seem to have been the most enthusiastic participant within this digital turn, occasionally encountering endless debates about the usefulness of digital tools themselves. However, some historians are too committed with the reliability and traceability of past construction to disregard the chances digital humanities offer in this domain, even when their digital skills have developed through inquisitiveness and everyday use, instead of a regular training program [3
The fact is that—beyond the conceptual changes—Digital Humanities actually made us face a change of paradigm in the processes of historical research in which:
Available tools allow us to deal with massive datasets, some of which were disregarded until recently as marginal or non-significant.
Interdisciplinary teamwork is key to build a richer, fairer and more precise construction of the past.
Information sources of diverse origin and nature must be integrated within a cross-disciplinary perspective.
Open datasets become compulsory within a new research Open Science framework.
All these new scenarios require new management of information fluxes claiming for the theoretical definition and practical development of information systems for a safe and efficient information management and research good practices. According to the main research developed in recent years and summarized in the following section, our main goal within this paper is to propose a specific ontology-mediated data modeling and a research information system (RIS) built accordingly. Regardless of the origin or nature of information sources susceptible of being considered as a vestige or a reflection [1
] (p. 14) of the past, it is possible to exploit them within a shared code, and so we aim at offering our conceptual proposal and some examples of successful application. One of these examples, is the successful implementation of an Archaeological application of the Research Information System (ARIS), called SigArq.
2. State-of-the-Art: Constructed Past Theory and Theoretical Approaches from Archaeology and Record Management
Thibodeau’s Constructed Past Theory (CPT) [1
] has shed new light in the conceptualization of past construction, which is not new particularly in archaeological theory, where material vestiges never speak by themselves, but it is the archaeologist that must give them their significance [4
]. Figure 1
provides a summary synthesis of CPT, wherein the Constructed Past is the final product of a process in which a Target Past evolves during the In-Progress Construction process according to the Intentional Domain. This comprises the Intent of Construction and the Sphere of Interest, both classes being determined by the researcher.
Intent of Construction and Sphere of Interest are interdependent classes, as the first shapes the process and its results and the second specifies the period under investigation and what is of interest. Therefore, the final construction of the past should satisfy the intent of construction and be about the sphere of interest. Four subclasses are involved within the Sphere of Interest: Entity, Event, Process, and State of Affairs. An ‘Entity’ is something, whether conceptual or physical, that existed and that had, at least, one inherent and persistent property. An Event, in contrast, is a change in an Entity. What changes and the nature of the change are two defining properties of the Event. Several events aggregated as steps may define a Process. Finally, Thibodeau defined the State of Affairs as a set of one or more assertions, all of which are true for the same chronological period and concern the same or related objects that are either instances of the Entity, Event, or Relationship, or their subclasses.
Entities and Events have an Involvement relationship—also expressed as a class—including four subclasses: Participant, Observer, Altered, and Instrument, according to the nature of the change occurred to the Entity during the Event. When human action gets into the scene, it introduces a subclass of Event: Action. Thus, Action is an Event in which human beings have an active role as participants.
The Involvement relationship between Entity and Event and the role of humans in between is particularly interesting to us because our recent research on ontology-mediated data modeling fits really well in Thibodeau’s CPT as a practical case and proposal for further development, as we will demonstrate in forthcoming sections. Any In-Progress Construction of the past under a determined Intentional Domain uses Construction Materials that might be—Item or Composite, Vestiges or Reflections. Vestiges are objects that existed in the past and survive the time of the sphere of interest, and Reflections are information objects produced in the course of construction. Being archaeologists ourselves, we have been deeply concerned with the selection and use of vestiges within Historical Science, as material archaeological evidence has sometimes been disregarded in favour of Archival Capital or written sources in general. Fortunately, recent times have fruitfully changed this perception, particularly for the medieval and post-medieval period. Medieval archaeology—especially under the methodological perspective of landscape archaeology—has contributed to the material turn [6
] in Archaeology, Anthopology, History, and other past-constructing disciplines, even though this turn has not always been exempt of debate [7
Apart from Thibodeau’s recent abstract approach—as stated by the author [1
] (p. 16)—few references address the problem of ‘managing information’ in practical terms in order to build an integrated historical discourse dealing at once with sources of information from many different origins and supports [10
]. Some of them are still highly theoretical [11
] or archaeology-based [12
], and, frequently, specific research about ontology-based data management only addresses historical problems tangentially [13
]. Indeed, we are indebted to the archaeological theory and practice in the development of our data modeling, particularly regarding landscape archaeology [15
]. Recently, considerable efforts have been made in order to develop common standards to ease data exploitation, and to empower digital research in the fields of History, Language Studies, Cultural Heritage, Archaeology, and related fields in the domain of Social Sciences and Humanities (SSH). The already completed PARTHENOS project [18
] was a beautiful example of them, and it hosted an archaeological experience of research infrastructure implementation called ARIADNE [19
]. Through this archaeological data-standardization-based project and other subsequent experiences, the archaeological domain has developed data integration practices through metadata introduction by means of controlled vocabularies.
Despite this, most of these experiences, particularly in the archaeological discipline, tackle with data standardization in order to make the results of research exchangeable under FAIR conditions, but very few suggestions are provided in order to deal with the FAIR character of basic research itself. Before reaching exportable results, Historians in general and Archaeologists in particular are file generators and our aim is to introduce and share the way in which we deal with the records produced since the very beginning of our research until the final product or report is obtained. Hence, the Research Information System (RIS) introduced below in this paper and its practical cases are part of an interdisciplinary proposal built upon the basis of Records Management [20
] and Conceptual Modeling [22
]. Our RIS proposal and its practical application through the SigArq ARIS can be used as a domain solution or an example of International Research on Permanent Authentic Records in Electronic Systems (InterPARES). Actually, most of the design of the system and its File Management Classification Chart as a result of identifying research processes and giving response to them follow the InterPARES methodological approach [23
] (pp. 6–7). A. Mauri [24
] set the conceptual basis of our Research Information System (RIS) as part of his MSc and PhD dissertations and applied it to the study of the County of Barcelona in the Early Medieval Period [25
] (pp. 103–384). Although some experiences in historical data management and computing have been known since 2005 [26
], most of them deal almost exclusively with written evidence while data integration experiences through interoperable minimum information units are rare. Mauri’s study [25
] built an integrated construction of the past for the first time, gathering information from different sources regardless of their origin or nature. The concepts of ‘Unit of Topography’ (UT) and ‘Actor’ (Ac) were defined then as minimal ‘Units of Information’ (UI) of historical knowledge and the RIS were built accordingly [28
]. Further crosscutting research used those information units at the basis of archaeological data management in connection with E. C. Harris’ concept ‘Unit of Stratigraphy’ (US), and P. Del Fresno conceptualized the main structure of his Archaeological Research Information System (ARIS) [29
] as advanced development of the original RIS. More recently, the concepts of UT and Ac were applied to mercantile accounting books from the 15th Century AD and the RIS was improved [31
More than 20 years after its first conceptualization, now we are in a good position to review the system structure and to offer—through different examples—a more complete and tested proposal according to its underpinning concepts and ontology. In the following Section 3
and Section 4
, the main concepts are defined, their relationships established, and the software applications described. Anyway, our RIS is still a tool in construction and our contribution aims at exchanging our thoughts and perceptions with other researchers as well, in order to build a better and even more functional information system.
4. Results: Experiences of Past Construction and Tool Development
Data gathering and exploitation according to the use of UT and Ac as minimal units of information linked though values and relationships led to a more precise knowledge of the County of Barcelona [25
]. Past Constructed under those principles was generally accepted as a reflection—in Thibodeau’s terms [1
] (p. 14)—and used in further research [10
]. Research developed in this framework particularly reflects the chances of data integration from written sources, ethnographic approaches, field archaeology, and material sciences [47
] (pp. 125–129,143).
Nevertheless, the most complete application of our methodological approach is the conceptualization and development of the Archaeological RIS SigArq, although some other experiences are currently under construction [31
]. Figure 5
shows the general design of this ARIS as a system of subsystems in which different existing computing tools are independent, interconnected, and distributed along its architectural levels. All these devices are concomitant to the ARIS as long as they support a transversal management of the heritage, concerning both their agents and subjects. The ARIS is obviously web-supported and allows all subsystems to be interconnected.
In order to make this interconnection possible and according to the integration of archaeological record within the principles of Records Management [48
], the documentation and analysis of vestiges follow a normalized protocol of data collection summarized in a file classification chart. These are supported within SigArq Geographic Information System, which allows researchers to gather all informative dimensions of US in one single database.
4.1. Archaeological Information System: Classification Chart as File Management Tool
Building the archaeological record according to the principles of Records Management implies identifying the main processes in research from the creation of the file production context until the delivery of the final report attesting the research project completion, as suggested by the InterPARES’s methodological approach [23
] (pp. 6–7). These processes originate items which are ordered and hierarchized in files and series accordingly. Building the archaeological record, in particular, produces several files that are representative of the different dimensions of US as a minimal information unit. These dimensions are:
Descriptive dimension: US description includes information about its class (deposit, structure or interface), definition (e.g., layer, wall, hole, etc.), natural or anthropic origin, and interpretation (e.g., construction, destruction, use, erosion).
Graphic dimension: Photographic register records the graphic dimension of US through different pictures (aerial, general, detail) that are stored and identified through normalized metadata.
: As stated in archaeological method, a single-layer plan for every US records the basic cartographic data by means of drawing the boundary contours of the US, and placing some evenly distributed elevations on the plotted area [35
] (pp. 95–104).
: Defining the stratigraphic sequence is the way to record the temporal dimension of the US, according to the physical relation between them [35
] (pp. 34–39).
These four dimensions are part of the raw data gathering included in the first of the processes –building the archaeological record—as shown in Figure 6
. Defining the methodological work processes precisely leads the researcher through the project management and execution through different activities, series, and composite files [50
], according to the classification chart shared by the authors and their team.
Data gathering is the first step of the research process. Each US is recorded according to its four dimensions (descriptive, graphic, cartographic, and temporal) by means of filling a registration form, taking some pictures, drawing its plan and recording the physical relation between different units. During this process, primary assessment of data takes place, as researchers select the correct and valid data that will get into the system. Primary management of the archaeological record consists in a two-step procedure: file classification and storage, and file description. The first step demands naming each item file according to the composite file or series it belongs to. Table 1
shows this normalized form. Classified and stored files are then described through metadata introduction. Metadata include general information such as site identification, fieldwork campaign, authorship, or origin; specific data as trench identification and US number; and finally that information coming out from secondary assessment. Through this secondary assessment, researchers will select those files that better represent the US four dimensions and the next step—secondary management of archaeological record—will start.
Secondary management is where data exploitation and interpretation takes place. Specific lab studies of archaeological materials, field survey results, exploitation of archival capital according to the UT/Ac identification and register, archaeology of architecture, and heritage preservation are interdisciplinary approaches that have a specific place within the ARIS structure. This will lead to the creation of documents that will be an integral part of the reflections presented as the final report. Within this research system, SigArq is—properly speaking—a web-supported platform built upon GIS software that allows us to gather and exploit data within a single application, and—what is more relevant—that monitors data introduction into the system in order to avoid inner contradictions and minimize errors. This is possible by means of using normalized thesauri and leading the user through the entire process according to the theoretical management of information established in the classification chart.
4.2. GIS and Archaeology. SigArq Tool Development
Nowadays, SigArq is an application still in transformation. It currently allows researchers to incorporate and manage excavation primary record, exploit these data and generate preliminary and final reports. Archaeological material databases are part of the ARIS but are not included in the application, since it currently is a GIS-based software. Data introduction takes five correlative steps that must be completed in order. Users are not allowed to skip any of these, as this would jeopardize the quality of the final record. These steps are:
US definition: US identification requires briefly defining what it is/was according to the materiality of the remains. A few examples of short definition are wall, filling, levelling layer, silo, pit, landslide, tomb, individual within a burial, etc.
Form Completion: Once the US is identified and defined, users are allowed to record US descriptive and temporal dimensions within a form. Although Description, Composition, and Interpretation are free-text entry input fields, fundamental attributes such as Origin or Type of US amongst others are single-choice input fields controlled through thesauri. Within the US form, the temporal dimension is introduced by means of recording the physical relation between the US under examination and those below. System crosscheck returns the relations with the US above and verifies the non-existence of contradictions in the stratigraphic sequence through a green-shaded status field. If contradictory data are introduced, they will be highlighted in red.
: General and detailed pictures of each US, identified with metadata—as shown in Table 2
—and stored in the adequate ARIS-series, attest the US graphic dimension. In this step, one selected picture file is uploaded into the application together with an XYZ-coordinate capture table.
Cartography production: The cartographic dimension is the last item recorded within the system. The previously uploaded XYZ-table is now used to produce the plan and elevation, reproducing the cartography of the US within the general excavation plan.
US Metrics and exploitation: Like any SIG software, SigArq has some enabled geometric functions that calculate US total surface, point out distance and any other metrics desired for exploitation.
To sum up, archaeologists can interrogate all US dimensions in one single screen—as shown in Figure 7
—after the record process is completed, and data are compared to the entire US assemblage for archaeological interpretation. These data being available from other software and platforms represented in Figure 5
above, data loading in previously designed templates produces US tables, Harris-Matrix diagrams and final reports easily and efficiently. The controlled process of data introduction ensures the quality and the traceability of the entire process. The result is a FAIR (findable, accessible, interoperable, and reusable) archaeological research, as claimed recently by European stakeholders [51
5. Discussion: Past Construction through UT/US Dialectics
We have shown the building architecture of our ARIS and derived software application SigArq as a practical case of Research Information Systems for past construction. Still, we should discuss further implications of UT and US information units bearing in mind that SigArq is a US-built software according to Harris’ principles of archaeological stratigraphy [35
]. It is worth insisting in the main issue to solve when combining archaeological data and information gathered from any other source: the need to define a common and exchangeable unit of information. As pointed out in Section 3
, we can consider all US as UT. Despite this, the existence of four informative (descriptive, graphic, cartographic, and temporal) dimensions of a US is an ontological requirement for them to be, while UT can exist regardless of their materiality or the lack of it. From the ontological perspective, the UT exists as far as there is a vestige or reflection of the past informing about it, while the US needs the existence of an archaeological context.
Our contribution to an Integrated Construction of the Past lies upon the fact that, as far as—as archaeologists—we are in need of other sources of information for an accurate interpretation of material data, we have developed a code valid for Reflection construction with independence from the source. To that extent, UT/US reliability has to be evaluated, and the SigArq application contributes to monitoring the process of data gathering and exploitation in order to ensure its quality based upon rigorous and FAIR data collection, storage, and exploitation. This is ensured by the inclusion and use of standardized metadata for any single file produced during the research process, and it is guaranteed by the use of shareable platforms as those shown in Figure 5
Ontological differences between UT and US have to be considered, even though these do never compromise unit interoperability. Both of them are Units of Information (UI), data, evidences of the recent or remote past, for which a spatiotemporal context has to be provided in a precise or generic way. Place and time are essential ontological attributes in all cases regardless of the origin of the source—vestige or reflection—its materiality, the scientific discipline that produces it, the methodological specificities of data gathering, or the reliability of the information [29
] (pp. 65–75). Table 3
summarizes the main differences and similarities between UT and US, which are mostly related to their materiality, and the possible relations with other UI for past construction such as Actors or Values, defined above.
As such UI, they are fully comparable, and together they shape the past construction. Their main difference is their materiality—essential for a US to be—and the consequences materiality has for the data gathering process. Other differences are concomitant to the relations that UT/US might or might not establish between them or with other UI. As stated in Section 3
, the dichotomy between UT and US is a matter of scale and inclusion. US cannot include other US inside, but they can sum. The addition of US turns into a new unit of information originated as a reflection, and, therefore, it will necessarily be a UT. The interpretation process of material evidence leads unavoidably to the grouping of different US in Activities, Groups of Activities, or Phases according to the events occurred within a site identifiable through the archaeological record. Although we are content to use the terms proposed by A. Carandini [44
] (p. 143) here, in the new scenario proposed, Activities, Groups of Activities, and Phases are equivalent to UT anyway.
These, registered from many other sources of information, can appear within every grade of this interpretational hierarchy of the archaeological record. Past construction then becomes completely interoperable and urgently interdisciplinary. As equivalent and comparable units, UT and US are part of the well-known Harris’ matrix diagram [35
] (pp. 34–39) and, when exploited from knowledge bases, could be analyzed, exploited, and represented in terms of knowledge graph technology [53
]. If considering, as L. Ehrlinger and W. Wöess did [55
] (p. 3), that the Knowledge Graph acquires and integrates information into an ontology and applies a reasoner to derive new knowledge, the potential of our ontology-mediated data modeling could significantly increase.
The link between archaeology and other SSH, History in particular, is in no way unidirectional. Examples of UT identification from graphic vestiges (e.g., drawings or photographs) are fundamental for the interpretation of the archaeological record. Certainly, ancient photographs or other reflections —an archaeological diary with drawings made 40 or 50 years ago about a site no longer available for excavation—might inform about the material evidence of an action, regardless of whether an updated archaeological record according to the four dimensions of US is possible or not. This is demonstrated by UT identified in examples provided in Appendix B
] (p. 154, Figure 4
), as they offered a good correlation with the archaeological register built several decades after the pictures or notes were taken [57
] (p. 147, Figure 3
Finally, getting back to Thibodeau’s CPT, US as entities have one unique and specific class of involvement in events, which is Altered, since they are the material evidence—result or impact—of those events. Thibodeau’s distinction between Action and Event, being the first a specific class of the second, is considered as such in SigArq’s US record as well. The US description form considers their natural or anthropic origin as an attribute. Hence, the human character of an Actor’s role is attested even though it is impossible to identify it as a UI in the fieldwork. US of natural origin are then vestiges informing about events, and US of anthropic origin are the material evidence of actions.
6. Future Prospective
As long as this is an on-going project, we would like to outline the future potential and prospective of this approach instead of a set of fast-changing conclusions. In the framework of the so-called Digital Humanities, the chances offered by ICT for an integrated historical discourse have been our major concern for many years. Consequently, we designed a Research Information System built upon ontology-mediated data modeling. Throughout this paper, we summarized a methodological proposal for Past Construction in accordance with our perceptions of how historical discourses are originated and our scientific experience as archaeologists and historians, which we consider two inseparable categories of past-constructing scientists. The archaeological development of this research information system is an ARIS currently in use at different archaeological sites in Spain. The SigArq application as a quality-monitoring tool for data gathering and exploitation is a concomitant improvement towards the creation of ‘FAIRer’ reflections of constructed past.
Our self-awareness of being file generators when fulfilling our archaeological duties led us to search the principles for file classification and preservation in Records Management, in an effort to ensure the backwards traceability of research processes as one of the underpinning elements of scientific reliability. The definition of RIS in Historical Science requires another key element for making interdisciplinary collaboration possible: the use of a common and exchangeable code. Thus, we introduce our conceptualization of UT as a main Unit of Information obtainable from many different domains and sources, regardless of their vestige or reflection character and the nature of the science that provided them. This provides a shareable language for different domains in SSH focused in integrated Past Construction. The key advantage in using these UT and Ac concepts is that the problem of traditional fragmentation of SSH could be solved to some extent. The archaeological initiative of our data-modeling is also due to the high level of fragmentation of archaeological data and the traditionally limited capability for the collaborative research across boundaries, as already diagnosed the ARIADNE project’s partners [19
] (p. 2).
Thibodeau’s CPT, as a germinal articulation of a challenging theoretical framework for past construction, moved us to review our in-progress data systematization and ontology creation. Precisely because the past is far more complex than what can be represented in any diagram [1
] (p. 19), ontology-mediated data modeling revealed useful in interdisciplinary past constructions generated by the authors and their colleagues, but it needs a permanent revision of ontologies and terms in order to achieve a terminological consensus. It is time for SSH to get hands on past construction building reflections from an integrated perspective, taking advantage from electronic-supported devices, using ICT as a requirement for more ambitious research aims in History, and making Digital Humanities a term which becomes meaningful for Historical Science.