In 2016, the PARTHENOS Project (https://www.parthenos-project.eu/
accessed on 7 December 2021) defined Research Infrastructures as “complex agglomerations of knowledge, data, people, and services that bring together diverse resources for a wide user base and make these resources (re)usable and available for an appropriately long term in order to support research (either individual or collaborative) and share the results of that research” (http://training.parthenos-project.eu/for-trainers/other-teaching-resources/#1548176339890-0f835e40-964b
accessed on 7 December 2021). This quote encapsulates several aspects that are generally included in research infrastructures. Nevertheless, it is clear that the concept and goals of research infrastructures are still evolving, and that certain features are more valued than others by stakeholders and policy-makers. For example, the European Strategy Forum on Research Infrastructures (ESFRI) (https://www.esfri.eu/
accessed on 7 December 2021) emphasises data sharing and preservation over the lifecycle of a research infrastructure, as can be seen in the European Roadmap for Research Infrastructures [1
]. In turn, the American Council of Learned Societies’ Commission on Cyberinfrastructure for the Humanities and Social Sciences highlights the contribution of research infrastructures in facilitating collaborative research and in building networks and communities, as shown in the “Our Cultural Commonwealth” report [3
]. These reports show that research infrastructures can vary widely, along with the contents and services provided by them, depending, for example, on the scientific domain in which they belong. Based on the aforementioned reports, there are three kinds of research infrastructures: (i) single-sited infrastructures, whose buildings and equipment are located in the same area; (ii) distributed infrastructures, whose resources and services are distributed across several national nodes; (iii) digital infrastructures, which are characterised by the importance of its technological components.
The ESFRI has maintained, since 2006, a strategy to foster national and international research infrastructures in countries of the European Union, promoting synergies between the infrastructures, which in turn has influenced scientific policies at a national level. With a view of integrating Portuguese institutions in this setting, in 2013, the Foundation for Science and Technology (FCT) created the National Roadmap for Research Infrastructures of Strategic Interest (RNIE 2014–2020) to map and evaluate the research infrastructures in Portugal. At first, the RNIE included 40 infrastructures, including the ROSSIO Infrastructure—Social Sciences, Arts and Humanities (https://rossio.fcsh.unl.pt/en/
accessed on 7 December 2021). As of 2020, there are 56 infrastructures in the RNIE, of which only seven belong to the domains of the Social Sciences and Humanities (SSAH) [4
]. These infrastructures cooperate with European counterparts, leading to the development of international networks (known as European Research Infrastructure Consortia, or ERIC). For instance, the Social Sciences DataLab (http://datalab.novasbe.pt/
accessed on 7 December 2021) is the Portuguese node of SHARE ERIC—Survey of Health, Ageing and Retirement in Europe (http://www.share-project.org/home0.html
accessed on 7 December 2021), and ROSSIO has the same role in DARIAH-EU ERIC (Digital Research Infrastructure for the Arts and Humanities) (https://www.dariah.eu/
accessed on 7 December 2021).
The ROSSIO Infrastructure is integrated by a consortium of seven Portuguese educational and cultural institutions. The consortium is coordinated by the NOVA School of Social Sciences and Humanities (NOVA University Lisbon). It further integrates six Portuguese institutions in the cultural heritage domain: Direção Geral do Livro, dos Arquivos e das Bibliotecas (Directorate-General for Books, Archives, and Libraries, DGLAB), Direção Geral do Património Cultural (Directorate-General for Cultural Heritage, DGPC), Teatro Nacional D. Maria II (D. Maria II National Theatre), Arquivo Municipal de Lisboa (Lisbon Municipal Archive), Cinemateca Portuguesa (Portuguese Film Archive), and Biblioteca de Arte e Arquivos da Fundação Calouste Gulbenkian (Calouste Gulbenkian Art Library and Archives). ROSSIO further includes content providers, namely, ARQUIVO.pt (Portuguese Web Archive) and the Diplomatic Institute of the Portuguese Ministry of Foreign Affairs. ROSSIO aims to apply best practices of research infrastructures in the SSAH, such as those of the Digital Public Library of America (DPLA), Historiana, or Torve.
There are five major objectives in the ROSSIO Infrastructure: (i) to aggregate, organise, link, contextualise, and provide free and open access to digital collections in the SSAH belonging to the previously mentioned consortium members and content providers (also including the digitisation and cataloguing of resources, in some cases, as was the case of the José Marques Photographic Studio collection (1924–2012) from Teatro Nacional D. Maria II, and the archive of the landscape architect Gonçalo Ribeiro Teles (1922–2020) from the Directorate-General for Books, Archives, and Libraries); (ii) to develop high-quality research in SSAH, fostering new studies and the exchange of ideas; (iii) to create synergies between individuals and institutions in view of promoting scientific innovation and the dissemination of cultural heritage; (iv) to contribute towards the internationalisation of Portuguese SSAH studies, facilitating the access to contents in the Portuguese language, following best practices set by other infrastructures and the FAIR principles of data (Findability, Accessibility, Interoperability, and Reuse); (v) to develop a sustainable network integrating academic and non-academic communities in order to meet the challenges of a rapidly changing society.
This paper provides an overview of the ROSSIO Infrastructure and reflects on how its services will contribute to a higher quality of research, collaborative work, and knowledge dissemination in SSAH. The paper has two parts. In the first part, we will present the platform under development, including the metadata aggregation technologies and the applications developed within ROSSIO, and how the platform will support the work carried out by its potential users. The second part of the paper presents the services integrated in the platform, namely, a discovery portal, digital exhibitions, collections, and a virtual research environment (VRE). The beta version of the platform should be released in 2022.
2. Software and Data Architecture
The digital resources that are objects of research in SSAH originate from numerous institutions of different natures, mainly academic, diplomatic, and cultural heritage institutions. Such dispersion brings discoverability and usage challenges to the resources. A typical approach to these challenges is metadata aggregation, where an organisation facilitates the discovery and use of the digital resources by aggregating their associated metadata into a central repository. ROSSIO is establishing itself as a central aggregation platform; based on these aggregated datasets of metadata, it will be in a position to further promote the usage of the Portuguese digital resources by means that cannot be efficiently accomplished by each data provider alone.
ROSSIO realises the metadata-aggregation approach based on the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) [5
]. This protocol was developed by the scholarly communication community in 1999, which was looking for a technical solution for the discovery of e-prints across institutional repositories. Soon after the first applications of OAI-PMH for the metadata aggregation of e-prints, the cultural heritage domain also adopted the OAI-PMH technology since the discovery of cultural heritage digital resources was facing similar discoverability challenges as the e-prints. In the case of cultural heritage resources, discovery is, in many cases, only feasible if based on metadata instead of full texts [6
], requiring, therefore, the centralisation of metadata. Presently, OAI-PMH is widely applied in both cultural heritage and academic institutions and is the technological base for cooperative networks, such as the Digital Public Library of America, Europeana, and OpenAIRE.
2.1. Applications Architecture
The ROSSIO platform consists of a complex information system. Several applications are used to aggregate the metadata, to centrally process it, and to provide access and search functionalities on the metadata. The VRE, the digital exhibitions, and the collections applications are built based on this core information system.
The aggregated datasets, along with other datasets created by the researchers while using the infrastructure, are also published by ROSSIO, according to the FAIR principles.
The application architecture of ROSSIO is shown in Figure 1
. It presents the applications that form the infrastructure, how they relate internally and with external systems, and the types of users that they interact with.
The general types of users, or actors, that this applications architecture considers are the following:
Data Manager and Curator—This actor operates the process of ingesting metadata and also publishes datasets. To perform these roles, he uses the Public Datasets Repository and the Metadata Harvesting and Ingestion Application;
Vocabulary Manager—This actor creates and maintains the controlled vocabularies used in the ROSSIO Infrastructure. Typically, this role is performed by information professionals and terminology specialists;
End-user—This is the target user group of the ROSSIO Infrastructure, which includes students, teachers, researchers, as well as the general public.
Ten applications are deployed by ROSSIO. These are the following:
Metadata Harvesting and Ingestion Application—This application performs the OAI-PMH harvesting process of the data providers’ datasets into ROSSIO. The initial data-processing tasks are also performed by this application. It ingests the harvested datasets into ROSSIO’s internal repository, creates the search indexes, and publishes the datasets in ROSSIO’s public datasets repository. The harvested metadata is also enriched during the ingestion process by using the Data Normalisation and Enrichment Application;
Internal Repository—The datasets from the data providers are stored in this repository. This storage is designed for fast access to individual records, and assigns each record an identifier in the form of a linked data URI;
Data Normalisation and Enrichment Application—During the ingestion process, this application is used to enrich the metadata harvested from the data providers. The application matches specific field values from the metadata with entities and concepts included in one of the ROSSIO vocabularies. The applications establishes links with the matched ROSSIO entities in order to enable semantic searching in ROSSIO via its vocabularies. In addition, this application performs the normalisation of values of date properties to enable a date search index and timeline searching in ROSSIO’s end-user applications;
Public Repository (Dataverse)—In order to publish datasets to the public, ROSSIO uses the Dataverse software (https://dataverse.org/
accessed on 7 December 2021). Two kinds of datasets will be published by ROSSIO: the datasets aggregated from the data providers and the datasets that will be created by researchers while using one of the ROSSIO applications. The datasets are assigned persistent identifiers by the Dataverse software;
Linked Data Resolution Application—This application makes the metadata accessible by following the best practices for linked data and implementing the relevant specifications. The application receives and processes the access requests to URIs in the ROSSIO’s namespaces. The data for responding to these URI requests is fetched in real time from other applications in ROSSIO, as follows:
From the Internal Repository—for the aggregated metadata of cultural and scientific items;
From the Public Dataset Repository—for dataset-level metadata;
From the Vocabularies RDF Triple Store—for concepts and entities defined in any of the ROSSIO vocabularies;
Search Engine (Apache SolrTM
)—This application provides searching functionality across the aggregated metadata about cultural and scientific items. The indexing process is operated by the Data Manager and Curator, who uses the Metadata Harvesting and Ingestion Application. The Search Engine implements a search schema designed for the search requirements of the ROSSIO applications for end-users. It consists of a deployment of the Apache SolrTM
accessed on 7 December 2021), which makes an API available for searching. It is via this API that the applications in ROSSIO search in the aggregated dataset;
Vocabularies Triple Store (Apache Fuseki)—This application stores and indexes the ROSSIO vocabularies as RDF data. The Triple Store makes available a SPARQL endpoint [7
] that allows semantic queries to be made on the vocabularies by other applications. The SPARQL endpoint is available to the public; therefore, the vocabularies can also be queried by internal applications, by applications from third parties, and by end-users proficient in the SPARQL language. The Triple Store is an installation of the Apache Fuseki software (https://jena.apache.org/documentation/fuseki2/
accessed on 7 December 2021);
Vocabularies Management Application (Vocbench)—By using this application, the Vocabularies Manager creates and maintains the controlled vocabularies of the ROSSIO infrastructure. It is a deployment of the Vocbench (http://vocbench.uniroma2.it/
accessed on 7 December 2021) software;
Vocabularies Publication Application (Skosmos)—The vocabularies created in Vocbench are published for human usage in this application. The Vocabularies Manager is responsible for the publication process, which consists in exporting the vocabularies from the Vocabulary Editor and then importing them into the Vocabularies Publication Application. This application is a deployment of the Skosmos software;
End-user systems—Four independent applications are being developed under a common framework, which will offer the services of the ROSSIO platform to end-users, based on the aggregated dataset. These services are described in detail in Section 3
. The applications that implement these services use the Search Engine for querying the dataset, use the Linked Data Resolution Application to obtain the metadata about the digital objects in RDF, and query the ROSSIO vocabularies using the Vocabularies Triple Store. The services are supported by four applications:
Discovery Portal—This application provides search and retrieval functionality to all end-users of ROSSIO;
Virtual Research Environment—This application provides functionalities for researchers to work with the resources available in ROSSIO;
Digital Exhibitions—This application provides functionalities for creating and publishing exhibitions of digital resources available in ROSSIO;
Digital Collections—This application provides functionalities for creating and publishing explanatory resources that use digital resources available in ROSSIO (the technical development of the VRE and the digital collections and exhibitions benefited from the work of three MA students from FCT—Joana Barbosa, Henrique Raposo, and Luís Coelho—who did their thesis among the project).
2.2. Internal Data Model
The data providers of ROSSIO describe their digital objects using a variety of data models that are suitable for their individual needs. Supporting all these different data models in ROSSIO’s data-ingestion systems would not be feasible; therefore, ROSSIO has adopted a data model with general semantics that is capable of representing the digital objects from the different domains. This data model is an Europeana Data Model (EDM) [8
] profile that was defined by representatives from Portuguese academic and cultural heritage institutions. This EDM profile was named “EDM-DRD application profile”. Although the EDM-DRD profile is not yet widely used in Portuguese institutions, it is expected that supporting it from the start will give ROSSIO the benefit of better data-exchange capabilities in the future, both for aggregating richer data from data providers and for allowing the re-use of its datasets by national and international third parties.
During the initial operation of the ROSSIO Infrastructure, data providers are not required to implement the EDM-DRD profile because its implementation would be too challenging for the timeframe of the project. Therefore, the metadata aggregated from data providers consists of a simple model based on the 15 most-generic elements of the Dublin Core Metadata Terms, as defined by the OAI-PMH protocol. Nevertheless, ROSSIO’s data systems are being developed based on the richer semantic model of EDM-DRD. By using EDM-DRD, ROSSIO is capable of representing the administrative metadata required for its operation and also the semantic enrichments made during the ingestion process. Our expectation is that EDM-DRD will eventually be used by data providers, allowing ROSSIO to operate with higher-quality metadata for the benefit of its end-users.
The use of the EDM-DRD profile allows ROSSIO to represent semantically richer data than it is currently being received from data providers. In turn, this allows ROSSIO to apply data-enrichment operations on the simple metadata received and represent the results of the enrichment process with the EDM-DRD model. Currently, the following normalisation and enrichment operations are performed:
Date normalisation—Data providers use different date formats and values with partial dates, such as only a year, or the month of a year. The values of the date properties are normalised into one format that allows date ranges to be represented. Using these normalised values, a special date range index is created in the Search Engine, allowing end-users to query by periods of time;
Hyperlink enrichment—This enrichment operation analyses the hyperlinks present in the metadata, and tries to determine if the links point to a web page, directly to an image file, or to another kind of media file. This allows ROSSIO to make particular uses of the links, such as for linking back to the page of the digital objects at the data providers’ websites, or displaying miniature images representing the digital objects;
Entity linking—This enrichment process aims to allow semantic searching on the ROSSIO dataset by linking entity names expressed in the metadata of data providers to entities and concepts of the ROSSIO vocabularies (described in Section 2.3
). It includes four enrichment operations:
Georeferencing enrichment—Place names, found in metadata fields about subjects and coverage, are matched against entities in the Places vocabulary, allowing, in some cases, an enrichment of the metadata with accurate geographic coordinates;
Agent enrichment—Names of persons and organisations, found in metadata fields about authors, contributors, and subjects, are matched against entities in the Agents vocabulary;
Temporal enrichment—Names of historical periods, found in metadata fields about subjects and coverage, are matched against entities in the Periods vocabulary;
Concept enrichment—Terms of concepts, found in metadata fields about subjects, are matched against entities in the ROSSIO Thesaurus.
2.3. The Development and Publication of Controlled Vocabularies
The ROSSIO Infrastructure will make services available, based on VocBench3 and Skosmos, for authorised users to develop controlled vocabularies and to publish them as linked open data. (The vocabulary-management application will only be accessible for registered users of the platform. The published vocabularies are available as linked open data at http://vocabs.rossio.fcsh.unl.pt/
. accessed on 7 December 2021) The vocabularies are intended to support knowledge-organisation and metadata-normalisation and enrichment in the ROSSIO platform, both for aggregated resources and for the digital collections and exhibitions produced within the platform. The vocabularies will be published as linked open data, which will further contribute to the infrastructure’s compliance with the FAIR data principles.
The ROSSIO vocabularies, a set of core vocabularies in the SSAH, are being developed for the purposes stated above. The following vocabularies are under development at this time:
ROSSIO Thesaurus. This vocabulary is composed of designations of subjects or topics in the SSAH, which mostly correspond to general concepts (e.g., “Arts”, “Artists”). The ROSSIO Thesaurus also includes several individual concepts, such as disciplines (e.g., “History”) and conceptual objects (e.g., “Identity”). The distinction between general and individual concepts is standardised both in terminology [9
] and information science [10
]. The development of the ROSSIO Thesaurus was already described in [11
ROSSIO Agents. This vocabulary consists of personal and organisational names. For example, it lists every consortium member and data provider of the ROSSIO Infrastructure, along with relevant agents in SSAH;
ROSSIO Places. This vocabulary includes toponyms. It will include names of geopolitical entities (e.g., countries), geographical features (e.g., rivers), areas (e.g., neighbourhoods), and points of interest (e.g., buildings). “Place” is understood broadly as any physical entity that is inherently located and, therefore, has stable geographic coordinates;
ROSSIO Periods. This vocabulary is composed of names for periods, including historical, cultural, artistic, or geological periods of time. It will comprise “absolute” time-intervals (e.g., millennia and centuries) as well as more or less variable periods in history (e.g., “Renaissance”) and individual events (e.g., “World War I, 1914–1918”).
provides an overview of the ROSSIO Vocabularies. These vocabularies are modelled in SKOS [12
], the W3C recommendation for modelling knowledge organisation systems (KOS) and sharing them as linked data. It should be noted that name authority lists and gazetteers are also KOS [13
]. While SKOS was designed for modelling thesauri-like resources with hierarchical and associative concept relations [14
], we have also applied it in modelling vocabularies of agents and places, in line with many other linked data initiatives (e.g., the Getty Vocabularies Program, which includes vocabularies of artists and geographic names modelled in SKOS (https://www.getty.edu/research/tools/vocabularies/
accessed on 7 December 2021)). This option has the advantage of enabling the use of tools that are already in production within ROSSIO, namely, VocBench and Skosmos, for publishing these vocabularies as linked open data.
While SKOS provides the backbone for modelling controlled vocabularies, it does not account for the full specificity of the ROSSIO vocabularies, which range over different types of entities. To close this gap, we have included elements from other widely-used ontologies in the linked open data cloud:
accessed on 7 December 2021). The BIBFRAME ontology provides classes for modelling types of entities for each vocabulary. These classes are Topic, Person, Organisation, Place, and Temporal. Each entity in the vocabularies is simultaneously an instance of skos:Concept
and of one of the above-mentioned BIBFRAME classes. This is intended to facilitate the use of the ROSSIO vocabularies in the platform and, also, their potential reuse in the linked open data cloud;
Getty Vocabulary Program (GVP) ontology (http://vocab.getty.edu/ontology
accessed on 7 December 2021). The ROSSIO vocabularies make use of properties from this ontology for representing types of agents (agentType
) and places (placeType
) by means of concepts from the ROSSIO Thesaurus. These properties allow, for example, declarations that Amália Rodrigues was a fado singer and that Portugal is a country by linking “Amália Rodrigues” in ROSSIO Agents to the “Fado singers” concept in the ROSSIO Thesaurus, and “Portugal” in ROSSIO Places to the “Countries” concept in ROSSIO Thesaurus, respectively;
SKOS extension of ISO 25964 (http://purl.org/iso25964/skos-thes
accessed on 7 December 2021). This extension aligns the ISO 25964 data model for thesauri [15
] with SKOS. The ROSSIO vocabularies use the Thesaurus array
class for modelling guide terms (e.g., <People by occupation>), which is a common design pattern found in many thesauri and terminologies;
Schema.org ontology (http://schema.org/
accessed on 7 December 2021). ROSSIO Agents makes use of properties from Schema.org for modelling birth and death dates, since BIBFRAME only contains a generic date property, which is intended to model publication dates;
The ROSSIO vocabularies are based on existing structured and unstructured vocabulary resources, such as subject heading lists and in-development thesauri provided by the ROSSIO Infrastructure’s partner institutions, and also on sections of the Getty’s Art and Architecture Thesaurus (AAT), which has become a reference vocabulary in the SSAH. Reusing information from the AAT is facilitated by Getty’s Vocabulary Program’s linked open data infrastructure. The concepts included in the ROSSIO vocabularies are required to be identified by Portuguese labels with English-equivalent forms. The form of the lexical labels follows the recommendations outlined in the international standards regarding thesauri for information retrieval [15
], along with Portuguese language standards for headings of personal, organisational, and geographic names [16
Since the ROSSIO vocabularies are published as linked data, they are linked to external resources identified by URIs. In our case, this is carried out by using SKOS mapping properties to align concepts in the ROSSIO vocabularies with those of external KOS. At this time, the ROSSIO Thesaurus and ROSSIO Periods are aligned with the AAT, either manually or by means of Silk Workbench (an automatic alignment tool) (http://silkframework.org/
accessed on 7 December 2021). The ROSSIO Thesaurus is further aligned with the Backbone Thesaurus (BBT) (https://vocabs.dariah.eu/backbone_thesaurus/en/
accessed on 7 December 2021), a top-level vocabulary for the interoperability of different thesauri and taxonomies in the arts and humanities, which is managed by a working group within DARIAH-EU, the European infrastructure that integrates ROSSIO. ROSSIO Agents, on the other hand, is aligned with the Virtual International Authority File (VIAF) (https://viaf.org/
accessed on 7 December 2021), while ROSSIO Places is aligned with the TGN and GeoNames.
Future work should be carried out for extending the alignments of the ROSSIO vocabularies to other third-party vocabularies available as linked open data. Concepts from the ROSSIO Thesaurus should be further aligned with specialised vocabularies, such as GEMET (General Multilingual Environmental Thesaurus), to which several vocabularies are mapped, including DARIAH’s BBT. The ROSSIO Thesaurus should also be aligned with widely used general vocabularies in the linked open data cloud, such as the Library of Congress Subject Headings (LCSH). Concepts from ROSSIO Periods should be aligned with the canonical dataset of PeriodO, a gazetteer of spatio-temporal periods based on scholarly definitions (https://perio.do/
accessed on 7 December 2021). Finally, entities from ROSSIO Agents should be mapped to the Getty Union List of Artist Names, which includes the names and biographical information of artists and associated people and organisations.
In addition to the core vocabularies described in this section, the ROSSIO vocabulary services will also publish relevant third-party vocabularies for metadata enrichment and for promoting their use among the ROSSIO Infrastructure’s partner and member institutions. At this time, we have included a SKOS version of the Lexvo.org knowledge base (http://lexvo.org/
accessed on 7 December 2021 ) in our Skosmos installation, which includes the ISO 639 two- and three-character codes for languages. Our vocabulary repository also includes the COAR (Confederation of Open Access Repositories) vocabulary of resource types (https://vocabularies.coar-repositories.org/resource_types/
accessed on 7 December 2021). A selected number of resource types from this vocabulary should be used for classifying the aggregated metadata descriptions in the platform (e.g., “text”, “image”, “video”, “other”).
Going forward, the ROSSIO vocabulary services should also enable the development, publishing, and alignment of subject-specific vocabularies by our partners and member institutions. This is the case of the SIPA Thesaurus, a vocabulary focusing on architectural heritage, which is currently under development by the Directorate-General for Cultural Heritage. We expect the ROSSIO vocabulary services to become a local hub for developing, publishing, and promoting the use of controlled vocabularies in SSAH, in line with similar initiatives in Europe.
3. ROSSIO Infrastructure Services: Discovery Portal, Exhibitions and Digital Collections, Virtual Research Environment
As explained in the previous sections, the metadata aggregation and controlled vocabularies will be at the core of the ROSSIO platform. The architecture of ROSSIO portal has been developed in seven phases: (i) Data model and metadata mapping; (ii) Development and web publishing of controlled vocabularies; (iii) System architecture design; (iv) Portal prototype; (v) Testing phase; (vi) First version; (vii) Final version, on which we are currently working.
ROSSIO will employ a vast array of information and communication technologies (ICT) tools, defined as devices, applications, and systems that allow different agents—including individuals and organisations—to digitally interact. In the following, we will delve into the discovery portal, VRE, and digital exhibitions and collections. The description of the applications is far more succinct than the explanation of the technologies used in the platform. That happens because of two reasons. First, the platform and the applications present different challenges. While, for building a platform such as ROSSIO, the challenges faced are highly technical (due to a large amount of data and to the need for interoperability across systems), the challenges for the applications are mainly functional—to ensure that the end-users’ needs are fulfilled. Second, the implementation process of these apps is still ongoing. During the last few months, these services were tested by focus groups, which allowed us to highlight the advantages and identify possible issues and queries that our team has been working to resolve.
3.1. The Discovery Portal
The discovery portal will provide many search options—including simple and advanced searches—of the digital resources located in the different consortium institutions and content providers (Figure 3
). For example, the advanced search option will provide more concrete results, which are enriched with controlled vocabularies and filters. As in the cases of the DPLA and Library of Congress, this function will prove instrumental for different users, especially within the research community [18
], by providing more precise results while fostering new approaches based on the connections between the discovered items. Moreover, it should be noted that the discovery portal is the core of the ROSSIO Platform since all other services are highly dependent on its successful implementation.
The users do not need to be registered on the platform to use the discovery portal, which allows them to explore more than 5 million catalogued digital resources. Currently, despite the considerable amount of content made available by each type of institution, most of it comes from archives (Table 1
). The aggregation of some institutions, such as audio-visual files (Cinemateca), is still in progress.
By doing a simple or advanced search, the users can access an overwhelming wealth, number, and diversity of data found at different levels:
Chronological diversity: The digital objects available date from different historical periods, ranging from Prehistory to the present day. The user can find a decorated menhir from the Ancient Neolithic period, preserved at DGPC, or temporary websites, such as those linked to presidential campaigns, searchable only in the Portuguese web-archive;
Geographic variety: Although the name of the Infrastructure (ROSSIO) might be taken to be associated with the large square located in Lisbon, the data provided covers the entire country, islands (Madeira and Azores), and former Portuguese colonies (e.g., Brazil, Cape Verde, Angola, Mozambique, Timor, etc). It also covers other regions and countries of the world, some of which have been less documented for earlier periods, such as Oman;
Thematic heterogeneity: The resources available focus on diverse subjects, including Festivities, Monuments, Music, Architecture, and Theatre, which are relevant to SSAH disciplines such as History, Anthropology, Sociology, Linguistics, Art History, and Musicology;
Typological diverseness: The available resource types include handwritten documentation (e.g., manuscripts, codices), published studies (e.g., scientific papers, books), iconography (e.g., engravings, paintings, photographs, sculptures), audio-visual material (e.g., sound recordings, videos), and other online resources (e.g., databases, websites).
3.2. Exhibitions and Digital Collections
Other services provided by the platform are exhibitions and digital collections, generally defined as activities that aim to present and develop a given subject by using hypermedia—texts, audio, graphics, images, and video—and digital resources disposed according to a predetermined and accessible narrative. This service was developed by taking into account the studies and reflections developed by authors such as M. Kalfatovic [19
], C. Leong [20
], M. T. Natale [22
], and A. Antoniou [23
], as well as other relevant international cases. For instance, ROSSIO’s team was careful to distinguish digital exhibitions from other similar and very common initiatives, such as photo galleries, by introducing a contextualising narrative. Furthermore, we can apply different storytelling techniques, such as transforming the exhibitions titles and their subsections into clear questions that could arouse the user’s curiosity. The exhibition’s narrative adopts a clear and direct language, as well as a predetermined limit of characters to be used in order to retain the user’s attention. The scientific curation of exhibitions and collections will be assured by ROSSIO’s researchers in collaboration with renowned experts.
ROSSIO’s digital collections—not to be confused with collections of digitised resources—are similar to the referred exhibitions, aside from some slight differences (Figure 4
). Considering the work developed by other Infrastructures (e.g., DPLA, Europeana, Torve, Culturaitalia) and cultural institutions (e.g., British Library, Gallica), the digital collections created by ROSSIO are small-sized exhibitions, as introductory texts and the resources’ explanatory texts should not exceed 3500 and 800 characters, respectively, as they are targeted to academic and non-academic audiences, such as students, teachers, tourism personnel, or representatives from different cultural industries.
The ROSSIO digital exhibitions and collections will address and explore themes related mostly to Portuguese Cultural Heritage (either in the country or in other parts of the world) through the perspective of SSAH. At the moment of writing, five exhibitions—one of them published at NOVA FCSH (Tempos de Doença, Tempos de Cura
(Times of Illness, Times of Healing [https://www.fcsh.unl.pt/faculdade/bibliotecas/tempos-de-doenca-tempos-de-cura/
accessed on 7 December 2021]))—and ten collections were already concluded and will be made available on the platform’s beta version.
The exhibitions feature around 50 resources aggregated and connected within the platform, while the collections consist of 10 to 15. The selection of resources follows a set of pre-defined norms, such as: (I) guarantee the scientific pertinence of the chosen objects for the analysed subject; (II) ensure that the majority of the consortium institutions are represented, including at least four partner institutions; (III) use materials of different types, historical periods, and geographic areas; (IV) collaborate with researchers from NOVA FCSH’s research units to assure the scientific validation and innovative character of the content created. Some of these guidelines were inspired by the work developed by the aforementioned international infrastructures and heritage and cultural institutions. Its application will promote the value of the materials presented, but also highlight and reinforce their visibility, richness, and diversity within a single platform.
3.3. The Virtual Research Environment
The VRE is a web-based working environment available to registered users that aims to enhance the research experience and streamline the sharing of resources. Following principles of technical interoperability, sustainability, security, and easy-to-use practices, the VRE is a fundamental tool for intuitive Research Infrastructures. The VRE places collaboration at the core of the experience, enabling the reuse of digital resources, promoting dialogue between different interlocutors, and strengthening scientific networks and communities [24
]. Currently, we are implementing several information and communication technologies (ICT) tools to accomplish these goals and to strengthen the application of FAIR principles (Figure 5
My Folders—to store and organise the resources selected;
Search—connected with the discovery portal (search, visualise and select data);
Resources—selected resources and personal annotations;
Text editor—to write notes on the resources chosen;
Share—For sharing with other users the resources and personal notes on them, increasing the collaborative work and the community of practices [24
The implementation of these tools required the organisation of focus groups and the execution of test sessions (e.g., usability, intelligibility). These groups are composed of possible users of different ages, professional experiences, and levels of digital literacy. Among them were, for example, researchers, professionals linked to the area of tourism and creative industries, museum technicians, representatives of professional and educational associations, among others. This process will help the ROSSIO Infrastructure provide solutions to some of its main objectives and concerns, such as building intuitive tools that are user-friendly, especially for those with less digital literacy or disabilities (e.g., colour blind (Directive EU 2016/2102 of the European Parliament and of the Council of 26 October 2016: https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32016L2102&from=EN
accessed on 7 December 2021)), and for those capable of encouraging and empowering the ICT policies in various institutions, such as schools. For example, the collaboration with tourist guides associations is already a fundamental contribution to adjusting the search engine to the general audience.
Furthermore, the focus group meetings will help the team address some user challenges that had already been considered in the project’s initial stages, but became more evident in the pandemic context and the unprecedented digital migration that followed. The contacts that we have already established with professors, for example, drew attention to the general distrust or even resistance of some teachers towards on-line learning and the use of digital resources, the difficulty of schools in accessing digital content due to outdated or the total absence of suitable software and hardware, or even the fact that although present-day students are considered digital natives, it does not mean that they know how to effectively use and manage the tools and information they have at their disposal. These referred-to challenges, among many others, will be at the core of the work developed between ROSSIO’s team and the focus groups.
4. Final Remarks: Potentialities for Public Dissemination of Science Research
ROSSIO is committed to develop scientific and technical innovation and respect the international guidelines of open science (open access, open data, open source, open reproducible research, and open methodology) that have been promoted since 2014 by the European Commission and national stakeholders. It contributes to digital preservation and increases research reproducibility, scientific dissemination, and social inclusion, facilitating access to knowledge, education, lifelong learning, and community empowerment.
The specific context of the COVID pandemic reinforced the importance and urgency of making content widely and freely available as a fundamental tool to overcome lockdown constraints and an ever growing infodemic, taking into account internationally defined methodologies such as the FAIR principles [27
]. These processes will undoubtedly accelerate in the coming years; hence, platforms such as ROSSIO are important as quick and effective ways of providing access to fundamental cultural, scientific, and diplomatic resources related to SSAH that are otherwise dispersed into different Portuguese institutions.
The discovery portal will provide an accessible and adjustable fine-tuned research tool for the study of many relevant SSAH themes and disciplines, contributing to the highlighting of important sources of Portuguese history (e.g., Medieval Royal Chancelleries and the first Portuguese videos), as well as relevant documents for a global approach to the history of humanity (e.g., UNESCO Memory of the World Programme). The resources available can be applied in unlimited ways. For instance, although most people would probably think that these objects will be used in more traditional or conventional scientific outputs, such as articles, books, and traditional physical exhibitions, they are being prepared to be applied in innovative formats, aiming for a more effective public outreach. In addition to the digital exhibitions and collections, the ROSSIO’s resources can be used to build a storytelling narrative or a Massive Open Online Course (MOOC).
In fact, the ROSSIO Infrastructure team is currently developing a MOOC with the UNESCO Chair—“The ocean’s cultural heritage” (https://cham.fcsh.unl.pt/catedra/index.html
accessed on 7 December 2021)—for the NAU Platform (a service developed and managed by the FCCN Unit of the Foundation for Science and Technology that allows the creation of courses in MOOC format). The MOOC focuses on the history of whale hunting in Portugal (from the Middle Ages to the 20th century). It is intended for teachers of geography, history, and biology in middle and secondary education, as well as for professionals linked to whale watching and cultural heritage management. Regarding the latter, although whale watching is present in various parts of the country, this economic sector assumes greater importance in the Azores islands. For instance, in 2014, the total income generated by this activity exceeded 3 million euros, a figure related only to the sales of tickets to observe the whales (https://greensavers.sapo.pt/acores-observacao-de-cetaceos-vale-e3-milhoesano-so-em-bilhetes-com-video/
accessed on 7 December 2021). By attending this course and consulting its materials, the aforementioned professionals will be able to access innovative information that is validated and accredited by scientific institutions. This is an example of an important contribution that can bring us towards the enhancement of the quality and competitiveness of professionals while simultaneously aiding a central economic sector from a peripheral region in a post-pandemic context.
The exhibitions and digital collections aim to present these resources in an interesting and thoughtful way, always targeting a wider audience while simultaneously promoting the public outreach of scientific knowledge and the digital literacy of society. These outputs will also provide options for users to search for related resources on other platform services in order to clarify and further contextualise the topic at hand. Furthermore, ROSSIO can inspire tourism professionals and creative industries to create innovative, advanced, highly-customisable, personalised, and competitive routes. These are important aspects if we consider, for example, the exquisite requirements of customers in some sectors of tourism, such as cultural tourism, and the growing desire of tourists to live and share experiences of the local communities’ everyday habits and traditions [29
]. Among the exhibitions and digital collections already developed, the Sites of Memory(s): the Alcobaça, Batalha and Tomar complex
can be a good example for this.
Currently, visits and tourist routes carried out to these monastic complexes, classified as World Heritage sites, focus mostly on the daily lives of monks and friars and the artistic styles (e.g., Gothic, Manueline, Mannerism, Baroque) that are present there. Although this collection will disclose new data to enrich these tours, it also allows the development of other routes to present these spaces as “sites of memories”, defined as spaces that were used, over time, to build, perpetuate, and celebrate a memory and an identity. This collection not only allows visitors to explore new information about these monasteries, but also observe how the local and national communities, as well as foreign visitors, felt and experienced those historically relevant spaces through time.
ROSSIO is also committed to providing additional research tools—as is the case of VRE, a web-based workspace that will allow the management of personal saved resources while enabling the development of collaborative efforts. Aside from benefiting scientific research, VRE also intends to target other communities from different areas of expertise, such as students and teachers, following the model of other infrastructures (e.g., Historiana). Hopefully, this interactive and dynamic space will bring different user groups (e.g., teachers, students, senior citizens) closer to these scientific and cultural institutions, stimulating hands-on initiatives [32
]. As in other cases of virtual reality [34
], the VRE can be a fundamental tool for improving learning capabilities by creating a working environment that is more familiar and better-suited to the new digital generations. It will also help schools overcome the lack of technological resources and, thus, facilitate the transition and digital migration within schools. If this objective is fairly important in the general Portuguese context—especially the interior and peripheral regions—it is even more important in the case of developing Portuguese-speaking countries. For example, through the use of a mobile phone, a tablet, or a computer, a professor of history in Atsabe, near the Foho Tatamailau Mountains (Timor), will be able to present students with texts and photographs of the island taken during the Portuguese presence and before the Indonesian occupation, which are presently preserved in DGLAB. A good example of this is the interview conducted in 1965 with Guilherme Gonçalves, a resident in Atsabe, about the Japanese invasion during the Second World War (https://digitarq.arquivos.pt/details?id=8144642
accessed on 7 December 2021). The teacher and the students will be able to work on the photographs, taking notes and highlighting the elements that they consider important.
In the context of ROSSIO, developing controlled vocabularies should facilitate the semantic enrichment and normalisation of the metadata produced and aggregated in the platform. The ROSSIO vocabularies are published as linked open data, which not only follows the FAIR principles, but is also an important contribution for the Portuguese linguistic linked open data cloud, which has very few resources. For instance, LingHub, a directory of more than 100,000 linked data language resources, only provides access to 164 resources in the Portuguese language (Data gathered from LingHub (http://linghub.org/
) accessed on 30 July 2021). Additionally, deploying applications for the management and publication of SKOS vocabularies facilitates the collaborative development of controlled vocabularies within the ROSSIO consortium. The ROSSIO vocabulary services could function as a hub for information organisation in the SSAH and in the Portuguese language, promoting the construction and use of controlled vocabularies.
We expect the ROSSIO platform, and the services provided, to contribute towards the implementation of best international practices within Portuguese scientific and cultural institutions. More specifically, ROSSIO promotes state-of-the-art procedures for digitally preserving documentation and its subsequent connection, contextualisation, and dissemination to the general public. The knowledge and experience acquired throughout the development of the platform should also facilitate and encourage the future entry of additional content providers, including central and local Portuguese cultural institutions. In 2021, this was exemplified by the incorporation of the Diplomatic Institute of the Ministry of Foreign Affairs as a content provider (Instituto Diplomático do Ministério dos Negócios Estrangeiros). Currently, ROSSIO is preparing to extend the consortium. One of the new partners will be the National Library of Portugal (Biblioteca Nacional de Portugal, BNP), whose arrival will allow the aggregation of more than 39,338 digital resources in the first phase. ROSSIO should also be an asset for small and local cultural institutions, such as regional and municipal historical archives, since many of these institutions lack the necessary technical skills and resources to carry out the semantic enrichment of their collections’ metadata on digital platforms.
Considering the thoughts of Tim Sherratt about platforms and their ability to “unlock the cultural heritage of a nation or a continent” [18
], ROSSIO is a reference hub as a service provider developing high-quality research, collaborative work, and knowledge dissemination. ROSSIO also promotes inclusion, allowing free and open access to Portuguese memory and cultural heritage, especially in Portuguese-speaking countries and the Portuguese diaspora.