1. Introduction
In the last few years, the locution “digital archive” has become widely used. The Etymology of “archive” derives from “arché”, but rather at the word “archive”—and with the archive of so familiar a word. Arkhe, we recall, names at once the commencement and the commandment. This name apparently coordinates two principles in one: the principle according to nature or history, there were things commencing physical, historical, or ontological principle—but also the principle according to the law, there where men and gods command, there where authority, social order are exercised, in this place from which order is given—a nomological principle [
1].
The term refers to the philosophical meaning of “beginning”, but also to the meaning of “command”, as the document has an intrinsic probative value, unless proven otherwise.
The profession of the archivist requires continuous training as not only does he find himself working among documents of a heterogeneous nature and produced by different producers, but he must also master the new TCI and have a deep knowledge of the cultural, scientific, economic and institutional context in which the documents are produced and in which the archival profession develops.
«This professional figure needs strong specialisation and skills modelled on the specific areas of concrete application of his knowledge. Talking about the archival profession means alluding to a series of activities that sometimes have only the name of the object to which they apply in common. Regardless of any other consideration, in fact, both the freelance professional who often places reordering at the centre of their specificities, and the mediator archivist called upon to work in cultural institutions where problems of use and enhancement arise, exercise the profession of archivist. Archivists are also those who develop descriptive models, define standards, study systems of organisation and restitution of archival information» [
2].
The Italian Code of Cultural Heritage provides for «activities aimed at promoting knowledge of cultural heritage (D. Lgs. 22 Gennaio 2004, n. 42, art. 6—Valorizzazione del patrimonio culturale.)», which must be entrusted «to the responsibility and implementation, according to their respective skills, of archaeologists, archivists, librarians, demoethnoanthropologists, physical anthropologists, restorers of cultural heritage and restorers of cultural heritage, technologists of cultural assets to cultural and historical art assets, in possession of adequate training and professional experience (D. Lgs. 22 Gennaio 2004, n. 42, art. 9-bis—Professionisti competenti ad eseguire interventi sui beni culturali.)».
Archives are considered cultural heritage, and they must be subjected to cataloguing activities (by cataloging we mean the registration, description and classification of all types of cultural heritage), whose methodology is defined by the Ministry, the regions, local public bodies and universities through studies, research and scientific initiatives. The catalogue cards represent an indispensable means for identifying and getting to know a cultural asset, as they collect, in a coded way, all the information relating to a specific asset.
The adoption of standards shared by a scientific community allows us not only to simplify routine operations such as cataloguing, but also allows us, in the IT field, to ensure interoperability between different systems, in order to communicate and/or share descriptive metadata—according to the ISO 15489 standard, these are structured data that describe the content, structure and context of documents and their management over time.
In Italy, the office that oversees the methodological coordination of cataloguing activities is the Central Institute for Catalogue and Documentation (ICCD) (
http://www.iccd.beniculturali.it), who has created various cataloguing standards to effectively describe the various document types (works of art, photographs, coins, archaeological finds, naturalistic assets, demo-ethno-anthropological assets, etc.), as such heterogeneous assets require different descriptive fields.
Since 2000, the year of publication of the Italian edition, the ICCD has been responsible for the translation of the Iconclass iconographic classification system (
http://www.iconclass.nl) to ensure its correct compilation according to the international and national standards defined by the RKD (Rijksbureau voor Kunsthistorische Documentatie/Netherlands Institute for Art History).
Iconclass is a classification system designed for art and iconography. It is the most widely accepted scientific tool for the description and retrieval of subjects represented in images (works of art, book illustrations, reproductions, photographs, etc.) and is used by museums and art institutions around the world. The Iconclass system is accessible through the Iconclass Browser and available as Linked Open Data (LOD). Iconclass was developed by Henri van de Waal (1910–1972), a Professor of Art History at the University of Leiden. His ideas for a systematic overview of subjects, themes and motifs in Western art, which later became the Iconclass system, took shape in the early 1950s. The complete Iconclass system was finished in the years after 1972 by a large group of scholars and was published between 1973 and 1985 by the Royal Netherlands Academy of Arts and Sciences (KNAW), of which Van de Waal was a member. The publication was followed by the development of several computerised editions of Iconclass by the University of Utrecht in the years 1990 to 2001. Numerous institutions across the world use Iconclass to describe and classify their collections in a standardised manner. In turn, users ranging from art historians to museum visitors use Iconclass to search and retrieve images from these collections. As a research tool, Iconclass is also used to identify the significance of entire scenes or individual elements represented within an image. ICONCLASS applications used around the world have made it the most widely accepted classification system for visual documents. In October 2001, Iconclass management was transferred to the KNAW. The KNAW was actively involved in supporting Iconclass translation projects. The multilingual Iconclass Libertas Browser became freely accessible as default online browser for Iconclass in November of 2004. In September 2006, Iconclass management was transferred to the RKD (Rijksbureau voor Kunsthistorische Documentatie/Netherlands Institute for Art History) in The Hague. In cooperation with the RKD, Etienne Posthumusand Hans Brandhorst developed the new Iconclass Browser which was launched on 10 November 2009 (
http://www.iconclass.nl/about-iconclass/history-of-iconclass).
The multilingual Iconclass Browser serves as a search tool that allows an indexer to find the concepts to tag an image. Needless to say, it can also be used to establish the correct meaning of a notation.
The Iconclass Browser contains English, German, French and Italian keywords and descriptions. Besides that, there are partial translations in Finnish and Norwegian and experimental translations in Chinese and Dutch (not yet online).
Iconclass is a subject-specific classification system. It is a hierarchically ordered collection of definitions of objects, people, events and abstract ideas that serve as the subject of an image. Art historians, researchers and curators use it to describe, classify and examine the subject of images represented in various media such as paintings, drawings and photographs.
The three main components of Iconclass are:
Classification System: 28,000 hierarchically ordered definitions divided into ten main divisions. Each definition consists of an alphanumeric classification code (notation) and the description of the iconographic subject (textual correlate). The definitions are used to index, catalogue and describe the subjects of images represented in works of art, reproductions, photographs and other sources;
Alphabetical Index: 14,000 keywords used for locating the notation and its textual correlate needed to describe and/or index an image;
Bibliography: 40,000 references to books and articles of iconographical interest (not yet online).
The unique elements of the Iconclass system are its alphanumeric classification codes, called notations. Notations always begin with one of the digits 0–9, corresponding with the ten main divisions of Iconclass.
The main divisions of the Iconclass system are represented by digits 0 to 9. Of these ten “main divisions”, the numbers 1 to 5 are “general” topics, designed to comprise all the principal aspects of what can be represented (1 Religion and Magic; 2 Nature; 3 Human being, Man in general; 4 Society, Civilisation, Culture; 5 Abstract Ideas and Concepts). Divisions 6 through 9 accommodate “special” topics, coherent subject matter of a narrative nature (6 History; 7 Bible; 8 Literature; 9 Classical Mythology and Ancient History). A tenth division, represented by the number zero, was added in 1996, at the request of Iconclass users, to accommodate abstract art.
Within each division of Iconclass, definitions are organised according to a logic of increasing specificity. A main division is divided further into a maximum of nine subdivisions by adding a second digit to the right of the first one. For example, Division 2 Nature is subdivided in the following way: 21 the four elements, and ether, the fifth element; 22 natural phenomena; 23 time; 24 the heavens (celestial bodies); 25 earth, the world as celestial body; 26 meteorological phenomena; 29 surrealia, surrealistic representations. The third level of specificity is attained by adding a letter in upper case and allows up to 25 subcategories. For example, the subdivisions of 25 earth are divided in this way: 25A maps, atlases; 25B continents represented allegorically; 25C geological phenomena; 25D rock types; minerals and metals; soil types; 25E geological chronological division; historical geology; 25F animals; 25G plants; vegetation; 25H landscapes; 25I city view and landscape with manmade constructions; 25K landscapes in the non-temperate zone, exotic landscapes; 25L cities represented allegorically or symbolically; 25M the Seven Wonders of the World; 25N fictitious countries. Therefore, the first subdivision of 25F animals is: 25F1 groups of animals; 25F2 mammals; 25F3 birds; 25F4 reptiles; 25F5 amphibians; 25F6 fishes; 25F7 lower animals; 25F8 extinct animals; 25F9 misshapen animals; monsters.
Through Iconclass, it is possible to view and select the various items that make up the classification system. Keywords can be used to indicate variants of an item, and the keys are preceded by the plus “+” sign and placed between brackets and the sign “:” is used to separate multiple codes.
An example of this hierarchical structure, and of the extension of other figures, can be made with the Bible. For example, to describe the oil painting’s Rembrandt Van Rijn, Bathsheba with King David’s letter—it shows a moment from the Old Testament story in which King David sees Bathsheba bathing and, entranced, seduces and impregnates her—all subsequent descents in the hierarchy take place by extending the notation to the right with more digits: 7 Bible; 71 Old Testament; 71H story of David; 71H7 David and Bathsheba; 71H71 David sees Bathsheba bathing; 71H713 Bathsheba receives a letter from David.
The Iconclass coding of this painting is “7:71:71H:1H7:71H71:71H713”.
Another useful tool concerning the cataloguing of artistic heritage is the Getty vocabularies, whose data are available in open format and in XML. The Getty Research Institute collaborates with other institutions, such as museums, libraries and archives, for the preparation of these vocabularies useful for indexing and retrieving information relating to art and architecture. The complete list of institutions is available on
https://www.getty.edu/research/tools/vocabularies/contributors.html.
The Getty vocabularies are resources containing structured terminology for art, architecture, decorative arts, archival materials, visual surrogates, conservation and bibliographic materials. Compliant with international standards, they provide authoritative information for cataloguers, researchers and data providers. The vocabularies grow through contributions. In the new linked, open environments, they provide a powerful conduit for research and discovery for digital art history.
The Getty vocabularies can be used in three ways: at the data entry stage, by cataloguers or indexers who are describing works of art, architecture, material culture, archival materials, visual surrogates or bibliographic materials; as knowledge bases, providing information for researchers and as search assistants to enhance end-user access to online resources. The Getty vocabularies are available in MARC format for easy mapping.
Five Thesauri (these are controlled vocabularies, that is, lists of normalised terms that identify themes, events, concepts, entities, companies, places and people. The standards used for the drafting of controlled vocabularies are: ANSI/NISO Z39.19-2005: Guidelines for the Construction, Format and Management of Monolingual Controlled Vocabularies; Bethesda: National Information Standards Organization, 2005; ISO 25964-1:2011: Information and documentation: Thesauri and interoperability with other vocabularies: Part 1: Thesauri for information retrieval; ISO 25964-2:2013: Information and documentation: Thesauri and interoperability with other vocabularies: Part 2: Interoperability with other vocabularies) and a set of guidelines have currently been released:
Art & Architecture Thesaurus (AAT) is a thesaurus containing generic terms, dates, relationships, sources and notes for work types, roles, materials, styles, cultures, techniques and other concepts related to art, architecture and other cultural heritage (amphora, oil paint, Renaissance, Buddhism, watercolours, etc.);
Getty Thesaurus of Geographic Names (TGN) focuses on places relevant to art, architecture, and related disciplines, recording names, relationships, place types, dates, notes and coordinates for current and historical cities, nations, empires, archaeological sites, lost settlements and physical features. TGN is a thesaurus that may be linked to GIS and maps (Thebes, Ottoman Empire, Ch’ien-fu-tung, Ganges River, etc.);
Cultural Objects Name Authority (CONA): compiles titles/names and other metadata for works of art, architecture and other cultural works, current and historical, documented as items or in groups, whether works are extant, destroyed or never built (Florentine Codex, Guernica, Girl with a Pearl Earring, Chayasomesvara Temple, Venus de Milo, etc.).
Union List of Artist Names (ULAN): contains names, relationships, notes, sources and biographical information for artists, architects, firms, studios, repositories, patrons, and other individuals and corporate bodies, both named and anonymous (Tiziano Vecellio, Altobelli and Molins, Rajaraja Museum etc.).
Getty Iconography Authority (IA): is a thesaurus that covers topics relevant to art, architecture and related disciplines; includes multilingual proper names, relationships and dates for iconographical narratives, religious or fictional characters, themes, historical events and named literary works and performing arts (Bouddha, Adoration of the Magi, French Revolution, Shiva, etc.).
A set of guidelines also exists for the description of art, architecture and other cultural works (CDWA), that represents common practice and advises best practice for cataloguing, based on surveys and consensus building with the user community. CDWA is arranged in a framework to which existing art information data structures may be mapped and new data modelling may be referenced. CDWA is mapped to other standards (CONA, CIDOC CRM, LIDO, VRA CORE, MARC/AACR, MODS, DUBLIN CORE, DACS, EAD, CIMI, FDA).
The AAT, TGN and ULAN are now available as LOD. They are published under the Open Data Commons Attribution License (ODC-By); 1.0. CONA and IA are not yet available as LOD, but data are available through APIs. In addition to LOD, the AAT, TGN and ULAN are currently released monthly in relational tables and XML formats. However, users are urged to transition to LOD release formats, because relational tables and XML formats will likely be discontinued in the future. CONA and IA are currently available as XML through APIs.
A current trend in managing art information is to increasingly make data about art, architecture and other cultural heritage objects available as Linked Open Data (LOD). This applies to the metadata about the objects, their creators, patrons, associated places, style, work type and other terminology concerning their description, history, scholarly research and conservation.
When data are linked and open, it means that data are structured and published according to the principles of Linked Data, so that they can be both interlinked and made openly accessible and shareable on the Semantic Web. The goal of linked open data is to allow data from different resources to be interconnected and queried, thus making them more useful. Although the idea of linking data in an open way is not new, the widespread practice of doing so is relatively new, thus the protocols, standards and licensing options used for linked open data are still evolving.
In order for data to be understood and processed automatically by computers, data in records or about resources must be expressed in a standard format. Each thing (for example, a museum object, a place or a person) must be represented by a persistent identifier (known as a Uniform Resource Identifier (URI)). A Resource Description Framework (RDF) is a language or format for describing things as well as the relationships between things as simple properties and values (known as “triples”), while things are represented using URIs. Among the most often-used formats for publishing art vocabularies are the Simple Knowledge Organisation System (SKOS) and Web Ontology Language (OWL).
If data are to be open to the community for linking and discovery, traditional licensing and copyright practices for images, art information and associated vocabularies and metadata must be adjusted. Data are considered open if the community is free to use, reuse and redistribute the data, subject either to no restriction or to only the requirements of attribution or share-alike. Among the licenses most often applied to art information are Open Data Commons (
https://opendatacommons.org) and Creative Commons (
https://creativecommons.org) licenses, each of which offers a range of levels of openness.
When data are linked and open, it means that data are structured and published according to the principles of Linked Data, both interlinked and made openly accessible and shareable on the Semantic Web. The goal is allowing data from different resources to be interconnected and queried.
The Linked Documents on the web are connected by hypertext; they allow users to traverse via Web browser and the data are made available in formats such as CVS, XML or marked up as HTML tables. The Linked Data on the web connect data from diverse domains such as people, books, scientific publications, films, genes, drugs and clinical trials, online communities, statistical and scientific data. Those enable new generation of search engines that follow the links between data sources to deliver more complete answer as new data sources appear; they operate on top of an unbound, global data space and use the web to create typed links between data from different sources. The standard user and application programming interface (API) of a relational database (a set of formally described tables from which data can be accessed or reassembled in many different ways without having to reorganize the database tables) is the Structured Query Language (SQL, is the standard means of manipulating and querying data in relational databases). Each table (called a relation) in a relational database contains one or more data categories in columns (called attributes); each row (called a record or tuple) contains a unique instance of data, or key, for the categories defined by the columns and each table has a unique primary key, which identifies the information in a table.
Three major projects comprise the work completed to date. First, the Vocabulary Coordination System (VCS) project created a single production system that replaced three separate, outdated and disparate data collection and editorial systems that had been used to produce the three vocabularies. The new, more powerful production engine allows Getty staff to efficiently collect, analyze, edit, merge and distribute terminology from Getty departments, as well as from external collaborating institutions. Second, the Vocabularies on the Web project produced unified Web-based access to the three Getty vocabularies and made them available to hundreds of thousands of researchers, scholars and members of the general public who are interested in the subject areas covered by the vocabularies. This project also enhanced security to protect the Getty’s intellectual property, and added measurement metrics to allow the Getty to gauge the usage volume, usage patterns and the success of these efforts. Finally, the Vocabulary Contributions project created processes and procedures for making use of and contributing to the vocabulary databases, an integral part of the work in all relevant Getty and external projects.
Example of record CONA.
ID: 700001956
Record Type: Movable Work
Images: 1
Rembrandts zoon Titus in monniksdracht, mogelijk voorgesteld als de heilige Franciscus van Assisi (easel painting (painting by form); Rembrandt Harmensz. van Rijn (Leiden 1606-07-15—Amsterdam ...; 1660; Rijksmuseum (Amsterdam, North Holland, Netherlands); SK-A-3138; RM001.collect.5227)
Titles:
Rembrandts zoon Titus in monniksdracht, mogelijk voorgesteld als de heilige Franciscus van Assisi (preferred,C,U,RP,Dutch,U,U)
Rembrandt’s son in a monk’s habit, possibly in the guise of St Francis of Assisi (C,U,DE,English,U,U)
Catalog Level: item
Work Types:
easel painting (painting by form) [300177435] (preferred)
.....(Objects Facet, Visual and Verbal Communication (hierarchy name), Visual Works (hierarchy name), visual
works (works), <visual works by material or technique>, paintings (visual works), <paintings by form>)
Classifications:
paintings (preferred)
Creation Date: 1660
Creator Display:
Rembrandt Harmensz. van Rijn (Leiden 1606-07-15—Amsterdam 1669-10-08) [preferred,VP]
painter Rembrandt, Harmensz van Rijn (Dutch painter, printmaker, 1606-1669) [500011051]
Locations:
Current: Rijksmuseum (Amsterdam, North Holland, Netherlands) [500246547] Corporate Bodies (Corp. Body)
Repository Numbers: SK-A-3138; RM001.collect.5227
Credit Line: Aankoop met steun van de Vereniging Rembrandt
Display Materials: oil paint on canvas
oil paint (paint) [300015050]
(Materials Facet, Materials (hierarchy name), materials (substances), <materials by function>, coating (material),
< <coating by form>, paint (coating), <paint by composition or origin>)
canvas (textile material) [300014078]
(Materials Facet, Materials (hierarchy name), materials (substances), <materials by form>, <materials by physical form>,
<fibre and fibre by product>, <fibre by product>, textile materials, <textile materials by process or technique>)
Dimensions: 79.5 × 67.7 cm; 94.1 × 81.9 × 7.2 cm (framed)
Events:
exhibition: Kie(k) naw, Helmond gefotografeerd; 2009-02-01–2009-05-31
exhibition: Chinese verleiding. Chinese exportkunst van de zestiende tot de negentiende eeuw; 2009-11-19–2010-04-25
exhibition: U vraagt, wij draaien. Chinees porselein rond de wereld; 2009-11-29–2010-03-21
exhibition: French Artists and Laymen in Eighteenth-century Rome; 2011-05-27–2011-08-21
exhibition: andt and the Golden Age of Dutch Art: Treasures from the Rijksmuseum—II—C; 2007-01-27–2007-05-06
exhibition: Saladin und die Kreuzfahrer—II; 2006-07-23–2006-11-05
exhibition: Shunga—Japanse erotische prenten; 2005-01-22–2005-04-10
exhibition: Het Rijksmuseum aan de Merwede—B; 2004-05-20–2007-05-28
exhibition: Schatzkammer Polen. Sammellust und Sammelwesen im alten Polen; 2002-12-01–2003-03-02
General Subject:
portraits (preferred)
human figures
Specific Subjects:
Ryn, Titus Rembrandtsz. van (Dutch painter and sitter, 1641-1668) [500008252]
.....(Persons, Artists) (ULAN)
habit [300224226]
.....(Objects Facet, Furnishings and Equipment (hierarchy name), Costume (hierarchy name), costume (mode of fashion),
<costume by function>) (AAT)
Copyright: Public Domain
Provenance: aankoop 1933-07-10
Inscriptions: signatuur en datum Rembrandt f. 1660
List/Hierarchical Position:
..... Movable Works
.......... Movable Works by class: drawings, paintings, prints, other two-dimensional media
Additional Notes:
Dutch ..... Portret van Rembrandts zoon Titus in monniksdracht, mogelijk voorgesteld als de heilige Franciscus van Assisi.
Zittend, ten halven lijve, naar links.
Sources and Contributors:
Rembrandts zoon Titus in monniksdracht, mogelijk voorgesteld als de heilige Franciscus van Assisi ........ [VP,Rijksmuseum]
Rijksmuseum XML files (2012)
Rembrandt’s son in a monk’s habit, possibly in the guise of St Francis of Assisi ........ [VP]
Rijksmuseum XML files (2012)
Subject: ....... [VP,Rijksmuseum]
....................... Rijksmuseum [online] (2000-)
....................... Rijksmuseum XML files (2012)
Note:
Dutch..... [VP]