4.1. The HIRMEOS Project
HIRMEOS (High Integration of Research Monographs in the European Open Science infrastructure) is an infrastructural project funded in the framework of Horizon 2020 to support open access (OA) monographs by enhancing five publishing platforms: OpenEdition Books (France), OAPEN (Netherlands), ΕΚΤ Open Book Press (Greece), Universitätsverlag Göttingen (Germany), and Ubiquity Press (UK).
OpenEdition Books (France) is the OpenEdition platform dedicated to open access books. OpenEdition Books is run by the OpenEdition Center, the French national infrastructure supported by CNRS, Aix-Marseille University, EHESS (École des hautes études en sciences sociales) and Avignon University. It currently distributes more than 6000 books from 87 publishers. OpenEdition works with Lodel, an open source software developed by the OpenEdition Center and disseminates open access books under different models, including the freemium model.
The OAPEN Library (Netherlands) is managed by the OAPEN Foundation and, like OpenEdition, aims to provide a highly qualified and certified collection of books. The platform currently presents more than 5000 books from more than 150 publishers. OAPEN also offers publishing houses, libraries, and research funding institutions services in the fields of digital preservation and long-term archiving, quality certification, and dissemination. The OAPEN Library works with XTF, an open source platform developed by the California Digital Library (CDL).
ΕΚΤ Open Book Press (Greece), financed with its own and structural funds, is the service provider for electronic publishing for the Greek National Documentation Center. EKT offers advanced e-Infrastructures and services for institutional partners (universities, research centers, scientific societies, and memory institutions), in particular, to enable the OA publication of peer-reviewed journals, conference proceedings, and monographs in the SSH. EKT works with Open Monograph Press (OMP), an Open Source software developed by the Public Knowledge Project (PKP) to organize peer review and editorial processes. OMP can also operate as a website.
The Universitätsverlag Göttingen (Germany) is the dedicated publishing house of the Georg-August-Universität Göttingen and is part of the group Electronic Publishing, in which several services and projects of the Niedersächsische und Universitätsbibliothek Göttingen are operated. These include projects on publishing and Open Science, advisory services, Open Science Campus activities and various repositories. The university publishing house is managed by an editorial board from the university, which consists of members of all 13 faculties, ensuring the quality of the publications. The university press publishes about 60 books per year, mainly from the SSH, which are also distributed through print on demand.
Ubiquity Press (UK): Ubiquity Press is an open access publisher of peer reviewed journals, academic books, and data. Ubiquity provides its own platform and various services. Ubiquity works with the Book Management System RUA, an Open Source application developed by Ubiquity to assist with the monograph publishing life cycle, from submission to both internal and peer review, from copy editing to production and publication.
In order to simplify the integration of monographs in the universe of Open Science, HIRMEOS operates as a distributed system in which the homogeneity of the platforms is not achieved simply by using a single software for all of them, but by the adoption of common standards. The different, independent publishing platforms participating in HIRMEOS are willing to use the same metadata as a result of the project and to implement common services accordingly. Thus, the publication system remains open to the future participation of other platforms; this participation is to be simplified by an implementation guide created during the course of the project.
4.2. HIRMEOS Data and Services Providers
Encouraging the annotation of Open Access monographs in order to foster conversation between scholars, between teachers and students, and between the academic community and a non-academic audience is one of the aims of the HIRMEOS project. HIRMEOS provides the five publishing platforms with the same standards for various services and tools intended to make the use of Open Access monographs in the SSH more attractive to readers. These services are intended to better identify and discover Open Access Monographs, as well to certify their quality, measure their impact, and intensify their usage through annotations.
By adding these different services and tools, the HIRMEOS project aims to facilitate various activities related to the use of digital monographs, of which annotating is one of the most important. In fact, interacting with a printed book, by underlying portions of the text or writing comments on the margins, is a key way to grasp and process the content of a book and fix ideas inspired by reading.
The centrality of these practices and the long tradition behind them make it clear why many students and researchers, while able to work using digital monographs with electronic devices today, still prefer to work with printed books. This means that digital monographs will increase in popularity only when they can provide the key functionality that users can enjoy in a printed format. However, the fact that digital monographs should also be annotatable does not mean that the ultimate goal of a tool for digital annotation is merely to replicate the dynamics of printed text annotation in the new digital environment. We already have an example of such an attempt to emulate the printed format in making the PDF format the main format for digital document. This strategy, as in the case of PDF, which at the beginning of the digital age made reading friendly to users and thus encouraged their interaction with digital publications, can in the long run be an obstacle to realizing the full potential of digital technologies. One should consider whether the popularity of PDF has slowed down the acceptance in some disciplines of other formats such as machine-readable formats, that can enable the use of many different tools developed for the digital humanities.
4.2.1. Metadata for the Identification of Books and Authors
As part of the project, a workflow was established to add the following metadata to each monograph published on the participating platforms:
All documents published on the platforms are identified by Crossref DOIs. Digital object identifier (DOI) technology enables usable, interoperable, and persistent identification of digital objects. DOI technology uses an identification syntax and a network resolution mechanism (Handle System®), as well as a stable and sustainable infrastructure.
If the authors have an ORCID ID (Open Researcher and Contributor ID), the platforms involved in the project display it next to the author name. ORCID is a non-proprietary alphanumeric code for the unique identification of authors. This addresses the problem that the contributions of certain authors can be difficult to recognize since most names of persons are not unique, can change (e.g., in the case of marriages), can have cultural differences in the display order of names, may contain inconsistent use of first name abbreviations, and utilize different writing systems. The ORCID organization offers an open and independent register, which is already the de facto standard for the identification of authors of scientific publications.
Through FundRef Data, it will be possible to identify the funding institution and the research project behind a specific publication. Publishers can provide financing information for articles and other content using a standard taxonomy of the sponsor’s name. A taxonomy of standardized names of funding agencies is offered by the Open Funder Registry, and associated funding data is then made available via Crossref search interfaces and APIs for sponsors and other interested parties.
All in all, the standardization of the platforms according to these standards should significantly improve the findability of Open Access monographs, which is not always optimal today.
4.2.2. Entity Recognition Tool for Discoverability and Enrichment of Texts
Entity extraction and resolution is the task of determining the identity of entities mentioned in a text against a knowledge base representing the reality of the domain under consideration. This could be the recognition of generic Named Entities suitable in general purpose subjects, like people, location, organizations, and so on, but it could also involve the resolution of specialist entities in different domains.
To increase discoverability and usability of OA Monographs, HIRMEOS experimented with the integration of a Named Entity Recognition and Disambiguation (NERD) for visualizing key words and disambiguating concepts in the full text. The software powering this service, called entity-fishing
, was initially developed by Inria in the context of the EU FP7 project CENDARI4
and provides automatic entity recognition and disambiguation using Wikipedia and Wikidata data sets. The application is distributed with an open-source license, and it has been deployed as a web service in DARIAH’s infrastructure hosted by the French HumaNum.5
interface implements a variety of functionalities, like language recognition, sentence segmentation, and modules for accessing and looking up concepts in the knowledge base. One of the most widely implemented use cases was the improvement of the platform’s search interface using entities extracted from the library content. This was done by extracting specific Named Entities then enabling users filtering their search with these complementary parameters.
For example, OpenEdition extended their Books Catalogue6
by adding two additional facets to filter out books by entities of the type PERSON and LOCATION. Along the same lines, Goettingen State Library also extracted mentions to organizations from the library corpus in their book catalog.7
Once the annotation data is collected and stored, it can also be used for automatic generation of word clouds at the repository level, with words displayed by their importance (relevance, frequency, etc.).8
Users are then able to access the most important concepts at library level. The underlying data is effectively the same used for facet searching, but with different visualization options.
4.2.3. Certification of Scientific Quality
The majority of scientific monographs undergo intensive quality assurance and evaluation procedures, which are, however, less standardized in the SSH than in other disciplines. Regardless of which review procedure would be optimal for monographs, HIRMEOS is developing a certification system that categorizes and standardizes review procedures. In this way, users can immediately recognize which review procedure a publication has undergone. A peer review certificate and an OA license certificate will be added to each document published on the five platforms.
The peer review and OA license certificates are delivered to the Directory of Open Access Books’ (DOAB) various partners. DOAB offers a quality-controlled list of peer-reviewed Open Access monographs (including book chapters) and book publishers. By developing this quality-controlled list, DOAB enables researchers, libraries, and discovery services to easily identify and search peer reviewed Open Access monographs, improving the discoverability, access, and use of monographs around the world. After an application process, publishers that meet the DOAB peer review and Open Access requirements and have the corresponding licenses are listed in DOAB and can then upload the metadata of their Open Access books. Such metadata can then be disseminated through the OAI-PMH (Open Archives Initiatives Protocol for Metadata Harvesting) protocol implemented by third-party providers such as libraries and search services; thus, improving the findability of books. The DOAB certification service includes a classification system and also allows certified publishers and publishing platforms to collect DOAB certification and icons through the DOAB API. Certified publishers and publishing platforms that meet DOAB’s requirements agree to the conditions of DOAB certification and commit to pass an audit to verify their peer review procedures.
4.2.4. Metadata for Metrics and Legacy Metrics
The measurement of impact and resonance presents specific challenges for Open Access monographs. Keeping track of current downloads, readership, and reuse across multiple hosting platforms is difficult, but important if one wants to understand and track the reach of Open Access monographs. The use of alternative metrics (Altmetrics), which measure the number of mentions of a document in social networks and other kinds of publications, has also increased significantly in recent years because it helps to better understand the impact of scientific publications by documenting the resonance of scientific content in broader communities and beyond the specific academic context.
HIRMEOS partners have enabled their platforms to collect usage metrics and alternative metrics and to display these directly on the documents. The altmetrics record the following measures for books: tweets, Facebook shares, Wikipedia quotes, and annotations. The service is designed to operate on a daily basis; it can therefore also make this data available in chronological order. Since the data sources often change their access methods, licensing and conditions, this is a maintenance-intensive system.
The metric service extracts usage data from the platform in question and uses the identifiers database to enforce the use of a particular type of identifier, such a DOI. The service connects the identifiers used by the reporting platform with the one desired by the data collector. The data collected is stored in the form of events, each of them recording the measure collected, the timestamp, the identifier of the work concerned by the event, and the number of times the event was repeated, e.g., there were four downloads of this book in this platform. The standardization process not only normalizes identifiers, it also tags each event with a URI (Uniform Resource Identifier) identifying the measurement it represents (e.g., views), the platform reporting the event (e.g., Google Books), and also provides a link to the definition of the measure to provide user-friendly description.
4.3. Annotating Monographs in the Framework of HIRMEOS
Aiming to create the best conditions for digital annotation, the HIRMEOS project has introduced technical implementations and given specific support to the academic community. During the project, an annotation layer was implemented directly on the publication platforms involved in the project, so that the user is not required to install the Hypothes.is plug in his browser when reading the monographs published on the platform. In addition, some of the platforms provided a small space on their homepage to inform the user about the use of the tool and the possibility of setting up working groups. Concerning the provision of support to the academic community and stimulating the annotation experiment, two initiatives should be mentioned more in detail: the support offered by the Göttingen University Library to annotation activities in a seminar of Philosophy at the University of Göttingen and the post-publication peer review experiment coordinated by Open Edition Books.
4.3.1. Annotating in A Seminar: Toward an Interactive Seminar Reader for Philosophy Students?
In the following we would like to describe the structure and the essential elements of the usage scenario that will be tested in the summer semester 2019 in the context of a philosophy seminar at the University of Göttingen with the support of the HIRMEOS process. The results of this experiment will be discussed in a later publication.
It is a so-called “Hauptseminar,” i.e., a seminar for MA students and for BA students in the final phase of their studies. The course is open to a maximum of 20 participants interested in the history of philosophy and practical philosophy, and it deals with Immanuel Kant’s virtue conception. As usual in philosophical seminars, the course focuses on the interpretation exercises of various texts under the guidance of the lecturers. In this case, passages from Kant’s Metaphysische Anfangsgründe der Tugendlehre (1797) will be discussed.
The lecturers will use the Hypothesis tool to enable the preparation and follow-up of the individual seminar sessions, which each revolve around a specific text. In particular, the tool will be used to perform different tasks. During the preparatory reading, the tool will enable the creation of notes on comprehension problems or possible interpretation paths, and these notes will be directly accessible to the lecturers and all other participants. These notes should then serve as a basis for discussions in individual seminar sessions and can then be revised or supplemented as a follow-up with regard to the results achieved.
The lecturers have the following expectations:
Easy access and quick control/response options should significantly increase the scope and quality of the preparation and follow-up of the individual sessions by the students.
This may have a positive influence on the deeper understanding of the text as well as on the discussion in the seminar (especially in the case of the present text, which is relatively unstructured and whose interpretation requires such references all the more).
An efficient way for lecturers to communicate with students on factual issues both within and outside the seminar itself.
Better communication and cooperative work between the students, which in particular also involves those in the discourse who are otherwise rather reserved in the seminar sessions. (Ideally in the long run this will also lead to their becoming more active in the sessions themselves.)
Streamlined management of the ‘small work’ to be done by the students (ungraded preliminary work, which should have a total volume of 2–4 pages depending on the module).
The resulting corpus of annotations can ultimately also serve as documentation of the work in the seminar (and thus replace seminar protocols, for example) and at the same time provide students with a good basis for preparing their examination performance (term paper or oral examination).
A particularly important aspect of this activity will be that the teachers will create three different classes of tags to describe the realized annotations:
1. Tags to describe the annotation type:
?: Understanding Question (What exactly is incomprehensible and why?)
!: Important text passage (To what extent and why is it a central statement?)
I: Interpretation required (How can this be understood? Are there different readings? If necessary, what speaks for or against the reading(s)?)
K: Commentary (critical examination of a concrete statement: Is the statement factual (un)plausible and why?)
2. Tags to describe the function of the annotated text passage:
3. Tags to categorize the annotation content:
This usage scenario not only points the way to a new form of learning in philosophy, but also to new forms of publication. Even if not yet planned, it would be very useful to publish the texts discussed in the seminar in a digital seminar reader, which would also contain a selection of the annotations realized in the course of the seminar. The benefits of such a publication are obvious: first, the material would be reusable for new seminars, and students would be able to find questions and answers that are likely to recur. Second, such digital editions of the seminar readers could be constantly updated, also taking into account the new annotations generated in later seminars.
4.3.2. Open Peer Review Experiment by OpenEdition9
In OpenEdition, the implementation of the annotation tool was part of an open post-publication peer review experiment. The objective of this experiment is to create a space both for scientific conversation around publications and to stimulate new forms of peer review. Through the annotation tool users can discuss theses and arguments presented in the scholarly books and also enrich them by providing new ideas. Developed as a form of post-publication peer-review, the experiment will play a strategic role in the framework of the HIRMEOS project which provide (see Section 4.2.3
) a certification system to describe different kinds of peer review.
In this way, we will explore the value of opening new peer review processes and extracting the ongoing discussions.
The project focuses on 13 books from four publishers10
opened for annotation from February to June 2019. Since February, readers were invited—during general campaigns (social networks, mailing lists, blogs) or contacted directly and individually—to read and annotate the selected books. Annotators are supported in learning to use Hypothes.is and in understanding the annotation process and its objectives. Annotations made by readers and authors will ultimately be studied and described in the overall results of the experiment, which will be the subject of a report.
The following aspects deserve particular attention:
Community outreach activities and clear guidelines are essential. The launch of this experiment was preceded by an important preparatory phase aimed to define the technical framework of the annotation activities. First, community outreach activities involved publishers and authors. They worked together with the staff of OpenEdition to give visibility to the experiment and invite other potential commentators. Second, in order to provide users with the best possible support, OpenEdition provided documentation and a user guide for Hypothes.is.11
We also established rules of good conduct to regulate annotations.12
These rules are broad enough to allow considerable freedom of use for annotators, but restrictive enough to protect authors from malicious or inappropriate comments.
Creation of publisher groups. OpenEdition needed to create specific groups of annotations for the open peer review process for each publisher of the monograph to be annotated. In order to make this possible, Hypothesis enabled the creation of publisher branded and moderated annotation groups: publisher groups. This presents two main advantages: (a) readers and annotators can activate different layers depending on the read-write experience they want to have; (b) in this way, every publisher maintains the ability to moderate the annotations made as part of this experiment.
Fast and reliable notification of authors and commentators. One of the main features that motivated OpenEdition to use the Hypothesis tool was the reply feature. When a few annotations are made, authors are notified by the publishing secretary in charge of monitoring the project and encouraged to respond to their readers’ annotations if they deem it appropriate. This gives readers the opportunity to react directly to the annotations of other annotators, and thus to achieve one of the objectives of this experiment, i.e., to create a real conversation and provide feedback to authors.
4.3.3. Checking the Quality of HTML Content at Ubiquity Press
Ubiquity Press used Hypothes.is in the in the quality assurance for the transfer of published content that was moving to their platform.13
Although the example presented refers to a journal, it is very likely that similar best practices will also be applied to monographs which are also published on the Ubiquity platform. Transferring HTML versions of content can be challenging, as the environment that the HTML file is in, is changing and therefore the HTML may not display as expected—it can also be an opportunity to improve old HTML (for example to make it reflowable for small screens, add accessibility features etc.). An annotation group was set up with the Editor-in-Chief and their team of volunteer students; the transferred HTML content was quality checked; where transfer problems were identified in-situ comments described the problem in-context and could be replied-to for clarification, and hashtags were used for consistent problems (CSS (Cascading Style Sheets) incompatibility issues, unicode encoding issues etc.) and frequent problems could be solved as a batch or with a global fix. Ubiquity Press will consider using Hypothes.is for other complex transfer projects in future, as it helped track and manage the issues, as well as working as a great communications tool.