Enabling A Conversation Across Scholarly Monographs through Open Annotation

Andrea C. Bertino; Heather Staines

doi:10.3390/publications7020041

Abstract

The digital format opens up new possibilities for interaction with monographic publications. In particular, annotation tools make it possible to broaden the discussion on the content of a book, to suggest new ideas, to report errors or inaccuracies, and to conduct open peer reviews. However, this requires the support of the users who might not yet be familiar with the annotation of digital documents. This paper will give concrete examples and recommendations for exploiting the potential of annotation in academic research and teaching. After presenting the annotation tool of Hypothesis, the article focuses on its use in the context of HIRMEOS (High Integration of Research Monographs in the European Open Science Infrastructure), a project aimed to improve the Open Access digital monograph. The general line and the aims of a post-peer review experiment with the annotation tool, as well as its usage in didactic activities concerning monographic publications are presented and proposed as potential best practices for similar annotation activities.

Keywords:

open annotation; monographs; open access; higher education; open peer review

1. Introduction: From Print to Digital Annotation

Annotation has roots that stretch back far into the history of the written word itself, as scholars and readers added notes and interpretations to the Talmud and other texts more than a thousand years ago. Tracing conversations between scholars makes it possible to map a conversation over time. Ultimately, classical Jewish literature is one that is steeped in annotation and reference. It is the quintessential network [1]. From the efforts of medieval scribes to the earliest days of the printing press, annotations for personal use or for sharing with others supported individual research and facilitated collaboration with others. Annotation has also played an important role in education, promoting literacy and improving memory, particularly in regard to social or shared annotations among students or between students and instructors [2]. Annotated editions, such as Norton Critical Editions or Folger Shakespeare Library, enabled students to see definitions, review added context, or gather scholarly interpretations to support their reading.

The earliest conception of the internet as we know it today was detailed in a 1945 article, “As We May Think,” by Vannevar Bush. To tackle the ever-increasing amount of knowledge being created, Bush suggested a machine, which he called the memex, for storage and consultation of information, complete with a keyboard for input and a display for reading, with full cross-indexing, where one could also “add marginal notes and comments” [3]. In 1993, when Marc Andreessen and Eric Bina were building Netscape, they included a feature called “group annotations” to complete this missing piece envisioned by Bush almost 50 years before. Unfortunately, this functionality was short-lived, as the team was not able at the time to build a server adequate to store those user-generated annotations [4]. Nearly a quarter decade would pass before Bush’s vision would be realized.

On 23 February 2017, the World Wide Web Consortium (W3C) which serves as the standards body for the web, would officially publish web annotation as a standard, giving tool creators something to aim for [5]. It is likely that in coming years browsers will enable a user to select their preferred annotation client in the same way they designate their preferred search engine today, this removing the need for plug-ins or bookmarklets. Annotations made with different services will be able to interact with each other seamlessly, in the same way that email can be sent easily between those using different email clients today. Users and organizations will also be able to move their annotations from one service to another, exporting for analysis, repurposing, or preservation. As more scholarly communication functions shift from the offline world to the online environment, publishers and educators will need workflow tools that can break through proprietary siloes. There is clearly a need for both human readable and machine readable annotations as infrastructure, enabling deep dives into author credentials, methods, lab equipment, and identifiers. Depending on their needs, readers will select which channels or layers to monitor. Use of deep linking to connect specific points in readings with other readings, data, and events, will open up new possibilities for linked data. All of these possibilities may lead to substantial noise or clutter around the content of the book in terms of the reader experience, so it is important that the reader be able to control how much or how little connectivity or context is needed.

A handful of annotation tools have been in use prior to the publication of the standard. Perhaps the best known is Genius (genius.com), previously known as Rap Genius due to its original purpose for annotating lyrics to rap songs. Although the tool remains in use by the Washington Post and Los Angeles Times, the company announced in March 2017 that it was moving away from general purpose annotation and into the music video format.1 In the scholarly communication space, Hypothesis (hypothes.is), a non-profit and open source annotation tool, began to take shape after 2011 and began to work in earnest with publishers in 2016. Proprietary tools also in use by some publishers include PaperHive, (paperhive.org) launched in 2015 with a model in which users move a copy of the published content to their hive for collaboration, and Redlink’s Remarq tool (remarqable.com), announced in May 2017, which focuses on pre-vetting of all annotators before their comments can be made publicly visible for other users. The existence of the standard now offers tool creators a standard to build towards.

For the purposes of this study, experiments around annotation were done using Hypothesis, which joined the HIRMEOS (High Integration of Research Monographs in the European Open Science Infrastructure) project as a partner during 2017.

2. The Lifecycle of Annotation

Open annotation is a flexible tool that can be used in nearly any part of the Research Lifecycle. It enables organization and collaboration atop research materials; inline peer review; augmentation of articles with additional information, links, images or video; elaboration around citations, content corrections or update; and extensive use cases in the teaching and learning space. The existence of the standard supports current and future interoperability, as well as a mechanism to make annotations comply with the principles of the FAIR2 Data Principles and also to utilize annotations to make content itself more FAIR. Three activities around annotations are crucial and present challenges which are relevant for every attempt to make of a digital text and a collection of annotations a citable research output: storing annotations, sharing annotations, and reusing annotations.

2.1. Storing Annotations

To expand the uptake in the creation of open annotations, those who create them need assurance that their annotations can be securely stored and preserved as part of the scholarly record. Organizations integrating annotation have a choice between using regular Hypothesis accounts or integrating with their own account systems through single sign on. In both cases, annotations are stored on the Hypothesis server. The vision for open annotation and open source code that powers Hypothesis anticipates the need for certain organizations to host annotations on their own server for privacy or business purposes. This is already happening, as eJournalPress hosts an annotation server to store their peer review annotations and the open access publisher MDPI has spun up their own Hypothesis instance with some unique features across their journal site. Currently, annotation accounts on these different servers remain independent of one another, but Hypothesis plans to work on a multi-service client that will enable users to connect different accounts and collectively view annotations across different servers.

Closely related to the need to store annotations safely is the ability to move annotations from one storage mechanism to another. Individual users and organizations who host their own annotation layers can currently access their annotations via the Hypothesis API (Application Programming Interface), and an export button is planned to make this easier for non-technical users. This removes fear over vendor lock in, as users should be able to download standards-based annotations created in one tool and upload them for use in another tool.

While annotations are stored securely with industry-standard backups, many users ask about the ability to store annotations in a digital preservation service like CLOCKSS and Portico.3 Discussions are currently underway with both initiatives. One proof of concept to be tested is ensuring that annotations from Hypothesis can successfully anchor to a journal that has been triggered and hosted openly on a preservation initiative site. A second factor to consider is whether, in the event that Hypothesis is no longer available as a service, a preservation network could run their own instance of the open source code and serve the annotations directly onto the content. This is part of a longer term discussion and framework about which Hypothesis welcomes user and reader feedback.

2.2. Sharing Annotations

A key feature of open annotation is the ability for someone who creates an annotation to be able to share and utilize it to collaborate with others. This process might entail sharing via social networks, by email, or through incorporating a link to an annotation on a website. All of these options are possible now. To share the collaborative process of annotation with others, the tool enables users to create on-the-fly private groups, which can then be shared with others via a link. About 60% of the annotations made using Hypothesis are made within the context of private groups, of which there are currently more than 28,000.

A second aspect around sharing annotations is their discoverability. Annotations made on content that has a digital object identifier (DOI) or that refer to content that has a DOI (or both), are shared with Crossref Event Data for indexing by Google and end user discovery. This expands the visibility of annotations and their associated content beyond the immediate context of the annotator, making them part of a wider scholarly communication infrastructure and again placing them in the context of FAIR data.

2.3. Reusing Annotations

Annotations made in the Hypothesis public channel carry a CC0 license and can thus be reused freely. Annotations made for private or private group activity are all rights reserved to their creators. It is also anticipated that organizations sponsoring an annotation layer might want to apply a different license.

In February 2018, PubMed announced that it would discontinue PubMedCommons commenting as a result of low usage [6]. Working together with Europe PMC, the Hypothesis team converted the existing comments into page notes (article level annotations) and added DOIs and PMIDs (PubMed Identifiers) to make the resulting annotations more FAIR. Great care was taken to ensure that the original CC-BY licenses of the comments would carry over and be visible on the new annotations. As a result, depending on metadata tags applied by the publisher on their home content platforms, these annotations may be visible to Hypothesis users there as well. In fall 2018, Europe PMC enabled the annotations to be visible by default in a restricted group across their content site.

A key outcome of this migration, through application of a PubMedCommonsArchive tag, is the ability for a reader to view all PubMedCommons annotations as one corpus of data. It is now easy to see the care, detail, and attention that went into creating these annotations. While the number added may have been small in contrast to the number of articles on PubMed, the number of readers and researchers who benefited from viewing them had to be considerably higher. We should not focus unduly on quantity while quality is also key. We tend to think of both comments and annotations from the perspective of what is publicly visible. However, public annotations make up only about 20% of the total.

3. Annotating Monographs: Needs and Expectations

When books moved from print to digital, a common result was a replication of the print version in a version that could be read on a screen. The expectations that many readers may have had for truly interactive or innovative experiences have been slow to materialize [7]. Open annotation promises numerous ways that the online environment might move beyond mere replication of the printed page. These possibilities fall into two larger categories: human generated and auto-generated. Human generated annotations might include the incorporation of reader feedback or reviewer remarks, opportunities to engage with the author or publisher, additional context around citations, connections to related resources or multimedia, or interactive assessments or code. Auto-generated might include additional information around identifiers, controlled vocabulary, or recommendations. With the success of online gaming, some interactive books may be somewhat indistinguishable from video games with reader/player engagement in annotation panes. With interactive learning elements—easily displayable as annotations—some textbooks may in effect be the course.

In fact, along similar lines to the transformation of the book, digital annotation not only merely replicates but also enhances the practice of annotating. In particular, the open annotation of digital documents offers two essential functionalities: sharing annotations with other people and making them searchable by tags that make it possible to identify the type of annotation or its content. In addition, digital annotations can be analyzed using text mining techniques that allow rapid categorization.

So, if traditional annotations have shaped researchers’ reading and learning habits, we can expect the same to happen with digital annotation if tools are properly implemented. For example, if inviting students to comment on a printed text seems to motivate them to interact more intensively and reflectively with the content of the text [8], we can expect the use of open annotation tools to reinforce this effect, since annotations on digital texts can be read by many different people. Further, Wolfe shows that the positive effects of annotating are increased when students receive guidance and coaching. This is even more important for digital annotation practices, considering that tools providing new functionalities must be learned and that, however simple the use of the tools might be, they are still alien to the average student and researcher in the humanities and social sciences (SSH). In particular, the teacher would introduce students to the use of open annotation groups and, most importantly, create a system of tags to describe the different kinds of annotation (e.g., a critical comment, a link to other texts, a question) and to specify the topic to which the annotation refers.

4. Developing Best Practices for Book Annotation in the Framework of the HIRMEOS PROJECT

4.1. The HIRMEOS Project

HIRMEOS (High Integration of Research Monographs in the European Open Science infrastructure) is an infrastructural project funded in the framework of Horizon 2020 to support open access (OA) monographs by enhancing five publishing platforms: OpenEdition Books (France), OAPEN (Netherlands), ΕΚΤ Open Book Press (Greece), Universitätsverlag Göttingen (Germany), and Ubiquity Press (UK).

1: OpenEdition Books (France) is the OpenEdition platform dedicated to open access books. OpenEdition Books is run by the OpenEdition Center, the French national infrastructure supported by CNRS, Aix-Marseille University, EHESS (École des hautes études en sciences sociales) and Avignon University. It currently distributes more than 6000 books from 87 publishers. OpenEdition works with Lodel, an open source software developed by the OpenEdition Center and disseminates open access books under different models, including the freemium model.
2: The OAPEN Library (Netherlands) is managed by the OAPEN Foundation and, like OpenEdition, aims to provide a highly qualified and certified collection of books. The platform currently presents more than 5000 books from more than 150 publishers. OAPEN also offers publishing houses, libraries, and research funding institutions services in the fields of digital preservation and long-term archiving, quality certification, and dissemination. The OAPEN Library works with XTF, an open source platform developed by the California Digital Library (CDL).
3: ΕΚΤ Open Book Press (Greece), financed with its own and structural funds, is the service provider for electronic publishing for the Greek National Documentation Center. EKT offers advanced e-Infrastructures and services for institutional partners (universities, research centers, scientific societies, and memory institutions), in particular, to enable the OA publication of peer-reviewed journals, conference proceedings, and monographs in the SSH. EKT works with Open Monograph Press (OMP), an Open Source software developed by the Public Knowledge Project (PKP) to organize peer review and editorial processes. OMP can also operate as a website.
4: The Universitätsverlag Göttingen (Germany) is the dedicated publishing house of the Georg-August-Universität Göttingen and is part of the group Electronic Publishing, in which several services and projects of the Niedersächsische und Universitätsbibliothek Göttingen are operated. These include projects on publishing and Open Science, advisory services, Open Science Campus activities and various repositories. The university publishing house is managed by an editorial board from the university, which consists of members of all 13 faculties, ensuring the quality of the publications. The university press publishes about 60 books per year, mainly from the SSH, which are also distributed through print on demand.
5: Ubiquity Press (UK): Ubiquity Press is an open access publisher of peer reviewed journals, academic books, and data. Ubiquity provides its own platform and various services. Ubiquity works with the Book Management System RUA, an Open Source application developed by Ubiquity to assist with the monograph publishing life cycle, from submission to both internal and peer review, from copy editing to production and publication.

In order to simplify the integration of monographs in the universe of Open Science, HIRMEOS operates as a distributed system in which the homogeneity of the platforms is not achieved simply by using a single software for all of them, but by the adoption of common standards. The different, independent publishing platforms participating in HIRMEOS are willing to use the same metadata as a result of the project and to implement common services accordingly. Thus, the publication system remains open to the future participation of other platforms; this participation is to be simplified by an implementation guide created during the course of the project.

4.2. HIRMEOS Data and Services Providers

Encouraging the annotation of Open Access monographs in order to foster conversation between scholars, between teachers and students, and between the academic community and a non-academic audience is one of the aims of the HIRMEOS project. HIRMEOS provides the five publishing platforms with the same standards for various services and tools intended to make the use of Open Access monographs in the SSH more attractive to readers. These services are intended to better identify and discover Open Access Monographs, as well to certify their quality, measure their impact, and intensify their usage through annotations.

By adding these different services and tools, the HIRMEOS project aims to facilitate various activities related to the use of digital monographs, of which annotating is one of the most important. In fact, interacting with a printed book, by underlying portions of the text or writing comments on the margins, is a key way to grasp and process the content of a book and fix ideas inspired by reading.

The centrality of these practices and the long tradition behind them make it clear why many students and researchers, while able to work using digital monographs with electronic devices today, still prefer to work with printed books. This means that digital monographs will increase in popularity only when they can provide the key functionality that users can enjoy in a printed format. However, the fact that digital monographs should also be annotatable does not mean that the ultimate goal of a tool for digital annotation is merely to replicate the dynamics of printed text annotation in the new digital environment. We already have an example of such an attempt to emulate the printed format in making the PDF format the main format for digital document. This strategy, as in the case of PDF, which at the beginning of the digital age made reading friendly to users and thus encouraged their interaction with digital publications, can in the long run be an obstacle to realizing the full potential of digital technologies. One should consider whether the popularity of PDF has slowed down the acceptance in some disciplines of other formats such as machine-readable formats, that can enable the use of many different tools developed for the digital humanities.

4.2.1. Metadata for the Identification of Books and Authors

As part of the project, a workflow was established to add the following metadata to each monograph published on the participating platforms:

All documents published on the platforms are identified by Crossref DOIs. Digital object identifier (DOI) technology enables usable, interoperable, and persistent identification of digital objects. DOI technology uses an identification syntax and a network resolution mechanism (Handle System^®), as well as a stable and sustainable infrastructure.
If the authors have an ORCID ID (Open Researcher and Contributor ID), the platforms involved in the project display it next to the author name. ORCID is a non-proprietary alphanumeric code for the unique identification of authors. This addresses the problem that the contributions of certain authors can be difficult to recognize since most names of persons are not unique, can change (e.g., in the case of marriages), can have cultural differences in the display order of names, may contain inconsistent use of first name abbreviations, and utilize different writing systems. The ORCID organization offers an open and independent register, which is already the de facto standard for the identification of authors of scientific publications.
Through FundRef Data, it will be possible to identify the funding institution and the research project behind a specific publication. Publishers can provide financing information for articles and other content using a standard taxonomy of the sponsor’s name. A taxonomy of standardized names of funding agencies is offered by the Open Funder Registry, and associated funding data is then made available via Crossref search interfaces and APIs for sponsors and other interested parties.

All in all, the standardization of the platforms according to these standards should significantly improve the findability of Open Access monographs, which is not always optimal today.

4.2.2. Entity Recognition Tool for Discoverability and Enrichment of Texts

Entity extraction and resolution is the task of determining the identity of entities mentioned in a text against a knowledge base representing the reality of the domain under consideration. This could be the recognition of generic Named Entities suitable in general purpose subjects, like people, location, organizations, and so on, but it could also involve the resolution of specialist entities in different domains.

To increase discoverability and usability of OA Monographs, HIRMEOS experimented with the integration of a Named Entity Recognition and Disambiguation (NERD) for visualizing key words and disambiguating concepts in the full text. The software powering this service, called entity-fishing, was initially developed by Inria in the context of the EU FP7 project CENDARI4 and provides automatic entity recognition and disambiguation using Wikipedia and Wikidata data sets. The application is distributed with an open-source license, and it has been deployed as a web service in DARIAH’s infrastructure hosted by the French HumaNum.5 The entity-fishing interface implements a variety of functionalities, like language recognition, sentence segmentation, and modules for accessing and looking up concepts in the knowledge base. One of the most widely implemented use cases was the improvement of the platform’s search interface using entities extracted from the library content. This was done by extracting specific Named Entities then enabling users filtering their search with these complementary parameters.

For example, OpenEdition extended their Books Catalogue6 by adding two additional facets to filter out books by entities of the type PERSON and LOCATION. Along the same lines, Goettingen State Library also extracted mentions to organizations from the library corpus in their book catalog.7 Once the annotation data is collected and stored, it can also be used for automatic generation of word clouds at the repository level, with words displayed by their importance (relevance, frequency, etc.).8 Users are then able to access the most important concepts at library level. The underlying data is effectively the same used for facet searching, but with different visualization options.

4.2.3. Certification of Scientific Quality

The majority of scientific monographs undergo intensive quality assurance and evaluation procedures, which are, however, less standardized in the SSH than in other disciplines. Regardless of which review procedure would be optimal for monographs, HIRMEOS is developing a certification system that categorizes and standardizes review procedures. In this way, users can immediately recognize which review procedure a publication has undergone. A peer review certificate and an OA license certificate will be added to each document published on the five platforms.

The peer review and OA license certificates are delivered to the Directory of Open Access Books’ (DOAB) various partners. DOAB offers a quality-controlled list of peer-reviewed Open Access monographs (including book chapters) and book publishers. By developing this quality-controlled list, DOAB enables researchers, libraries, and discovery services to easily identify and search peer reviewed Open Access monographs, improving the discoverability, access, and use of monographs around the world. After an application process, publishers that meet the DOAB peer review and Open Access requirements and have the corresponding licenses are listed in DOAB and can then upload the metadata of their Open Access books. Such metadata can then be disseminated through the OAI-PMH (Open Archives Initiatives Protocol for Metadata Harvesting) protocol implemented by third-party providers such as libraries and search services; thus, improving the findability of books. The DOAB certification service includes a classification system and also allows certified publishers and publishing platforms to collect DOAB certification and icons through the DOAB API. Certified publishers and publishing platforms that meet DOAB’s requirements agree to the conditions of DOAB certification and commit to pass an audit to verify their peer review procedures.

4.2.4. Metadata for Metrics and Legacy Metrics

The measurement of impact and resonance presents specific challenges for Open Access monographs. Keeping track of current downloads, readership, and reuse across multiple hosting platforms is difficult, but important if one wants to understand and track the reach of Open Access monographs. The use of alternative metrics (Altmetrics), which measure the number of mentions of a document in social networks and other kinds of publications, has also increased significantly in recent years because it helps to better understand the impact of scientific publications by documenting the resonance of scientific content in broader communities and beyond the specific academic context.

HIRMEOS partners have enabled their platforms to collect usage metrics and alternative metrics and to display these directly on the documents. The altmetrics record the following measures for books: tweets, Facebook shares, Wikipedia quotes, and annotations. The service is designed to operate on a daily basis; it can therefore also make this data available in chronological order. Since the data sources often change their access methods, licensing and conditions, this is a maintenance-intensive system.

The metric service extracts usage data from the platform in question and uses the identifiers database to enforce the use of a particular type of identifier, such a DOI. The service connects the identifiers used by the reporting platform with the one desired by the data collector. The data collected is stored in the form of events, each of them recording the measure collected, the timestamp, the identifier of the work concerned by the event, and the number of times the event was repeated, e.g., there were four downloads of this book in this platform. The standardization process not only normalizes identifiers, it also tags each event with a URI (Uniform Resource Identifier) identifying the measurement it represents (e.g., views), the platform reporting the event (e.g., Google Books), and also provides a link to the definition of the measure to provide user-friendly description.

4.3. Annotating Monographs in the Framework of HIRMEOS

HIRMEOS has implemented a tool on all its publishing platforms to make the open annotation of monographs possible. The Hypothesis annotation tool allows annotations at a sentence (or phrase) level, such as criticism or notes on news, blogs, scientific articles, books, terms of use, campaign initiatives, legislative procedures, and more. The tool is based on an open source JavaScript library and annotation standards developed by the Web Annotation Working Group created by the World Wide Web Consortia (W3C), a non-profit, in 2014. A coalition of over 60 scientific publishers, including PLOS, Wiley, and Oxford University Press, partner with Hypothes.is through the Annotating All Knowledge Initiative.

Aiming to create the best conditions for digital annotation, the HIRMEOS project has introduced technical implementations and given specific support to the academic community. During the project, an annotation layer was implemented directly on the publication platforms involved in the project, so that the user is not required to install the Hypothes.is plug in his browser when reading the monographs published on the platform. In addition, some of the platforms provided a small space on their homepage to inform the user about the use of the tool and the possibility of setting up working groups. Concerning the provision of support to the academic community and stimulating the annotation experiment, two initiatives should be mentioned more in detail: the support offered by the Göttingen University Library to annotation activities in a seminar of Philosophy at the University of Göttingen and the post-publication peer review experiment coordinated by Open Edition Books.

4.3.1. Annotating in A Seminar: Toward an Interactive Seminar Reader for Philosophy Students?

In the following we would like to describe the structure and the essential elements of the usage scenario that will be tested in the summer semester 2019 in the context of a philosophy seminar at the University of Göttingen with the support of the HIRMEOS process. The results of this experiment will be discussed in a later publication.

It is a so-called “Hauptseminar,” i.e., a seminar for MA students and for BA students in the final phase of their studies. The course is open to a maximum of 20 participants interested in the history of philosophy and practical philosophy, and it deals with Immanuel Kant’s virtue conception. As usual in philosophical seminars, the course focuses on the interpretation exercises of various texts under the guidance of the lecturers. In this case, passages from Kant’s Metaphysische Anfangsgründe der Tugendlehre (1797) will be discussed.

The lecturers will use the Hypothesis tool to enable the preparation and follow-up of the individual seminar sessions, which each revolve around a specific text. In particular, the tool will be used to perform different tasks. During the preparatory reading, the tool will enable the creation of notes on comprehension problems or possible interpretation paths, and these notes will be directly accessible to the lecturers and all other participants. These notes should then serve as a basis for discussions in individual seminar sessions and can then be revised or supplemented as a follow-up with regard to the results achieved.

The lecturers have the following expectations:

Easy access and quick control/response options should significantly increase the scope and quality of the preparation and follow-up of the individual sessions by the students.
This may have a positive influence on the deeper understanding of the text as well as on the discussion in the seminar (especially in the case of the present text, which is relatively unstructured and whose interpretation requires such references all the more).
An efficient way for lecturers to communicate with students on factual issues both within and outside the seminar itself.
Better communication and cooperative work between the students, which in particular also involves those in the discourse who are otherwise rather reserved in the seminar sessions. (Ideally in the long run this will also lead to their becoming more active in the sessions themselves.)
Streamlined management of the ‘small work’ to be done by the students (ungraded preliminary work, which should have a total volume of 2–4 pages depending on the module).
The resulting corpus of annotations can ultimately also serve as documentation of the work in the seminar (and thus replace seminar protocols, for example) and at the same time provide students with a good basis for preparing their examination performance (term paper or oral examination).

A particularly important aspect of this activity will be that the teachers will create three different classes of tags to describe the realized annotations:

1. Tags to describe the annotation type:

?: Understanding Question (What exactly is incomprehensible and why?)
!: Important text passage (To what extent and why is it a central statement?)
I: Interpretation required (How can this be understood? Are there different readings? If necessary, what speaks for or against the reading(s)?)
K: Commentary (critical examination of a concrete statement: Is the statement factual (un)plausible and why?)

2. Tags to describe the function of the annotated text passage:

‘definition’ = definition
‘beweis’ = proof
‘beispiel’ = example
‘erläuterung’ = explanation

3. Tags to categorize the annotation content:

‘ugend’ = virtue
‘recht’ = right
‘ethik’ = ethics
‘vollkommenheit’ = perfection
‘glückseligkeit’ = bliss
‘liebe’ = love
‘achtung’ = respect
‘selbst’ = oneself
‘andere’ = other
‘emotion’ = emotion
‘erziehung’ = oneself

This usage scenario not only points the way to a new form of learning in philosophy, but also to new forms of publication. Even if not yet planned, it would be very useful to publish the texts discussed in the seminar in a digital seminar reader, which would also contain a selection of the annotations realized in the course of the seminar. The benefits of such a publication are obvious: first, the material would be reusable for new seminars, and students would be able to find questions and answers that are likely to recur. Second, such digital editions of the seminar readers could be constantly updated, also taking into account the new annotations generated in later seminars.

4.3.2. Open Peer Review Experiment by OpenEdition9

In OpenEdition, the implementation of the annotation tool was part of an open post-publication peer review experiment. The objective of this experiment is to create a space both for scientific conversation around publications and to stimulate new forms of peer review. Through the annotation tool users can discuss theses and arguments presented in the scholarly books and also enrich them by providing new ideas. Developed as a form of post-publication peer-review, the experiment will play a strategic role in the framework of the HIRMEOS project which provide (see Section 4.2.3) a certification system to describe different kinds of peer review.

In this way, we will explore the value of opening new peer review processes and extracting the ongoing discussions.

The project focuses on 13 books from four publishers10 opened for annotation from February to June 2019. Since February, readers were invited—during general campaigns (social networks, mailing lists, blogs) or contacted directly and individually—to read and annotate the selected books. Annotators are supported in learning to use Hypothes.is and in understanding the annotation process and its objectives. Annotations made by readers and authors will ultimately be studied and described in the overall results of the experiment, which will be the subject of a report.

The following aspects deserve particular attention:

Community outreach activities and clear guidelines are essential. The launch of this experiment was preceded by an important preparatory phase aimed to define the technical framework of the annotation activities. First, community outreach activities involved publishers and authors. They worked together with the staff of OpenEdition to give visibility to the experiment and invite other potential commentators. Second, in order to provide users with the best possible support, OpenEdition provided documentation and a user guide for Hypothes.is.11 We also established rules of good conduct to regulate annotations.12 These rules are broad enough to allow considerable freedom of use for annotators, but restrictive enough to protect authors from malicious or inappropriate comments.
Creation of publisher groups. OpenEdition needed to create specific groups of annotations for the open peer review process for each publisher of the monograph to be annotated. In order to make this possible, Hypothesis enabled the creation of publisher branded and moderated annotation groups: publisher groups. This presents two main advantages: (a) readers and annotators can activate different layers depending on the read-write experience they want to have; (b) in this way, every publisher maintains the ability to moderate the annotations made as part of this experiment.
Fast and reliable notification of authors and commentators. One of the main features that motivated OpenEdition to use the Hypothesis tool was the reply feature. When a few annotations are made, authors are notified by the publishing secretary in charge of monitoring the project and encouraged to respond to their readers’ annotations if they deem it appropriate. This gives readers the opportunity to react directly to the annotations of other annotators, and thus to achieve one of the objectives of this experiment, i.e., to create a real conversation and provide feedback to authors.

4.3.3. Checking the Quality of HTML Content at Ubiquity Press

Ubiquity Press used Hypothes.is in the in the quality assurance for the transfer of published content that was moving to their platform.13 Although the example presented refers to a journal, it is very likely that similar best practices will also be applied to monographs which are also published on the Ubiquity platform. Transferring HTML versions of content can be challenging, as the environment that the HTML file is in, is changing and therefore the HTML may not display as expected—it can also be an opportunity to improve old HTML (for example to make it reflowable for small screens, add accessibility features etc.). An annotation group was set up with the Editor-in-Chief and their team of volunteer students; the transferred HTML content was quality checked; where transfer problems were identified in-situ comments described the problem in-context and could be replied-to for clarification, and hashtags were used for consistent problems (CSS (Cascading Style Sheets) incompatibility issues, unicode encoding issues etc.) and frequent problems could be solved as a batch or with a global fix. Ubiquity Press will consider using Hypothes.is for other complex transfer projects in future, as it helped track and manage the issues, as well as working as a great communications tool.

5. Conclusions

What came out of experimenting with new forms of annotation on digital monographs during the HIRMEOS project suggests that there is a continuous, albeit slow, process of fluidification of the traditional concepts of books and monographs. After a first phase in which digitization was to replicate the structure of the traditional paper format in the new one, mass annotation contributes to a process in which the monograph becomes a kind of hub or container for a multitude of contributions. However, this should not lead to a chaotic coexistence of contributions from different authors and, as can be seen in the examples described in these papers, there are already tools and best practices to regulate and structure the annotation process and thus better define the type of output expected from such interaction with the monographs. This also means that the breaking down of the traditional boundaries of the book reveals the need to develop new editorial solutions to adequately present texts and annotations.

For publishers and digital platforms, new scenarios are opening up in response to the need of new publications that, on the one hand, must still have their own identity and authorship, which makes them recognizable and quotable, but, on the other hand, can also be living entities in constant development. It is also a matter of developing graphic solutions that make it possible to present a multitude of contents without impairing the user-friendliness of a text. Finally, it is necessary to clarify which are the best solutions to archive these annotations created with mass annotation tools over a long period of time, so that they remain permanently associated with a certain work and are easily discoverable. Once the answers to these questions are satisfactory, new ways of exchanging ideas and presenting them in digital publications will be possible. However, we cannot yet fully foresee the impact this will have on research, education and intellectual life in general.

Author Contributions

Both authors co-wrote the article with contributions throughout.

Funding

This research was funded by from European Union’s Horizon 2020 research and innovation program under grant agreement 731102.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Arbesman, S. The Network Structure of Jewish Texts. Wired. 10 July 2014. Available online: https://www.wired.com/2014/07/the-network-structure-of-jewish-texts/ (accessed on 2 May 2019).
Wolfe, J.L.; Neuwirth, C.M. From the Margins to the Center: The Future of Annotation. J. Bus. Tech. Commun. 2001. Available online: https://journals.sagepub.com/doi/10.1177/105065190101500304 (accessed on 3 May 2019). [CrossRef]
Bush, V. As We May Think. The Atlantic. 1945. Available online: https://www.theatlantic.com/magazine/archive/1945/07/as-we-may-think/303881/ (accessed on 2 May 2019).
Andreessen, M. Why Andreessen Horowitz is Investing in Rap Genius. Genius Blog. Available online: https://genius.com/Marc-andreessen-why-andreessen-horowitz-is-investing-in-rap-genius-annotated (accessed on 2 May 2019).
W3C. Three Recommendations to Enable Annotations on the Web. W3C Blog. 23 February 2017. Available online: https://www.w3.org/blog/news/archives/6156 (accessed on 2 May 2019).
NCBI Insights. PubMedCommons to be Discontinued. 2018. Available online: https://ncbiinsights.ncbi.nlm.nih.gov/2018/02/01/pubmed-commons-to-be-discontinued/ (accessed on 2 May 2019).
Mod, C. The Future Book is Here, But It’s Not What We Expected. Wired. 20 December 2018. Available online: https://www.wired.com/story/future-book-is-here-but-not-what-we-expected/ (accessed on 2 May 2019).
Wolfe, J. Annotations and the collaborative digital library: Effects of an aligned annotation interface on student argumentation and reading strategies. Int. J. Comput. Support. Collab. Learn. 2008, 3, 141–164. [Google Scholar] [CrossRef]
Bertino, A.; Foppiano, L.; Romary, L.; Mounier, P. Leveraging Concepts in Open Access Publications. Preprint. Available online: https://hal.inria.fr/hal-01981922 (accessed on 2 May 2019).

1	https://musically.com/2017/03/16/genius-pivots-towards-easy-consume-formats-like-video/
2	FAIR (findable, accessible, interoperable, reusable)
3	CLOCKSS, or Controlled LOCKSS (Lots of Copies Keep Stuff Safe), is a shared dark archive that runs on LOCKSS technology (https://clockss.org/); PORTICO is a digital preservation service funded by libraries and publishers (https://www.portico.org/).
4	Collaborative European Digital Archival Research Infrastructure (www.cendari.eu)
5	DARIAH, Digital Research Infrastructure for the Arts and Humanities (www.dariah.eu). HumaNum, a research infrastructure aimed at facilitating digital change in the SSH (humanities and social sciences (www.huma-num.fr/).
6	http://books.openedition.org/catalogue
7	https://www.univerlag.uni-goettingen.de/handle/3/Goettingen_studies_in_cultural_property_series
8	Further details and examples of the entity-fishing based services tested on the platforms participating in HIRMEOS can be found in [9].
9	We thank Claire Dandieu (OpenEdition Center) for her contribution to this chapter.
10	https://oep.hypotheses.org/2122
11	http://www.maisondesrevues.org/1281
12	https://www.openedition.org/22530
13	https://journal.digitalmedievalist.org

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.