Next Article in Journal
The Demographics of Deception: What Motivates Authors Who Engage in Misconduct?
Next Article in Special Issue
A Model for Institutional Infrastructure to Support Digital Scholarship
Previous Article in Journal
A Case-Control Comparison of Retracted and Non-Retracted Clinical Trials: Can Retraction Be Predicted?

Publications 2014, 2(1), 38-43; doi:10.3390/publications2010038

Article
The Means of (Re-)Production: Expertise, Open Tools, Standards and Communication
Martin Paul Eve
University of Lincoln, Brayford Pool, Lincoln LN6 7TS, UK; E-Mail: martin@martineve.com
Received: 23 January 2014; in revised form: 7 February 2014 / Accepted: 10 February 2014 /
Published: 13 February 2014

Abstract

: This article examines the current difficulties faced in penetrating the world of scholarly communication technology. While there have been large strides forward in the disintermediation of digital publishing expertise—most notably by the Public Knowledge Project—a substantial number of barriers remain. This paper examines a case study in terms of scholarly typesetting and the Journal Article Tag Suite (JATS) standard before moving to suggest three potential solutions: (1) The formation of open, non-commercial and inclusive (but structured) organizations dedicated to the group exploration and standardisation of scholarly publishing technology; (2) The collective authoring of as much technological and process documentation on scholarly publishing as is possible; (3) The modularisation of platforms and agreement on standards of interoperability. Only through such measures is it possible for researchers to reclaim the means of (re)production, for the remaining barriers are not difficult to understand, merely hard to discover.
Keywords:
publishing; open access; technology; typesetting; JATS; XML

1. Introduction

A significant weakness in the current system of scholarly publishing lies in the esoteric black-box of the commercial toolchain: “how”, the naive newcomer asks, “do they get from my messy Word document through to those polished HTML and PDF files?” “What”, asks her friend, “do they use to organize their peer review? Whence this polished platform?” Thus, while we have a proliferation of academic-run open access journals, running mostly on the Public Knowledge Project’s Open Journal Systems, there are significant gaps in the process that non-technical users are unable to correctly implement, including, but not limited to: digital preservation, CrossRef DOI setup and typesetting.

In this article, I want to concentrate on the barriers to entry to the technical side of publishing, using systems of typesetting as an example, and suggest that reforms to scholarly publishing must include openness of the tools that are used as a criterion. I want also to analyse what is meant by publishers having “expertise” in technological systems and, through a quick detour into studies of expertise, I note that the barriers to entry here may be predicated more on commercial obscurity than any true underlying complexity. However, until we control these means of production, none of the economic benefits that are often extolled as a prime concern of the Open Access movement will come to fruition and I here propose three steps that could let us test whether this expertise is really something beyond the grasp of the masses.

2. Typesetting: A Case Study

In answer to the hypothetical newcomer I posited at the start of this article: scholarly articles are often transformed into the final products through an XML interchange specification called the Journal Article Tag Suite (JATS), the Journal Article Tagging Suite (previously known as the National Library of Medicine Journal Publishing Tagset [NLM]), now standardised as NISO Z39.96 or a similar type of proprietary XML-based setup [1]. XML is a markup language that encloses textual elements in tags that encapsulate information about that text. For instance, the start of a JATS document might look a little like this:

<front>
  <journal-meta>
   <journal-id>Orbit: Writing Around Pynchon</journal-id>
   <issn>2044-4095</issn>
  …
  </journal-meta>
</front>

In this instance, the markup here represents the front-matter of the article, the meta-data about the journal in which this article is published and the ID of the journal along with its ISSN. Each opening tag has a corresponding closing tag that prefixes its element name with a forward slash: <open></open>.

These XML formats promise a great deal: the combined human- and machine- readability that they offer makes them an appealing interchange format between the moderately complex Microsoft Word docx formats (which are; in actuality; a zip file with a series of complex and interconnected XML documents within that are difficult to parse), its predecessor .doc (which is an even harder task to process) and the likewise complex PDF outputs. They also promise a way in which it becomes theoretically simple to synchronise different output formats: after all, through such a process, it becomes possible to generate a range of output objects (PDF; HTML; EPUB, etc.) from a single source document, thus ensuring a consistency between the formats.

As utopian as JATS sounds, however, and despite its canonical open credentials (it is a NISO standard), there are several problems with the format that can roughly be divided (as can most) into the technological and the social (and, in this instance, the two are deeply interconnected). Let me start with the technological. Publishers spend tens of thousands of dollars each year to license tools such as eXtyles that plug-in to Microsoft Word to create typesetting workflows that enable them to get from Word, to JATS, to their output format. These are often hideously complex and, in almost all instances that I have encountered, proprietary. Take for instance a description of a desirable workflow from the 2010 JATS Conference:

  • The Federation of Animal Science Societies recently implemented an XML-based journal publishing workflow with the eXtyles software and Typéfi Publish at the core. With eXtyles, we export XML (validated to the NLM Journal Publishing DTD) from edited Word files, which is used by the Typéfi Publish system to automatically generate a composed journal article […] Our technical editors work in the familiar environment of Word, compositors export XML (validated to the NLM Journal Publishing DTD) from Word using eXtyles, and the Typéfi Publish engine uses InDesign server, a journal-specific template, NLM XML, and graphics files (figures and math equations) to compose and produce an InDesign article and PDF [2].

The complexity and interaction between various proprietary procedures here fundamentally undermines the notion that JATS lives up to its promise: the commercial roots of the format have played out such that, with a limited target audience, tools that enable its usage are proprietary, expensive and opaque, since the profit incentive discourages inter-organisational participation. Indeed, that no price or purchasing options are available on the eXtyles website gives an indication of the type of corporate customer at which their product is aimed.

This excluding mode of technological availability, triggered by the commercial histories of the JATS format, leads to the second problem in this area of typesetting, but one that is replicated through most of the publishing toolchain: a lack of social awareness of processes, fostered by a closed commercial climate. The efforts of John Willinsky and the Public Knowledge Project have done much to facilitate the creation of journals that correctly comply with many discoverability protocols with little to no effort and/or technical knowledge on behalf of the creator. In other areas, though, there is still a long way to go. Consider that, at present, typesetting, the case under consideration here, remains an enormous gap in Open Journal Systems’ (OJS) otherwise comprehensive workflow. There is simply nothing between the submission document’s copyediting and “upload galley” stages and the editor must produce galleys from (in all likelihood) Microsoft Word, with no guidance on how to get to the PDF and HTML versions that most journals provide. Likewise, there is a hopeful-looking CrossRef Export plugin for OJS, but little guidance on how one goes about obtaining the, for all intents and purposes, mystical DOIs.

3. Expertise and Commercial Obscurity

Extrapolating from this niche area to the wider field and many publishers tout their esoteric knowledge of technological publication practices as the zone in which they add value; it is one of their areas of expertise. It is worth considering, however, what is meant by the term “expertise”. Expertise is defined by The Cambridge Handbook of Expertise and Expert Performance as “the characteristics, skills, and knowledge that distinguish experts from novices”, experts being those who “exhibit superior performance for representative tasks in a domain” [3]. While there are many ways in which expertise can be analysed and in which experts are assessed, it is notable that, in some domains, experts’ perform no better than less-trained individuals and that sometimes experts’ decisions are no more accurate than beginners’ decisions and simple decision aids’ [4]. In some domains, therefore, such as sprinting, “experts” can easily be specified as those who are faster and it is clear that experts fare better than non-experts. In others, such as academia, experts seem to be those who can consistently produce work that their peers deem valuable within the disciplining constraints of their fields. In areas of commercial endeavour, such as the form of publishing under discussion here, experts are those who can create the outputs that fulfil the expectations of their customers (predominantly academic libraries and, in turn, faculty). These last two fields form an interesting basis for a case study on the effects of commercial closure on “expertise”.

The university is interesting as a counter-example to a commercial landscape here because the motivation of this type of education from the side of the pedagogue is, ideally, non-rivalrous and anti-competitive; it can be applied in other environments with no detriment to the university/academic/teacher. Academics are those who show their students how to create outputs that look like, and hence are understood as, academic work. (I do not dismiss the loftier goal and ultimate purpose of furthering knowledge and understanding in the field but am more interested here in ascertaining the parameters that constitute an expert, the baseline for which seems to be disciplinary conventions.) This is done, however, on the understanding that this learning will be applicable in many different fields of endeavour across multiple organisations and institutions. Although academics are institutionally rooted, the education they facilitate is not tied to a single institution (in an academic context, doctoral students are not supervised in the expectation that they will work at that institution, for instance), and this is the case even in more vocational courses where the skills taught could apply to work in a number of competing commercial entities. The university teaches for its own financial benefit, perhaps, but what it teaches, it ideally teaches autonomously and in abstraction from a single institution in most cases.

Commercial entities, conversely, have a competitive incentive to ensure that the expertise (“training”) that they might impart to their staff remains rivalrously rooted within and for the sole benefit of their own institution. In the example of academic publishers, the knowledge that allows the individual to learn of the best practices for the reproduction of the objects recognised as “scholarly publications” is commercially sensitive and so is not widely talked or written about, despite its open standardisation. This is exemplified in non-disclosure agreements and funded placements at universities, where industry partners specify a period of service for which the employee must work solely for them in exchange for covering the tuition fees. This aspect is exacerbated, also, by outsourcing procedures, meaning that even fewer publishers have this knowledge in-house and there is a sub-field of expertise to which publishers are beholden.

This background is necessary because it clarifies why aspects of scholarly publishing seem so obscured. Although enshrined in open standards, the original NLM specification, as just one example, was formed from the desire to unify the XML formats of several different commercial publishers, thus saving each of the publishers labour while not truly compromising their trade secrets with one another. Conversely, however, a culture of competitive secrecy seems endemic here and deeply ingrained; there are no “how to” guides written by publishers, even for sale, encouraging amateurs to educate themselves in scholarly typesetting and there are relatively few blog/forum posts discussing the format and its lack of tools, indicating a low level of awareness of the process by which scholarly articles come into being.

Interestingly, though, it is not that these practices are incomprehensible or arcane—despite the jargon, it is relatively easy with a basic technical knowledge of XML to understand the schemata and processes involved—but that, from commercial roots, the communities of practice are obscure and the domains niche with limited general availability/inclusion. It is also not that there is no room for expertise here. Indeed, expertise, although often founded on obscurantism, should, in reality, be considered as the skilled application of knowledge to understanding in order to achieve a meaningful outcome; the ability to pursue a methodology that facilitates an interpretation through a specific knowledge-base. In the world of scholarly publishing (although hardly the sole area where this happens), a system has instead developed where technical obscurantism fostered within a small enclave environment has engendered a domain that is incommensurably difficult to enter relative to the difficulty of understanding the precepts involved. Rephrased: the technical mechanisms are not difficult to understand, they are hard to discover, an aspect that I attribute to the commercial roots and secretive culture of the technical processes, even when publicly standardised. Although I am sure that publishers would point to the fact that these are open standards and that there are public conferences on the topic, that all the tools designed to parse these standards are proprietary and that such little documentation on process is available is symptomatic of the commercial climate in which these technologies circulate. That, then, is not a function of technology but a social function of bodies that cloak their use of simple technologies under the masquerade of expertise.

4. Solutions: Open Tools, Communication and Standards

While it is wise to be wary of “openness” as a panacea, a simple empirical test can be formed to establish the basis for traditional publishers’ claimed expertise in the area of publishing technology. Through a proliferation of discourse and non-rivalrous education on the technological means and processes through which our publications come into being, we can ascertain whether the barrier to expertise in this field is actually technological or corporate-social. By putting in place a series of open initiatives that have an educational and communicative function, rather than “skills training” and commercial-secretive aspects, we can test whether the community can understand and formulate best practices outside of the commercial environment.

Specifically, there are a number of steps that I feel could be beneficial:

  • The formation of open, non-commercial and inclusive (but structured) organizations dedicated to the group exploration and standardisation of scholarly publishing technology. The Open Access Toolset Alliance initiative, which I founded in July last year, is one such body where we hold working group discussions and also advertise the technology that we are building, working out what different groups share in common [5]. We are open to working with commercial publishers, but they must be interested in developing open platforms and demonstrate a commitment to a “spirit of openness” to be involved.

  • The collective authoring of as much technological and process documentation on scholarly publishing as is possible. This should take the form not only of blog posts, an incredibly important form of dissemination for wider community and cross-disciplinary engagement, but also through digitally preserved journal formats. Journals of librarianship and scholarly communication should encourage articles that document the current state-of-the-art publishing practices in forms that are accessible to newcomers to the field, perhaps through issues on “breaking into scholarly publishing”. Through such publications, the boundaries of expertise could be broken down.

  • The modularisation of platforms and agreement on standards of interoperability. One of the problems with existing platforms is that they are difficult to alter (“hack” in the non-pejorative sense). Open Journal Systems and Ambra, the two stand-out open platforms for submission, document management, review process and hosting, are both relatively monolithic, undocumented and single purpose [6]. Other platforms such as Annotum are tightly integrated with other systems, like Wordpress. Through abstraction and interface design, we can create next-generation platforms that are repurposable and easier to comprehend for new developers, thereby enlarging the potential contributor pool and also allowing for the implementation of new, unforeseen workflows.

This process cannot guarantee success, but it can at least help us to begin to understand the ownership mechanisms and systems of expertise that govern the current means of re-production.

Conflicts of Interest

The author works on open-source typesetting technologies funded by the Public Knowledge Project. The author is also establishing an open access publishing company.

References and Notes

  1. Beck, J. NISO Z39.96 The Journal Article Tag Suite (JATS): What happened to the NLM DTDs? J. Electron. Publ. 2011, 14. [Google Scholar] [CrossRef]
  2. Adam, L.R.; Chandi, P. eXtyles, Typefi, and the NLM Journal Publishing DTD. J. Artic. Tag Suite Conf. Available online: http://www.ncbi.nlm.nih.gov/books/NBK47080/ (accessed on 11 February 2014).
  3. Ericsson, K.A. An Introduction to Cambridge Handbook of Expertise and Expert Performance: Its Development, Organization, and Content. In The Cambridge Handbook of Expertise and Expert Performance; Ericsson, K.A., Feltovich, P.J., Hoffman, R.R., Eds.; Cambridge University Press: Cambridge, UK, 2006; pp. 3–19. [Google Scholar]
  4. Camerer, C.F.; Johnson, E.J. The Process-Performance Paradox in Expert Judgment. How Can Experts Know so Much and Predict so Badly? In Toward a General Theory of Expertise; Cambridge University Press: Cambridge, UK, 1991; pp. 195–217. [Google Scholar]
  5. Eve, M.P. Evaluating the Open Access Software Toolchain. Available online: https://www.martineve.com/2013/07/07/evaluating-the-open-access-software-toolchain/ (accessed on 7 July 2013).
  6. The lengthy “quick start” guide for Ambra. Available online: http://ambraproject.org/trac/wiki/QuickStart (accessed on 11 February 2014).
Publications EISSN 2304-6775 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert