Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessCase Report

Peer-Review Record

The Fast and the FRDR: Improving Metadata for Data Discovery in Canada

Publications 2020, 8(2), 25; https://doi.org/10.3390/publications8020025

by Clara Turp^1,*

, Lee Wilson²

, Julienne Pascoe³ and Alex Garnett⁴

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Renaine Julian

Publications 2020, 8(2), 25; https://doi.org/10.3390/publications8020025

Submission received: 14 February 2020 / Revised: 27 March 2020 / Accepted: 27 April 2020 / Published: 2 May 2020

(This article belongs to the Special Issue 14th International Conference on Open Repositories 2019 – All The User Needs)

Round 1

Reviewer 1 Report

This paper covers several topics of current interest, and the problem of inconsistent metadata is a serious concern in the field of LIS. Discovering new methods of reconciling metadata with minimum manual effort will be critical to improving the interoperability of our systems and consequently making records discoverable. It is worthwhile to explore the merits of using an open tool to clean up metadata programmatically, but the execution falls short in several respects.

The biggest issue is that the project of using OpenRefine to establish a metadata workflow is described, but the intended significance is unclear. Is the argument that this methodology could be viable in the future with some adjustments? The low number of automatic matches reported in the paper and the ongoing amount of human mediation required to match keywords to FAST (when they can be matched at all) suggests otherwise. If the authors intend to argue the merits of continuing to develop this workflow and hopefully make it feasible for real adoption, they need to do so more explicitly and provide reasons why the initial lack of success is not as big of an obstacle as it seems. If, on the other hand, this paper is intended as a report of “negative results”, more or less, that is also fine and worthy of publication, but then the lessons learned need more focus. What was it about OpenRefine that made it fail as a tool? Or, why was it ultimately not workable to translate the keywords into FAST headings/vocabulary? In other words, a negative result still needs to be actionable so that those who try similar projects know what not to do.

As might be expected from a paper that is unclear on what is fundamentally being said, organization is another serious problem. Substantial reorganization of the paper is required so that there is a logical order: from background to problem to attempted solution to significance. Sections are often written out of order, which is fine, but then another pass of the full paper needs to be given for flow. One example of a problem is that acronyms such as FAST and FRDR are used before they are written out completely (in the body text), then written out completely later on as if it is the first time they are being used. There is also an issue with major section headings. While the intro is clear and concise and the next heading, literature review, is standard, the subsequent nonstandard heading (which also repeats “2”) is about context and then the next major heading is labeled “Discussion”, which is typically found near the end of the paper. Recommended to organize the paper around more standard headings – understood that this is a case study, but one section could be “rationale” (in place of methods) following on from the lit review/context and then “procedure” (in place of results) describing what was done. Then the discussion can go into the significance and the conclusion can sum up.

Suggested course of action is to step back and agree on what, specifically, you want to communicate about this attempt to improve the FRDR user experience, then focus the paper around that. Clarifying the most important points will make this report a positive contribution to the field.

Comments for author File: Comments.docx

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 2 Report

This is an interesting case study of a unique and ambitious project. Both the project and its description have value for the broader libraries community, but there are areas where this paper could be strengthened. My comments on this paper are enumerated below.

1. There are relatively minor but distracting grammatical errors throughout the paper. For example, in the introduction, the authors write that "Libraries, having both a high degree of confidence in their own data and a high expectation of the labour involved in producing it, have been fairly slow to adopt automatic entity resolution for bibliographic control systems, preferring wherever possible to perform this canonical lookup manually." One would assume that the "it" in "the labour involved in producing it" would refer to the Libraries' own data based on the sentence structure, but then it doesn't follow that libraries would be slow to adopt an automatic solution and would instead prefer manual lookup if the latter is associated with a great deal of labour.

2. The literature review overall feels disjointed and awkward. For example, OpenRefine is introduced with bullet points and then is not mentioned again until the "Developing a Proof of Concept" section much further in the paper. It's not until the reader reaches that section that the purpose of introducing OpenRefine becomes clear. Given that the current literature review is only about two paragraphs long, it would be useful to expand this section to provide greater flow.

3. I was surprised not to see reference to the Tri-Agencies Council and their promotion of data management and sharing. This seems like an important piece of context. It would also be helpful for the authors to provide more information about data sharing and repositories as a part of their literature review, since this would provide context but also make the work more meaningful outside of their specific environment.

4. Although the number of repositories is reported, it would be useful to have a sense of the number of institutions overall in the country. A numerator without a denominator isn't particularly meaningful.

5. It would be useful to have more details about the institutions involved in the working groups. Were these entirely large, research intensive institutions, or was there a mixture of sizes and institution types? Can you provide any insight into the types of systems these institutions were using, how they were staffed, etc.?

6. The screenshot of OpenRefine is not going to translate well to an article. The text is small and difficult to read. For people unfamiliar with OpenRefine, it just looks like a spreadsheet editor. For those familiar with OpenRefine, it is unnecessary. If the authors feel that a visual would be useful in highlighting their workflows or processes, it would be preferable to develop a visual outlining that as opposed to a screenshot of an interface.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 3 Report

This case report is potentially valuable to the library community; however, significant revisions would be needed in order for this draft to be publishable. This is mostly due to the lack of academic and scholarly literature supporting the claims made in the article. Specifically, none of the claims in the introduction are supported by citations of any kind. Additionally, the literature review only has two citations in it, one of which is the results of a survey. Please consider consulting research data management and data curation literature to support statements made in your introduction and in your literature. In generally, the literature should be more robust.

This report should also be reformatted in more logical format. Please consider reviewing other case reports in MDPI publications as well as other articles in Publications for ideas. Specifically, there should be a specific methodology section that houses the technical workflows in your article. The discussion section is the largest part of the paper and includes many things that should be in the methods and other sections.

While this work is valuable to the profession and should be shared as a case report, significant revisions to the introduction, literature review, and overall arraignment/structure of the article in order for it to be suitable for publication.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

This version of the paper has improved significantly over the last version. There are a few sections that could be made more readable, but there are no longer structural issues and the broader significance of the project is now clear. The explanation of the Canadian RDM context has improved as well. Further changes would mostly need to focus on sentence clarity and flow.

Reviewer 3 Report

The authors have put significant efforts towards citing additional literature in the study. The literature cited is relevant and backs the claims being made. Additionally, the general reformatting of the paper added a great deal of clarity to the article. The quality of this article has been improved to a point where I believe that this is ready for publication.

Article Menu

The Fast and the FRDR: Improving Metadata for Data Discovery in Canada

Further Information

Guidelines

MDPI Initiatives

Follow MDPI