Review Reports - Investigating the Impacts of Misspellings in Patent Search by Combining Natural Language Tools and Rule-Based Approaches

Round 1

Reviewer 1 Report

The title of the paper suppose that it should be devoted to evaluation of the typo robust word/subword/sentence embeddings. Nevertheless, the actual content of the paper includes a typo ontology. As far as I know there were presented several such ontologies during the past decades, e.g. [1]. I suggest the authors make a background check on this topic. So I would expect the comparison of the proposed ontology with the existing ones. Possible aspects of comparison may be: recall in terms of types, coverage, applicability.

The other crucial point is there is no evaluation in the paper. The ontology itself, being interesting per se, is not enough to say that title of this paper is truthful. I would expect a generation algorithm using this ontology, an evaluation of this algorithm in comparison with real data. Then I would expect to see a corpus description of queries and patents (each of which could be misspelled). Then I expect actual word/subword embedding models to be used in a retrieval system and demonstration of improvement with usage of such a) models, b) noise generation, с) typo ontology.

In closing I would like to say that in the current form the paper cannot be accepted. It needs a serious effort from the authors to be considered as completed.

Reference:

1. Suzuki, H., & Gao, J. (2012, May). A unified approach to transliteration-based text input with online spelling correction. In Proceedings of EMNLP.

Author Response

All the comments are listed in the attached file

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors present a timely and interesting article about misspelling a in patent texts. The reviewer concurs with the authors that this is a common but grave issue facing the scientific community at large. The only issue I found with the current manuscript is the presence of a few misspellings. Some examples include ‘volontary’ in Table 1, Ln 203: ‘Kunming University of ….’, Ln 296 should read ‘Robert Bosch’,

It is funny enough that a paper on misspellings should contain such grave issues. The authors are encouraged to properly proofread the manuscript for the next round.

Author Response

We thank you for the comments provided. The manuscript was proofread and the typos have been corrected.

Round 2

Reviewer 1 Report

I see the improvement on the typo generation description, which is great. But there is no still empirical evaluation of the method on some open source (or at least easily available) datasets. Also there is no baseline comparison.

Thу paper cannot be accepted in present form.