Investigating the Impacts of Misspellings in Patent Search by Combining Natural Language Tools and Rule-Based Approaches
Abstract
:1. Introduction
- HP. 1: Misspellings can be present in all the parts of the text of a patent, e.g., description vs applicant or inventor name, etc.
- HP. 2: Different types of misspellings, contained in the same or different parts of a patent text, can influence the information retrieval from patents more or less severely.
- Different search strategies (e.g., single or multi-words);
- The specific applicability for patents search;
- Different and random searched arguments (e.g., inventor, applicant, technology);
- The presence of different misspellings (in used queries), generated through tools for the misspellings generation;
- Validity for general and undomain application fields.
2. Literature Background about Misspellings Definition and Classification
2.1. Accidental Misspellings
2.2. Voluntary Misspellings
2.3. Misspellings Generation
2.3.1. Rule-Based
2.3.2. Statistical Method
2.3.3. Deep Learning-Based Method
3. Methodology
- Step 1 (selecting the patent database) regards the selection of the patent database within which the misspellings must be searched.
- Step 2 (classical patent search) defines the queries with which to search for the patents within a certain topic, containing grammatically correct keywords that define this topic. In this case, the goal is to count the number of patents that contain the correctly searched and written keywords.
- Step 3 (patent search including misspellings) defines the queries with which to search for patents that contain the same information but might be written incorrectly. For this reason, the keywords contained in these queries are the combinations of all the incorrect ways in which such information can be written. Within step 3, the misspellings are generated through a combination approach between natural language tools and rule-based approaches.
3.1. Step 1—Selecting the Patent Database
3.2. Step 2—Classical Patent Search
3.3. Step 3—Patent Search including Misspellings
- (+) right Truncation replaces any number of characters at the end of a term (bicycle+);
- (?) Truncation replaces zero or one character (bicycle?);
- (#) Truncation replaces exactly one character (b#c#cle);
- (_) The underscore allows for simultaneous searching of terms that may be written as one or two words. It will also retrieve results where there is a hyphen between terms, and it can also be used in chemical formulas.
4. Case Studies
5. Results
5.1. BRAYTON CYCLE Search
5.2. BICYCLE Search
5.3. BOSCH Search
5.4. CO2 Search and CARBON DIOXIDE Search
5.5. SUPERCRITICAL Fluid Search
5.6. SERGEI ALEXANDROVICH IKOVENKO Search
5.7. Final Considerations
6. Discussions
7. Conclusions
- Misspellings can be present in all the constituent parts of a patent, i.e., description, title, claims, applicant, and inventor.
- Different misspellings, voluntary or accidental, single, or repeated, affect the patent search differently. The biggest problem is that the longer or more complex the words are, the greater the number of possible misspellings. Some misspellings are easily identifiable and allow to identify the patent sought, others not, or worse, they lead to the identification of wrong and misleading patents.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Stein, B.; Hoppe, D.; Gollub, T. The impact of spelling errors on patent search. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France, 23–37 April 2012; pp. 570–579. [Google Scholar]
- Adams, S. The Text, the Full Text and nothing but the Text: Part 1–Standards for creating Textual Information in Patent Documents and General Search Implications. World Pat. Inf. 2010, 32, 22–29. [Google Scholar] [CrossRef]
- Moon, J.; Burstein, F. Ontology-based spelling correction for searching medical information. In Semantic Web Technologies and E-business: Toward the Integrated Virtual Organization and Business Process Automation; IGI Global: Hershey, PA, USA, 2007; pp. 384–404. [Google Scholar]
- Bhole, A.; Udupa, R. On correcting misspelled queries in email search. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015. [Google Scholar]
- Huang, B. WNSpell: A WordNet-based spell corrector. In Proceedings of the 8th Global WordNet Conference (GWC), Bucharest, Romania, 27–30 January 2016; pp. 136–143. [Google Scholar]
- Hossain, M.M.; Labib, M.F.; Rifat, A.S.; Das, A.K.; Mukta, M. Auto-correction of english to bengali transliteration system using levenshtein distance. In 2019 7th International Conference on Smart Computing & Communications (ICSCC), Sarawak, Malaysia, 28–30 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–5. [Google Scholar]
- Suzuki, H.; Gao, J. A comparative study on language model adaptation using new evaluation metrics. In Proceedings of the EMNLP, Vancouver, BC, Canada, 6–8 October 2005. [Google Scholar]
- Tokunaga, H.; Okanohara, D.; Mori, S. Discriminative method for Japanese kana-kanji input method. In Proceedings of the Workshop on Advances in Text Input Methods (WTIM 2011), Chiang Mai, Thailand, 25 July 2011. [Google Scholar]
- Suzuki, H.; Gao, J. A unified approach to transliteration-based text input with online spelling correction. In Proceedings of the EMNLP, Stroudsburg, PA, USA, 12–14 July 2012. [Google Scholar]
- Lee, J.-H.; Minho, K.; Hyuk-Chul, K. Deep learning-based context-sensitive spelling typing error correction. In IEEE Access; IEEE: Piscataway, NJ, USA, 2020; Volume 8, pp. 152565–152578. [Google Scholar]
- Cohen, W.; Ravikumar, P.; Fienberg, S. A comparison of string metrics for matching names and records. In Proceedings of the Kdd Workshop on Data Cleaning and Object Consolidation, Washington, DC, USA, 7–12 August 2003; Volume 3, pp. 73–78. [Google Scholar]
- Knuth, D.E. The Art of Computer Programming, Volume I: Fundamental Algorithms, 3rd ed.; Addison-Wesley: Boston, MA, USA, 1997. [Google Scholar]
- Stein, B.; Curatolo, D. Phonetic Spelling and Heuristic Search. In Frontiers in Artificial Intelligence and Applications; IOS Press: Amsterdam, The Netherlands, 2006; Volume 141, p. 829. [Google Scholar]
- Lee, J.-H.; Minho, K.; Hyuk-Chul, K. Improved statistical language model for context-sensitive spelling error candidates. J. Korea Multimed. Soc. 2017, 20, 371–381. [Google Scholar] [CrossRef]
- Lewellen, M. Neural network recognition of spelling errors. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, QC, Canada, 10–14 August 1998; Volume 2, pp. 1490–1492. [Google Scholar]
- Chrupała, G. Normalizing tweets with edit scripts and recurrent neural embeddings. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, MD, USA, 22–27 June 2014; Volume 2, pp. 680–686. [Google Scholar]
- Al Azawi, M.; Ul Hasan, A.; Liwicki, M.; Breuel, T.M. Character-level alignment using WFST and LSTM for post-processing in multi-script recognition systems-A comparative study. In Proceedings of the International Conference Image Analysis and Recognition, Loulé, Portugal, 22–24 October 2014; Springer: Cham, Switzerland, 2014; pp. 379–386. [Google Scholar]
- Liu, F.; Weng, F.; Wang, B.; Liu, Y. Insertion, deletion, or substitution? Normalizing text messages without pre-categorization nor supervision. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA, 19–24 June 2011; pp. 71–76. [Google Scholar]
- Contractor, D.; Faruquie, T.A.; Subramaniam, L.V. Unsupervised cleansing of noisy text. In Proceedings of the 23rd International Conference on Computational Linguistics: Posters, Beijing, China, 23–27 August 2010; Association for Computational Linguistics: Stroudsburg, PA, USA, 2010; pp. 189–196. [Google Scholar]
- Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 26–28 October 2014. [Google Scholar]
- Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T. Bag of tricks for efficient text classification. arXiv 2016, arXiv:1607.01759. [Google Scholar]
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. Roberta: A robustly optimized bert pretraining approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Lewis, M.; Liu, Y.; Goyal, N.; Ghazvininejad, M.; Mohamed, A.; Levy, O.; Stoyanov, V.; Zettlemoyer, L. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv 2019, arXiv:1910.13461. [Google Scholar]
- Gardner, M.; Grus, J.; Neumann, M.; Tafjord, O.; Dasigi, P.; Liu, N.; Peters, M.; Schmitz, M.; Zettlemoyer, L. Allennlp: A deep semantic natural language processing platform. arXiv 2018, arXiv:1803.07640. [Google Scholar]
- Floridi, L.; Chiriatti, M. GPT-3: Its nature, scope, limits, and conse-quences. Minds Mach. 2020, 30, 681–694. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Yang, Z.; Dai, Z.; Yang, Y.; Carbonell, J.; Salakhutdinov, R.R.; Le, Q.V. Xlnet: Generalized autoregressive pretraining for language understanding. In Proceedings of the 33rd Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 August 2019. [Google Scholar]
- Russo, D.; Carrara, P.; Facoetti, G. Technical problem identification for supervised state of the art. IFAC-PapersOnLine 2018, 51, 1341–1346. [Google Scholar] [CrossRef]
- Montecchi, T.; Russo, D.; Liu, Y. Searching in Cooperative Patent Classification: Comparison between keyword and concept-based search. Adv. Eng. Inform. 2013, 27, 335–345. [Google Scholar] [CrossRef]
- Kuznetsov, A.; Urdiales, H. Spelling correction with denoising transformer. arXiv 2021, arXiv:2105.05977. [Google Scholar]
- Haak, L.L.; Fenner, M.; Paglione, L.; Pentz, E.; Ratner, H. ORCID: A system to uniquely identify researchers. Learn. Publ. 2012, 25, 259–264. [Google Scholar] [CrossRef] [Green Version]
Accidental | Voluntary | |
---|---|---|
Accidental Ignorance | Accidental Typographic | |
Phonetically Plausible Misspelling Knok/knock | Thumbo, Twypo, Writo Bicylce, receive’ as ‘recieve | Typosquatting gogole.com (accessed on 28 July 2022) |
Difficult Words (i.e., latin origins) Diarrhoea | Speako ate/eight | Neologisms Wake cup |
Misuse and Orthographic errors than” and “then | Format conversion Universit?/università | Atomic misspellings prostate instead of prostrate |
Compound (Hyphen or dash) email or e-mail | Transliteration of texts from non-latin alphabets Ko = co = cho |
Example | Goal Terms | Rules | Example of Results |
---|---|---|---|
Compounds | Anglo Saxon | Anglo_saxon | Anglo-Saxon |
Anglo 0W saxon | Anglo Saxon | ||
Anglo?saxon | Anglossaxon | ||
Common inflectional suffixes on base words (-s, -ed, -ing, -er, -est) | Bikes | Bike? | Bike, Bikes, Biker |
Measured Measuring Measurer | Measur+ | Measure, Measures, Measuring, Measured, Measurement | |
Measur?? | Measure, Measures, Measured, Measuring (n.f.), Measurement (n.f.) | ||
Measure? | Measured, Measures, Measuring (n.f.), Measurement (n.f.), Measure (n.f.) | ||
Measuri?? | Measuring, Measure (n.f.), Measures (n.f.), Measured (n.f.), Measurement (n.f.) | ||
Measur??? | Measure, Measures, Measuring, Measured, Measurement (n.f.) | ||
Most common prefixes:
| Not critical | Not_critical | Not critical, Not-critical, Notcritical |
Bi-Phase | Bi_phase? | Biphase, Bi phase, Bi-phase | |
Multi-Object | Multi_object | Multi object, Multiobject, Multi-object | |
Multi+_object | Multi object, Multiobject, Multi-object, Multiple object, Multipart object, Multiplying object, Multipurpose object, Multipart object | ||
More prefixes (fore, inter, trans, over, sub, semi, anti, mid, ex, post) | All prefix | Over+ | Over, Overs, Over tube, Overwrapping, Overstrike, Overvoltage, Overlapping |
Over##+ | Overwrapping, Overstrike, Overvoltage, Overlapping, Over, Overs, Over tube | ||
Prefix + known word | Fore_casting | Forecasting, Fore casting, Fore-casting | |
Fore_cast+ | Forecasting, Fore casting, Fore-casting, Forecast, Forecasted, Forecasts, Forecastable, Forecastiong, Forecastting | ||
Over_pressure | Over pressure, Over-pressure, Overpressure | ||
Over+_pressure | Over pressure, Over-pressure, Overpressure, Overwrapped pressure, Overall pressure, Overly pressure | ||
Common suffixes beginning with a consonant (-ly, -ful, -ment, -hood, -less, -ness) | -ful | +ful | Ful, Useful, Powerful, Harmful |
+#ful | Ful, Useful, Useful, Powerful, Harmful | ||
-ful variation for known word | Useful | Useful | |
Useful+ | Useful, Usefull, Usefulness, Usefully, Usefulness, Usefullness | ||
Suffixes with ti, ci, si (tion, sion, tious, sious, cial, tial) | -tious | +tious | Infectious, Sedimentitious, Nutritious, Surreptitious |
-tious variation for known word | Nutritio?s | Nutritious, Nutritios | |
Nutriti?us | Nutritius, Nutritious | ||
Nutrit#ous | Nutrituous, Nutritious | ||
In (immigrate, illegal, irregular) ad (address, approach, aggressive) ob (obstruct, opportunity) sub (subtract, suppose, surround) com (commit, collide, corrode) dis (dissuade, difference) | Double letter | Im?igrate | Immigrate, Imigrate |
Il?egal | Illegal, Ilegal | ||
Dif?erence | Difference, Diference | ||
Double letter replaced | Su#pose | Suppose, Subpose | |
Graphemes unique to Greek-based words ch = /k/ (chorus, monochrome) y = [I] or [#j] (dyslexia, cytoplasm) ph = /f/ (phonology, grapheme) x = /z/ (xylophone) | CH → K | Mono#?rome | Monochrome, Monokrome |
Y → I/J | Bic#cle | Bicycle, Bicicle | |
PH → F | Gra#?ene | Graphene, Graphene, | |
X → Z | #ylophone | Xylophone, Zylophone | |
Silent letter spellings rh (rheumatoid) mn (mnemonic) pt (pterodactyl) | Rh, ps, pn, mn, pt | R?eumatoid | Reumatoid, Rheumatoid |
M?emonic | Mnemonic, Memonic | ||
Connectives that join the root and suffix i (menial, lenient, anxious) and u (superfluous, disingenuous, factual) | Presence or assence of connection | Superfl?ous | Superfluous, Superfluous |
Case Studies | Used Queries with Correct Keywords | Used Queries with Keywords with Misspellings |
---|---|---|
BRAYTON CYCLE | (BRAYTON+ 5D (CYCL+ OR CICLE+ OR THERM+_DYNAM+))/ETI/ECLM/DESC | ((MISSPELLINGS OF BRAYTON+) 5D (CYCL+ OR CICLE+ OR THERM+_DYNAM+))/ETI/ECLM/DESC |
BICYCLE OR BICYCLES | (BICYCLE OR BICYCLES)/TI/CLMS/DESC AND (B62+)/CPC/IPC | ((MISSPELLINGS OF BICYCLE OR BICYCLES) 5D (CYCL+ OR CICLE+ OR THERM+_DYNAM+))/ETI/ECLM/DESC |
BOSCH | BOSCH/PA/OPA | BOSCH/PA/OPA AND (ROBERT 1W MISSPELLINGS OF BOSCH) |
CO2 | (CO2)/ETI/ECLM/DESC | (MISSPELLINGS OF CO2)/ETI/ECLM/DESC |
CARBON DIOXIDE | (CARBON 0W DIOXIDE)/ETI/ECLM/DESC | (MISSPELLINGS OF (CARBON 0W DIOXIDE))/ETI/ECLM/DESC |
SUPERCRITICAL (fluid) | (SUPERCRITICAL)/ETI/ECLM/DESC | (MISSPELLINGS OF SUPERCRITICAL)/ETI/ECLM/DESC |
SERGEI ALEXANDROVICH IKOVENKO | (ЯКОВЕНКО AND (СЕРГЕЙ OR АЛЕКСАНДРОВИЧ))/IN/OIN/INH/INV | (MISSPELLINGS OF IKOVENKO) AND ((MISSPELLINGS OF SERGEI) OR (MISSPELLINGS OF ALEXANDROVICH)))/ IN/OIN/INH/INV |
Keywords | Results (N° Patents) | |
---|---|---|
Correct keyword | BRAYTON CYCLE | 5395 |
Keywords with misspellings | BRAITON CYCLE | 10 |
BRAY-TON CYCLE | 33 | |
BRYTON CYCLE | 57 | |
BRIGHTON CYCLE | 96 | |
BRETTON CYCLE | 99 | |
BRITTON CYCLE | 4 | |
BREE TON CYCLE | 1 |
Contained Misspellings | Patent Number and Applicant | Sentences with Misspellings |
---|---|---|
BRAITON CYCLES | WO2012/114367 HITACHI | The heat cycle of the gas turbine power generation system basically follows Braiton cycles, and the thermal efficiency is determined by the air compression ratio. |
BRYTON | WO2022/117228 NUOVO PIGNONE TECNOLOGIE | The thermodynamic system 35 may include an open thermodynamic cycle, such as a Bryton cycle, using a gas turbine engine. |
BRAITON CYCLE | ES2643558 SHELL | Any suitable liquefaction cycle known in the art may be used, including the Claude cycle, the Braiton cycle, the Joule Thompson cycle, and any modifications or combinations thereof. |
BRYTON CYCLE | WO2019/194670 HYUNDAI HEAVY INDUSTRY | In the present embodiment is provided with a refrigerant heat (1275) exchanger N2 Bryton cycle the refrigerant supply portion (127) may be provided but, in any shape including a refrigerant heat (1275) exchanger is the first embodiment. |
BRYTON CYCLE | WO2014/087344 ENEL INGEGNERIA & RICERCA-POLITECNICO DI MILANO | In this further secondary exchanger, the gas transfers the heat amount necessary to feed a Bryton cycle for micro-generation in an appropriate section of the plant by heating of air under pressure. |
BRIGHTON | RU2719413 GENERAL ELECTRIC | Figure 1 depicts schematic diagram of the traditional system with Brighton’s enclosed regenerative cycle for electricity generation; |
BRIGHTON | RU2018129741 NUOVO PIGNONE TECNOLOGIE | Floating heat can be converted into useful energy through various thermal engines using thermodynamic cycles such as Renkin steam cycles, organic cycles of Renkin or Brighton, CO cycles [2] or other energy cycles. |
Keywords | Results (N° Patents) | |
---|---|---|
Correct keyword | BICYCLE | 188,699 |
Keywords with misspellings providing more patents (>100 each) | BICY | 823 |
BICYLE | 417 | |
BYCYCLE | 132 | |
BI-CYCLE | 128 | |
BI CYCLE | 128 | |
BICI | 126 | |
BYCICLE | 117 | |
ABICYCLE | 109 | |
Other keywords with misspellings | BICICLE | 78 |
BI CICLE | 0 | |
BI-CICLE | 0 | |
BICICICLE | 1 | |
BICIRCLE | 0 | |
BICYRCLE | 2 | |
BICYKLE | 6 | |
BICYELE | 50 | |
BICCYLE | 1 | |
BICYLCE | 42 | |
BICYC!E | 6 | |
BICYCE | 13 | |
BIICYCLE | 1 | |
BIYCYCLE | 4 | |
BIVCYCLE | 45 | |
BIECYCLE | 4 | |
BICIYCLE | 1 | |
BICLYCLE | 3 | |
BICYCILE | 4 | |
BICSYCLE | 1 | |
BICYLCLE | 14 | |
BICYCVLE | 13 | |
BICYCELE | 6 | |
BICYCLIES | 4 | |
BICYCLYE | 1 | |
BICYCLLE | 2 | |
BICYCLEE | 1 | |
BICYCI | 2 | |
BICYDE | 42 | |
BICYCCLETTE | 1 | |
BICYCLEELETTE | 1 | |
BICICLE CLETTE | 1 | |
BICYYCLETTE | 1 | |
EBICYCLE | 4 |
Misspellings in the applicant | WO2015/005943 | SLIPSTREAM BYCYCLES |
US20070010376 | TAIWAN BICYLCE INDUSTRY R & D CENTER | |
EP2103512 | CANNONDALE BICYLE | |
Misspellings in the title | EP0825101 | Electric bicyle |
CA3053537 | Biycycle seat post travel adjustment assembly |
Correct keyword | (BOSCH AND (ROBERT 1W BOSCH)) /PA/OPA (124538 patents) |
Keywords with misspellings | BOSH (6), BOSCHE (5), BOSGH (2), BOCH (2), BOECH (2), BOACH (1), BOSTH (1), BOBCH (1), BOEOH (1), BOCSH (1) |
Keywords | Results (N° of Patents) | Results (N° of Patents) | ||
---|---|---|---|---|
Correct keyword | CO2 | 536,459 | ||
Keywords with misspellings | CO 2; CO-2; CO.2; CO2-; CO2~; CO2; CO.2; CO_2; CO–2; CO:2 | 831,022 | ||
C02 (Zero instead of O) | 76,624 | |||
C0 2; C0-2; C0.2; C02-; C02~; C02; C0.2; C0_2; C0–2; C0:2 | 118,156 | |||
Cθ2 (Teta instead of O) | 2288 | |||
Cθ 2; Cθ-2; Cθ.2; Cθ2-; Cθ2~; Cθ2; Cθ.2; Cθ_2; Cθ–2; Cθ:2 | 3982 | |||
CO sub 2; CO.sub.2; CO.sub2; CO sub2 | 20,482 | |||
C0 sub 2; C0.sub.2; C0.sub2; C0 sub2 | 25 | |||
Correct keyword | CARBON DIOXIDE | 1,164,394 | ||
Keywords with misspellings providing more patents (>1000 each) | DIOXIDE CARBON | 42,424 | ||
CARBON OXIDE | 31,867 | |||
CARBONDIOXIDE | 4279 | |||
CARBON DIOXID | 2888 | |||
CARBON DI OXIDE | 1990 | |||
Other keywords with misspellings | CARBON BIOXIDE | 200 | CARBON DIOX7D | 2 |
CARBON BI OXIDE | 12 | CAROBN DIOXIDE | 19 | |
CARBON DOUBLE OXIDE | 2 | CARBBN DIOXIDE | 27 | |
CARBON TWO OXIDE | 2 | CACBON DIOXIDE | 6 | |
CARBN DIOXIDE | 25 | CARTON DIOXIDE | 120 | |
CARBONE DIOXIDE | 148 | CANBON DIOXIDE | 5 | |
CARBONIC DIOXIDE | 263 | CAHON DIOXIDE | 3 | |
CARBONIC DI OXIDE | 6 | CARBEN DIOXIDE | 38 | |
CARBON DIOXYDE | 146 | CATBON DIOXIDE | 48 | |
CARBON DIOXDE | 178 | CARLBON DIOXIDE | 34 | |
CARBON DEOXIDE | 34 | CARHON DIOXIDE | 83 | |
CARBONS DIOXIDE | 50 | CARBAN DIOXIDE | 28 | |
CARBON DI OXID | 60 | CARLION DIOXIDE | 10 | |
CARBONDIOXID | 107 | CAIRBON DIOXIDE | 20 | |
CARON DIOXIDE | 44 | CARBON DIOXCDE | 1 | |
CARTOON DIOXIDE | 31 | CARBOIN DIOXIDE | 9 | |
CARBON DIOXODE | 58 | CARBON DIOXFDE | 5 | |
CARHON DI OXID | 1 | CARIBON DIOXIDE | 20 | |
CARHON DIOXID | 83 | CAFBON DIOXIDE | 15 | |
CARDON DIOXIDE | 41 | CARBDN DIOXIDE | 7 | |
CARBLON DIOXIDE | 32 | CANON DIOXIDE | 5 | |
CAROON DIOXIDE | 40 | CAMBON DIOXIDE | 4 | |
CABON DIOXIDE | 132 | CARDAN DIOXIDE | 1 | |
CARBON DIOXJD | 3 |
Keywords | Results (N° of Patents) | Results (N° of Patents) | ||
---|---|---|---|---|
Correct keyword | SUPERCRITICAL | 158,899 | ||
Keywords with misspellings providing more patents (>100 each) | SUPER CRITICAL | 10,627 | ||
SUPERCRITICALLY | 1213 | |||
SUPERCRITICALITY | 364 | |||
SUPERIOR CRITICAL | 120 | |||
SUPER CRITICALLY | 108 | |||
Other keywords with misspellings | SUPERCRITICALS | 12 | SUPERCCRITICAL | 1 |
SUPER CRITICALS | 1 | SUPERECRITICAL | 2 | |
SUPERCRITIC | 90 | SUPERHEAT CRITICAL | 6 | |
SUPER CRITIC | 7 | SUPERHEATER CRITICAL | 4 | |
SUPERCRITICALNESS | 1 | SUPERICRITICAL | 2 | |
SUPERCRITICALY | 2 | SUPERIORCRITICAL | 1 | |
SUPERCRITICF | 1 | SUPERLCRITICAL | 1 | |
SUPERCRITICISM | 1 | SUPERRCRITICAL | 2 | |
SUPER CRITICISM | 2 | SUPERS CRITICAL | 1 | |
SUPERCRITICALL | 5 | SUPERSCRITIC | 1 | |
SUPERCRITICA | 51 | SUPERSCRITICAL | 3 | |
SUPER CRITICA | 9 | SUPERSUPERCRITICAL | 77 | |
SUPERCRYTICAL | 5 | SUPERTCRITICAL | 2 | |
SUPER CRITICALITY | 28 | SUPERTRICRITICAL | 9 | |
SUPER CRITCAL | 4 | SUPPER CRITICAL | 15 | |
SUPERCRITCAL | 60 | SUPPERCRITICAL | 3 | |
SUPERCRITICALIZATION | 3 | SUPRA CRITICAL | 23 | |
SUPERCRYTICALLY | 2 | SUPRA CRITICALLY | 3 | |
SUPERCRTICAL | 22 | SUPRACRITIC | 1 | |
SUPERC CRITICAL | 1 | SUPRACRITICAL | 25 | |
SUPRACRITICALLY | 4 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Russo, D.; Spreafico, C.; Avogadri, S.; Precorvi, A. Investigating the Impacts of Misspellings in Patent Search by Combining Natural Language Tools and Rule-Based Approaches. Knowledge 2022, 2, 487-507. https://doi.org/10.3390/knowledge2030029
Russo D, Spreafico C, Avogadri S, Precorvi A. Investigating the Impacts of Misspellings in Patent Search by Combining Natural Language Tools and Rule-Based Approaches. Knowledge. 2022; 2(3):487-507. https://doi.org/10.3390/knowledge2030029
Chicago/Turabian StyleRusso, Davide, Christian Spreafico, Simone Avogadri, and Andrea Precorvi. 2022. "Investigating the Impacts of Misspellings in Patent Search by Combining Natural Language Tools and Rule-Based Approaches" Knowledge 2, no. 3: 487-507. https://doi.org/10.3390/knowledge2030029
APA StyleRusso, D., Spreafico, C., Avogadri, S., & Precorvi, A. (2022). Investigating the Impacts of Misspellings in Patent Search by Combining Natural Language Tools and Rule-Based Approaches. Knowledge, 2(3), 487-507. https://doi.org/10.3390/knowledge2030029