Recently, sustainable growth and development has become an important issue for governments and corporations. However, maintaining sustainable development is very difficult. These difficulties can be attributed to sociocultural and political backgrounds that change over time [1
]. Because of these changes, the technologies for sustainability also change, so governments and companies attempt to predict and manage technology using patent analyses, but it is very difficult to predict the rapidly changing technology markets. The best way to achieve insight into technology management in this rapidly changing market is to build a technology management direction and strategy that is flexible and adaptable to the volatile market environment through continuous monitoring and analysis. Quantitative patent analysis using text mining is an effective method for sustainable technology management. There have been many studies that have used text mining and word-based patent analyses to extract keywords and remove noise words. Because the extracted keywords are considered to have a significant effect on the further analysis, researchers need to carefully check out whether they are valid or not. However, most prior studies assume that the extracted keywords are appropriate, without evaluating their validity. Therefore, the criteria used to extract keywords needs to change. Until now, these criteria have focused on how well a patent can be classified according to its technical characteristics in the collected patent data set, typically using term frequency–inverse document frequency weights that are calculated by comparing the words in patents. However, this is not suitable when analyzing a single patent. Therefore, we need keyword selection criteria and an extraction method capable of representing the technical characteristics of a single patent without comparing them with other patents. In this study, we proposed a methodology to extract valid keywords from single patent documents using relevant papers and their authors’ keywords. We evaluated the validity of the proposed method and its practical performance using a statistical verification experiment. First, by comparing the document similarity between papers and patents containing the same search terms in their titles, we verified the validity of the proposed method of extracting patent keywords using authors’ keywords and the paper. We also confirmed that the proposed method improves the precision by about 17.4% over the existing method. It is expected that the outcome of this study will contribute to increasing the reliability and the validity of the research on patent analyses based on text mining and improving the quality of such studies.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited