How ChatGPT Is Shaping Next-Generation Patent Solutions

Liu, Chong; Fomin, Nikita Igorevich; Xiao, Shuoting; Sun, Guofeng; Lyu, Yuelong

doi:10.3390/buildings15132273

Open AccessArticle

How ChatGPT Is Shaping Next-Generation Patent Solutions

by

Chong Liu

^*

,

Nikita Igorevich Fomin

,

Shuoting Xiao

,

Guofeng Sun

and

Yuelong Lyu

Institute of Civil Engineering and Architecture, Ural Federal University, Yekaterinburg 620002, Russia

^*

Author to whom correspondence should be addressed.

Buildings 2025, 15(13), 2273; https://doi.org/10.3390/buildings15132273

Submission received: 31 May 2025 / Revised: 23 June 2025 / Accepted: 26 June 2025 / Published: 28 June 2025

(This article belongs to the Section Construction Management, and Computers & Digitization)

Download

Browse Figures

Versions Notes

Abstract

With the rapid advancement of artificial intelligence, large language models (LLMs) such as ChatGPT have garnered increasing attention in the field of text generation. However, when applied to patent drafting in the architectural domain, these models often produce verbose texts and omit critical elements, leading to issues such as limited claim scope and insufficient disclosure in the specifications. To address these challenges, this study conducted experimental research on the application of AI technologies, including GPT-4o, in the drafting and publication of patents related to prefabricated integral buildings. The results indicate that GPT-4o performs well in generating claims and technical descriptions. Nevertheless, problems such as repetitive descriptions of technical achievements, ambiguity in result interpretation, and a lack of detailed implementation methods were observed during the generation process. Therefore, we recommend incorporating expert review mechanisms when using GPT-4o for patent drafting, to enhance the accuracy and professionalism of the output. Moreover, enriching training datasets and enforcing a rigorous textual review workflow can further optimize model outputs, aligning them more closely with the high standards required in patent documentation.

Keywords:

large language models; patent; prefabricated integral buildings; multiple linear regression

1. Introduction

Patent drafting serves as the critical bridge between technological achievements and legal rights, directly influencing the scope of protection and the competitive advantage of innovators. This process is highly specialized, requiring not only a deep understanding of the technical domain but also a solid grasp of relevant legal principles. Traditional patent applications are often time-consuming, costly, and demand high standards of drafting quality. These challenges are particularly pronounced in the field of prefabricated integral buildings, where the complexity of technical descriptions and the precision required in claims further complicate the drafting process [1,2]. Prefabricated integral buildings (PIBs) are factory-finished volumetric modules, three-dimensional units that arrive on-site with a structural frame, envelope, building-services systems and finishes already integrated, requiring only stacking and mechanical connection. Volumetric modular systems have become a major focus of industrialized construction policy and produce an especially rich corpus of patents on inter-module joints; the combination of detailed 3D descriptions and high document density makes PIB patents an ideal test-bed for evaluating large language model-based validation methods.

The effectiveness of large language models (LLMs) in generating patent documents depends fundamentally on their capacity for accurate semantic understanding and logical coherence, since deficiencies in these aspects frequently manifest as internal inconsistencies and errors in generated texts. To systematically assess these critical capabilities before selecting our primary baseline model, we conducted a targeted error-detection experiment. Specifically, candidate LLMs were tasked with identifying deliberately introduced semantic and logical inconsistencies within a carefully constructed dataset comprising 100 prefabricated building patent text excerpts (containing 300 seeded errors in total).

Precision refers to the proportion of correctly identified errors among all detected errors, recall refers to the proportion of correctly identified errors among all actual errors, and the F1 score represents the harmonic mean of precision and recall. The results of this evaluation, as detailed in Table 1, indicated that GPT-4o achieved superior performance, obtaining the highest precision (0.90), the highest recall (0.58), and consequently, the best overall F1 score (0.70). These outcomes confirm GPT-4o’s strong ability to consistently interpret complex technical information and maintain logical clarity, thereby justifying its adoption as the central LLM in our subsequent patent-drafting experiments. In addition, ChatGPT’s advanced natural language capabilities can markedly accelerate patent preparation. It can rapidly generate multiple candidate claims and detailed technical descriptions, thereby reducing the time patent professionals spend on drafting and revision [3,4]. Second, ChatGPT improves the linguistic quality of patent documents by translating complex technical language into more accessible and accurate patent descriptions, which is crucial for ensuring clarity and professionalism in patent applications [5]. Recent research shows that ChatGPT outperforms other LLMs in API validation tasks, especially in low-latency, high-precision scenarios [6]. Moreover, its o3-mini-high variant likewise surpasses competing models in scientific computing, combining top-tier accuracy with the fastest reasoning speed [7]. Complementing these findings, Jurišević et al. have demonstrated the practical viability of GPT-3.5 in kindergarten energy management, underscoring the growing relevance of LLMs for building-related optimization tasks [8].

However, despite the numerous advantages that ChatGPT offers, several challenges remain in practical applications. A primary concern is that ChatGPT may not yet fully comprehend intricate technical concepts or adhere strictly to precise legal requirements, potentially resulting in outputs that lack technical depth or legal accuracy [9,10,11]. Additionally, due to the broad and diverse sources of training data, the model may inadvertently introduce biases or omit critical information, which could compromise the fairness and comprehensiveness of patent texts. Moreover, using AI tools such as GPT for patent drafting raises concerns regarding data security and confidentiality. It is essential to safeguard the sensitive information of inventors and enterprises and to ensure compliance with the disclosure regulations stipulated in patent law [12,13,14].

Over the past five years, China has granted more than 2.53 million patents, with an average annual growth rate of 13.4%. In 2023 alone, approximately 798,347 patents were granted in China, as illustrated in Figure 1. Figure 2 shows a continuous increase in the number of patents granted per 10,000 R&D personnel, indicating a rise in productivity or innovation output among Chinese researchers and a growing demand for patent filings.

Beyond China, several developed countries have maintained relatively stable patent authorization volumes, with a general trend of steady growth. In contrast, some emerging economies are still in the accumulation phase of patent development, with room for improvement in both the quantity and quality of patent output [17].

In summary, GPT-4o demonstrates significant potential in generating structured and logically coherent patent texts in the architectural domain, particularly in supporting the drafting of legal and technical documents. The aim of this study is to explore the use of GPT-4o in drafting patent texts for prefabricated integral buildings and to document the model’s performance in generating initial patent drafts, thereby providing guidance for applicants in the patent writing process.

2. Methodology

To comprehensively evaluate the effectiveness of GPT-4o in drafting patents for prefabricated integral buildings, the research methodology was systematically designed combining research strategy, data collection, text generation, and statistical analysis. First, a hybrid framework combining experimental design with expert evaluation was established to ensure both the professionalism and practical applicability of the generated texts. Second, during the data collection phase, representative patent cases from China and Russia were carefully selected. A stepwise input approach was adopted to simulate realistic drafting scenarios, enabling a systematic assessment of the model’s performance under varying input conditions. This stepwise design closely simulates the real-world patent drafting process and ensures that the evaluation comprehensively captures both linguistic quality and technical accuracy.

Building on this foundation, the quality of the generated texts was evaluated through a combination of self-assessment and expert scoring, which served as the basis for the iterative refinement of the generation process. Finally, multiple linear regression analysis was employed to quantitatively analyze the evaluation scores, identify key variables influencing expert assessments, and verify the validity and consistency of the model’s output.

2.1. Research Strategy

This study adopted a mixed research strategy combining experimental design and expert evaluation to assess the potential of artificial intelligence in the patent drafting process, as illustrated in Figure 3 Specifically, focus was placed on evaluating the performance of GPT-4o in generating claims for prefabricated integral building technology patents. By comparing the model’s outputs with expert assessments, the study aimed to verify its generation capabilities and linguistic quality under varying levels of input specificity [18,19].

The research began by gradually refining the input content to observe GPT-4o’s performance under different input conditions. Interactive adjustments were made based on both system self-assessment and expert evaluation. This strategy was designed to simulate the iterative drafting and optimization process inherent to real-world patent applications, while its mixed-method structure (quantitative probe + qualitative expert review) provides both statistical rigor and practical relevance. With the goal of maximizing the technical accuracy and legal validity of the generated texts while improving drafting efficiency. The inclusion of multiple rounds of self-assessment and expert review ensured the reliability of the findings and their practical relevance in real application contexts [20]. This strategy ensured that both the drafting logic and expert validation were fully integrated into the experimental workflow.

Through the research strategy, this study aims to conduct an in-depth analysis of GPT-4o’s performance in the patent drafting process, with a focus on evaluating the quality of Claims, Descriptions, Linkage of Features to Disclosure, and Linkage of Text to Figures and Other Text. In addition, the study explores the correlation between the model’s self-assessment capabilities and expert evaluation results, thereby providing both theoretical and practical foundations for the future application of AI-assisted patent drafting.

2.2. Data Collection Methods

The research design centered around a step-by-step experimental procedure aimed at stimulating the real-world patent drafting process. Initially, a total of 230 representative patents were selected from the Google Patents database—115 from Russia and 115 from China—covering key technological innovations in the field. This ensured both the breadth and representativeness of the research sample. For each patent, the title, abstract, and relevant figures were extracted to provide GPT-4o with the initial input, thereby simulating the basic information typically available to patent drafters during the claims drafting stage.

During the experiment, GPT-4o was instructed to generate corresponding patent claims based on the provided abstracts and figures. The model first produced an initial draft, after which a self-evaluation was conducted manually by the research team using a five-level rubric: “Very Bad,” “Bad,” “Fair,” “Good,” and “Very Good.” GPT-4o was not involved in performing the self-evaluation. If the result of the manual self-evaluation did not reach the “Good” level, the input information was incrementally enriched—for example, by supplying basic component descriptions, lists of involved elements, technical details, connection methods, materials used, and their advantages. This iterative process was repeated until the research team rated the output as “Good” or higher. Only those drafts that met this threshold were forwarded to domain experts for external review, ensuring the overall quality and completeness of the generated texts [21,22].

The iterative generation process followed a structured prompt refinement strategy aligned with the three evaluation dimensions outlined in Table 2. The initial prompt provided only the abstract and figure to simulate minimal input. If the resulting draft lacked key components or implementation steps, the next prompt supplemented this information. If the language was vague or the alignment between text and figures was unclear, further refinement addressed terminological precision and structural consistency. When claim boundaries were ambiguous or no embodiment was included, a final prompt was issued to clarify scope and add at least one concrete example. In most cases, two rounds of refinement were sufficient to reach the required “Good” level.

The regional focus on Chinese and Russian patent cases was informed by both methodological and practical considerations. These two jurisdictions represent distinct legal traditions and technical disclosure practices, allowing for a meaningful comparison of GPT-4o’s adaptability to different institutional contexts. Furthermore, the composition of the research team—consisting primarily of Chinese and Russian researchers—ensured strong linguistic and contextual familiarity with both patent systems. This background not only facilitated accurate interpretation of the model outputs but also enabled the effective recruitment of qualified domain experts from both countries, thereby enhancing the reliability of the expert evaluation process.

2.3. Data Analysis Methods

The objective of this study was to evaluate the impact of the quality and terminological standardization of GPT-4o-generated patent texts in the field of prefabricated integral buildings on expert comprehensive evaluations. Based on the 230 collected patent cases, data analysis was conducted using IBM SPSS Statistics 27.0.1 and a multiple linear regression model, which is a statistical method used to assess the relationship between a dependent variable and multiple independent variables [23,24].

The data collection involved multidimensional evaluations of each case, with the Expert Comprehensive Evaluation serving as the dependent variable to reflect the overall assessment of the patent texts. The independent variables included Quality of Claims, Quality of Description Text, Linkage of Features to Disclosure, Linkage of Text to Figures, and Other Text. This model enabled us to investigate how each explanatory variable independently influences the experts’ overall evaluation of the patent documents.

Once a claim draft received a “Good” rating from GPT-4o’s self-evaluation, it was submitted to two domain experts specializing in prefabricated integral building technologies for evaluation. These experts were deliberately selected for their highly relevant expertise: one is a professor in construction engineering who has authored dozens of granted patents, and the other is a senior engineer actively engaged in the research and development of prefabricated construction technologies, with extensive experience in patent drafting and examination. If both experts also rated the output as “Good,” the claims were accepted without modification. However, if the expert assessments did not meet expectations, the input information was further refined based on expert feedback, and the generation and evaluation process was repeated accordingly, as outlined in Table 3. This iterative loop ensured that each generated draft met both the model’s internal standards and the expert validation thresholds before proceeding to the next experimental stage. During this self-evaluation phase, one to two additional regeneration rounds were typically sufficient for a draft to reach a “Good” rating before it was forwarded to the human experts. This phase aimed to verify the acceptability and practical applicability of GPT-4oT-generated texts within a specialized professional context.

Building on the validated claim drafts, GPT-4o was then guided to generate a complete patent application document, including sections such as “Field of Application,” “Description of the Prior Art,” “Statement of the Problem,” and “Disclosure of the Invention (Utility Model).” Experts assessed these outputs based on textual quality, degree of terminological precision, and technical accuracy, using a five-point rating scale (1 to 5), and provided detailed feedback (see Figure 4).

The original Russian-annotated version is available in Appendix A, Figure A1.

In our experiments, most patent cases required at least one additional iteration after the initial AI draft before receiving a “Good” rating from the experts, as shown in Table 4. On average, each patent underwent approximately two rounds of generation and refinement (including the initial attempt) to achieve final expert approval. The generation process similarly followed an iterative loop of self-assessment and expert evaluation to ensure the final document met high standards of quality and professionalism.

First, a descriptive statistical analysis was conducted on the expert evaluation scores to understand the basic characteristics of the data. Specifically, the mean, standard deviation, and score distribution were calculated. The mean represents the central tendency of the overall ratings, while the standard deviation measures the degree of dispersion in the scores [25]. The formulas for calculating these statistical measures are as follows:

\bar{x} = \frac{1}{n} \sum_{i = 1}^{n} x_{i}

(1)

S = \sqrt{\frac{1}{n - 1} \sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}

(2)

where

\bar{x}

denotes the mean, S represents the standard deviation,

x_{i}

is the expert rating for the i-th patent, and n is the sample size (n = 115).

To assess the correlation between the explanatory variables and the dependent variable, a Pearson correlation analysis was conducted. The resulting correlation matrix illustrated the relationships between each pair of variables, with particular attention paid to their levels of statistical significance. By analyzing the correlation coefficients, we identified potential linear associations between the explanatory variables and the expert evaluation scores [26].

A multiple linear regression model was then applied to evaluate the influence of each variable on expert ratings. The model is expressed as follows:

Y_{i} = β_{0} + β_{1} X_{1 i} + β_{2} X_{2 i} + & \cdot \cdot \cdot \cdot \cdot \cdot + β_{n} X_{n i} + ϵ_{i}

(3)

where

Y_{i}

is the comprehensive evaluation of the i-th expert,

β_{0}

is constant,

X_{1 i}, X_{2 i}, X_{3 i}, X_{4 i}

are the quality of claims, quality of description text, and precision of input text,

β_{1}, β_{2}, β_{3}, β_{4}

are the regression coefficients of each explanatory variable, and

ϵ_{i}

is the error term, which represents the impact of other factors not explained by the model.

To evaluate the fitness of the regression model, the adjusted R-squared (

R_{adj}^{2}

) was used to assess the explanatory power of the model. The values of

R_{adj}^{2}

were employed to compare model fit across different conditions. In addition, an F-test was conducted to examine the overall significance of the regression model. If the models for both countries were statistically significant at the 0.05 level (p < 0.05), this indicated that the explanatory variables had a significant overall effect on the dependent variable [27,28].

Finally, to verify whether the assumptions of the regression model were satisfied, a residual analysis was performed, including tests for the normality and independence of residuals. The Durbin–Watson statistic indicated that the residuals of both models did not exhibit significant autocorrelation, thus supporting the assumption of independent errors [29].

3. Results

Descriptive statistics provide the foundation for subsequent statistical analyses and help to reveal differences in variable performance across regions. A higher mean combined with a lower standard deviation typically indicates that the corresponding variable exhibits more consistent and concentrated performance within the sample [30]. In contrast, a lower mean and higher standard deviation suggest greater variability in evaluations, which may be attributed to sample heterogeneity or inconsistencies in assessment criteria. The following results are presented in a sequential structure: starting with descriptive statistics, followed by correlation analysis, regression modeling, and residual diagnostics. This stepwise approach ensures a clear and logical presentation of the findings.

By comparing the evaluation results of patent documents from Russia and China, we observed that both models received relatively high scores for the Claims section, indicating that the quality of this part was generally perceived as superior. Nevertheless, both countries showed relatively lower performance in expert evaluations related to Other Text components, suggesting room for improvement in those areas. Overall, despite some differences in scoring, the two countries demonstrated similar performance across most evaluation indicators, indicating a degree of consistency in the quality of patent document drafting.

Figure 5 presents the correlation matrix among different sections of the patent applications. Each matrix element represents the correlation coefficient between two variables, ranging from −1.0 (perfect negative correlation) to 1.0 (perfect positive correlation). The color gradient from orange to green reflects the direction and strength of the correlations, where orange indicates positive correlation and green indicates negative correlation, with darker shades representing stronger relationships.

As illustrated in Figure 6, a notably strong positive correlation is observed between the Description section and expert evaluation in both Russia (r = 0.52) and China (r = 0.56). This suggests that more detailed and comprehensive descriptive content tends to be associated with higher expert appraisal scores. Furthermore, a moderate correlation between Claims and expert evaluations is also evident (Russia: r = 0.40; China: r = 0.47), indicating that the substantive quality of the claims contributes meaningfully to expert assessments. In contrast, structural components such as LFD and LTF exhibit weak or negligible correlations with expert evaluations, underscoring the relatively limited impact of formal layout features on perceived patent quality.

The R² values (Russia: 0.647; China: 0.681) in Table 5 indicate that approximately 64.7% of the variance in expert comprehensive evaluations in Russia and 68.1% in China can be explained by the predictive variables in the model, suggesting a good model fit. The Standard Error of the Estimate was 0.645 for Russia and 0.581 for China, reflecting the average deviation between the predicted and actual values, with the Chinese model exhibiting slightly lower prediction error. The Durbin–Watson statistics (Russia: 1.797; China: 1.786) were both close to 2, indicating that the residuals were independent, and no significant autocorrelation was present in the data.

The model summary is further validated through analysis of variance, confirming the statistical robustness of the regression models.

The results of the analysis of variance (ANOVA) in Table 6 further confirmed the statistical significance of the models and demonstrated a high level of predictive reliability. The total regression sum of squares was 32.547 for Russia and 31.794 for China, indicating that the variables included in the models effectively accounted for the variance in the dependent variable. The F-values for the two models were 15.658 and 18.844, respectively, with p < 0.001, suggesting that at least one predictor variable had a significant effect on the dependent variable and thus rejecting the null hypothesis of no relationship. These findings indicate that the models possess strong statistical power in explaining patent evaluation outcomes.

In summary, the regression models, as detailed in Table 7, demonstrated good fit and explanatory capability, offering valuable insights into how independent variables influence expert comprehensive evaluation.

Based on the regression analysis results, the contribution of each independent variable to the expert evaluations can be quantified through the following formulas:

Y_Russia = −1.360 + 0.505 × Claims + 0.418 × Description + 0.043 × LFD + 0.074 × LTF − 0.166 × Other

(4)

Y_China = −1.182 + 0.499 × Claims + 0.367 × Description + 0.064 × LFD + 0.117 × LTF + 0.076 × Other

(5)

The analysis results indicated that Claims and Description had the most significant effects (p < 0.001), suggesting that detailed and well-articulated claims and specifications are critical in achieving higher expert evaluations. Additionally, other variables exhibited a significant positive impact in the Russian model (p = 0.007), possibly reflecting a greater emphasis by Russian experts on the completeness and practical relevance of elements such as application fields, objectives, and implementation examples. In contrast, these aspects had a weaker influence in the Chinese model, implying that Chinese experts may place more importance on descriptions of technological innovation and technical detail.

The Linkage of Text to Figures (LTF) showed a significant positive effect in the Chinese model (p = 0.042), indicating that Chinese experts are more attentive to the consistency between figures and textual descriptions and their role in clarifying technical solutions. This effect was not significant in the Russian model, which may be due to a greater focus on textual depth and logical coherence. These differences highlight cultural and evaluative distinctions in patent assessment practices across countries, offering valuable insights into international patent applications.

The variance inflation factors (VIFs) were all close to 1, indicating no serious multicollinearity among the variables, thus ensuring the stability and reliability of the model. Overall, the results underscore the importance of clear claims and comprehensive descriptive documentation in the patent application process, while the influence of other factors was relatively limited [31].

Table 8 shows that the predicted value ranges and means of the Russian and Chinese models are comparable, indicating similar performance in predicting expert evaluations. The mean residuals are zero, suggesting a good model fit. However, the Chinese model exhibits slightly smaller residual ranges and standard deviations, implying a marginally higher prediction accuracy compared to the Russian model. Both standardized predicted values and residuals fall within the normal distribution range, with no outliers observed, which further confirms the stability of the models. The residual statistics collectively indicate that both regression models are stable and reliable, exhibiting satisfactory predictive performance.

The histogram (Figure 7a,b) illustrates the distribution of residuals for both models in comparison with the normal distribution. Ideally, if the residuals approximate a normal distribution, this suggests an absence of systematic error in the model. The histograms for both the Russian and Chinese models show that the residuals are generally normally distributed, with only slight deviations. The mean residuals are close to zero, and the standard deviations are near 1, which aligns with the characteristics of a well-fitting regression model.

The normal probability plots (Figure 7c,d) further confirmed the normality of the residuals. The data points largely align along the reference line (the 45-degree line), with most points closely clustered around the line, especially in the mid-range. This supports the assumption of normal residual distribution, indicating that the observed cumulative probabilities closely match those expected from a theoretical normal distribution. Although a few points deviate slightly from the line, such deviations are not sufficient to challenge the overall assumption of normality for the models [32,33,34].

In summary, the results are consistently presented across descriptive, correlational, and regression analyses, with additional diagnostic evidence reinforcing the statistical reliability of the models.

4. Discussion

This study verified both the potential and limitations of GPT-4o in the context of patent drafting for prefabricated integral buildings. While the model significantly enhanced the efficiency of writing patent claims and descriptive sections, several issues emerged during its application—issues that necessitate further model refinement and the integration of human expert input.

First, GPT-4o demonstrated the ability to generate high-quality claims and technical descriptions, greatly accelerating the early stages of patent preparation. Its natural language processing capabilities were particularly effective in transforming complex technical concepts into accurate and comprehensible language. However, the model still exhibited notable deficiencies in handling intricate technical constructs and domain-specific legal terminology. Notably, several expert reviewers found that some AI-generated drafts lacked concrete technical details, such as stepwise implementation methods or key engineering parameters, underscoring the need for future improvements in the model’s technical specificity.

Specifically, one recurring issue observed during the generation of claims was redundancy in the description of technical outcomes. This redundancy not only lengthened the text unnecessarily but also, in some cases, introduced ambiguity that detracted from the legal precision required in patent documentation. Moreover, the model often proposed multiple similar technical solutions, resulting in vague or overly broad patent scopes; however, patent claims must define distinct and well-delimited innovations. This reflects the current limitations of AI models in maintaining the consistency and uniqueness of technical disclosures.

During manual review, experts frequently noted repetitive phrases and ambiguous language in certain AI-generated drafts, which can undermine both the distinctiveness and legal enforceability of patent claims. To counter these problems, we recommend that future workflows combine automated redundancy/ambiguity checks with expert editing, and leverage model parameter adjustments (e.g., increased penalty for n-gram repetition) to further constrain output. In our study, all drafts were manually checked and edited to minimize such issues prior to expert evaluation.

Another key shortcoming was the lack of detailed descriptions of how the technical solutions are to be implemented. In many cases, the generated text failed to provide clear explanations of practical implementation, thereby reducing the utility and operational clarity of the claims. In patent examinations, detailed disclosure of implementation methods is critical to ensure that the application meets the standards of sufficiency and enablement.

Additionally, the evaluation methodology employed in this study has inherent limitations. While a structured manual self-evaluation was performed by the research team, it must be acknowledged that relying on the same researchers or model developers for evaluating outputs introduces potential subjectivity and bias. In principle, utilizing GPT-4o or any automated system for self-evaluation of its own outputs lacks logical validity, as such an approach is susceptible to systemic biases and limited objectivity. Therefore, we emphasize that our internal ratings primarily served as a filtering and pre-screening mechanism to improve efficiency, whereas the final validation of text quality and legal robustness was exclusively carried out by independent domain experts. Future work should incorporate additional layers of external evaluation—such as automated rating frameworks or assessments by a broader panel of experts—to further enhance the objectivity and reproducibility of the evaluation process.

Moreover, as this study focused only on Chinese and Russian patent data, the findings may not be fully generalizable to other legal and technical systems. Future research should extend the analysis to broader jurisdictions. A key limitation of this study is the regional and technical focus of the patent data used for both model prompting and evaluation. For future research, we believe it is essential to incorporate a wider variety of real-world patents—covering different fields, legal systems, and innovation types—to equip AI models with more comprehensive technical and legal knowledge. Such expansion is critical for generalizing findings and for further advancing the field of AI-driven patent drafting.

These limitations suggest that GPT-4o’s ability to handle technical depth and legal complexity remains constrained. To address these issues, future research may consider the following measures:

(1) Increasing the diversity of training data: Incorporating a broader range of legal texts and technical documents related to patents—especially from different technical fields—can improve the model’s understanding of legal and technical nuances.

(2) Integrating expert-assisted feedback mechanisms: Introducing expert review into the generation workflow can help refine the model output based on human feedback, thereby improving the accuracy and professionalism of the generated text.

(3) Implementing a rigorous text review process: After automated generation, a strict review process should be employed to detect and correct redundancy, inaccuracy, or ambiguity in the output [35].

(4) Enhancing the evaluation framework: Combining internal screening with independent assessments and transparent criteria can increase the credibility, reproducibility, and academic rigor of future LLM-based patent generation studies.

5. Conclusions

This study conducted an in-depth investigation into the practical application potential of GPT-4o in the patent drafting process within the field of prefabricated integral buildings. The results demonstrated that GPT-4o exhibits notable advantages in generating Claims and Description texts, particularly in terms of linguistic expression, structural organization, and the completeness of technical descriptions. These capabilities highlight the model’s value in improving drafting efficiency and reducing labor costs, making it especially suitable for generating initial drafts and supporting revisions.

However, several critical issues remain in the model’s current implementation. For instance, a high degree of redundancy in the description of technical achievements was observed, which compromises textual conciseness and may blur the boundaries of patent protection. The model also tends to generate multiple similar but insufficiently differentiated technical solutions, leading to reduced novelty and clarity in the patent text. In addition, the lack of detailed descriptions regarding the implementation of technical solutions undermines the practical applicability and sufficiency of disclosure—both of which are essential for patent approval.

To address these issues, this study recommends establishing an expert-driven review mechanism when using GPT-4o to assist in patent drafting. Such a mechanism would ensure that model outputs are scrutinized and refined to meet high standards of legal compliance and technical accuracy. Improving the model’s performance also requires optimizing the structure and quality of training data, particularly by incorporating more diverse and high-quality patent samples, including real-world cases from various fields and technological levels. Furthermore, implementing a rigorous review and feedback process is crucial in enhancing the precision, professionalism, and usability of AI-generated patent documents.

In summary, while ChatGPT is not without limitations, it already demonstrates strong application potential in specific dimensions. With the integration of appropriate human oversight mechanisms and continuous optimization strategies, ChatGPT holds the promise of delivering transformative improvements to the patent drafting workflow.

Author Contributions

Conceptualization, C.L. and N.I.F.; methodology, C.L.; software, C.L.; validation, N.I.F., C.L. and S.X.; formal analysis, C.L.; investigation, G.S.; resources, C.L.; data curation, C.L. and Y.L.; writing—original draft preparation, C.L.; writing—review and editing, C.L.; visualization, C.L.; supervision, N.I.F.; project administration, C.L.; funding acquisition, C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT (OpenAI, GPT-4o) for the purposes of generating patent texts, which were subsequently evaluated through expert questionnaires to assess the quality of the generated content. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
LLM	Large Language Model
GPT	Generative Pre-trained Transformer
LoRA	Low-Rank Adaptation
ANOVA	Analysis of Variance
MSE	Mean Squared Error
RMSE	Root Mean Squared Error
R&D	Research and Development
VIF	Variance Inflation Factor
PIB	Prefabricated integral building

Appendix A

Figure A1. Expert evaluation process (original Russian-annotated version).

References

Li, L.; Wang, L.; Zhang, X. Technology Innovation for Sustainability in the Building Construction Industry: An Analysis of Patents from the Yangtze River Delta, China. Buildings 2022, 12, 2205. [Google Scholar] [CrossRef]
Xue, X.; Tan, X.; Huang, Q.; Zhu, H.; Chen, J. Exploring the Innovation Path of the Digital Construction Industry Using Mixed Methods. Buildings 2022, 12, 1840. [Google Scholar] [CrossRef]
Ren, R.; Ma, J.; Luo, J. Large Language Model for Patent Concept Generation. Adv. Eng. Inform. 2025, 65, 103301. [Google Scholar] [CrossRef]
Jiang, L.; Goetz, S.M. Natural Language Processing in the Patent Domain: A Survey. Artif. Intell. Rev. 2025, 58, 214. [Google Scholar] [CrossRef]
Lee, J.-S. InstructPatentGPT: Training Patent Language Models to Follow Instructions with Human Feedback. Artif. Intell. Law 2024. [Google Scholar] [CrossRef]
Altin, M.; Mutlu, B.; Kilinc, D.; Cakir, A. Automated Testing for Service-Oriented Architecture: Leveraging Large Language Models for Enhanced Service Composition. IEEE Access 2025, 13, 89627–89640. [Google Scholar] [CrossRef]
Jiang, Q.; Gao, Z.; Karniadakis, G.E. DeepSeek vs. ChatGPT vs. Claude: A comparative study for scientific computing and scientific machine learning tasks. Theor. Appl. Mech. Lett. 2025, 15, 100583. [Google Scholar] [CrossRef]
Jurišević, N.; Gordić, D.; Nikolić, D.; Nešović, A.; Kowalik, R. Exploring the Potential of Emerging Digitainability—GPT Reasoning in Energy Management of Kindergartens. Buildings 2024, 14, 4038. [Google Scholar] [CrossRef]
Kamateri, E.; Salampasis, M.; Perez-Molina, E. Will AI Solve the Patent Classification Problem? World Pat. Inf. 2024, 78, 102294. [Google Scholar] [CrossRef]
Chiarello, F.; Giordano, V.; Spada, I.; Barandoni, S.; Fantoni, G. Future Applications of Generative Large Language Models: A Data-Driven Case Study on ChatGPT. Technovation 2024, 133, 103002. [Google Scholar] [CrossRef]
Rane, N.L.; Choudhary, S.P.; Rane, J. Enhancing Sustainable Construction Materials Through the Integration of Generative Artificial Intelligence, Such as ChatGPT. SSRN Electron. J. 2024, 1, 98–122. [Google Scholar] [CrossRef]
Chesterman, S. Good Models Borrow, Great Models Steal: Intellectual Property Rights and Generative AI. Policy Soc. 2025, 44, 23–37. [Google Scholar] [CrossRef]
Rane, N.; Choudhary, S.; Rane, J. Transforming the Civil Engineering Sector with Generative Artificial Intelligence, Such as ChatGPT or Bard. SSRN J. 2024. [Google Scholar] [CrossRef]
Koubaa, A.; Boulila, W.; Ghouti, L.; Alzahem, A.; Latif, S. Exploring ChatGPT Capabilities and Limitations: A Survey. IEEE Access 2023, 11, 118698–118721. [Google Scholar] [CrossRef]
World Population Review Patents by Country/Number of Patents Per Country 2023. Available online: https://worldpopulationreview.com/country-rankings/patents-by-country (accessed on 24 January 2025).
National Bureau of Statistics Research on China’s Innovation Index. Available online: https://www.stats.gov.cn/sj/zxfb/202302/t20230203_1901640.html (accessed on 25 January 2025).
Son, J.; Moon, H.; Lee, J.; Lee, S.; Park, C.; Jung, W.; Lim, H. AI for Patents: A Novel Yet Effective and Efficient Framework for Patent Analysis. IEEE Access 2022, 10, 59205–59218. [Google Scholar] [CrossRef]
Ding, Y.; Ma, J.; Luo, X. Applications of Natural Language Processing in Construction. Autom. Constr. 2022, 136, 104169. [Google Scholar] [CrossRef]
Kawano, S.; Nonaka, H.; Yoshino, K. ClaimBrush: A Novel Framework for Automated Patent Claim Refinement Based on Large Language Models. In Proceedings of the 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 15–18 December 2024; IEEE: Washington, DC, USA, 2024; pp. 6594–6603. [Google Scholar]
Sun, H.; Burton, H.V.; Huang, H. Machine Learning Applications for Building Structural Design and Performance Assessment: State-of-the-Art Review. J. Build. Eng. 2021, 33, 101816. [Google Scholar] [CrossRef]
Baduge, S.K.; Thilakarathna, S.; Perera, J.S.; Arashpour, M.; Sharafi, P.; Teodosio, B.; Shringi, A.; Mendis, P. Artificial Intelligence and Smart Vision for Building and Construction 4.0: Machine and Deep Learning Methods and Applications. Autom. Constr. 2022, 141, 104440. [Google Scholar] [CrossRef]
Olu-Ajayi, R.; Alaka, H.; Sulaimon, I.; Sunmola, F.; Ajayi, S. Building Energy Consumption Prediction for Residential Buildings Using Deep Learning and Other Machine Learning Techniques. J. Build. Eng. 2022, 45, 103406. [Google Scholar] [CrossRef]
Petralia, S. Mapping General Purpose Technologies with Patent Data. Res. Policy 2020, 49, 104013. [Google Scholar] [CrossRef]
Wang, G.; Zhou, Y.; Cao, D. Artificial Intelligence in Construction: Topic-Based Technology Mapping Based on Patent Data. Autom. Constr. 2025, 172, 106073. [Google Scholar] [CrossRef]
Li, C.Z.; Tam, V.W.; Zhou, M.; Liu, L.; Wu, H. Quantifying the Coupling Coordination Effect between the Prefabricated Building Industry and Its External Comprehensive Environment in China. J. Clean. Prod. 2024, 434, 140238. [Google Scholar] [CrossRef]
Park, Y.-N.; Lee, Y.-S.; Kim, J.-J.; Lee, T.S. The Structure and Knowledge Flow of Building Information Modeling Based on Patent Citation Network Analysis. Autom. Constr. 2018, 87, 215–224. [Google Scholar] [CrossRef]
Zaker Esteghamati, M.; Flint, M.M. Developing Data-Driven Surrogate Models for Holistic Performance-Based Assessment of Mid-Rise RC Frame Buildings at Early Design. Eng. Struct. 2021, 245, 112971. [Google Scholar] [CrossRef]
Ding, Y.; Fan, L.; Liu, X. Analysis of Feature Matrix in Machine Learning Algorithms to Predict Energy Consumption of Public Buildings. Energy Build. 2021, 249, 111208. [Google Scholar] [CrossRef]
Sharma, V. A Comprehensive Exploration of Regression Techniques for Building Energy Prediction. Eur. J. Adv. Eng. Technol. 2021, 8, 83–87. [Google Scholar]
Ragot, S. A Novel Approach to Measuring the Scope of Patent Claims Based on Probabilities Obtained from (Large) Language Models. World Pat. Inf. 2024, 79, 102321. [Google Scholar] [CrossRef]
Ardabili, S.F.; Abdilalizadeh, L.; Mako, C.; Torok, B.; Mosavi, A. Systematic Review of Deep Learning and Machine Learning for Building Energy. Front. Energy Res. 2022, 10, 786027. [Google Scholar] [CrossRef]
Khalil, M.; McGough, A.S.; Pourmirza, Z.; Pazhoohesh, M.; Walker, S. Machine Learning, Deep Learning and Statistical Analysis for Forecasting Building Energy Consumption—A Systematic Review. Eng. Appl. Artif. Intell. 2022, 115, 105287. [Google Scholar] [CrossRef]
Zhao, X.; Wu, W.; Wu, D. How Does AI Perform in Industry Chain? A Patent Claims Analysis Approach. Technol. Soc. 2024, 79, 102720. [Google Scholar] [CrossRef]
Nkolongo, F.T.; Echchakoui, S.; Mehdi, A. Harnessing Large Language Models for Precision Topic Extraction and Technology Patent Nomination: A GPT-Centric Methodology. Procedia Comput. Sci. 2025, 257, 904–912. [Google Scholar] [CrossRef]
Fernandes, D.; Garg, S.; Nikkel, M.; Guven, G. A GPT-Powered Assistant for Real-Time Interaction with Building Information Models. Buildings 2024, 14, 2499. [Google Scholar] [CrossRef]

Figure 1. Total patent grants in 2023 [15].

Figure 2. Index of patents granted per 10,000 R&D personnel, China, 2012–2021 [16].

Figure 3. Experimental design.

Figure 4. Example of the expert evaluation process.

Figure 5. Comparative boxplot of descriptive statistics for patent evaluation indicators in Russia and China (squares = mean, diamonds = outliers).

Figure 6. Pearson correlation matrix among different sections of the patent applications. (a) Russia. (b) China.

Figure 7. Normal diagnostics of regression-standardized residuals. (a) Histogram of standardized residuals—Russia. (b) Histogram of standardized residuals—China. (c) Normal P-P plot of standardized residuals—Russia. (d) Normal P-P plot of standardized residuals—China.

Table 1. Performance of candidate large language models on patent error detection task.

Model	Precision	Recall	F1
GPT-4o	0.90	0.58	0.70
Claude Sonnet 4	0.88	0.51	0.65
Gemini 2.5 Pro	0.81	0.46	0.58
DeepSeek V3	0.77	0.45	0.56
Qwen3 235B A228	0.76	0.48	0.59
Llama 4 Maverick	0.70	0.41	0.51

Table 2. Self-evaluation rubric for pre-screening GPT-4o-generated patent texts.

Dimension	Evaluation Question	Criteria for “Good” Rating
Completeness	Does the text cover all key technical features and necessary implementation elements?	(1) All key components or procedural steps are included; (2) Claim logic is coherent and complete;
Clarity	Is the language accurate and easy to understand? Do the text and figures align?	No vague terms or code-like expressions; text and figure numbers correspond one-to-one
Legal adequacy	Does the claim meet patent law requirements for disclosure and protection scope clarity?	Claim boundaries are clear; no exaggerated or subjective phrases; At least one implementation example provided.

Table 3. Patent claim generation and evaluation step-by-step.

Step	Input	Output	Self-Evaluation	Expert Evaluation
1	Generate a claim based on the provided abstract and image	Preliminary claims text	Very bad	-
2	Provide a basic component description, briefly listing the main elements involved in the invention	Improved claim text	Bad	-
……	……	……	……	-
n	Provide technical details of components, connection methods, materials used and their advantages	Further improved claim text	Good	Good

All ratings (Very bad–Good) and qualitative comments were provided by two independent domain experts.

Table 4. Variables in the analysis.

Variable Type	Variable Name	Abbreviation	Description
Response variable	Expert Comprehensive Evaluation	Expert	The aggregate score is given by experts assessing the overall patent application
Explanatory variables (continued)	Quality of Claims	Claims	The quality of the patent claims in terms of specificity and legal robustness
	Quality of Description Text	Description	The quality of the description text within the patent application, assessing clarity and completeness
	Linkage of Features to Disclosure	LFD	Relationship between the characteristic features from the formula and invention disclosure
	Linkage of Text to Figures	LTF	The relationship between the text of the application and the figures, showing how the illustrations support and clarify the described technical solution
	Other Text	Other	Field of application, objective, summary, embodiments and so on
	Quality of Claims	Claims	The quality of the patent claims in terms of specificity and legal robustness

Table 5. Model summary.

Model	R	R²	Adjusted R²	Std. Error of the Estimate	Durbin-Waston
Russia	0.647 ^a	0.441	0.391	0.645	1.797
China	0.681 ^a	0.464	0.439	0.581	1.786

a. Predictors: (constant): expert comprehensive evaluation.

Table 6. ANOVA.

Model	Variables	Sum of Squares	df	Mean Square	F	Siq.
Russia	Regression	32.547	5	6.509	15.658	<0.001
	Residual	45.314	109	0.416
China	Regression	31794	5	6.359	18.844	<0.001
	Residual	36.780	109	0.337

Independent variables: Claims, Description, Linkage of Features to Disclosure, Linkage of Text to Figures and Other Text.

Table 7. Coefficients.

Model	Variables	B	Std. Error	Beta	t	Siq.	Tolerance	VIF
Russia	(Constant)	−1.360	0.647	-	−2.101	0.038	-	-
	Claims	0.505	0.129	0.293	3.904	<0.001	0.951	1.052
	Description	0.418	0.068	0.454	6.117	<0.001	0.968	1.033
	LFD	0.043	0.066	0.048	0.655	0.514	0.991	1.009
	LTF	0.074	0.061	0.089	1.213	0.228	0.989	1.011
	Other	0.166	0.061	0.202	2.725	0.007	0.969	1.032
China	(Constant)	−1.182	0.543	-	−2.177	0.032	-	-
	Claims	0.499	0.119	0.313	4.189	<0.001	0.884	1.132
	Description	0.367	0.067	0.420	5.520	<0.001	0.850	1.177
	LFD	0.064	0.066	0.072	0.971	0.334	0.885	1.129
	LTF	0.117	0.057	0.153	2.058	0.042	0.892	1.121
	Other	0.076	0.062	0.091	1.231	0.221	0.904	1.106

Table 8. Residual statistics.

Model	Variable	Min	Max	Mean	Std. Deviation	N
Russia	Predicted Value	2.09	4.48	3.17	0.534	115
	Residual	−1.850	1.374	0.000	0.630	115
	Std. Predicted Value	−2.007	2.456	0.000	1.000	115
	Std. Residual	−2.869	2.130	0.000	0.978	115
China	Predicted Value	2.13	4.36	3.06	0.528	115
	Residual	−1.523	1.511	0.000	0.568	115
	Std. Predicted Value	−1.769	2.456	0.000	1.000	115
	Std. Residual	−2.622	2.602	0.000	0.978	115

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, C.; Fomin, N.I.; Xiao, S.; Sun, G.; Lyu, Y. How ChatGPT Is Shaping Next-Generation Patent Solutions. Buildings 2025, 15, 2273. https://doi.org/10.3390/buildings15132273

AMA Style

Liu C, Fomin NI, Xiao S, Sun G, Lyu Y. How ChatGPT Is Shaping Next-Generation Patent Solutions. Buildings. 2025; 15(13):2273. https://doi.org/10.3390/buildings15132273

Chicago/Turabian Style

Liu, Chong, Nikita Igorevich Fomin, Shuoting Xiao, Guofeng Sun, and Yuelong Lyu. 2025. "How ChatGPT Is Shaping Next-Generation Patent Solutions" Buildings 15, no. 13: 2273. https://doi.org/10.3390/buildings15132273

APA Style

Liu, C., Fomin, N. I., Xiao, S., Sun, G., & Lyu, Y. (2025). How ChatGPT Is Shaping Next-Generation Patent Solutions. Buildings, 15(13), 2273. https://doi.org/10.3390/buildings15132273

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

How ChatGPT Is Shaping Next-Generation Patent Solutions

Abstract

1. Introduction

2. Methodology

2.1. Research Strategy

2.2. Data Collection Methods

2.3. Data Analysis Methods

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI