Identification and Prediction Methods for Frontier Interdisciplinary Fields Integrating Large Language Models

Wu, Yu; Lin, Qiao; Wu, Jinming; Yao, Ru; Zhang, Xuefu

doi:10.3390/systems13080677

Open AccessArticle

Identification and Prediction Methods for Frontier Interdisciplinary Fields Integrating Large Language Models

by

Yu Wu

¹

,

Qiao Lin

¹

,

Jinming Wu

¹,

Ru Yao

² and

Xuefu Zhang

^1,*

¹

Agricultural Information Institute of Chinese Academy of Agricultural Sciences, Beijing 100081, China

²

Institute of Data Science and Agricultural Economy, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China

^*

Author to whom correspondence should be addressed.

Systems 2025, 13(8), 677; https://doi.org/10.3390/systems13080677

Submission received: 30 June 2025 / Revised: 5 August 2025 / Accepted: 5 August 2025 / Published: 8 August 2025

(This article belongs to the Topic Data Science and Intelligent Management)

Download

Browse Figures

Versions Notes

Abstract

Identifying frontier interdisciplinary domains is essential for tracking scientific evolution and informing strategic research planning. This study proposes a comprehensive framework that integrates (1) semantic disciplinary classification using a large language model (GPT-3.5-Turbo), (2) quantitative metrics for interdisciplinarity (degree and integration strength) and frontierness (novelty, growth, and impact), and (3) trend prediction using time series models, including Transformer, LSTM, GRU, Random Forest, and Linear Regression. The framework systematically captures both structural and temporal dimensions of emerging research fields. Compared to conventional citation-based or topic modeling approaches, it enhances semantic precision, supports multi-label classification, and enables forward-looking forecasts. Empirical validation shows that the Transformer model achieved the highest predictive performance, outperforming other deep learning and baseline models. As an illustrative example, the framework was applied to synthetic biology, which demonstrated high interdisciplinarity, strong novelty, and growing academic influence. These results underscore the field’s strategic position as a frontier interdisciplinary domain. Beyond this case, the proposed framework is generalizable to other domains and provides a scalable, data-driven solution for dynamic monitoring of emerging interdisciplinary areas. It holds promise for applications in science and technology intelligence, research evaluation, and policy support.

Keywords:

frontier research; interdisciplinarity; deep learning forecasting; transformer model; trend prediction; science and technology intelligence

1. Introduction

With the rapid development of science and technology, the boundaries between disciplines have gradually become blurred [1], leading to an increasing trend toward interdisciplinary and integrative scientific research. Interdisciplinary research has emerged as a critical domain for innovation breakthroughs [2], especially in addressing complex global challenges such as climate change [3], biotechnology [4], and artificial intelligence [5]. In this context, interdisciplinary approaches have created significant opportunities for innovation. At the same time, frontier scientific fields play an essential role in driving technological advancement [6] and tackling global issues. As emerging fields continue to evolve, the timely identification of potential frontier interdisciplinary areas can provide valuable guidance for policymakers, scholars, and industries, enabling the strategic allocation of research resources in advance [7,8].

Frontier interdisciplinary fields are characterized by both “frontierness” and “interdisciplinarity”. They represent the cutting edge of scientific exploration, integrating the latest achievements and technologies from different disciplines, supporting breakthrough innovation, and promoting the birth of new knowledge. These fields significantly expand the boundaries of scientific research and deepen its applications [9,10]. However, effectively identifying these potential frontier interdisciplinary areas from the vast amount of academic output and predicting their future development trajectories remains a major challenge in current research [11,12].

At present, methods for identifying frontier fields largely rely on citation analysis [13], textual content mining [14], and expert judgment [15]. Although these approaches have, to some extent, revealed patterns of field evolution, they also face pressing limitations. For instance, citation analysis suffers from issues such as self-citation anomalies [16] and time-lag effects [17]. Content analysis, often based on topic modeling, is known to encounter serious conceptual and practical problems, including the lack of well-grounded Bayesian priors, discrepancies with the true statistical characteristics of text data, and difficulties in accurately determining the number of topics [18]. Meanwhile, expert judgment is inherently subject to biases and subjectivity [19].

Current approaches to identifying frontier interdisciplinary fields primarily rely on topic modeling [14,20], focusing on subfield-level analysis while lacking mechanisms to evaluate and forecast the overall interdisciplinarity and frontierness of entire research domains.

To address this gap, we propose an integrated framework that combines large language models, quantitative metrics, and time series forecasting. GPT-3.5-Turbo is used to classify multidisciplinary papers from the Web of Science, enabling fine-grained disciplinary tagging. Interdisciplinarity is measured through both static indicators—such as the degree and strength of cross-disciplinary integration—and dynamic network-based metrics. Frontierness is assessed using a composite index of novelty, growth, and impact. Time series models are then compared to select the optimal method for forecasting development trajectories.

Applying this framework to synthetic biology, we demonstrate its utility in identifying frontier interdisciplinary domains, supporting evidence-based decisions in research management and innovation policy. This study offers not only a new methodological pathway for frontier identification but also practical tools for guiding interdisciplinary innovation and optimizing resource allocation.

The proposed framework offers both a novel technological approach for quantitative frontier identification and critical decision support for research management, resource allocation, and the construction of innovation systems. It holds significant implications for advancing academic innovation and driving societal progress.

This study develops a structured and integrative framework that consolidates existing methods—LLM-based semantic classification, multidimensional frontierness evaluation, and time series trend forecasting—into a coherent pipeline aimed at supporting the strategic identification of frontier interdisciplinary domains. The value of this framework lies in its ability to systematically combine and apply recent advances across several methodological strands. Specifically, the framework offers the following:

(1): Methodological Innovation: We propose an integrated framework that combines semantic classification via ChatGPT-3.5-Turbo, novel interdisciplinarity and frontierness metrics, and time series forecasting. This overcomes the limitations of traditional citation-based or keyword-based approaches and enhances the precision and scalability of frontier field identification.
(2): Metric System for Interdisciplinarity and Frontierness: A set of indicators is introduced to quantify the degree and depth of disciplinary integration, as well as the novelty, growth, and impact of a field. These metrics provide a systematic basis for identifying and comparing interdisciplinary dynamics across domains.
(3): Time Series Forecasting for Frontier Trend Analysis: We evaluate and compare linear, machine learning, and deep learning models for predicting research field trajectories. The Transformer model is found to perform best, offering a robust, data-driven tool for prospective trend analysis in science and technology studies.
(4): Empirical Validation: Using synthetic biology as a case study, the framework demonstrates its practical effectiveness in capturing the interdisciplinary structure, research frontierness, and developmental trajectory of an emerging field.

This study contributes both methodologically and practically to the field of science and technology intelligence, offering a decision-support tool for researchers, institutions, and policymakers in managing and anticipating interdisciplinary innovation.

The remainder of this paper is organized as follows. Section 2 reviews related work. Section 3 presents the proposed methodology. Section 4 reports the empirical results. Section 5 discusses the methodology and research findings. Finally, Section 6 concludes the study.

2. Literature Review

2.1. Research Front Identification

Frontier research generally refers to the most innovative and promising directions within a discipline or across interdisciplinary fields [7]. Accurately identifying research frontiers is crucial for research management agencies and policymakers, as it facilitates the optimization of resource allocation and drives breakthroughs in key technologies [21,22]. For researchers, understanding and grasping research frontiers not only ensures the academic value and practical impact of their work but also promotes interdisciplinary collaboration and knowledge innovation [23].

In bibliometric studies, “Research Fronts” is a key concept for assessing and identifying frontier areas. Price de Solla [24] was among the first to propose that focusing on approximately 30 to 50 recently published highly cited papers and their associated research topics within a domain could effectively reflect the research frontiers of that field. Later, Small and Griffith [25] further defined research fronts as clusters of highly cited or co-cited documents.

Subsequently, scholars have proposed a variety of methods for frontier identification, mainly categorized into qualitative methods based on expert knowledge and quantitative methods leveraging computational technologies [26]. Among qualitative approaches, the Delphi method is a classic and widely applied technique for identifying frontier interdisciplinary fields [27]. Quantitative approaches, which have become a research focus, include citation network-based analysis and content-based analysis [26].

Citation analysis mainly includes co-citation analysis [28] and bibliographic coupling [29]. Small and Griffith [25] Small and Griffith (1974) proposed that co-cited documents often share certain content similarities. Through co-citation analysis, the intrinsic relationships between documents can be revealed, and by applying clustering methods [30,31] and community detection techniques [32], researchers can further explore research fronts within academic fields.

On the other hand, bibliographic coupling [33] posits that documents sharing common references are likely to be closely related in content. Based on this concept, numerous studies have explored frontier topics through bibliographic coupling [33,34]. For instance, Liu et al. [35] used bibliographic coupling networks to detect scientific research fronts and trends, while Wei et al. [36] applied this approach to investigate research fronts in low-carbon technologies. Although these methods have achieved certain successes in revealing the dynamics of disciplinary development, they commonly suffer from time lag issues: newly published papers need time to accumulate citations, making it difficult to promptly capture emerging research areas and potential frontiers.

Content-based approaches, in contrast, directly analyze the textual content of documents to identify research fronts. Content analysis typically operates at two granularities: lexical and thematic. From the lexical perspective, researchers detect emerging terms and concepts in papers, patents, and other texts, or apply co-word analysis methods. For example, Kleinberg [37] proposed a burst detection algorithm capable of quickly capturing the emergence of research hotspots or new ideas in a field. Li and Chu [38] explored the identification of specific research frontiers based on enhanced co-word analysis techniques. Li and Chu [38] explored the identification of specific research frontiers based on enhanced co-word analysis techniques. From the thematic perspective, studies have employed models such as Latent Dirichlet Allocation (LDA) [39,40,41] and BERTopic [42] for thematic frontier discovery. Within the framework of topic modeling, several studies have further proposed corresponding frontier evaluation indicators encompassing dimensions such as impact, novelty, and growth [43,44].

2.2. Interdisciplinary Research Identification

Given the close methodological association between frontier detection and interdisciplinarity assessment, certain techniques such as topic modeling and citation analysis are discussed in both sections, though applied to different analytical purposes.

Research on interdisciplinary identification has increasingly adopted topic modeling approaches, either by first identifying interdisciplinary documents or by directly evaluating topic-level interdisciplinarity [45]. These methods often draw from keyword co-occurrence, semantic modeling, and citation-based linkages—many of which are also used in research front detection.

Keyword mining uses extracted terms to infer interdisciplinary trends. For instance, Xu et al. [46] and Wang et al. [47] analyzed keyword diversity and introduced measures such as “topic term interdisciplinarity” [48]. While intuitive and scalable, this approach is sensitive to keyword quality and struggles with synonym recognition and latent semantic links.

Topic modeling, especially using LDA or BERTopic, has become a mainstream method for capturing thematic interdisciplinarity. Recent work has integrated semantic embeddings [49], cosine similarity and Rao–Stirling diversity [50], and even graph neural networks [51] to enhance topic coherence and interdisciplinary linkage prediction. However, these models often face challenges in dynamic tracking and interpretability, particularly when modeling the evolving nature of interdisciplinary fields.

In the area of citation network analysis, citation-based interdisciplinary topic identification methods are categorized into co-citation analysis, bibliographic coupling, and direct citation analysis [45]. Omodei et al. [52] proposed a bipartite multilayer network analysis approach based on citations and disciplines to assess the interdisciplinary importance of scholars, institutions, and countries. Adams and Light [53] constructed bibliographic coupling networks for AIDS research papers and identified topics spanning multiple disciplinary communities as interdisciplinary topics.

Currently, interdisciplinary field identification methods exhibit a trend of increasing diversification. Nevertheless, each category of methods faces inherent limitations. Keyword mining depends on the precision and completeness of terms and struggles to detect implicit interdisciplinary linkages. Text content mining approaches (e.g., BERTopic, graph neural networks) offer advantages in semantic understanding and knowledge recombination but still face challenges regarding model interpretability and dynamic tracking. Citation analysis methods effectively reveal knowledge flows and disciplinary associations; however, due to the time lag inherent in citation accumulation, they are less capable of reflecting the latest frontier dynamics.

2.3. Time Series Modeling in Interdisciplinary Frontier Identification

Time series analysis, as a predictive method, has been widely applied in scientific trend analysis in recent years [54,55,56]. Particularly in the prediction of disciplinary frontiers, it enables the identification of development trajectories across different fields [57], thereby facilitating the targeted allocation of scientific and technological resources [58]. The basic idea of time series analysis is to use historical data to build models that infer future trends, making it particularly suitable for complex systems characterized by continuity and cumulative effects.

2.4. Limitations of Existing Research

Despite notable advancements, current studies on the identification and prediction of frontier fields still face several limitations:

(1): Methodological Fragmentation:

Most approaches rely on a single method—such as bibliometrics, machine learning, or topic modeling—without integrating their respective strengths. This constrains the capacity to conduct comprehensive, multidimensional analyses of interdisciplinarity and frontier dynamics.

(2): Narrow Scope of Interdisciplinary Focus:

Existing research tends to emphasize subfield-level topic detection, lacking tools for identifying and modeling interdisciplinarity at the domain level. As disciplinary boundaries become increasingly fluid, there is a growing need for more holistic and scalable frameworks.

(3): Static Analysis and Forecasting Gaps:

Many studies are based on retrospective data, with limited adoption of predictive models. The absence of dynamic forecasting restricts their utility for prospective technology assessment and policy planning.

(4): Semantic and Temporal Limitations:

Conventional methods (e.g., keyword mining, LDA, citation analysis) often struggle with semantic nuance, time sensitivity, and early detection of emerging topics. They are hindered by citation delays, rigid topic structures, and limited ability to capture deep contextual interdisciplinarity.

To clearly present the strengths and weaknesses of existing methods, we summarize them in the Table 1 below:

In response, this study proposes an integrated framework combining large language models, bibliometric metrics, and time series forecasting to enable more accurate and dynamic identification of frontier interdisciplinary domains.

3. Research Framework

Identifying frontier interdisciplinary domains requires integrating three core dimensions: interdisciplinarity, frontierness, and development trends. Interdisciplinarity reflects cross-domain integration, while frontierness captures innovation potential and emerging influence. By combining LLM-based semantic classification, bibliometric analysis, and trend forecasting, this study proposes a unified framework to systematically identify high-potential domains and support strategic decision-making in science and technology development.

The research process is divided into five major steps (Figure 1): (1) Data Collection and Disciplinary Classification: Extract paper data from the WOS database covering the past ten years to construct a broad academic dataset; utilize OpenAI’s GPT-3.5-Turbo for disciplinary classification. (2) Interdisciplinarity Identification: Based on the classification results, design interdisciplinarity measurement indicators to calculate the degree of interdisciplinarity and the strength of interdisciplinary integration. (3) Frontierness Assessment: Evaluate the frontierness of each domain by integrating three dimensions: influence, growth, and novelty. (4) Field Trend Analysis: Employ five different time series models to analyze the future trends in the academic impact of papers from each field, modeling and forecasting future development trajectories. (5) Identification of Frontier Interdisciplinary Fields: Integrate interdisciplinarity and frontierness indicators, combined with the technological trend analysis, to identify potential frontier interdisciplinary fields, thereby providing empirical support for technological innovation and policymaking within the respective fields.

3.1. Interdisciplinarity Identification Method

This study analyzes interdisciplinarity at three levels: (1) disciplinary classification, (2) static indicator construction, and (3) dynamic network analysis. First, we use ChatGPT to perform semantic classification of papers, allowing for the identification of multiple relevant disciplines and addressing the limitations of traditional single-label systems. Second, we construct interdisciplinarity metrics based on bibliometric distributions to quantify integration breadth and depth. Third, we extract dynamic indicators from both static and temporal co-occurrence networks to reveal the structural evolution of interdisciplinary relationships over time.

3.1.1. Paper Classification System Based on ChatGPT

Conventional classification systems such as Web of Science (WoS), Scopus, and Fields of Research (FOR) offer standardized, journal-level taxonomies with broad coverage; however, they face limitations in handling interdisciplinary and heterogeneous data. These systems often assign a single discipline per document, lack contextual understanding, and require manual effort, making them less effective for large-scale or interdisciplinary analyses.

To address these issues, this study employs GPT-3.5-turbo-0125 for paper classification. As a large language model, ChatGPT enables deep semantic understanding, allowing for the extraction of both explicit disciplinary labels and implicit interdisciplinary signals. This improves classification precision, adaptability across data sources, and processing efficiency.

Unlike traditional systems, ChatGPT supports multi-label classification, capturing the full disciplinary scope of publications. It also accommodates varying classification standards, reducing inconsistencies in multi-source data integration [59].

In this study, papers on synthetic biology were classified into 19 disciplines—such as Biology, Engineering, Computer Science, Medicine, and Environmental Science—based on Biglan [60] framework. This classification underpins the subsequent measurement of interdisciplinarity metrics.

To evaluate the ChatGPT-based classification system, we randomly sampled 400 articles from the full corpus. Each article was associated with a Web of Science (WOS) subject category. To establish a reliable ground truth, two domain experts independently reviewed each article’s title and abstract and either confirmed the WOS label or provided a revised label. Disagreements were resolved through discussion, resulting in a consensus-based reference set.

Evaluation was conducted using standard metrics for multi-class classification:

(1): Precision: indicates the proportion of papers that were classified into a certain category by the model and indeed belong to that category. It is calculated as follows:

Precision = \frac{TP}{TP + F P}

(1)

where TP denotes true positives and FP denotes false positives.

(2): Recall: indicates the proportion of papers that actually belong to a certain category and were successfully identified by the model. It is calculated as follows:

Recall = \frac{TP}{TP + FN}

(2)

where FN denotes false negatives.

(3): F1 score: the harmonic mean of precision and recall, used to comprehensively assess the model’s accuracy and coverage. It is calculated as follows:

F_{1} = \frac{2 \cdot Precision \cdot Recall}{Precision + Recall}

(3)

3.1.2. Indicators for Disciplinary Interdisciplinarity Degree and Disciplinary Integration Strength

To quantitatively assess the interdisciplinary nature of a research field, this study adopts two complementary indicators: Disciplinary Interdisciplinarity Degree (I) and Disciplinary Integration Strength (W).

The Interdisciplinarity Degree (I) reflects the breadth of disciplinary participation—that is, how evenly research output is distributed across multiple disciplines. A higher value suggests more balanced involvement from diverse fields, indicating wider interdisciplinary coverage. Based on Ma et al. [61], the formula is as follows:

I = 1 - \frac{\sum_{i = 1}^{γ} m_{i}^{2}}{{(\sum_{i = 1}^{γ} m_{i})}^{2}}

(4)

This formula assesses the breadth of interdisciplinary integration within a field. A higher I value closer to 1 indicates a greater level of interdisciplinarity, meaning that the field involves a more diverse range of disciplines.

A higher interdisciplinarity score I indicates that research output is more evenly distributed across disciplines. For example, if a field includes equal contributions from biology and computer science, I will be high. In contrast, if one discipline dominates, I decreases, signaling limited interdisciplinary breadth.

The Integration Strength (Wᵢⱼ) measures the depth of interaction between two specific disciplines by evaluating how frequently they co-occur in the same paper. A higher value indicates closer interdisciplinary collaboration. Following Leydesdorff [62], the formula is as follows:

W_{i j} = \frac{N (i \cap j)}{N (i \cup j)} \times 100

(5)

where N(i ∩ j) represents the number of papers associated with both discipline i and discipline j, and N(i ∪ j) represents the total number of papers associated with either discipline i or j. A higher W value indicates a stronger interdisciplinary connection between the two disciplines.

By combining I (broad disciplinary diversity) with W (micro-level integration closeness), the framework captures both the extent and strength of interdisciplinary collaboration within a given research domain.

3.1.3. Temporal Indicators of Interdisciplinary Integration Strength

To capture the structural evolution of interdisciplinarity in synthetic biology, this study analyzes both the overall network structure and its yearly dynamics. We constructed a static disciplinary co-occurrence network based on ChatGPT-derived classifications, where nodes represent disciplines and edges reflect their co-occurrence within the same paper, weighted by frequency. This network visualizes the roles of core and peripheral disciplines and the patterns of disciplinary interaction.

To examine temporal dynamics, we divided the data into annual time windows and constructed a series of disciplinary evolution networks from 2015 to 2024. On this basis, we selected representative network topology indicators (Table 2).

These indicators provide insights into how disciplinary integration has changed over time, revealing key turning points and long-term collaboration trends.

3.2. Frontierness Identification of Research Fields

In the bibliometric analysis of scientific research, measuring frontierness has become an important method for identifying emerging research areas and innovation trends. Scholars have proposed various indicators for evaluating frontierness, including novelty, growth, and impact.

This study systematically characterizes the frontierness of research fields by integrating three dimensions: novelty, growth, and impact. (1) Novelty evaluates the emerging nature of research content, reflecting the speed of knowledge renewal within a field. (2) Growth measures research activity and expansion potential, indicating the developmental trends of the field. (3) Impact reflects the degree to which research outputs are cited, indicating their academic recognition and dissemination. These three dimensions collectively provide a comprehensive quantitative assessment of frontierness, capturing the degree of content innovation, the rate of field expansion, and the academic influence of research achievements.

3.2.1. Novelty Indicator

Novelty is a critical metric for determining whether the research content of a field belongs to an emerging domain. It evaluates the level of innovation within a field during a specific time period by analyzing the publication dates of papers. The calculation formula is as follows:

N (D) = \frac{1}{M_{t} (D)} \sum_{m \in M_{t} (D)} T_{m}

(6)

where

M_{t} (D)

represents the number of papers in field

D

during time period

t

, and

T_{m}

denotes the publication year of paper

m

within the same period. A higher concentration of recent publications indicates that field

D

is characterized by newer research activities, suggesting a higher degree of novelty.

3.2.2. Growth Indicator

Growth reflects whether research activities in a field show an increasing trend over a specific time period. By calculating the growth rate of the number of publications across different periods, the activity level and development potential of the field can be assessed. The growth indicator is calculated as follows:

I (D) = \frac{1}{4} \sum_{t = 1}^{4} \frac{M_{t + 1} (D) - M_{t} (D)}{M_{t} (D)}

(7)

where

M_{t} (D)

denotes the number of papers in field

D

during time period

t

. A higher growth value suggests that the field is rapidly developing and possesses strong potential for future expansion.

3.2.3. Impact Indicator

Impact reflects the academic influence of research outputs within a field, typically measured through citation data. The calculation formula for impact is as follows:

E (D) = \frac{1}{M_{t} (D)} \sum_{m \in M_{t} (D)} C_{m}

(8)

where

C_{m}

represents the citation count of paper

m

during time period

t

. The average number of citations per paper provides a measure of the academic influence of the field. A higher impact value indicates broader academic recognition and higher citation rates for the field’s research outputs.

3.3. Field Trend Analysis Methods

3.3.1. Time Series Models

Given the temporal variability [63] and complexity [64] characteristics of frontier interdisciplinary fields, traditional static analysis methods are often insufficient for capturing the dynamic relationships among multidimensional data. In contrast, time series models can achieve both long-term trend forecasting and short-term fluctuation analysis [65], making them widely applied in the dynamic tracking and prospective research of scientific frontiers.

This study employs five representative time series models for trend analysis: LSTM [66], GRU [67], Transformer [68], Random Forest [69], and Linear Regression [70]. LSTM and GRU are selected for their strong capability in capturing temporal dependencies in sequential data; Transformer, with its attention mechanism and parallel computation, offers enhanced efficiency and scalability for long sequence modeling. Random Forest is employed for its robustness in handling nonlinear relationships, while Linear Regression serves as a baseline for performance comparison.

By integrating deep learning and traditional machine learning approaches, this study aims to comprehensively assess the dynamic trends of frontier fields under different modeling perspectives.

3.3.2. Evaluation Metrics for Time Series Models

To assess the performance of time series models in forecasting frontier trends, this study adopts five standard evaluation metrics: Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and the Coefficient of Determination (R²).

MAE and MAPE offer intuitive measures of average error; MSE and RMSE emphasize larger deviations, with RMSE reflecting error dispersion. R² quantifies the proportion of variance explained, indicating model fit.

Together, these indicators support a robust comparative analysis of forecasting accuracy and inform model selection in science and technology trend prediction.

3.4. Identification of Frontier Interdisciplinary Fields

To identify frontier interdisciplinary domains, this study integrates three dimensions—interdisciplinarity, frontierness, and development trend—into a unified analytical framework. Based on threshold values for each dimension, domains are classified into four types (Figure 2), enabling a structured interpretation of their innovative potential and scientific positioning.

The conceptual function for determining frontier interdisciplinary fields is defined as follows:

F (x, y, z) = \{\begin{matrix} R_{1} i f x < θ_{x}, y < θ_{y}, z < θ_{z} \\ R_{2} i f x < θ_{x}, y \geq θ_{y}, z < θ_{z} \\ R_{3} i f x \geq θ_{x}, y < θ_{y}, z \geq θ_{z} \\ R_{4} i f x \geq θ_{x}, y \geq θ_{y}, z \geq θ_{z} \end{matrix}

(9)

where

x

: interdisciplinarity,

y

: frontierness,

z

: future potential. The thresholds θ_x, θ_y, and

θ_{z}

distinguish the following domain types:

(1) R₁—Traditional Domains: Low in all three dimensions, representing mature fields with established frameworks but limited growth and integration, such as classical thermodynamics. (2) R₂—Traditional Frontier Domains: High frontierness but low interdisciplinarity and future momentum, often focused on deepening existing core knowledge. (3) R₃—Emerging Interdisciplinary Domains: High interdisciplinarity and future potential but low current frontierness. These areas are undergoing theoretical accumulation and cross-domain integration, e.g., agriculture + big data. (4) R₄—Frontier Interdisciplinary Domains: High across all three dimensions. These domains are deeply integrated, highly innovative, and poised for continued expansion—examples include bioinformatics, quantum computing, and AI.

The classification into R₁–R₄ research types is built on a conceptual thresholding function, using θ_x, θ_y, and θ_z to denote decision boundaries along interdisciplinarity, frontierness, and trend potential axes, respectively. In this study, we do not implement fixed numerical thresholds, but rather present a logic-based typology that can be adapted in future applications. These thresholds can be parameterized using statistical (e.g., percentile cutoffs), expert-driven, or clustering-based methods depending on specific analytical contexts.

This framework provides a scalable approach for classifying emerging research domains and supporting evidence-based science policy and information management decisions.

4. Empirical Analysis

4.1. Data Source

This study uses synthetic biology as a representative domain to validate the proposed AI-assisted framework for identifying and predicting frontier interdisciplinary fields. As a rapidly evolving area, synthetic biology integrates biology, engineering, and information science, with expanding applications in medicine, agriculture, and environmental systems.

We retrieved data from the Web of Science Core Collection, focusing on articles and reviews published between 2015 and October 2024. The SCI-EXPANDED and SSCI databases were selected for their high-quality coverage and robust citation metadata. A total of 13,683 relevant records were obtained after screening.

The 2024 data used in this study include publications up to October only, due to the timing of data retrieval. While this may lead to a slight underestimation of full-year output, it does not compromise the methodological validity of the proposed framework. On the contrary, one of the framework’s key strengths is its ability to operate with partial-year or real-time data, enabling timely identification and forecasting of frontier fields without the need to wait for complete annual records.

To ensure comprehensive topic coverage, a tailored search strategy targeting key terms in synthetic biology was applied. The full query is provided in File S1.

4.2. Interdisciplinarity Identification

4.2.1. Disciplinary Classification and Interdisciplinary Scale Analysis

In this study, the ChatGPT-based literature classification system achieved strong performance across evaluation metrics. Specifically, the system reached an average precision of 0.850, recall of 0.801, and F1 score of 0.835 across the 19 disciplinary categories. These results demonstrate that the model effectively captures both explicit and implicit disciplinary signals, providing reliable multi-label classification capabilities suitable for interdisciplinary analysis. Overall, the system’s performance aligns with commonly accepted benchmarks for high-quality multi-class classification tasks in literature analysis.

Building on this robust classification foundation, we analyzed the temporal evolution of disciplinary composition in synthetic biology. Figure 3 provides a longitudinal heatmap of publication counts by discipline from 2015 to 2024. The results reveal a clear trajectory of interdisciplinary convergence:

Phase I: Foundational Phase (2015–2016)—Biology dominates the field, contributing over 60% of total publications. This reflects the early-stage emphasis on core biological processes, such as gene editing, metabolic pathways, and synthetic constructs.

Phase II: Expansion Phase (2017–2022)—A marked rise is observed in contributions from Computer Science, Environmental Science, and Engineering. The surge in Computer Science correlates with the integration of machine learning and bio-design automation. Environmental Science contributes to risk assessment and sustainable deployment, indicating the field’s expanding application scope.

Phase III: Convergence Phase (2023–2024)—The disciplinary composition stabilizes, reflecting a matured interdisciplinary system. Biology remains central, but the sustained presence of Computer Science, Medicine, and Engineering points to established role-sharing: biology for foundational knowledge, CS for modeling and automation, and engineering for system integration.

Importantly, less traditional disciplines—notably Law, Ethics, and Philosophy—begin to appear more consistently post-2020. Their increasing visibility aligns with the growing recognition of biosafety, governance, and public acceptability issues, signaling a shift toward responsible innovation in synthetic biology.

These results collectively illustrate that synthetic biology is no longer a domain confined to life sciences. Instead, it is evolving into a socio-technical enterprise that increasingly draws from diverse knowledge systems, reinforcing its classification as a frontier interdisciplinary field.

Synthetic biology, as a highly interdisciplinary field, has seen notable shifts in its integration patterns over the past decade. As shown in Figure 4, interdisciplinary publications began gaining momentum in 2016, followed by rapid growth until 2022. Although publication counts declined slightly after peaking in 2022, they remained high, underscoring the sustained vitality of cross-disciplinary collaboration.

The proportion of interdisciplinary papers also steadily increased from 2016 to 2022, reflecting the expanding role of external fields such as Computer Science, Engineering, and Environmental Science in synthetic biology. These disciplines have supported advances in bioinformatics, system design, and ecological assessment, respectively.

Post-2022, the proportion of interdisciplinary work remained elevated despite a drop in total publications, indicating a stabilization phase in the field’s integration trajectory. Overall, synthetic biology has firmly established itself as a typical interdisciplinary domain, with continued expansion expected as new disciplines engage in its development.

4.2.2. Analysis of Interdisciplinary Integration Strength

The interdisciplinary nature of synthetic biology is reflected not only in the growth of interdisciplinary publication numbers but also in changes in interdisciplinary integration strength. Figure 5 shows the trends in the Interdisciplinary Index (I) and the number of involved disciplines (Discipline Counts) from January 2015 to October 2024.

Overall, the interdisciplinary strength of synthetic biology has exhibited relatively small fluctuations over the past decade, consistently maintaining a value around 0.7, indicating a stable level of disciplinary integration.

Specifically, the interdisciplinary strength was approximately 0.7 in 2016, and remained relatively stable with a slight upward trend from 2016 to 2018. However, a slight decline was observed between 2018 and 2020, returning to around 0.7. Since 2020, the index has stabilized again and continued through 2024.

These changes suggest that although the degree of interdisciplinarity among different disciplines may experience short-term fluctuations, overall, the integration of disciplines in synthetic biology has become structurally stable, with relatively steady collaborative relationships across fields.

Meanwhile, the trend in discipline counts also reflects the breadth of interdisciplinary integration. In 2016, approximately 16 disciplines were involved in synthetic biology research. This number increased significantly between 2016 and 2018, and stabilized around 17 disciplines during 2018–2020.

This period marked a deepening of multidisciplinary integration, as more fields became actively engaged in synthetic biology research. However, after 2022, a downward trend in discipline counts appeared, with the number falling back to about 16 disciplines by 2024. This suggests that although the breadth of interdisciplinary involvement expanded from 2016 to 2022, in recent years, the intensity of interest from certain disciplines may have declined, leading to a slight contraction in the scope of interdisciplinary engagement.

In summary, although the number of involved disciplines has slightly decreased in recent years, the interdisciplinary integration strength has remained stable. This reflects that synthetic biology continues to maintain a high level of disciplinary fusion.

The stability of interdisciplinary strength indicates that the core disciplines have established mature and deep collaborative frameworks. Moving forward, the development of synthetic biology may rely more on the deep integration among existing disciplines rather than the large-scale influx of new disciplines.

4.2.3. Overall Network Analysis of Interdisciplinary Connections

Figure 6 visualizes the disciplinary co-occurrence network in synthetic biology, highlighting the structural characteristics of interdisciplinary collaboration in the field.

At the core of the network is biology, which holds the largest node and highest connectivity. This reflects its foundational role and its extensive interaction with other disciplines.

Computer science is closely linked to biology, driven by the rise of computational biology, bioinformatics, and AI-assisted research. Environmental science and sociology also maintain strong ties with biology, contributing to ecological risk assessments and societal impact evaluations.

Other disciplines such as medicine, engineering, and neuroscience connect through direct or intermediate paths. Linguistics, linked via computer science, reflects the growing role of natural language processing in biological knowledge mining.

Peripheral fields like law, philosophy, and political science appear less frequently but remain important in shaping ethical frameworks, regulatory policies, and public engagement. In the interdisciplinary structure of synthetic biology, the presence of disciplines such as ethics, law, and public policy is notable. These fields contribute to shaping the research landscape in important ways. Ethical debates on genome editing, regulatory frameworks for biosafety, and legal mechanisms for intellectual property all affect funding directions, institutional behavior, and even the public legitimacy of emerging technologies. Their appearance in the citation network and semantic classification suggests that frontier science is increasingly embedded within broader societal dialogues, where governance structures co-evolve with technical innovation. This co-evolutionary dynamic deserves further attention in future studies of frontier field development.

This network structure provides three key insights: (1) Synthetic biology is not merely interdisciplinary—it is polycentric, involving diverse epistemic communities that contribute at different layers (technical, methodological, institutional). (2) Interdisciplinary integration is asymmetric: biology collaborates intensively with computational and environmental fields, while legal, philosophical, and political domains maintain functional but looser ties. (3) The presence of peripheral social sciences signals an ongoing shift toward responsible and anticipatory governance, which will likely intensify as applications of synthetic biology broaden.

Overall, the network demonstrates a biology-centered, multi-level integration structure, with core disciplines enabling intensive collaboration and peripheral ones providing institutional and ethical context. As synthetic biology advances, deeper interdisciplinary expansion is expected.

4.2.4. Evolutionary Analysis of the Interdisciplinary Network

The evolution of interdisciplinary research in synthetic biology is reflected in the structural dynamics of its disciplinary co-occurrence network (Figure 7). Key topological metrics from 2015 to 2024 (Table S1) reveal distinct shifts in connectivity and collaboration intensity over the past decade.

From 2015 to 2019, network density remained relatively stable (0.40–0.44), indicating consistent interdisciplinary linkages. In 2020, however, both density and average degree rose sharply—reaching 0.57 and 9.06 respectively—suggesting intensified disciplinary interaction, likely catalyzed by the global COVID-19 crisis. While these values slightly declined in subsequent years, they remained elevated, pointing to a lasting structural shift.

The clustering coefficient remained high throughout the period (0.76–0.81), implying stable subgroup formation and strong internal cohesion within the interdisciplinary network. Meanwhile, the network maintained a short diameter (2–3) and average path length (1.43–1.62), reflecting efficient disciplinary connectivity.

The number of disciplines and their interactions also increased, expanding from 17 nodes and 55 edges in 2015 to 18 nodes and 68 edges by 2024. This growth highlights not only the diversification of participating fields but also the intensification of interdisciplinary collaboration.

Overall, 2020 marked a structural turning point. Despite modest fluctuations thereafter, synthetic biology has sustained high levels of integration and connectivity, underscoring its status as a mature and enduringly interdisciplinary field.

4.3. Frontierness Analysis of the Field

Synthetic biology demonstrates strong frontier characteristics, reflected in its novelty, growth, and impact indicators.

4.3.1. Novelty

Based on Equation (6), the novelty score of synthetic biology is 2019.860, indicating that research hotspots in the field are highly concentrated in recent years, reflecting strong temporal novelty.

This is primarily attributed to breakthroughs in key technologies, such as CRISPR gene editing, automated laboratory platforms, and AI-assisted synthetic pathway optimization, which have continually pushed the research frontiers forward. The high novelty score suggests that synthetic biology remains in a phase of rapid evolution, with the potential for further technological and application breakthroughs in the future (Table 3).

Compared to other research topics, synthetic biology’s novelty score is significantly higher than that of Information Technology and Services (2006.220) and Algorithm Optimization (2011.191), and is close to the scores for emerging cross-disciplinary technologies such as Medical Data Modeling (2021.260) and Machine Learning and Deep Learning (2021.219) [71], highlighting its position at the forefront of research.

4.3.2. Growth

Based on Equation (7), the growth rate of synthetic biology is calculated as 4.46%, indicating steady research expansion. Although its growth rate is not as explosive as that of emerging technologies such as artificial intelligence, it maintains a healthy upward trend. Such stable growth typically suggests that the field has entered a sustainable development phase, supported by a stable research community and a gradually maturing technological system. This trend is closely linked to the application-driven nature of synthetic biology, with advances in biomanufacturing, precision medicine, and agricultural biotechnology driving the steady translation of research results into practical applications.

4.3.3. Impact

Based on Equation (8), the impact score of synthetic biology is 25.03, positioning it at a moderate-to-high level among various research fields. Although slightly lower than Information Technology and Services (55.73) and Scientometrics (43.08), it is notably higher than Medical Data Modeling (10.73) and Machine Learning and Deep Learning (11.12) [71]. This indicates that synthetic biology has achieved considerable academic influence, policy support, and patent activity.

The strong impact of the field is largely due to its critical position within a multidisciplinary context, particularly through its deep integration with fields such as biomedicine, agricultural engineering, and environmental remediation, thus enabling its research outcomes to generate broad and lasting value in real-world applications.

4.3.4. Field Potential

Synthetic biology is fundamentally a technology- and application-driven discipline, propelled by advances in gene editing, bioengineering methodologies, and computational simulation technologies. In recent years, the maturation of tools like CRISPR has contributed significantly to the simultaneous growth in publication output and academic influence within the field. The field’s interdisciplinary nature is another crucial factor supporting its continuous expansion. By integrating Biology, Engineering, Computer Science, and other disciplines, synthetic biology continually explores new research directions and enhances its academic impact.

Based on the above analyses, synthetic biology exhibits substantial potential for future growth, particularly in cross-disciplinary applications such as AI-assisted gene design, computational modeling for pathway optimization, and intelligent biomanufacturing processes.

Although the current growth rate (4.46%) is relatively stable, the field’s high impact (25.03) and high frontierness (2019.860) indicate the high quality of ongoing research, suggesting that more breakthrough innovations are likely to emerge. Furthermore, as industry interest in synthetic biology continues to rise, the trend toward integrating academic research with industrial applications is expected to strengthen, providing additional momentum for field development. Consequently, synthetic biology is likely to maintain strong frontier characteristics and exert profound scientific and societal impacts across multiple domains in the future.

4.4. Trend Analysis of the Field

4.4.1. Model Construction and Error Evaluation

To forecast synthetic biology’s research trajectory, five models were implemented: three deep learning models (LSTM, GRU, Transformer) and two baseline models (Random Forest and Linear Regression). All models were trained using TensorFlow 2.6.0 and Keras 2.9.0, and evaluated using standard error metrics (MAE, MSE, RMSE, MAPE, R²).

As shown in Figure 8, deep learning models outperformed traditional ones. The Transformer exhibited the lowest and most stable error values, suggesting strong generalization performance. In contrast, LSTM and GRU showed fluctuating MAPE curves, indicating potential instability. Random Forest and Linear Regression failed to capture the underlying nonlinear temporal dynamics.

4.4.2. Model Comparison and Best Model Selection

Table 4 summarizes the prediction performance of all models on the test set using five metrics: Mean Absolute Error (MAE), MSE, Root Mean Squared Error (RMSE), MAPE, and R². Among all models, the Transformer achieved the best overall performance, with the lowest error values across all indicators and a high R² of 0.96, indicating excellent predictive accuracy and data fitting.

The LSTM ranked second with relatively low error values but showed a limited capacity for generalization (R² = 0.23). GRU exhibited higher error rates, particularly underperforming in MAPE and R². Traditional models such as Random Forest and Linear Regression failed to capture the nonlinear patterns inherent in synthetic biology trends, performing poorly across all metrics.

To further validate model effectiveness, Figure 9 compares historical publication data with forecasts from each model. The Transformer aligned most closely with the observed slowdown in growth after 2022, projecting steady expansion rather than an explosive increase. In contrast, LSTM predicted exponential growth, likely overfitting early acceleration trends. GRU showed more conservative forecasts, while Random Forest and Linear Regression projected near-flat growth, clearly underestimating the field’s dynamics. These results underscore the Transformer’s superior ability to capture complex temporal patterns in emerging research domains.

4.5. Identification of Frontier Interdisciplinary Domain

Based on the interdisciplinarity–frontierness–trend framework proposed in this study (Figure 2), synthetic biology can be identified as a typical frontier interdisciplinary domain.

4.5.1. Interdisciplinarity and Frontierness

Synthetic biology integrates knowledge from biology, computer science, engineering, environmental science, and other disciplines, reflecting strong interdisciplinary collaboration. Research in areas such as AI-assisted gene design, computational biology, and biomanufacturing has reinforced its role as a convergence hub for diverse technologies.

From a frontierness perspective, synthetic biology demonstrates both novelty and impact: it features recent high-intensity research activity and increasing academic and industrial attention around topics like CRISPR, synthetic circuits, and artificial cells. These indicators confirm its status as a highly innovative and scientifically forward-looking field.

4.5.2. Trend Outlook and Strategic Role

Since 2010, synthetic biology has entered a phase of accelerated growth, peaking around 2020. Forecasting results—especially from Transformer models—suggest sustained upward momentum, with new opportunities emerging at the intersection of synthetic biology and artificial intelligence. Applications in agriculture, medicine, and environmental engineering are also expanding, reinforcing the domain’s practical relevance.

4.5.3. Final Positioning

Synthesizing findings across all three dimensions, synthetic biology clearly qualifies as a frontier interdisciplinary domain. It combines a high degree of disciplinary integration with robust innovation capacity and stable development potential. As such, it is well positioned to drive cross-disciplinary technological transformation and contribute strategically to the future landscape of science, policy, and industry.

5. Discussion

This study constructs an integrated framework combining large language models, interdisciplinary metrics, and time series forecasting to identify and predict frontier interdisciplinary domains. By applying this framework to synthetic biology, we demonstrate its capability in extracting disciplinary patterns, measuring frontierness, and forecasting future trajectories. The discussion unfolds from both methodological and domain-specific perspectives.

5.1. Effectiveness of the Methodological Framework and Forecasting Algorithms

In the context of information management and science foresight, accurately identifying disciplinary frontiers and anticipating their development is a growing priority [72]. Our proposed framework addresses this need by combining semantic classification, network-based interdisciplinary indicators, and predictive modeling.

Empirical validation shows that the framework can differentiate disciplinary evolution types and offers scalable tools for research monitoring and strategic planning. Although synthetic biology is used as a test case, the methodology is adaptable to other domains, supporting broader policy and research agenda-setting.

Among the forecasting models compared, the Transformer significantly outperformed LSTM, GRU, Random Forest, and Linear Regression in both short- and long-term trend prediction, confirming its value in modeling complex nonlinear dynamics [73].

5.2. Interdisciplinary Integration and Frontier Characteristics of Synthetic Biology

Synthetic biology exemplifies a frontier interdisciplinary field, bringing together biology, computer science, engineering, environmental science, and medicine. Recent advances have driven the field toward greater automation and intelligence [74], with rapid progress in AI-assisted gene design [75] and automated biomanufacturing [76]. Its integration with environmental and medical sciences has enabled impactful applications in ecological restoration, precision medicine, and bioenergy. From a frontier perspective, the field shows high novelty and wide academic recognition. Research hotspots such as CRISPR, synthetic circuits, and artificial cells highlight its leadership in technological innovation.

Overall, synthetic biology demonstrates strong interdisciplinarity and transformative potential, reinforcing its position as a priority domain in science policy and innovation management.

5.3. Development Trends and Future Growth Forecast

Historical data and model forecasts show that synthetic biology entered a rapid growth phase after 2010, peaking around 2020. In the past two years, however, growth has slightly slowed—possibly due to resource constraints, policy regulations, or technical bottlenecks. According to the Transformer-based forecast, the field is expected to continue steady expansion, though the growth rate may stabilize, suggesting a transition into a phase of technological refinement and application-oriented expansion. Additionally, based on the frontier-interdisciplinarity-trend classification framework, future research hotspots in synthetic biology may continue to extend toward multi-domain integration. The convergence of synthetic biology with areas such as artificial intelligence, biomedical engineering, and green technology is likely to reshape its innovation ecosystem, reinforcing its strategic role in both academic and industrial contexts.

5.4. Challenges, Limitations, and Future Directions

Despite its growing influence, synthetic biology faces challenges in technology development, ethical governance, and industrial translation. Technologically, core capabilities such as gene editing, synthetic pathway optimization, and artificial cell construction remain limited in stability, scalability, and reproducibility [77,78]. While AI has begun to assist gene design and modeling [79], enhancing biological interpretability and experimental feasibility remains a key bottleneck [80].

Ethical and biosafety concerns are intensifying as synthetic biology advances. Issues surrounding artificial life and genetic modification require stronger governance in biosafety, ethics, and ecological risk [81,82,83].

On the policy side, industrial adoption faces regulatory fragmentation, cost barriers, and slow market translation [84]. Harmonizing global standards and ethical frameworks is urgently needed [84,85,86].

GPT-3.5-Turbo was selected for its accessibility and stability. Despite its strong classification performance, challenges such as hallucinations, prompt sensitivity, and interpretability persist. Future work could explore newer models (e.g., GPT-4, DeepSeek, Gemini) for improved robustness, reasoning, and domain adaptability.

Despite its strong performance, the ChatGPT-based classification system faces concerns regarding hallucinations [87], prompt sensitivity [88], and interpretability. In some cases, the model generated field labels not present in standard taxonomies (e.g., WOS), reflecting its tendency to infer or “hallucinate [89]” plausible categories. Moreover, slight variations in prompt wording can lead to inconsistent outputs, particularly for interdisciplinary abstracts. Although temperature was fixed at 0 to reduce output randomness [90], such behaviors highlight the need for robustness testing, ensemble prompting, and possibly fine-tuning for future implementations. We consider these factors important for responsible use of LLMs in scientific evaluation workflows.

While the Transformer model demonstrated superior forecasting accuracy based on standard metrics (MAE, MSE, RMSE, R²), its practical utility in supporting strategic foresight also depends on robustness to data perturbation. In real-world settings, historical research trajectories may be noisy, incomplete, or distorted by lagging indicators (e.g., delays in publication or indexing [91]). A small shift in forecasted growth curves—especially near inflection points—could influence funding strategies, institutional planning, or technology prioritization.

Although this study did not include a formal sensitivity analysis, future work will explore the stability of forecasting outputs under simulated data noise, varying time horizons, and missing data. Such robustness checks are essential to ensure responsible deployment of trend analysis tools in policy or investment contexts.

This study, while offering a robust framework, is limited by its reliance on bibliometric data and assumptions inherent to the modeling approaches. Future work should incorporate multi-source datasets—including patents, policy texts, and industrial records—to improve the precision of trend forecasts and assess interdisciplinary spillovers. Expanding the framework to other emerging domains would further validate its generalizability and enhance its utility for information management and policy intelligence.

The results demonstrate that our integrated framework not only enables the identification of interdisciplinary research areas but also facilitates their dynamic monitoring and future projection. Compared with traditional methods such as LDA-based topic modeling and citation burst detection, our framework incorporates semantic classification, multi-dimensional indicators, and deep learning models, which together offer a more robust and forward-looking approach. Notably, the Transformer model showed superior accuracy and stability in forecasting publication trends, likely due to its ability to capture long-range dependencies and nonlinear growth patterns. The rising indicator curves of interdisciplinarity, novelty, and growth in synthetic biology further validate its status as a frontier interdisciplinary domain. This triangulation across multiple dimensions improves both the reliability and interpretability of the results, supporting data-driven decision-making in research, policy, and innovation management.

6. Conclusions

This study proposed an integrated framework for identifying and predicting frontier interdisciplinary fields by combining large language model-based semantic classification, multidimensional indicator evaluation, and time series forecasting. Using synthetic biology as a case study, we demonstrated that the framework effectively captures both the structural and temporal characteristics of emerging domains.

Key findings include: the following (1) ChatGPT-based disciplinary classification enables flexible, high-granularity field labeling; (2) the multidimensional evaluation system—comprising interdisciplinarity, novelty, growth, impact, and trend—provides a comprehensive lens for characterizing research frontiers; and (3) the Transformer model outperforms other forecasting methods in predicting future publication trends, confirming its suitability for modeling nonlinear scientific growth.

The proposed framework offers methodological advancements in science mapping, with practical implications for research management, strategic funding, and innovation policy. Future work could extend the model by incorporating additional data modalities (e.g., patents, policy documents), testing on other emerging fields, and integrating multimodal large language models for deeper semantic analysis.

Future research will apply the framework to additional interdisciplinary domains (e.g., quantum computing, bioinformatics) to further assess its generalizability and scalability.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/systems13080677/s1, Supplementary File S1: Search Query Used in the Web of Science Core Collection; Table S1: Evolution of Key Topological Indicators in the Synthetic Biology Interdisciplinary Network (2015–2024).

Author Contributions

Y.W.: Conceptualization, Methodology, Writing—original draft. R.Y.: Investigation, Data curation. J.W.: Writing—Review and Editing. Q.L.: Writing—Review and Editing. X.Z.: Methodology, Funding acquisition, Supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Office for Philosophy and Social Sciences, Beijing, China [Grant number 23&ZD225 2023].

Data Availability Statement

The data that support the findings of this study are available upon request to the corresponding author.

Acknowledgments

This study used ChatGPT for linguistic embellishment to improve the readability of the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Klein, J.T. Beyond Interdisciplinarity: Boundary Work, Communication, and Collaboration; Oxford University Press: Oxford, UK, 2021. [Google Scholar]
Zahra, S.A.; Newey, L.R. Maximizing the impact of organization science: Theory-building at the intersection of disciplines and/or fields. J. Manag. Stud. 2009, 46, 1059–1075. [Google Scholar] [CrossRef]
Keohane, R.O.; Victor, D.G. The regime complex for climate change. Perspect. Politics 2011, 9, 7–23. [Google Scholar] [CrossRef]
Newell, P. Globalization and the governance of biotechnology. Glob. Environ. Politics 2003, 3, 56–71. [Google Scholar] [CrossRef]
Dwivedi, Y.K.; Hughes, L.; Kar, A.K.; Baabdullah, A.M.; Grover, P.; Abbas, R.; Andreini, D.; Abumoghli, I.; Barlette, Y.; Bunker, D. Climate change and COP26: Are digital technologies and information management part of the problem or the solution? An editorial reflection and call to action. Int. J. Inf. Manag. 2022, 63, 102456. [Google Scholar] [CrossRef]
Le, T.; Pham, H.; Mai, S.; Vu, N. Frontier academic research, industrial R&D and technological progress: The case of OECD countries. Technovation 2022, 114, 102436. [Google Scholar] [CrossRef]
Ye, G.; Wang, C.; Wu, C.; Peng, Z.; Wei, J.; Song, X.; Tan, Q.; Wu, L. Research frontier detection and analysis based on research grants information: A case study on health informatics in the US. J. Informetr. 2023, 17, 101421. [Google Scholar] [CrossRef]
Huang, M.-H.; Chang, C.-P. Detecting research fronts in OLED field using bibliographic coupling with sliding window. Scientometrics 2014, 98, 1721–1744. [Google Scholar] [CrossRef]
Morss, R.E.; Lazrus, H.; Demuth, J.L. The “inter” within interdisciplinary research: Strategies for building integration across fields. Risk Anal. 2021, 41, 1152–1161. [Google Scholar] [CrossRef]
Newman, J. Promoting interdisciplinary research collaboration: A systematic review, a critical literature review, and a pathway forward. Soc. Epistemol. 2024, 38, 135–151. [Google Scholar] [CrossRef]
Yu, M.; Li, J.; Cui, Y. Review on Science and Technology Trend Awareness Research. In Proceedings of the 2024 3rd International Conference on Big Data, Information and Computer Network (BDICN), Sanya, China, 12–14 January 2024; pp. 1–6. [Google Scholar]
Ozsoy, C.M.; Mengüç, M.P. A transdisciplinary approach and design thinking methodology: For applications to complex problems and energy transition. World 2024, 5, 119–135. [Google Scholar] [CrossRef]
Li, T.; Cui, L.; Lv, W.; Song, X.; Cui, X.; Tang, L. Exploring the frontiers of sustainable livelihoods research within grassland ecosystem: A scientometric analysis. Heliyon 2022, 8, e10704. [Google Scholar] [CrossRef] [PubMed]
Amado, A.; Cortez, P.; Rita, P.; Moro, S. Research trends on Big Data in Marketing: A text mining and topic modeling based literature analysis. Eur. Res. Manag. Bus. Econ. 2018, 24, 1–7. [Google Scholar] [CrossRef]
Funk, P.; Davis, A.; Vaishnav, P.; Dewitt, B.; Fuchs, E. Individual inconsistency and aggregate rationality: Overcoming inconsistencies in expert judgment at the technical frontier. Technol. Forecast. Soc. Change 2020, 155, 119984. [Google Scholar] [CrossRef]
Azoulay, P.; Lynn, F.B. Self-citation, cumulative advantage, and gender inequality in science. Sociol. Sci. 2020, 7, 152–186. [Google Scholar] [CrossRef]
Hall, B.H.; Jaffe, A.B.; Trajtenberg, M. The NBER Patent Citation Data File: Lessons, Insights and Methodological Tools; National Bureau of Economic Research: Cambridge, MA, USA, 2001. [Google Scholar]
Gerlach, M.; Peixoto, T.P.; Altmann, E.G. A network approach to topic models. Sci. Adv. 2018, 4, eaaq1360. [Google Scholar] [CrossRef]
Więckowski, J.; Kizielewicz, B.; Wątróbski, J.; Sałabun, W. A New Approach for Handling Uncertainty of Expert Judgments in Complex Decision Problems. IEEE Access 2024, 12, 142026–142046. [Google Scholar] [CrossRef]
Lesnikowski, A.; Belfer, E.; Rodman, E.; Smith, J.; Biesbroek, R.; Wilkerson, J.D.; Ford, J.D.; Berrang-Ford, L. Frontiers in data analytics for adaptation research: Topic modeling. Wiley Interdiscip. Rev. Clim. Change 2019, 10, e576. [Google Scholar] [CrossRef]
Wang, X.; Zhang, S.; Liu, Y. ITGInsight–discovering and visualizing research fronts in the scientific literature. Scientometrics 2022, 127, 6509–6531. [Google Scholar] [CrossRef]
Wang, X.; Zhang, S.; Liu, Y.; Du, J.; Huang, H. How pharmaceutical innovation evolves: The path from science to technological development to marketable drugs. Technol. Forecast. Soc. Change 2021, 167, 120698. [Google Scholar] [CrossRef]
Yu, D.; Yan, Z. Combining machine learning and main path analysis to identify research front: From the perspective of science-technology linkage. Scientometrics 2022, 127, 4251–4274. [Google Scholar] [CrossRef]
Price de Solla, D.J. Networks of scientific papers. Science 1965, 149, 510–515. [Google Scholar] [CrossRef]
Small, H.; Griffith, B.C. The structure of scientific literatures I: Identifying and graphing specialties. Sci. Stud. 1974, 4, 17–40. [Google Scholar] [CrossRef]
Shibata, N.; Kajikawa, Y.; Takeda, Y.; Matsushima, K. Comparative study on methods of detecting research fronts using different types of citation. J. Am. Soc. Inf. Sci. Technol. 2009, 60, 571–580. [Google Scholar] [CrossRef]
Williamson, K. Chapter 12—The Delphi method. In Research Methods for Students, Academics and Professionals, 2nd ed.; Williamson, K., Bow, A., Burstein, F., Darke, P., Harvey, R., Johanson, G., McKemmish, S., Oosthuizen, M., Saule, S., Schauder, D., et al., Eds.; Chandos Publishing: Oxfordshire, UK, 2002; pp. 209–220. [Google Scholar] [CrossRef]
Osareh, F. Bibliometrics, citation analysis and co-citation analysis: A review of literature I. Libri 1996, 46, 217–225. [Google Scholar] [CrossRef]
Weinberg, B.H. Bibliographic coupling: A review. Inf. Storage Retr. 1974, 10, 189–196. [Google Scholar] [CrossRef]
Zhang, T.; Chen, J.; Lu, Y.; Yang, X.; Ouyang, Z. Identification of technology frontiers of artificial intelligence-assisted pathology based on patent citation network. PLoS ONE 2022, 17, e0273355. [Google Scholar] [CrossRef] [PubMed]
Xu, D.; Liu, B.; Wang, J.; Zhang, Z. Bibliometric analysis of artificial intelligence for biotechnology and applied microbiology: Exploring research hotspots and frontiers. Front. Bioeng. Biotechnol. 2022, 10, 998298. [Google Scholar] [CrossRef]
Chen, X.; Mao, J.; Li, G. A co-citation approach to the analysis on the interaction between scientific and technological knowledge. J. Informetr. 2024, 18, 101548. [Google Scholar] [CrossRef]
Ma, C.; Xu, Q.; Li, B. Comparative study on intelligent education research among countries based on bibliographic coupling analysis. Libr. Hi Tech 2022, 40, 786–804. [Google Scholar] [CrossRef]
Yanhui, S.; Lijuan, W.; Junping, Q. A comparative study of first and all-author bibliographic coupling analysis based on Scientometrics. Scientometrics 2021, 126, 1125–1147. [Google Scholar] [CrossRef]
Liu, J.S.; Lu, L.Y.; Lu, W.-M. Research fronts in data envelopment analysis. Omega 2016, 58, 33–45. [Google Scholar] [CrossRef]
Wei, Y.-M.; Wang, J.-W.; Chen, T.; Yu, B.-Y.; Liao, H. Frontiers of low-carbon technologies: Results from bibliographic coupling with sliding window. J. Clean. Prod. 2018, 190, 422–431. [Google Scholar] [CrossRef]
Kleinberg, J. Bursty and hierarchical structure in streams, data mining and knowledge discovery. In Proceedings of the Elected Papers from the 8th ACM SIGKDD International Conference on Knowledge I Discovery and Data Mining? Edmonton, AB, Canada, 23–26 July 2002; pp. 372–397. [Google Scholar]
Li, M.; Chu, Y. Explore the research front of a specific research theme based on a novel technique of enhanced co-word analysis. J. Inf. Sci. 2017, 43, 725–741. [Google Scholar] [CrossRef]
Xie, T.; Qin, P.; Yan, J. Research on Artificial Intelligence Frontier Recognition Based on LDA. Open Access Libr. J. 2018, 5, 1–13. [Google Scholar] [CrossRef]
Wu, Q.; Kuang, Y.; Hong, Q.; She, Y. Frontier knowledge discovery and visualization in cancer field based on KOS and LDA. Scientometrics 2019, 118, 979–1010. [Google Scholar] [CrossRef]
Yu, D.; Xiang, B. Discovering topics and trends in the field of Artificial Intelligence: Using LDA topic modeling. Expert Syst. Appl. 2023, 225, 120114. [Google Scholar] [CrossRef]
Liu, Y.; Wan, F. Unveiling temporal and spatial research trends in precision agriculture: A BERTopic text mining approach. Heliyon 2024, 10, e36808. [Google Scholar] [CrossRef]
He, J.; Chen, C. Predictive effects of novelty measured by temporal embeddings on the growth of scientific literature. Front. Res. Metr. Anal. 2018, 3, 9. [Google Scholar] [CrossRef]
Park, I.; Yoon, B. Identifying promising research frontiers of pattern recognition through bibliometric analysis. Sustainability 2018, 10, 4055. [Google Scholar] [CrossRef]
Wang, Z.; Chen, J.; Chen, J.; Chen, H. Identifying interdisciplinary topics and their evolution based on BERTopic. Scientometrics 2024, 129, 7359–7384. [Google Scholar] [CrossRef]
Xu, J.; Bu, Y.; Ding, Y.; Yang, S.; Zhang, H.; Yu, C.; Sun, L. Understanding the formation of interdisciplinary research from the perspective of keyword evolution: A case study on joint attention. Scientometrics 2018, 117, 973–995. [Google Scholar] [CrossRef]
Wang, L.; Notten, A.; Surpatean, A. Interdisciplinarity of nano research fields: A keyword mining approach. Scientometrics 2013, 94, 877–892. [Google Scholar] [CrossRef]
Xu, H.; Guo, T.; Yue, Z.; Ru, L.; Fang, S. Interdisciplinary topics of information science: A study based on the terms interdisciplinarity index series. Scientometrics 2016, 106, 583–601. [Google Scholar] [CrossRef]
Kim, K.; Kogler, D.F.; Maliphol, S. Identifying interdisciplinary emergence in the science of science: Combination of network analysis and BERTopic. Humanit. Soc. Sci. Commun. 2024, 11, 603. [Google Scholar] [CrossRef]
Wang, M.; Xie, Y.; Guo, X.; Fu, H. Interdisciplinarity in the Built Environment: Measurement and Interdisciplinary Topic Identification. Buildings 2024, 14, 3718. [Google Scholar] [CrossRef]
Chaoguang, H.; Yueji, H.; Fanfan, H.; Chenwei, Z. An approach for interdisciplinary knowledge discovery: Link prediction between topics. Phys. A Stat. Mech. Its Appl. 2025, 665, 130517. [Google Scholar] [CrossRef]
Omodei, E.; De Domenico, M.; Arenas, A. Evaluating the impact of interdisciplinary research: A multilayer network approach. Netw. Sci. 2017, 5, 235–246. [Google Scholar] [CrossRef]
Adams, J.; Light, R. Mapping interdisciplinary fields: Efficiencies, gaps and redundancies in HIV/AIDS research. PLoS ONE 2014, 9, e115092. [Google Scholar] [CrossRef]
Montgomery, D.C.; Jennings, C.L.; Kulahci, M. Introduction to Time Series Analysis and Forecasting; John Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Liu, Z.; Zhu, Z.; Gao, J.; Xu, C. Forecast methods for time series data: A survey. IEEE Access 2021, 9, 91896–91912. [Google Scholar] [CrossRef]
Mudelsee, M. Trend analysis of climate time series: A review of methods. Earth-Sci. Rev. 2019, 190, 310–322. [Google Scholar] [CrossRef]
Abuhay, T.M.; Nigatie, Y.G.; Kovalchuk, S.V. Towards predicting trend of scientific research topics using topic modeling. Procedia Comput. Sci. 2018, 136, 304–310. [Google Scholar] [CrossRef]
Xiong, X.; Yang, G.-L.; Zhou, D.-Q.; Wang, Z.-L. How to allocate multi-period research resources? Centralized resource allocation for public universities in China using a parallel DEA-based approach. Socio-Econ. Plan. Sci. 2022, 82, 101317. [Google Scholar] [CrossRef]
Li, X.; Liu, Y.; Li, A.; Yang, X.; Tang, X. Research on the Subject Mapping Methods Optimization of Multi-Sources Biomedical Literatures. Digit. Libr. Forum 2023, 19, 1–9. [Google Scholar]
Biglan, A. The characteristics of subject matter in different academic areas. J. Appl. Psychol. 1973, 57, 195–203. [Google Scholar] [CrossRef]
Ma, L.; Zhao, Y.; Zuo, K.; Yi, P.; Liu, H. Analysis on the development trends of synthetic biology from the interdisciplinary perspective. Sci. Manag. Res. 2023, 41, 19–26. [Google Scholar] [CrossRef]
Leydesdorff, L. On the normalization and visualization of author co-citation data: Salton’s Cosine versus the Jaccard index. J. Am. Soc. Inf. Sci. Technol. 2008, 59, 77–85. [Google Scholar] [CrossRef]
Gribetz, S.K.; Kaye, L. 3 Time in Disciplinary and Interdisciplinary Perspective. In Time: A Multidisciplinary Introduction; Walter de Gruyter GmbH: Berlin, Germany, 2023; pp. 53–82. [Google Scholar]
Frogeri, R.F.; Júnior, P.d.S.P. Interdisciplinary Frontiers: Bridging Knowledge for Contemporary Challenges. Rev. Myth. 2025, 22, 238–243. [Google Scholar] [CrossRef]
Dalal, A.-A.; AlRassas, A.M.; Al-qaness, M.A.; Cai, Z.; Aseeri, A.O.; Abd Elaziz, M.; Ewees, A.A. TLIA: Time-series forecasting model using long short-term memory integrated with artificial neural networks for volatile energy markets. Appl. Energy 2023, 343, 121230. [Google Scholar]
Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to forget: Continual prediction with LSTM. Neural Comput. 2000, 12, 2451–2471. [Google Scholar] [CrossRef]
Dey, R.; Salem, F.M. Gate-variants of gated recurrent unit (GRU) neural networks. In Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; pp. 1597–1600. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; The MIT Press: Cambridge, MA, USA, 2017; p. 30. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Montgomery, D.C.; Peck, E.A.; Vining, G.G. Introduction to Linear Regression Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar]
Yuan, Y.; Meng, Y. Evolutionary Analysis and Frontier Identification of Interdisciplinary Research in Foreign LIS Field Based on BERT-LDA. Digit. Libr. Forum 2024, 20, 1–15. [Google Scholar]
Dwivedi, Y.K.; Sharma, A.; Rana, N.P.; Giannakis, M.; Goel, P.; Dutot, V. Evolution of artificial intelligence research in Technological Forecasting and Social Change: Research topics, trends, and future directions. Technol. Forecast. Soc. Change 2023, 192, 122579. [Google Scholar] [CrossRef]
Hain, D.; Jurowetzki, R.; Lee, S.; Zhou, Y. Machine learning and artificial intelligence for science, technology, innovation mapping and forecasting: Review, synthesis, and applications. Scientometrics 2023, 128, 1465–1472. [Google Scholar] [CrossRef]
Iram, A.; Dong, Y.; Ignea, C. Synthetic biology advances towards a bio-based society in the era of artificial intelligence. Curr. Opin. Biotechnol. 2024, 87, 103143. [Google Scholar] [CrossRef]
Gong, X.; Zhang, J.; Gan, Q.; Teng, Y.; Hou, J.; Lyu, Y.; Liu, Z.; Wu, Z.; Dai, R.; Zou, Y. Advancing microbial production through artificial intelligence-aided biology. Biotechnol. Adv. 2024, 74, 108399. [Google Scholar] [CrossRef]
Silva, T.C.; Eppink, M.; Ottens, M. Automation and miniaturization: Enabling tools for fast, high-throughput process development in integrated continuous biomanufacturing. J. Chem. Technol. Biotechnol. 2022, 97, 2365–2375. [Google Scholar] [CrossRef]
Jiang, W.; Wu, Z.; Gao, Z.; Wan, M.; Zhou, M.; Mao, C.; Shen, J. Artificial cells: Past, present and future. ACS Nano 2022, 16, 15705–15733. [Google Scholar] [CrossRef]
Dixit, S.; Kumar, A.; Srinivasan, K.; Vincent, P.D.R.; Ramu Krishnan, N. Advancing genome editing with artificial intelligence: Opportunities, challenges, and future directions. Front. Bioeng. Biotechnol. 2024, 11, 1335901. [Google Scholar] [CrossRef]
Li, Z.; Khan, W.U.; Bai, G.; Dong, C.; Wang, J.; Zhang, Y.; Wang, C.; Zhang, H.; Wang, W.; Luo, M.; et al. From Code to Life: The AI-Driven Revolution in Genome Editing. Adv. Sci. 2025, e17029. [Google Scholar] [CrossRef]
Chen, V.; Yang, M.; Cui, W.; Kim, J.S.; Talwalkar, A.; Ma, J. Applying interpretable machine learning in computational biology—Pitfalls, recommendations and opportunities for new developments. Nat. Methods 2024, 21, 1454–1461. [Google Scholar] [CrossRef]
Xue, L.; Pang, Z. Ethical governance of artificial intelligence: An integrated analytical framework. J. Digit. Econ. 2022, 1, 44–52. [Google Scholar] [CrossRef]
Mateos Fernández, R.; Petek, M.; Gerasymenko, I.; Juteršek, M.; Baebler, Š.; Kallam, K.; Moreno Giménez, E.; Gondolf, J.; Nordmann, A.; Gruden, K. Insect pest management in the age of synthetic biology. Plant Biotechnol. J. 2022, 20, 25–36. [Google Scholar] [CrossRef]
Li, J.; Zhao, H.; Zheng, L.; An, W. Advances in synthetic biology and biosafety governance. Front. Bioeng. Biotechnol. 2021, 9, 598087. [Google Scholar] [CrossRef] [PubMed]
Sun, T.; Song, J.; Wang, M.; Zhao, C.; Zhang, W. Challenges and recent progress in the governance of biosecurity risks in the era of synthetic biology. J. Biosaf. Biosecur. 2022, 4, 59–67. [Google Scholar] [CrossRef]
Liao, C.; Xiao, S.; Wang, X. Bench-to-bedside: Translational development landscape of biotechnology in healthcare. Health Sci. Rev. 2023, 7, 100097. [Google Scholar] [CrossRef]
Brooks, S.M.; Alper, H.S. Applications, challenges, and needs for employing synthetic biology beyond the lab. Nat. Commun. 2021, 12, 1390. [Google Scholar] [CrossRef] [PubMed]
Hanna, E.; Levic, A. Comparative Analysis of Language Models: Hallucinations in ChatGPT: Prompt Study. 2023. Available online: https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1764165&dswid=2283 (accessed on 10 May 2025).
Herrera-Poyatos, D.; Zuheros, C.; Montes, R.; Herrera, F. Large language models for crowd decision making based on prompt design strategies using ChatGPT: Models, analysis and challenges. arXiv 2024, arXiv:2403.15587. [Google Scholar]
Kalai, A.T.; Vempala, S.S. Calibrated language models must hallucinate. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, Vancouver, BC, Canada, June 24–28 2024; pp. 160–171. [Google Scholar]
Evstafev, E. The Paradox of Stochasticity: Limited Creativity and Computational Decoupling in Temperature-Varied LLM Outputs of Structured Fictional Data. arXiv 2025, arXiv:2502.08515. [Google Scholar]
Yu, G.; Guo, R.; Li, Y.-J. The influence of publication delays on three ISI indicators. Scientometrics 2006, 69, 511–527. [Google Scholar] [CrossRef]

Figure 1. Research framework diagram. The framework includes five steps: (1) multi-label disciplinary classification via ChatGPT-3.5-Turbo; (2) interdisciplinarity assessment using structural and network metrics; (3) frontier potential evaluation through growth, influence, and novelty indicators; (4) trend forecasting using five time series models; and (5) identification of high-potential frontier domains through integrated analysis.

Figure 2. Framework for identifying frontier interdisciplinary fields. The diagram illustrates a classification model based on three dimensions: interdisciplinarity (horizontal axis), frontierness (vertical axis), and future development trend (indicated by arrows). The four quadrants represent distinct domain types: R₁—Traditional Domains, R₂—Traditional Frontier Domains, R₃—Emerging Interdisciplinary Domains, and R₄—Frontier Interdisciplinary Domains. This framework enables structured identification of high-potential research areas.

Figure 3. Annual publication heatmap showing the disciplinary composition of synthetic biology literature from 2015 to 2024. The chart illustrates the transition from a biology-dominated domain to a stabilized, multidisciplinary configuration with increasing participation from computer science, environmental science, law, and ethics.

Figure 4. Proportion of interdisciplinary publications in synthetic biology (2015–2024).

Figure 5. Trends in interdisciplinary integration strength (Interdisciplinary Index, I) and number of involved disciplines in synthetic biology from 2015 to 2024. The figure illustrates the relative stability of interdisciplinary collaboration intensity over the past decade, alongside fluctuations in the breadth of disciplinary participation.

Figure 6. Interdisciplinary co-occurrence network of synthetic biology publications (2015–2024). Node size represents publication volume; edge thickness reflects co-occurrence frequency. The network reveals a biology-centered structure with strong computational integration and emerging governance-oriented peripheries.

Figure 7. Disciplinary evolution network of synthetic biology (2015–2024).

Figure 8. Convergence curves comparison of deep learning temporal models (Transformer, GRU, LSTM).

Figure 9. Model predictions comparison across temporal forecasting techniques. The Transformer model best fits the recent publication plateau and predicts steady future growth. In contrast, LSTM forecasts an exponential rise, GRU remains conservative, while traditional models underestimate the trend—visually confirming the Transformer’s superiority.

Table 1. Comparison of common approaches for frontier field identification.

Method	Advantages	Limitations
Citation Analysis	Reveals knowledge flow; quantifiable; useful for structural mapping	Time lag in citation accumulation; sensitive to self-citations
Keyword/Content Mining	Captures semantic patterns; scalable; identifies emerging terms	Dependent on keyword quality; limited semantic depth; difficulty with synonyms
Topic Modeling (e.g., LDA, BERTopic)	Detects thematic structure; supports trend analysis	Struggles with dynamic evolution; interpretability challenges; topic count sensitivity
Expert Judgment	Leverages domain knowledge; useful for early-stage fields	Subjective bias; lacks scalability; low reproducibility

Table 2. Temporal indicators of interdisciplinary integration strength.

Primary Indicator	Secondary Indicator	Indicator Type	Calculation Method
Temporal Integration Strength	Disciplinary Cohesion	Network Density	Measures the overall connectivity tightness
Temporal Integration Strength	Interdisciplinary Connectivity	Average Degree	Describes the average number of connections per discipline

Table 3. Novelty scores of some research areas.

Research Area	Novelty Score
Synthetic Biology	2019.860
Information Technology and Services	2006.220
Algorithm Optimization	2011.191
Medical Data Modeling	2021.260
Machine Learning and Deep Learning	2021.219

Table 4. Performance comparison of forecasting models.

Model	MAE	MSE	RMSE	MAPE/%	R²
LSTM	0.21	0.09	0.30	1.52	0.23
GRU	0.35	0.19	0.43	2.42	0.00
Transformer	0.06	0.00	0.07	0.22	0.96
Random Forest	0.58	0.45	0.67	8.85	0.00
Linear Regression	0.62	0.47	0.69	8.57	0.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, Y.; Lin, Q.; Wu, J.; Yao, R.; Zhang, X. Identification and Prediction Methods for Frontier Interdisciplinary Fields Integrating Large Language Models. Systems 2025, 13, 677. https://doi.org/10.3390/systems13080677

AMA Style

Wu Y, Lin Q, Wu J, Yao R, Zhang X. Identification and Prediction Methods for Frontier Interdisciplinary Fields Integrating Large Language Models. Systems. 2025; 13(8):677. https://doi.org/10.3390/systems13080677

Chicago/Turabian Style

Wu, Yu, Qiao Lin, Jinming Wu, Ru Yao, and Xuefu Zhang. 2025. "Identification and Prediction Methods for Frontier Interdisciplinary Fields Integrating Large Language Models" Systems 13, no. 8: 677. https://doi.org/10.3390/systems13080677

APA Style

Wu, Y., Lin, Q., Wu, J., Yao, R., & Zhang, X. (2025). Identification and Prediction Methods for Frontier Interdisciplinary Fields Integrating Large Language Models. Systems, 13(8), 677. https://doi.org/10.3390/systems13080677

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identification and Prediction Methods for Frontier Interdisciplinary Fields Integrating Large Language Models

Abstract

1. Introduction

2. Literature Review

2.1. Research Front Identification

2.2. Interdisciplinary Research Identification

2.3. Time Series Modeling in Interdisciplinary Frontier Identification

2.4. Limitations of Existing Research

3. Research Framework

3.1. Interdisciplinarity Identification Method

3.1.1. Paper Classification System Based on ChatGPT

3.1.2. Indicators for Disciplinary Interdisciplinarity Degree and Disciplinary Integration Strength

3.1.3. Temporal Indicators of Interdisciplinary Integration Strength

3.2. Frontierness Identification of Research Fields

3.2.1. Novelty Indicator

3.2.2. Growth Indicator

3.2.3. Impact Indicator

3.3. Field Trend Analysis Methods

3.3.1. Time Series Models

3.3.2. Evaluation Metrics for Time Series Models

3.4. Identification of Frontier Interdisciplinary Fields

4. Empirical Analysis

4.1. Data Source

4.2. Interdisciplinarity Identification

4.2.1. Disciplinary Classification and Interdisciplinary Scale Analysis

4.2.2. Analysis of Interdisciplinary Integration Strength

4.2.3. Overall Network Analysis of Interdisciplinary Connections

4.2.4. Evolutionary Analysis of the Interdisciplinary Network

4.3. Frontierness Analysis of the Field

4.3.1. Novelty

4.3.2. Growth

4.3.3. Impact

4.3.4. Field Potential

4.4. Trend Analysis of the Field

4.4.1. Model Construction and Error Evaluation

4.4.2. Model Comparison and Best Model Selection

4.5. Identification of Frontier Interdisciplinary Domain

4.5.1. Interdisciplinarity and Frontierness

4.5.2. Trend Outlook and Strategic Role

4.5.3. Final Positioning

5. Discussion

5.1. Effectiveness of the Methodological Framework and Forecasting Algorithms

5.2. Interdisciplinary Integration and Frontier Characteristics of Synthetic Biology

5.3. Development Trends and Future Growth Forecast

5.4. Challenges, Limitations, and Future Directions

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI