Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite

Open AccessArticle

Peer-Review Record

Multi-Hop Trajectory Prediction of Aircraft Taxiing Using Spatio-Temporal Knowledge Graph with Vector-Index Support

Electronics 2026, 15(12), 2613; https://doi.org/10.3390/electronics15122613 (registering DOI)

by Jing Shan^1,*

, Jianan Yin¹

, Beijing Zhou²

and Minghua Hu¹

Reviewer 1: Anonymous

Reviewer 2:

Kamal Berahmand

Reviewer 3: Anonymous

Electronics 2026, 15(12), 2613; https://doi.org/10.3390/electronics15122613 (registering DOI)

Submission received: 6 May 2026 / Revised: 7 June 2026 / Accepted: 9 June 2026 / Published: 12 June 2026

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper present a vector-index-supported multi-hop prediction method for spatio-temporal knowledge graphs of aircraft taxiing trajectories.There are some issues that need to be addressed.

1) The Methods section contains no figures at all, which is highly inappropriate. Figures illustrating the model architecture and workflow should be added to enrich the content of the paper.

2) The idea of combining the proposed spatio-temporal embedding with IndexIVFFlat has been explored in existing works. What are the specific characteristics and innovations in the combination process of this paper?

3) The paper only deals with static historical trajectories and does not explain how to handle dynamic sequential data. This is inconsistent with the claim of “real-time application”.

4) Only nprobe is verified in the parameter analysis, and the sensitivity analysis of key hyperparameters such as embedding dimension and the number of negative samples is missing. It is recommended to supplement and improve the relevant content.

5) There are numerous spelling and grammatical errors throughout the paper, and some sentences are overly verbose. It is recommended that the language be uniformly polished and revised. In line 453, spati-temporal is misspelled and should be corrected to spatio-temporal.

Author Response

Comments 1: The Methods section contains no figures at all, which is highly inappropriate. Figures illustrating the model architecture and workflow should be added to enrich the content of the paper.

Response 1: Thank you for pointing this out. We agree with the comment. Therefore, we have added an overall architecture diagram (Figure 1) to visually illustrate the overall workflow of the proposed method in Section 3 along with explanation highlighted in yellow on page 5 and 6.

Comments 2: The idea of combining the proposed spatio-temporal embedding with IndexIVFFlat has been explored in existing works. What are the specific characteristics and innovations in the combination process of this paper?

Response 2: Thank you for your comments. It should be clarified that existing works have not introduced an indexing mechanism into spatio-temporal knowledge graph embedding. Previous studies have mainly focused on improving the accuracy of single-hop predictions and have never employed IndexIVFFlat for multi-hop acceleration. Therefore, the innovation of this paper does not lie in simply combining embedding with indexing, but rather in achieving, for the first time, the organic synergy among "spatio-temporal knowledge graph embedding, multi-hop flight trajectory prediction, and vector indexing." Specifically, the embedding model is designed specifically for the spatio-temporal semantics of multi-hop paths, the index construction leverages the distribution characteristics of taxiing trajectories to guide clustering partition, and the multi-hop algorithm utilizes this index to perform fast pruning. These three aspects together constitute an end-to-end acceleration framework, rather than a direct splicing of existing techniques.

Comments 3: The paper only deals with static historical trajectories and does not explain how to handle dynamic sequential data. This is inconsistent with the claim of “real-time application”.

Response 3: Thank you for your comments. We have provided additional clarification on this issue in the revised manuscript. The specific modification can be found in the newly added highlighted paragraph following Equation (3) in Section 3.2.1. on page 7 and 8.

Comments 4: Only nprobe is verified in the parameter analysis, and the sensitivity analysis of key hyperparameters such as embedding dimension and the number of negative samples is missing. It is recommended to supplement and improve the relevant content.

Response 4: Thank you for your comments. We have supplemented the sensitivity analysis of the embedding dimension and the number of negative samples in the revised manuscript. Please refer to the newly added content in Section 4.3.4 of the revised manuscript on page 18 and 19, which is highlighted in yellow.

Comments 5: There are numerous spelling and grammatical errors throughout the paper, and some sentences are overly verbose. It is recommended that the language be uniformly polished and revised. In line 453, spati-temporal is misspelled and should be corrected to spatio-temporal.

Response 5: Thank you for your careful review. We will thoroughly polish the language throughout the paper, correct the spelling and grammatical errors, and revise verbose sentences. The typo “spati-temporal” in line 453 has been corrected to “spatio-temporal”.

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript proposes a vector-index-supported multi-hop prediction framework for spatio-temporal knowledge graphs of aircraft taxiing trajectories. By integrating a spatio-temporal-aware knowledge graph embedding (KGE) model with an IndexIVFFlat-based hierarchical vector index, the method aims to accelerate multi-hop retrieval while preserving prediction accuracy. The core idea—leveraging domain-specific spatio-temporal clustering to partition the embedding space and enable efficient approximate nearest neighbor search—addresses a practical efficiency bottleneck in large-scale aviation trajectory analysis. Experiments on real taxiing datasets and general benchmarks show promising gains in prediction speed with competitive MRR and Hits@K metrics.

The spatio-temporal embedding in Equations (2)–(3) combines object projection and temporal rotation in complex space. While rotation via complex multiplication captures cyclic temporal patterns, the additive incorporation of spatial coordinates in multi-hop embeddings lacks a clear geometric interpretation or theoretical justification for preserving trajectory continuity. A more rigorous analysis of the resulting metric space (e.g., distortion bounds under successive rotations and translations) or comparison with hyperbolic/temporal rotation baselines would strengthen the modeling foundation.
The distance function in Equation (5) uses L1/L2 norms on combined subject-relation vectors. However, the interaction between temporal rotation and spatial addition is not analyzed for properties such as triangle inequality preservation or sensitivity to noise in timestamp/position sequences. Deriving error bounds for multi-hop composition or proving Lipschitz continuity with respect to small perturbations in 𝑡 or 𝑙 would provide mathematical grounding for the claimed robustness.
The IndexIVFFlat construction relies on k-means clustering of spatial vectors (Equation 7). The paper does not discuss convergence guarantees of the clustering step with respect to the underlying trajectory manifold or the impact of initialization on subspace quality. Theoretical analysis of quantization error or empirical evaluation of different clustering objectives (e.g., k-means++ vs. spectral clustering guided by spatio-temporal density) would better justify the partitioning strategy.
The loss function (Equation 6) follows a standard margin-based contrastive form. However, with multi-hop queries, the negative sampling strategy (random replacement of answers) may not adequately capture hard negatives along plausible but incorrect trajectory paths. A mathematically motivated negative sampling scheme—e.g., based on geodesic distance in the embedding space or temporal order violation—would improve discriminative power and training stability.
Ablation studies isolate the index and multi-hop components but lack statistical rigor. Results should include standard deviations over multiple runs, paired Wilcoxon or t-tests across datasets, and detailed sensitivity analysis of hyperparameters (nlist, nprobe, embedding dimension d) with respect to both accuracy (MRR) and efficiency (query time, memory). Variance analysis would clarify whether observed gains are robust or dataset-specific.
The approximate search in Equation (8) trades exactness for speed via nprobe control. While empirical trade-offs are shown, there is no theoretical characterization of recall or approximation ratio as a function of nprobe relative to the true nearest neighbors in the complex vector space. Deriving PAC-style guarantees or using concentration inequalities for the retrieved set would elevate the methodological contribution.
Efficiency comparisons focus on prediction time but omit detailed scalability analysis (e.g., index build time, memory footprint as |E| and hop count grow, incremental update cost for streaming trajectories). Given the claimed real-time applicability in aviation monitoring, asymptotic complexity and profiling under increasing data volumes (e.g., 10×–100× more trajectories) are essential.
The case studies illustrate qualitative behavior but do not quantitatively link retrieved multi-hop paths to operational metrics such as taxi time prediction error, conflict avoidance, or fuel savings. Stronger downstream evaluation—e.g., integrating predictions into a simulation of ground movement optimization—would better demonstrate practical impact beyond MRR/Hits@K.

The positioning and literature review would benefit from citing Knowledge Tracing with A Temporal Hypergraph Memory Network and Predicting Short-Term Bike-Sharing Demand at Station Level: A Multi-Task Dynamic Graph-based Spatiotemporal Approach. These works offer relevant insights into temporal hypergraph modeling and multi-task spatio-temporal graph prediction that could further contextualize the vector-index mechanism and multi-hop trajectory reasoning within broader dynamic graph learning frameworks.

Author Response

Comments 1: The spatio-temporal embedding in Equations (2)–(3) combines object projection and temporal rotation in complex space. While rotation via complex multiplication captures cyclic temporal patterns, the additive incorporation of spatial coordinates in multi-hop embeddings lacks a clear geometric interpretation or theoretical justification for preserving trajectory continuity. A more rigorous analysis of the resulting metric space (e.g., distortion bounds under successive rotations and translations) or comparison with hyperbolic/temporal rotation baselines would strengthen the modeling foundation.

Response 1: Thank you for your comments. We have supplemented the geometric interpretation of the spatial coordinate addition operation in Equation (3) in the revised manuscript. Considering the time cost, regarding the introduction of methods such as hyperbolic embeddings and pure temporal rotation to further strengthen the theoretical foundation of our model, we will pursue this in future work. Please refer to the highlighted paragraph following Equation (3) in Section 3.2.1 of the revised manuscript for the specific modifications on page 7 and 8.

Comments 2: The distance function in Equation (5) uses L1/L2 norms on combined subject-relation vectors. However, the interaction between temporal rotation and spatial addition is not analyzed for properties such as triangle inequality preservation or sensitivity to noise in timestamp/position sequences. Deriving error bounds for multi-hop composition or proving Lipschitz continuity with respect to small perturbations in t or l would provide mathematical grounding for the claimed robustness.

Response 2: Thank you for your comments. In response to your concern regarding the insufficient analysis of the interaction between temporal rotation and spatial addition in the distance function, we have provided supplementary analysis in the revised manuscript. Please refer to the highlighted paragraph following Equation (5) in Section 3.2.2 of the revised manuscript on page 8.

Comments 3: The IndexIVFFlat construction relies on k-means clustering of spatial vectors (Equation 7). The paper does not discuss convergence guarantees of the clustering step with respect to the underlying trajectory manifold or the impact of initialization on subspace quality. Theoretical analysis of quantization error or empirical evaluation of different clustering objectives (e.g., k-means++ vs. spectral clustering guided by spatio-temporal density) would better justify the partitioning strategy.

Response 3: Thank you for your comments. In response to your concern regarding the insufficient analysis of convergence, initialization impact, and quantization error of k-means clustering in IndexIVFFlat construction, we have provided supplementary explanations in the revised manuscript. Furthermore, considering the time cost, a systematic empirical comparison between different clustering objectives (e.g., spectral clustering guided by spatio-temporal density) and the k-means strategy adopted in this paper will be pursued in our future work. Please refer to the highlighted paragraph following Equation (7) in Section 3.3.1 of the revised manuscript for the specific modifications on page 9 and 10.

Comments 4: The loss function (Equation 6) follows a standard margin-based contrastive form. However, with multi-hop queries, the negative sampling strategy (random replacement of answers) may not adequately capture hard negatives along plausible but incorrect trajectory paths. A mathematically motivated negative sampling scheme—e.g., based on geodesic distance in the embedding space or temporal order violation—would improve discriminative power and training stability.

Response 4: Thank you for pointing this out. We agree with this comment. In multi-hop query scenarios, the negative sampling strategy that randomly replaces answers may indeed struggle to adequately capture hard negatives, thereby affecting the discriminative power of the model. Considering the time cost, we will pursue the design and validation of more mathematically motivated negative sampling schemes, such as those based on geodesic distance in the embedding space or temporal order violation in our future work.

Comments 5: Ablation studies isolate the index and multi-hop components but lack statistical rigor. Results should include standard deviations over multiple runs, paired Wilcoxon or t-tests across datasets, and detailed sensitivity analysis of hyperparameters (nlist, nprobe, embedding dimension d) with respect to both accuracy (MRR) and efficiency (query time, memory). Variance analysis would clarify whether observed gains are robust or dataset-specific.

Response 5: Thank you for your comments. We have supplemented the analysis of prediction performance and standard deviation based on multiple random experiments in Section 4.3.3 to verify the statistical stability of the model. In addition, we have added a sensitivity analysis of the embedding dimension and its impact on memory performance in Section 4.3.4. Please refer to the newly added content in Sections 4.3.3 and 4.3.4 of the revised manuscript from page 16 - 19 highlighted in yellow.

Comments 6: The approximate search in Equation (8) trades exactness for speed via nprobe control. While empirical trade-offs are shown, there is no theoretical characterization of recall or approximation ratio as a function of nprobe relative to the true nearest neighbors in the complex vector space. Deriving PAC-style guarantees or using concentration inequalities for the retrieved set would elevate the methodological contribution.

Response 6: Thank you for your comments. Regarding the theoretical characterization of the relationship between recall and nprobe, we have added a new analysis in the revised manuscript, providing the definition of recall and its expected lower bound. Regarding the derivation of PAC-style guarantees or the use of concentration inequalities for theoretical characterization, since rigorous theoretical upper bound derivation requires complex modeling of data distribution and clustering quality, which itself constitutes an independent research direction beyond the scope of this paper, we will pursue this in our future work.

Please refer to the highlighted paragraph following Equation (8) in Section 3.3.2 of the revised manuscript for the specific modifications on page 10.

Comments 7: Efficiency comparisons focus on prediction time but omit detailed scalability analysis (e.g., index build time, memory footprint as |E| and hop count grow, incremental update cost for streaming trajectories). Given the claimed real-time applicability in aviation monitoring, asymptotic complexity and profiling under increasing data volumes (e.g., 10×–100× more trajectories) are essential.

Response 7: Thank you for your comments. We have supplemented the analysis of memory usage under different hop counts in Section 4.3.4 to evaluate the scalability of the model. Regarding the analysis of index construction time, asymptotic complexity under increasing data volume, incremental update costs, as well as asymptotic complexity and profiling under 10× to 100× more trajectories, due to the limitations of computational resources and data scale in the current experimental environment, we will collectively pursue these as future work. Please refer to the newly added content in Section 4.3.4 of the revised manuscript on page 18 and 19 highlighted in yellow.

Comments 8: The case studies illustrate qualitative behavior but do not quantitatively link retrieved multi-hop paths to operational metrics such as taxi time prediction error, conflict avoidance, or fuel savings. Stronger downstream evaluation—e.g., integrating predictions into a simulation of ground movement optimization—would better demonstrate practical impact beyond MRR/Hits@K.

Response 8: Thank you for pointing this out. We agree with this comment. The current case study mainly demonstrates the qualitative behavior of the predictions and does not quantitatively link the multi-hop prediction results to operational metrics such as taxi time prediction error, conflict avoidance, or fuel savings. This is primarily because the accurate calculation of these operational metrics relies on a complete airport surface simulation environment (including dynamic taxiing models, flight scheduling, conflict detection, etc.), which itself constitutes an independent system integration topic. Integrating our method into a surface movement optimization simulation and conducting quantitative operational evaluation is an important direction for our future work.

Comments 9: The positioning and literature review would benefit from citing Knowledge Tracing with A Temporal Hypergraph Memory Network and Predicting Short-Term Bike-Sharing Demand at Station Level: A Multi-Task Dynamic Graph-based Spatiotemporal Approach. These works offer relevant insights into temporal hypergraph modeling and multi-task spatio-temporal graph prediction that could further contextualize the vector-index mechanism and multi-hop trajectory reasoning within broader dynamic graph learning frameworks.

Response 9: Thank you for your comments. The two works you recommended indeed provide valuable insights into temporal hypergraph modeling and multi-task spatiotemporal graph prediction. We have supplemented the citation and discussion of these two works in Section 2.2 of the revised manuscript to better position our paper within the context of dynamic graph learning frameworks. Please refer to the highlighted paragraph in Section 2.2 of the revised manuscript on page 4. The added references are [45] and [46].

Reviewer 3 Report

Comments and Suggestions for Authors

The study presents a framework based on a spatio-temporal knowledge graph for predicting multi-hop aircraft taxi trajectories. The framework uses knowledge graph embedding and a vector-index-based retrieval mechanism to enhance the efficiency of predicting based on a large dataset. The technical background of this research is appropriate for the content of the paper, which includes the model definition, indexing method, experiments, ablation study, and case studies, all of which are described in an orderly manner. The integration of IndexIVFFlat with spatio-temporal embeddings will be helpful to researchers looking to reduce the retrieval costs associated with multi-hop predictive models. However, the authors present unusually high-performance gaps while using benchmark datasets, and this requires clarification to understand the true significance of the accuracy vs. efficiency trade-off, as well as the ability to replicate the study's experimental conditions. Additionally, while the authors present evidence to support many of their claims, some of their sections contain repetitions and inconsistent presentation, thus resulting in diminished readability and causing parts of the evaluation to appear less than rigorous considering the paper represents an advanced experimental study.

However, the following elements need to be addressed in the manuscript:

It appears that the reported performance enhancements to Wikidata33k and YAGO13k are unusually large when compared to all of the baselines, and especially the increase from single digits for the previous MRR to above 60 for the proposed model. Could the authors confirm if the evaluation procedure, candidate filtering approach, or scaling of the metrics differs from that used normally in knowledge graph completion? Based on the current presentation there is not enough information available to ascertain whether the comparison of the different methods is indeed a fair one.
While the proposed framework integrates Approximate Nearest Neighbor Retrieval with Multi-Hop Spatiotemporal Reasoning, the paper focuses primarily on ranking measures and aggregate prediction time. Could the authors provide a more comprehensive analysis to support their claims of the retrieval error introduced by the use of the IndexIVFFlat approximation during multi-hop reasoning? Specifically, how does the application of approximate pruning to resolve entries in the knowledge graph impact semantic coherence through longer hop chains, and do the accumulated approximation errors invalidate paths?
Table 1 indicates that aircraft taxiing datasets have only a single type of relationship, but the proposed approach relies on multiple hops of complex semantic reasoning with respect to spatio-temporal knowledge graphs as input data. Could the authors explain how to maintain meaningful diversity of relationships between nodes in the graph structure while carrying out trajectory continuation prediction or general multi-relational reasoning?
Combining both sampling projections and using temporal rotations of spatial cumulative aggregations through addition of time and space sequences yields the embedding equations. There has been no reference to stability, identifiability, or smoothing (over-smoothing) issues that arise from increasing the distance of hops in terms of distance. Did any analyses provide insight into embedding collapses, sensitivity of trajectory length, or cumulative spatial accumulation effects on future predicted behaviour over long distance time frames?
The analysis indicates that dropping the clustering index increases predictive accuracy (in some cases greatly) from all the tested datasets and moderately from smaller datasets. Can the authors further justify why the proposed approximation trade-off is deemed to be advantageous in a practical sense, especially in aviation safety-critical applications where even minor predictive error may lead to serious consequences?
There are multiple places in the paper that contain repeat headings for individual subsections of the experiment section as well as repeated text used to evaluator the same experiment multiple times throughout this Section. This leads to concerns on reproducibility because of the questionable statements contained within the data and formatting choices. Can the Author(s) please provide a final verification to the consistency of all material in the original draft and whether it matches what was used to generate the results from this report?

Author Response

Comments 1: It appears that the reported performance enhancements to Wikidata33k and YAGO13k are unusually large when compared to all of the baselines, and especially the increase from single digits for the previous MRR to above 60 for the proposed model. Could the authors confirm if the evaluation procedure, candidate filtering approach, or scaling of the metrics differs from that used normally in knowledge graph completion? Based on the current presentation there is not enough information available to ascertain whether the comparison of the different methods is indeed a fair one.

Response 1: Thank you for your comments. Regarding your concern about the unusually large performance improvement on Wikidata33k and YAGO13k, we clarify the following:

First, Wikidata33k and YAGO13k are not public benchmark datasets, but rather spatio-temporal knowledge graph evaluation datasets we constructed from the open-source knowledge bases Wikidata and YAGO. The reason for selecting these two knowledge bases is that they inherently contain rich spatio-temporal information. For example, YAGO records spatial attributes such as a person's birthplace and place of death, as well as the temporal scope of events. Similarly, Wikidata maintains timestamps and geographic coordinates for many facts. Extracting facts that contain both temporal and spatial annotations to form (subject, relation, object, timestamp, location) quintuples is a reasonable and natural data construction approach.

Second, among the baseline methods listed, only ST-NewDE is a genuine spatio-temporal knowledge graph embedding model capable of handling quintuples containing both temporal and spatial information. However, it is primarily designed for single-hop entity prediction tasks and has limited effectiveness in predicting sequential spatial positions. The remaining baselines (e.g., TransE, RotatE, TeRo) were originally designed for static triples (without time or space) or temporal quadruples (with time but without space), and thus cannot leverage the spatial position supervision signal, leading to low performance in the evaluation. In contrast, our method explicitly models spatial positions and achieves significantly higher performance than these baselines. Therefore, the reported performance gap primarily reflects the gain brought by incorporating the spatial supervision signal, rather than any bias in the evaluation procedure or metric calculation.

In summary, since most baseline methods lack the capability to handle spatial information, the direct comparison demonstrates the advantage of our method in spatio-temporal information modeling and in predicting spatial information with sequential semantics.

Comments 2: While the proposed framework integrates Approximate Nearest Neighbor Retrieval with Multi-Hop Spatiotemporal Reasoning, the paper focuses primarily on ranking measures and aggregate prediction time. Could the authors provide a more comprehensive analysis to support their claims of the retrieval error introduced by the use of the IndexIVFFlat approximation during multi-hop reasoning? Specifically, how does the application of approximate pruning to resolve entries in the knowledge graph impact semantic coherence through longer hop chains, and do the accumulated approximation errors invalidate paths?

Response 2: Thank you for your comments. In response to your concern regarding the approximate retrieval error and its cumulative effect in multi-hop reasoning, we have provided supplementary analysis in the revised manuscript. Please refer to the newly added highlighted paragraph in Section 4.3.1 of the revised manuscript for the specific modifications on page 15.

Comments 3: Table 1 indicates that aircraft taxiing datasets have only a single type of relationship, but the proposed approach relies on multiple hops of complex semantic reasoning with respect to spatio-temporal knowledge graphs as input data. Could the authors explain how to maintain meaningful diversity of relationships between nodes in the graph structure while carrying out trajectory continuation prediction or general multi-relational reasoning?

Response 3: Thank you for your comments. We have provided supplementary analysis in the revised manuscript. It should be clarified that the aircraft taxiing trajectory dataset indeed contains only one relation type (i.e., the "adjacent pass" relation), which is determined by its inherent characteristics: a taxiing path is essentially a chain-like sequence composed of physical adjacency relations between taxiway nodes. However, the "complexity" of multi-hop reasoning in this paper does not stem from the diversity of relation types, but rather from the multi-dimensional constraints of spatio-temporal semantics. Specifically, in each multi-hop query, the model must simultaneously satisfy three dimensions of constraints: (1) temporal order constraint, i.e., the timestamps of consecutive hops must be increasing with reasonable intervals; (2) spatial continuity constraint, i.e., adjacent spatial positions must have a physical connection in the taxiway network; and (3) entity consistency constraint, i.e., the flight identifier and operation type must remain consistent throughout the multi-hop path. The joint satisfaction of these three constraints constitutes the "complex semantics" in multi-hop reasoning. Even with a single relation type, the model still needs to simultaneously encode and coordinate the above multi-dimensional information in the embedding space.

Furthermore, this paper also validates the proposed method on the general-domain datasets Wikidata33k and YAGO13k, which contain multiple relation types (e.g., "born in", "located in", "occurred in", etc.). Experimental results show that our method also achieves excellent performance in multi-relational scenarios, further demonstrating the model's capability to handle diverse relations.

In summary, the single relation type does not affect the effectiveness of our method in capturing complex spatio-temporal semantics in multi-hop reasoning.

Please refer to the newly added highlighted paragraph in Section 4.1.2 of the revised manuscript for the specific modifications on page 12.

Comments 4: Combining both sampling projections and using temporal rotations of spatial cumulative aggregations through addition of time and space sequences yields the embedding equations. There has been no reference to stability, identifiability, or smoothing (over-smoothing) issues that arise from increasing the distance of hops in terms of distance. Did any analyses provide insight into embedding collapses, sensitivity of trajectory length, or cumulative spatial accumulation effects on future predicted behaviour over long distance time frames?

Response 4: Thank you for your comments. We have supplemented the analysis of embedding dimension and performance under different hop counts in Section 4.3.4. Please refer to the newly added content in Section 4.3.4 of the revised manuscript on page 18 and 19 highlighted in yellow.

Comments 5: The analysis indicates that dropping the clustering index increases predictive accuracy (in some cases greatly) from all the tested datasets and moderately from smaller datasets. Can the authors further justify why the proposed approximation trade-off is deemed to be advantageous in a practical sense, especially in aviation safety-critical applications where even minor predictive error may lead to serious consequences?

Response 5: Thank you for your comments. We have provided supplementary clarification on this issue in the revised manuscript. Please refer to the newly added highlighted paragraphs in Section 4.3.3 of the revised manuscript on page 16 and 17, which are highlighted in yellow.

Comments 6: There are multiple places in the paper that contain repeat headings for individual subsections of the experiment section as well as repeated text used to evaluator the same experiment multiple times throughout this Section. This leads to concerns on reproducibility because of the questionable statements contained within the data and formatting choices. Can the Author(s) please provide a final verification to the consistency of all material in the original draft and whether it matches what was used to generate the results from this report?

Response 6: Thank you for your comments. We have carefully re-reviewed the manuscript and acknowledge that there were duplicate section headings and partial content redundancy in the experimental section. These issues have now been addressed. The section structure has been unified. The duplicate content has been removed, and the consistency of all data as well as results has been verified.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The revised version of the paper has shown improvement, but there are still some issues that need to be addressed.

1) The standardization of writing and formatting needs to be further improved. For instance, the sizes of the two subplots in Figures 6 and 7 should be consistent, and the number of decimal places in Table 7 should be uniform.

2) Could quantitative data be added at the end of the abstract to demonstrate the advantages of the method?

3) Where does the real sliding data come from? What data preprocessing was carried out?

Author Response

Comments 1: The standardization of writing and formatting needs to be further improved. For instance, the sizes of the two subplots in Figures 6 and 7 should be consistent, and the number of decimal places in Table 7 should be uniform.

Response 1: Thank you for pointing this out. We have proofread the entire text, adjusted the sizes of Figure 6 and Figure 7 to be consistent, and unified the number of decimal places for the data in Table 7. Please refer to the corresponding sections in the revised version for details, which are marked in red.

Comments 2: Could quantitative data be added at the end of the abstract to demonstrate the advantages of the method?

Response 2: Agree. Thank you for your comments. We have added quantitative data at the end of the abstract, which is marked in red, as follows:

Efficient multi-hop prediction over large-scale spatio-temporal knowledge graphs of aircraft taxiing trajectories remains challenging, as existing methods focus either on static multi-hop relations or on accuracy improvement for spatio-temporal single-hop predictions, leading to computational inefficiency. This paper proposes a vector-index-supported multi-hop prediction method. First, a knowledge graph embedding technique that integrates spatio-temporal features maps the trajectory graph into a low-dimensional complex vector space. Then, a hierarchical query acceleration structure based on IndexIVFFlat is constructed. A clustering strategy guided by the distribution of trajectory data partitions the vector space into subspaces, and approximate nearest neighbor search within those subspaces rapidly prunes the candidate set to accelerate multi-hop retrieval. Experiments on real aircraft taxiing trajectory datasets and general benchmarks show that the proposed method substantially improves prediction efficiency while maintaining competitive accuracy. The results demonstrate that the vector index mechanism effectively balances accuracy and efficiency, and the efficiency has been improved by at least 56.65%. This work provides a key technical foundation for real-time analysis and intelligent prediction of large-scale aircraft taxiing trajectories.

Comments 3: Where does the real sliding data come from? What data preprocessing was carried out?

Response 3: Thank you for your comments. The taxiing data were acquired from Shenzhen Bao'an International Airport. Considering the presence of observational jitter and redundancy in raw datasets, we carried out data preprocessing. The coordinates were firstly converted from degrees-minutes-seconds to decimal degrees for standardization. Continuous positional observations were then mapped to discrete taxiway nodes. On this basis, dwell modeling and observation aggregation were applied to repeated records at the same node, forming robust basic trajectory units.

Reviewer 2 Report

Comments and Suggestions for Authors

The author has adequately addressed the concerns raised by previous reviewers. The paper is well-structured, clearly written, and presents reliable results. It meets the necessary standards for publication

Author Response

Dear Reviewer,

Thank you so much for your time involved in reviewing the manuscript and all your helpful comments promoting the publication.

Best Regards,

Jing Shan, et al.

Reviewer 3 Report

Comments and Suggestions for Authors

The authors have addressed all the comments.

Author Response

Dear Reviewer,

Thank you so much for your time involved in reviewing the manuscript and all your helpful comments promoting the publication.

Best Regards,

Jing Shan, et al.

Article Menu

Multi-Hop Trajectory Prediction of Aircraft Taxiing Using Spatio-Temporal Knowledge Graph with Vector-Index Support

Further Information

Guidelines

MDPI Initiatives

Follow MDPI