Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Printed Edition

A printed edition of this Special Issue is available at MDPI Books....

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Probabilistic Coarsening for Knowledge Graph Embeddings

Axioms 2023, 12(3), 275; https://doi.org/10.3390/axioms12030275

by Marcin Pietrasik^1,*

and Marek Z. Reformat^1,2

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Axioms 2023, 12(3), 275; https://doi.org/10.3390/axioms12030275

Submission received: 21 January 2023 / Revised: 26 February 2023 / Accepted: 1 March 2023 / Published: 6 March 2023

(This article belongs to the Special Issue Applied Mathematics and Information Sciences)

Round 1

Reviewer 1 Report

The authors propose a Probabilistic Coarsening method for knowledge graph embedding in this paper.

Recently, knowledge graph embedding has played an important key role in classifying entities and predicting relationships between entities from increasingly large-scale knowledge graphs.

However, massive hardware resources are required, and the time needed for learning is also a significant barrier to using knowledge graphs.

It is shown quantitatively through datasets with four different characteristics that the proposed technique can solve these problems.

Quantitative evaluation or explanation of how much the reverse mapping process proposed in the paper can reduce information loss due to coarse embedding. However, the results in Table 3 explain that more than half of the entities in the AIFB and BGS datasets are reduced. If added to this paper, the superiority of the method proposed by the authors will be more clearly explained to the readers.

Following minor editing errors needed to be fixed:

The positions of the first and second (R-GCN, RDF2Vec) descriptions of AIFB data in Figure 2 are different from other experimental results, so help to understand readers, the position of the pictures needs to be modified like other experimental results. It also seems that the word 'Table' was omitted in line 204.

Author Response

Point 1: The explanation or evaluation of how the reverse mapping procedure can reduce the information lost in the coarsening procedure is missing

Response 2: Indeed, this issue was not explored in the original manuscript. The evaluation of the reverse mapping procedure could be performed by a simple pairwise comparison. Specifically, compare the embedding quality with and without reverse mapping and fine-tuning. We anticipate, however, that on the entity classification task such a comparison would not be indicative of the extent to which information is regained in this step. This is because entity classification is performed on a subset of graph entities which are highly connected and thus unlikely to be collapsed in the coarsening process. This is due to the requirements for collapsing and their satisfaction being more likely by peripheral entities. As such, the embeddings of classified entities will not change during the reverse mapping and fine tuning process. We make note of this in the revised manuscript.

Point 2: Position of subfigures in Figure 2 and incorrect reference to Table 2.

Response 2: Thank you for drawing attention to these formatting errors. They have been fixed.

Reviewer 2 Report

Dear Edittors and Authors,

The reviewer would like to submit the review result in the attached file.

My goal is to assist you in enhancing your work, expand your audience, and making a positive impact on academia.

Thank you very much.

Comments for author File: Comments.pdf

Author Response

Point 1: The strategy could use or compare against additional methods for graph coarsening, such as community detection methods or graph clustering algorithms.

Response 1: This is a valid point and serves as a potential avenue for future work. With this said however, our paper attempts to put forward the idea that even a simple coarsening procedure, based merely on collapsing structurally similar entities can yield better embedding results. In this sense, the coarsening procedure, due to its simplicity, can be seen as picking a low hanging fruit to enhance embeddings. Indeed, this is similar in spirit to models like HARP and MILE (cited in the paper). Furthermore, introduction of more complex coarsening schemes often comes with higher time complexity. For instance, stochastic blockmodels used for community detection with inference schemes utilizing methods such as Gibbs sampling or expectation maximization may not scale to the datasets used in our work. In general though, other coarsening techniques do warrant investigation.

Point 2: The strategy could incorporate elements of graph convolutional networks such as graph attention mechanisms.

Response 2: This is a good point and another avenue for future work. MILE, for instance, uses principles from graph convolutions to refine their reverse mapping process. This is perhaps a limitation of our paper in that our reverse mapping process uses a simple HARP inspired mapping, without leveraging more sophisticated methods. Using graph convolutions and graph attention mechanisms for the coarsening process is a less explored although interesting area as well. Indeed, models such as [1,2] have been developed for knowledge graph embeddings. Here we again appeal to the simplicity and speed of our coarsening procedure.

Point 3: In this paper, please indicate how you can eliminate or reduce the limitations of the proposed baseline methods.

Response 3: Upon reflection, our original manuscript did not adequately address the answer as to why embeddings were improved over the baseline methods. We have addressed this in the revised manuscript. In short, the impact of limitations in the baseline models are as follows:

RDF2vec: The main difference in using our strategy in conjunction with RDF2VEC is in how it changes the sequences obtained from the random walks on the graph. Because densely connected entities are less likely to be collapsed in the coarsening process, they are more likely to be sampled in walks on the coarsened graph. This bears some similarity to RDF2VEC Light [3] which only performs walks on entities of interest if we assume that densely connected entities are more likely to be of interest. Furthermore, on all four datasets we see a larger reduction in the percentage of entities as opposed to triples. This too changes the nature of sampled walks, namely in that predicates are more likely to be sampled. The difference between sampling entities versus predicates was discussed in [4] and termed e-walks and p-walks, respectively. In short, it was shown that e-walks are better at capturing the relatedness of entities and p-walks are better at capturing their semantic similarity. Because coarsening changes the sampled walks in the direction of p-walks, our strategy is theorized to be better at capturing semantic similarity between entities. In general, the idea of biasing random walks in order to improve the performance of RDF2VEC is well studied and extensively evaluated in [5].
R-GCN: The advantage of using our strategy in conjunction with a R-GCN baseline is that of parameter reduction. Recall that R-GCN relies on computations made on the adjacency matrix of the knowledge graph. Such a formulation results in an increasing number of parameters as the size of the knowledge graph increases, making its scalability poor for very large datasets. By coarsening the knowledge graph, we reduce the number of parameters in the model which must be learned at the coarse level, improving its efficiency. We note in our paper, however, that of the three baselines R-GCN shows the least improvement. The reason for this may be that coarsening produces graphs with a larger proportion of highly connected hub entities, which is a structural weakness of R-GCN as pointed out by its authors.
TransE: The limitations of the translational assumption inherent to the TransE model is not addressed by our strategy directly as the underlying embedding procedure is never changed. These limitations were largely addressed by subsequent models in the translational family such as TransR [6] and TransH [7]. Specifically, TransR finessed the issue of entities and predicates being embedded in the same space by introducing a separate predicate embedding space. Entities are then projected to the predicate space by a predicate specific projection matrix and the original translation assumptions of the TransE model are applied. This solves the problem of embedding entities which may or may not be similar to one another depending on their predicate context. TransH was proposed to handle the problem of embedding one-to-many and many-to-one relationships. To this end, it too introduced the notion of predicate specific projections, although unlike TransR, entities and predicates were mapped in the same space. Unlike these approaches, our strategy may be seen as improving on TransE indirectly, namely by augmenting the input data. One-to-many and many-to-one relationships are likely candidates for second order collapsing, ensuring tight embeddings of these entities by TransE. Such a feature does not rectify TransE’s challenges in handling these types of relationships but merely accepts its inadequacy and ensures that less computation is spent on embedding these entities. Such is also the case for symmetrical relationships whose entities are candidates for first order collapsing. Thus, in order to overcome the obstacles of TransE, it would be advised to apply our strategy in conjunction with one of its successors which explicitly deals with the limitations mentioned.

We note that despite our strategy’s positive impact in regards to the aforementioned limitations, it is formulated to be embedding method agnostic. As such it does not inherently seek to overcome the limitations of any particular embedding method.

Point 4: In the conclusion, the authors should indicate more about the limitations of your study and your future research.

Response 4: We have added to the conclusion to incorporate the limitations and directions for future research. These additions are largely related to the points made in previous responses.

Point 5: The references for this article are not enough for supporting the literacy.

Response 5: In expanding upon the areas of our paper which were addressed in the previous responses, we increased the number of citations in the revised manuscript to 50.

[1] Chen, Meiqi, et al. "r-GAT: Relational Graph Attention Network for Multi-Relational Graphs." arXiv preprint arXiv:2109.05922 (2021).

[2] Liu, Xiyang, et al. "RAGAT: Relation aware graph attention network for knowledge graph completion." IEEE Access 9 (2021): 20840-20849.

[3] Portisch, Jan, Michael Hladik, and Heiko Paulheim. "RDF2Vec Light--A Lightweight Approach for Knowledge Graph Embeddings." arXiv preprint arXiv:2009.07659 (2020).

[4] Portisch, Jan, and Heiko Paulheim. "Walk this way! entity walks and property walks for rdf2vec." The Semantic Web: ESWC 2022 Satellite Events: Hersonissos, Crete, Greece, May 29–June 2, 2022, Proceedings. Cham: Springer International Publishing, 2022. 133-137.

[5] Cochez, Michael, et al. "Biased graph walks for RDF graph embeddings." Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics. 2017.

[6] Lin, Yankai, et al. "Learning entity and relation embeddings for knowledge graph completion." Proceedings of the AAAI conference on artificial intelligence. Vol. 29. No. 1. 2015.

[7] Wang, Zhen, et al. "Knowledge graph embedding by translating on hyperplanes." Proceedings of the AAAI conference on artificial intelligence. Vol. 28. No. 1. 2014.

Round 2

Reviewer 2 Report

Dear Authors and Editors,

It is great that the authors acknowledge the potential for future work and improvements.

The points raised by the reviewers are valid and have been addressed in the revised manuscript.

The limitations of the baseline methods have also been discussed in detail, and the impact of our strategy on each of them has been explained.

The coarsening procedure has proven to be a simple yet effective approach to enhancing embeddings.

This article should be processed for publishing.

Thank you very much.

Article Menu

Printed Edition

Probabilistic Coarsening for Knowledge Graph Embeddings

Further Information

Guidelines

MDPI Initiatives

Follow MDPI