Next Article in Journal
Unraveling COVID-19 Dynamics via Machine Learning and XAI: Investigating Variant Influence and Prognostic Classification
Previous Article in Journal
Multi-Task Representation Learning for Renewable-Power Forecasting: A Comparative Analysis of Unified Autoencoder Variants and Task-Embedding Dimensions
 
 
Article
Peer-Review Record

Beyond Weisfeiler–Lehman with Local Ego-Network Encodings

Mach. Learn. Knowl. Extr. 2023, 5(4), 1234-1265; https://doi.org/10.3390/make5040063
by Nurudin Alvarez-Gonzalez 1,*, Andreas Kaltenbrunner 2,3 and Vicenç Gómez 1,*
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Mach. Learn. Knowl. Extr. 2023, 5(4), 1234-1265; https://doi.org/10.3390/make5040063
Submission received: 1 August 2023 / Revised: 10 September 2023 / Accepted: 17 September 2023 / Published: 22 September 2023
(This article belongs to the Section Network)

Round 1

Reviewer 1 Report

The paper proposes a new method called IGEL for encoding the local ego-network structure around nodes in a graph. The encoding is based on histograms of node degrees at different distances in the ego-network. The authors show this encoding is more expressive than 1-WL for distinguishing non-isomorphic graphs. They evaluate the encoding by introducing it as input features to various graph neural network models on tasks like graph classification, regression, link prediction, etc. Results show improved performance across models and datasets by incorporating the IGEL encoding, matching state-of-the-art expressivity techniques.

Strong points:

  • Provides both theoretical analysis and empirical evaluation of the proposed encoding. Formally relates it to 1-WL and MATLANG expressivity.
  • Achieves improved performance across diverse tasks without changing model architecture, just incorporating IGEL as input features.
  • Matches state-of-the-art methods for improving GNN expressivity while being simpler and more efficient to compute.

Weak points:

  • Does not explore the effect of different ego-network depths α in detail. Unclear what optimal value is.
  • Upper bound analysis on expressivity focuses only on a limited family of graphs (strongly regular). Can it distinguish other non-isomorphic graphs?

none

Author Response

We thank the reviewer for their helpful comments and consideration of our work. We address their comments below:

  1. Does not explore the effect of different ego-network depths α in detail. Unclear what optimal value is. Indeed, the role of α is not analyzed theoretically, but we provide an extended experimental discussion in Appendix A and more specifically show the best-performing values of α Table A1.
    We found that the trade-off between ego-network depth and model performance favored values of α \in {1, 2} for the tasks detailed in Section 5.
    During early development, α = 3 provided marginal performance improvements at increased time and memory costs matching the algorithmic analysis in Section 3, and we believe that the optimal value is task-dependent, mostly depending on network density. We leave for future work to explore whether the depth can be adapted to ignore or downsample uninformative nodes and edges within the ego-network.
  2. Upper bound analysis on expressivity focuses only on a limited family of graphs (strongly regular). Can it distinguish other non-isomorphic graphs? As noted by the reviewer, our focus is on the expressivity on regular, co-spectral and strongly regular graphs. 
    In general, IGEL can distinguish any pair of non-isomorphic graphs for which any ego-network exhibits different distance/degree histograms. 
    Additionally, note that in Appendix D, we show IGEL is also more expressive than Shortest Path Neural Networks on unattributed graphs, which can be understood as a distance-based expressivity measure that distinguishes other families of graphs. 
    Our work focuses on identifying relationships with state-of-the-art methods, but we leave for future work the analysis of other families of graphs beyond strongly regular ones that are indistinguishable by IGEL.

Additionally, we address their comments on the quality of the English language (Minor editing of English language required) by improving the manuscript with additional proof-reading. The corresponding changes highlighted in the revised manuscript are listed below: 

- Unified the references to empirical questions (Q1, Q2, Q3, Q4) by removing unnecessary periods.
- Unified usages of "message-passing" and "message passing" to the latter, as it is more common in the literature.
- Figure 2: 1-WL [and] produces -> removed "and".
- L45: Removed "simply".
- L49: for graphs that [are] match sub-graph... -> removed "are"
- L70: correspondence between [the network layers connectivity] and -> replaced with "the connectivity of network layers"
- Pg 3: additional parametrized functions Readout produce -> "an additional parametrized function dubbed Readout produces a"
- L94: for more details [of] the algorithm -> "on".
- L110: Removed "high-level" and introduced reference to Appendix I.
- L113: [ocurrs] -> "occurs"
- L136: which embed each [node incorporating] identity information -> introduced while in "node while incorporating"
- L234: whose graph [representation is] not distinguishable -> replaced with "representations are".
- L284: Proof of Theorem 1, replaced "By def." with "By definition".
- L236: Equivalency -> "equivalence", for consistency throughout the manuscript.
- L379-380: degrades [when] IGEL is introduced -> added missing "when".
- L399: Equivalency -> "equivalence", for consistency throughout the manuscript.
- L451-454: Clarified the metrics used in Table 5, introducing ↓ and ↑ indicators to clarify the metrics in which lower or higher is better respectively.
- L463: [both] ZINC-12K -> removed "both".
- L481: [methids] -> "methods".
- L490: We report AUC results [reported] -> [averaged on]
- L499: underperforms [compared to] other baselines -> removed "compared to" to improve sentence structure.
- L506: Rewrote the sentence to [In light of our graph-level results on graphlet counting...].
- Table 7: Moved citations in the [Results as reported by] footnote to the superscript of each method to clarify the origin of each result.

Reviewer 2 Report

This manuscript introduces an innovative graph representation method within Graph Neural Networks (GNNs) to address the graph isomorphism problem. This representation captures the structural information of graph data by encoding node distances and degrees, resulting in a more expressive encoding. This enhanced expressiveness comes with increased computational and space complexity. To mitigate these challenges, the authors suggest employing multiple processors to reduce computation time complexity and employ a sparse vector with linear space constraints.

The most remarkable aspect of this paper lies in the authors' formal proofs demonstrating that ego-networks can generate a structural encoding scheme for arbitrary graphs with greater expressivity compared to the 1-WL test. Furthermore, the authors shows the limitations of their proposed encoding method, particularly its inability to distinguish fully meshed graphs.

The manuscript is well-written, with clear and comprehensible explanations of the methodology and proofs. The proposed encoding method has undergone rigorous experimental evaluations, affirming its potential significance in solving the graph isomorphism problem. In conclusion, this paper presents a promising approach to address graph isomorphism. 

Only some minor typos. For example, in page 8, Figure 2, 1-WL(Algorithm 1) and produces, removing "and"

Author Response

We thank the reviewer for their helpful comments and consideration of our work.

We address their comments on the quality of the English language (Only some minor typos., Minor editing of English language required) by improving the manuscript with additional proof-reading. The corresponding changes highlighted in the revised manuscript are listed below:

- Unified the references to empirical questions (Q1, Q2, Q3, Q4) by removing unnecessary periods.
- Unified usages of "message-passing" and "message passing" to the latter, as it is more common in the literature.
- Figure 2: 1-WL [and] produces -> removed "and".
- L45: Removed "simply".
- L49: for graphs that [are] match sub-graph... -> removed "are"
- L70: correspondence between [the network layers connectivity] and -> replaced with "the connectivity of network layers"
- Pg 3: additional parametrized functions Readout produce -> "an additional parametrized function dubbed Readout produces a"
- L94: for more details [of] the algorithm -> "on".
- L110: Removed "high-level" and introduced reference to Appendix I.
- L113: [ocurrs] -> "occurs"
- L136: which embed each [node incorporating] identity information -> introduced while in "node while incorporating"
- L234: whose graph [representation is] not distinguishable -> replaced with "representations are".
- L284: Proof of Theorem 1, replaced "By def." with "By definition".
- L236: Equivalency -> "equivalence", for consistency throughout the manuscript.
- L379-380: degrades [when] IGEL is introduced -> added missing "when".
- L399: Equivalency -> "equivalence", for consistency throughout the manuscript.
- L451-454: Clarified the metrics used in Table 5, introducing ↓ and ↑ indicators to clarify the metrics in which lower or higher is better respectively.
- L463: [both] ZINC-12K -> removed "both".
- L481: [methids] -> "methods".
- L490: We report AUC results [reported] -> [averaged on]
- L499: underperforms [compared to] other baselines -> removed "compared to" to improve sentence structure.
- L506: Rewrote the sentence to [In light of our graph-level results on graphlet counting...].
- Table 7: Moved citations in the [Results as reported by] footnote to the superscript of each method to clarify the origin of each result.

Reviewer 3 Report

I recommend accepting article after proof reading.  I suggest to add tabel to section related works with comparison of existing solutions.

I recommend proof reading

Author Response

We thank the reviewer for their helpful comments and consideration of our work.

I suggest to add table to section related works with comparison of existing solutions. Following the recommendation of the reviewer, we have introduced a new Appendix I including Table A3, in which we compare existing solutions to increase the expressivity of GNNs beyond 1-WL, highlighting model architectures, their extensions to the message passing mechanism, and their limitations. We reference Appendix I in the Related work section covering GNNs beyond 1-WL (2.2.1) to provide readers with additional context as suggested by the reviewer.

We address their comments on the quality of the English language (I recommend accepting article after proof reading., Minor editing of English language required) by improving the manuscript with additional proof-reading. The corresponding changes highlighted in the revised manuscript are listed below:

- Unified the references to empirical questions (Q1, Q2, Q3, Q4) by removing unnecessary periods.
- Unified usages of "message-passing" and "message passing" to the latter, as it is more common in the literature.
- Figure 2: 1-WL [and] produces -> removed "and".
- L45: Removed "simply".
- L49: for graphs that [are] match sub-graph... -> removed "are"
- L70: correspondence between [the network layers connectivity] and -> replaced with "the connectivity of network layers"
- Pg 3: additional parametrized functions Readout produce -> "an additional parametrized function dubbed Readout produces a"
- L94: for more details [of] the algorithm -> "on".
- L110: Removed "high-level" and introduced reference to Appendix I.
- L113: [ocurrs] -> "occurs"
- L136: which embed each [node incorporating] identity information -> introduced while in "node while incorporating"
- L234: whose graph [representation is] not distinguishable -> replaced with "representations are".
- L284: Proof of Theorem 1, replaced "By def." with "By definition".
- L236: Equivalency -> "equivalence", for consistency throughout the manuscript.
- L379-380: degrades [when] IGEL is introduced -> added missing "when".
- L399: Equivalency -> "equivalence", for consistency throughout the manuscript.
- L451-454: Clarified the metrics used in Table 5, introducing ↓ and ↑ indicators to clarify the metrics in which lower or higher is better respectively.
- L463: [both] ZINC-12K -> removed "both".
- L481: [methids] -> "methods".
- L490: We report AUC results [reported] -> [averaged on]
- L499: underperforms [compared to] other baselines -> removed "compared to" to improve sentence structure.
- L506: Rewrote the sentence to [In light of our graph-level results on graphlet counting...].
- Table 7: Moved citations in the [Results as reported by] footnote to the superscript of each method to clarify the origin of each result.

Back to TopTop