Next Article in Journal
NUMSnet: Nested-U Multi-Class Segmentation Network for 3D Medical Image Stacks
Next Article in Special Issue
EVCA Classifier: A MCMC-Based Classifier for Analyzing High-Dimensional Big Data
Previous Article in Journal
A Single-Product Multi-Period Inventory Routing Problem under Intermittent Demand
Previous Article in Special Issue
AutoML with Bayesian Optimizations for Big Data Management
 
 
Article
Peer-Review Record

Local Community Detection in Graph Streams with Anchors

Information 2023, 14(6), 332; https://doi.org/10.3390/info14060332
by Konstantinos Christopoulos 1,*, Georgia Baltsou 2 and Konstantinos Tsichlas 1
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4:
Information 2023, 14(6), 332; https://doi.org/10.3390/info14060332
Submission received: 10 April 2023 / Revised: 29 May 2023 / Accepted: 6 June 2023 / Published: 12 June 2023
(This article belongs to the Special Issue Multidimensional Data Structures and Big Data Management)

Round 1

Reviewer 1 Report (Previous Reviewer 2)

Unfortunately the remarks were not taken into account.

- Concerning a practical application, they just cited some examples and didi not perform a real application using the proposed method.

- Concerning the complexity in the worst case situation the algorithm seems to be so time consuming O(N^3). Neither is this result compared to existing approaches nor there is a clear study concerning the execution time.

Concerning the metric F1 they used to evaluate their approach, there is no a clear explanation why they used this metric (steps of calculated). They just stated that other papers did the same.

 

I still found some mistakes/ grammar errors.

Author Response

Dear reviewer,

Thank you very much for your time and we appreciate your effort. We think that all comments have been tackled in our revised submission.

First, we would like to thank you for the time you have invested in considering our paper. In the first round, the reviewers helped us a lot in revising our paper and certain of its parts were rewritten or clarified.
Comment 1: The reviewer insists that the paper does not contain a real application. Community detection is a well-known problem with many applications and our paper contributes to the case of local incremental community detection with minimal external (non-structural) knowledge. We have done experiments on real datasets, exhibiting the effectiveness of our approach. In fact, it was this effectiveness that we wished to highlight in the paper.
Comment 2: The reviewer suggested us to discuss the worst-case complexity of the incremental algorithm based on a single parameter. The worst-case complexity based only on one parameter is a very pessimistic measure, and in fact, it is avoided in the analysis of community detection algorithms since in the worst-case a single edge update can change everything. This is why, in the literature it is preferred to use various parameters (such as the mean degree of the graph) although even in this case the final complexity is again rather pessimistic. However, we provided such an analysis. We also have experimental results concerning the execution time but we focused on the effectiveness of the method for discovering the local community, since we wanted to show that our approach provides good results.
Comment 3: The reviewer thought that we have not justified the use of the F1 score. If we take aside the fact that we have added a discussion about why we use the precision/recall and F1 score, which is related to the Jaccard similarity as stated in the paper, the reviewer argues that a list of papers/surveys on community detection that use these metrics is not a valid argument in their favor. 

 

Yours sincerely,

Konstantinos Christopoulos

Georgia Baltsou

Konstantinos Tsichlas

Reviewer 2 Report (Previous Reviewer 1)

Minor comments from the previous round are fixed.

 

Figures 4-13 with results are visually improved.

 

Time complexity of Algorithm~2 is updated.

 

Multiple parts of the manuscript are enhanced to make the work more clear.

 

Algorithm’s 2 pseudocode is missing the initialization of Steps 1 and 2.

Author Response

Dear reviewer,

Thank you very much for your time and we appreciate your effort.

Yours sincerely,

Konstantinos Christopoulos

Georgia Baltsou

Konstantinos Tsichlas

Reviewer 3 Report (New Reviewer)

Very interesting and timely article. I think it deserves publication and I am recommending accept with corrections. There are some issues that require your attention. I list these corrections below as feedback / comments, and I am looking forward to reading the updated version of this article. 

 

- This tracking is directly related to what we tried to achieve in the Covid-19 pandemic, and yet, not discussed in relation to existing literature on this topic. 

- I have finished reading the article and I didn’t see any mention on the ethics of data privacy risk from these new Local Community Detection in Graph Streams with Anchors processes. You have done a really good job at reviewing so many articles, but not a single article on the ethics and risk. There are recent articles on this topic that reviews recent and relevant literature, for example, on the related topic of ‘ethics of shared Covid-19 risks’ - see: https://doi.org/10.1007/s12553-021-00565-3 and on the related topic of ‘Ethics and Shared Responsibility in Health Policy’ - see: https://doi.org/10.3390/su13158355 It would be interesting to see a few sentences review and comparison of your work in relations to these recent studies in related topics.

 

Otherwise, well done for a very interesting paper. The updated conclusion seems very well designed. 

Language seems OK. 

Author Response

Dear reviewer,

Thank you very much for your time and we appreciate your effort. Your comments regarding the ethics and data privacy are really important and we added a few sentences with relevant literature in introduction. Furthermore, we added an application of our method related to Covid - 19 pandemic cases and we extended the discussion section. Finally, we added a paragraph in section 4.4.1 regarding the execution time.

Yours sincerely,

Konstantinos Christopoulos

Georgia Baltsou

Konstantinos Tsichlas

Reviewer 4 Report (New Reviewer)

In this paper, the authors study the evolution of a single community containing a particularly important node called anchor. 

The topic considered by the authors is extremely interesting, although it is already much studied in the literature. However, the proposed idea has some innovative points in the presence of the anchor.

The authors say they want to study communities from a dynamic point of view and not from a static point of view. However, the related literature lacks some papers that deal with this very topic. Among them I highlight "Investigating community evolutions in TikTok dangerous and non-dangerous challenge" and "A framework for investigating the dynamics of user and community sentiments in a social platform." The authors should enhance this part of the paper with these and additional citations.

The technical description of the approach is interesting and the experiments conducted appear convincing.

The authors should enhance the Discussion part, which is extremely sketchy.

The English is good

Author Response

Dear reviewer,

Thank you very much for your time and we appreciate your effort. Your comments regarding the related literature are really important and we added a few sentences with the highlighted literature in introduction and in related work section. Furthermore, we added an application of our method related to Covid - 19 pandemic cases and we enhanced the discussion section. Finally, we added a paragraph in section 4.4.1 regarding the execution time.

Yours sincerely,

Konstantinos Christopoulos

Georgia Baltsou

Konstantinos Tsichlas

Round 2

Reviewer 1 Report (Previous Reviewer 2)

Unfortunately none of the remarks were taken into account. Furthermore some were even misunderstood, that is, concerning the time complexity calculation. There is no a practical application in a real dataset where they can highlight the interest or usefulness of their approach rather than limiting to the calculation of score. Some clear examples can find in any quality paper discussing about community détection. For instance, by interpreting the detected communities.

Second, they mentionned that they performed experiments concerning the execution time but they do not present them in this paper. Why? It is very well-known that a community detection to be relevant should not only be able to ptovide good results but also run in a reasonable amount of time. Anybody can design a method that outputs good results but if it takes many years it is not practical.

Finally, concerning the F1-score use they just can not justify the use of this metric. The results of the paper should exist by itself without refering to already published papers.

 

Author Response

Dear reviewer,

Thank you very much for your time and we appreciate your effort. Regarding your comments about the execution time of our approach, we added a paragraph in section 4.4.1. Furthermore, we added two motivating examples in introduction and we extended the discussion section. Finally, regarding the rest comments we believe that have been answered.

Yours sincerely,

Konstantinos Christopoulos

Georgia Baltsou

Konstantinos Tsichlas

Reviewer 3 Report (New Reviewer)

well done. 

Reviewer 4 Report (New Reviewer)

The authors have striven to comply with all my suggestions. Therefore, in my opinion, the paper can be published.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

The paper considers the problem of identifying local communities around a specific node, called anchor, of a dynamic graph, where edges are added and deleted over time. 

They consider the streaming graph model for getting edges that are added or deleted.

Their main goal is to explore the community structure evolution of an anchor node that holds an important role in the network.

Algorithm Local Community Detection with Anchors (LCD-A) consists of 5 steps. They evaluate their proposed algorithm by performing experiments on three synthetic and two real world networks, with setting different hyperparameters, and comparing to an existing dynamic algorithm and an existing static algorithm.

 

 

Innovations of their approach: the idea of anchor nodes based on importance, adding rewards to all the edges of the triangles that the anchor node participates in, and detecting stable communities of the anchor node in time evolving networks.

The intuition is to limit the community update only to a region of influence around the anchor, in here it is immediate neighbors.

The decision of which nodes to delete is based on a quality measure, which they call fitness score. The two quality measures are: the conductance and the fmonc (no name other than that is given).

 

Experimental results show promising results. The problem itself is interesting, and the rewards mechanism is meaningful.

 

 

Some minor comments:

Add citation to Algorithm 1 in algo environment

 

Algorithm 2 step 1 is RW the reward?

 

For Figures 3-6 clarify that the x-coordinate is actually the iterations

 

In Figure 6 the red line is not shown clearly (in iteration 541 does it drop?)

For Dynamic with reward, which one gives better results from iterations 335? It is not clear, it seems the one with the orange line not the grey line.

 

Figure 12 the blue line is missing.

 

What are the tuples under figures 3,6,8, 9, 10, 11, 12 showing?

 

Author Response

1. Add citation to Algorithm 1 in algo environment
REPLY: In page 7, citation in algo enviroment has been added.

2. Algorithm 2 step 1 is RW the reward?
REPLY: In page 8, a comment in algo enviroment has been added.

3. For Figures 3-6 clarify that the x-coordinate is actually the iterations
REPLY: In page 11 and in line 361, x-coordinate is clarified.
 
4. In Figure 7 the red line is not shown clearly (in iteration 541 does it drop?)
REPLY: The quality of all images have been improved. Yes, in figure 7 the red line drops in iteration 541.

5. For Dynamic with reward, which one gives better results from iterations 335? It is not clear, it seems the one with the orange line not the grey line.
REPLY: From iteration 335, in figure 7, the grey line (static with reward now has yellow line) perform better, on average, compare to orange one(now has red line).
 
6. Figure 12 the blue line is missing.
REPLY: The quality of the image has been improved and now it is obvious that the blue line is not missing.

7. What are the tuples under figures 3,6,8, 9, 10, 11, 12 showing?
REPLY: We have added extra comments under the figures in order to explain what the tuples represent.

Reviewer 2 Report

This paper tackles the problem of local community detection in dynamic networks using what the authors call anchor which is a specific node in the network. Altghough in the introduction they authors motivate the application of this approach in the introduction, there is no a clear/specific application where the this work can be applied. In Section 4, they evaluate their approach using F1 score and they do not highlight the important/interest of following the evolution of the community of a specific node with a clear practical application. So finally, the interest of this approach does not seem justified.

Additional issues:

- Furthermore, the complexity of the proposed approach should be calculated an compared to the existing ones.

- The accronym RW is defined nowhere.

- Some images are blurred.

Finally from my point of view, evaluation the approach using synthetic networks is not enough. Authors should clearly define the definition of community that they are considering. For instance, in [1] the authors consider different definitions show that there exist diferent definitions of (local) community and they according to that, different approaches are more of less adapted. Finally, I do not agree with the fact that the anchor node should be in the center of a community, if it has a low degree it is usually in the border because it does not accupy a central position.

[1] P. Conde-Céspedes, B. Ngonmang and E. Viennet, "An efficient method for mining the maximal α-quasi-clique-community of a given node in complex networks", Social Network Analysis and Mining, December 2018, 8:20

Author Response

This paper tackles the problem of local community detection in dynamic networks using what the authors call anchor which is a specific node in the network. Altghough in the introduction they authors motivate the application of this approach in the introduction, there is no a clear/specific application where the this work can be applied.
 
REPLY: Introduction Section, lines 53–64. One of the possible applications that the proposed approach could be used is an IoT network. In such networks, the connections between nodes change as the time passes. As a result, someone may be interested in uncovering the evolution of the community to which a particular switch device node for example belongs. The former node would act as an anchor to this community. More generally, the community detection problem is a fundamental problem in network analysis, and as such, especially for graph streams, any new approach has considerable merit.

 

In Section 4, they evaluate their approach using F1 score and they do not highlight the important/interest of following the evolution of the community of a specific node with a clear practical application. 
So finally, the interest of this approach does not seem justified.

REPLY: In page 10 and in section 4.2.2, we describe the importance of following the evolution of the community of a specific node with a clear practical application.


Additional issues:

- Furthermore, the complexity of the proposed approach should be calculated an compared to the existing ones.
REPLY: In page 9 we present the time complexity of the proposed method.

- The accronym RW is defined nowhere.
REPLY: In page 8, a comment in algo enviroment has been added.

- Some images are blurred.
REPLY: The quality of all images have been improved.


Finally from my point of view, evaluation the approach using synthetic networks is not enough. 
REPLY: Experiments in both synthetic (page: 11) and real datasets (page: 15) have been conducted.

Authors should clearly define the definition of community that they are considering. For instance, in [1] the authors consider different definitions show that there exist diferent definitions of (local) community and they according to that, different approaches are more of less adapted. Finally, I do not agree with the fact that the anchor node should be in the center of a community, if it has a low degree it is usually in the border because it does not accupy a central position.
[1] P. Conde-Céspedes, B. Ngonmang and E. Viennet, "An efficient method for mining the maximal α-quasi-clique-community of a given node in complex networks", Social Network Analysis and Mining, December 2018, 8:20

REPLY:  Problem formulation and Methodology Section, Preliminaries subsection, lines 138-155. We provide the definition of Local Community that is considered throughout the paper. We also provide a general scheme of local community to help the reader better understand the former definition.

Round 2

Reviewer 2 Report

Definetely there are issues in this article.

First of all, concerning a practical application to emphisize the utility of this approach, just some examples in the introduction were cited without a real application contribution in the present article.

Second, the complexity is calculated in terms of the parameters of the algorithm, which is incorrect. It just have to take into account the input data with is out of our hands! Since the number of "updates" and "actions" depend on the algorithm they can be set  as small as possible. Furthermore, the complexity should be calculated using the worst-case scenario. Once the complexity is properly calculated, it should be compare to existing approaches to highlight the pertinence of the proposed method.

Finally, the metric they use to evaluate the results is the F1 score. Just to recall, the F1 score is a metric coming from supervised machine learning, mathematically, is is the harmonic mean of precision and recall. The problem of community detection is rather similar to unsupervised learning approaches, more specifically clustering. In the unsupervised context, the notion of precision and recall make no sense. They authors should clarify this issue. Other metrics such as the Jaccard index are more appropiate.

In addition, there is no explicit mathematical definition of community. In Fig 1, you do not even know which the anchor node is. The local community is composed of two non-connected components which are not even densely connected, which is counter to the notion of community.

Back to TopTop